Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments
Published in Neurocomputing, 2025
Previously presented at 2023 Adaptive and Learning Agents Workshop (AAMAS Conference) Read more
Published in Neurocomputing, 2025
Previously presented at 2023 Adaptive and Learning Agents Workshop (AAMAS Conference) Read more
Published in Results in Engineering, 2024
Abstract: In the realm of human-machine interaction, artificial intelligence has become a powerful tool for accelerating data modeling tasks. Object detection methods have achieved outstanding results and are widely used in critical domains like autonomous driving and video surveillance. However, their adoption in high-risk applications, where errors may cause severe consequences, remains limited. Explainable Artificial Intelligence methods aim to address this issue, but many existing techniques are model-specific and designed for classification tasks, making them less effective for object detection and difficult for non-specialists to interpret. Read more
Published in Open-World Agents workshop at NeurIPS, 2024
Abstract: Sparse reward environments in reinforcement learning (RL) pose significant challenges for exploration, often leading to inefficient or incomplete learning processes. To tackle this issue, this work proposes a teacher-student RL framework that leverages Large Language Models (LLMs) as “teachers” to guide the agent’s learning process by decomposing complex tasks into subgoals. Read more
Published in Intrinsically Motivated Open-ended Learning workshop at NeurIPS, 2024
Abstract: Exploration remains a significant challenge in reinforcement learning, especially in environments where extrinsic rewards are sparse or non-existent. The recent rise of foundation models, such as CLIP, offers an opportunity to leverage pretrained, semantically rich embeddings that encapsulate broad and reusable knowledge. In this work we explore the potential of these foundation models not just to drive exploration, but also to analyze the critical role of the episodic novelty term in enhancing exploration effectiveness of the agent. We also investigate whether providing the intrinsic module with complete state information – rather than just partial observations – can improve exploration, despite the difficulties in handling small variations within large state spaces. Read more
Published in International Joint Conference on Neural Networks, IJCNN, 2024
Abstract: Safety in AI-based systems is among the highest research priorities, particularly when such systems are deployed in real-world scenarios subject to uncertainties and unpredictable inputs. Among them, the presence of long-tailed stimuli (Out-of-Distribution data, OoD) has captured much interest in recent times, giving rise to many proposals over the years to detect unfamiliar inputs to the model and adapt its knowledge accordingly. Read more
Published in Mathematics, 2024
Abstract: Despite the increasing availability of vast amounts of data, the challenge of acquiring labeled data persists. This issue is particularly serious in supervised learning scenarios, where labeled data are essential for model training. In addition, the rapid growth in data required by cutting-edge technologies such as deep learning makes the task of labeling large datasets impractical. Active learning methods offer a powerful solution by iteratively selecting the most informative unlabeled instances, thereby reducing the amount of labeled data required. However, active learning faces some limitations with imbalanced datasets, where majority class over-representation can bias sample selection. To address this, combining active learning with data augmentation techniques emerges as a promising strategy. Nonetheless, the best way to combine these techniques is not yet clear. Read more
Published in IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, IEEE ADPRL, 2023
Abstract: Exploration poses a fundamental challenge in Reinforcement Learning (RL) with sparse rewards, limiting an agent’s ability to learn optimal decision-making due to a lack of informative feedback signals. Self-Imitation Learning (self-IL) has emerged as a promising approach for exploration, leveraging a replay buffer to store and reproduce successful behaviors. However, traditional self-IL methods, which rely on high-return transitions and assume singleton environments, face challenges in generalization, especially in procedurally-generated (PCG) environments. Therefore, new self-IL methods have been proposed to rank which experiences to persist, but they replay transitions uniformly regardless of their significance, and do not address the diversity of the stored demonstrations. Read more
Published in IEEE Symposium Series on Computational Intelligence (SSCI), 2023
Abstract: The deployment of Machine Learning models on hardware devices has motivated a notable research activity around different strategies to alleviate their complexity and size. This is the case of neural architecture search or pruning in Deep Learning. This work places its focus on simplifying randomization-based neural networks by discovering fixed-point quantization policies that optimally balance the trade-off between performance and complexity reduction featured by these models. Read more
Published in IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, IEEE ADPRL, 2022
Abstract: Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or bad) the decisions made by the learned agent are. Unfortunately, in a broad range of problems the design of a good reward function is not trivial, so in such cases sparse reward signals are instead adopted. The lack of a dense reward function poses new challenges, mostly related to exploration. Imitation Learning has addressed those problems by leveraging demonstrations from experts. In the absence of an expert (and its subsequent demonstrations), an option is to prioritize well-suited exploration experiences collected by the agent in order to bootstrap its learning process with good exploration behaviors.However, this solution highly depends on the ability of the agent to discover such trajectories in the early stages of its learning process. Read more
Published in Neural Computing and Applications (S.I.: Human-aligned Reinforcement Learning for Autonomous Agents and Robots), 2022
Abstract: In the early stages of human life, babies develop their skills by exploring different scenarios motivated by their inherent satisfaction rather than by extrinsic rewards from the environment. This behavior, referred to as intrinsic motivation, has emerged as one solution to address the exploration challenge derived from reinforcement learning environments with sparse rewards. Read more
Published in International Cross Domain Conference for Machine Learning and Knowledge Extraction, CD-MAKE, 2022
Abstract: In the last few years, the research activity around reinforcement learning tasks formulated over environments with sparse rewards has been especially notable. Among the numerous approaches proposed to deal with these hard exploration problems, intrinsic motivation mechanisms are arguably among the most studied alternatives to date. Read more
Published in International Joint Conference on Neural Networks, IJCNN, 2021
Abstract: A critical goal in Reinforcement Learning is the minimization of the time needed for an agent to learn to solve a given environment. In this context, collaborative reinforcement learning refers to the improvement of this learning process through the interaction between agents, which usually yields better results than training each agent in isolation. Most studies in this area have focused on the case with homogeneous agents, namely, agents equally skilled for undertaking their task. By contrast, heterogeneity among agents could arise due to the particular capabilities on how they sense the environment and/or the actions they could perform. Those differences eventually hinder the learning process and information sharing between agents. This issue becomes even more complicated to address over hard exploration scenarios where the extrinsic rewards collected from the environment are sparse. Read more