Publications

On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data

Authors: Aitor Martinez-Seras , Javier Del Ser* , Aitzol Olivarez-Rad , Alain Andres , Pablo Garcia-Bringas

Published in Neural Networks, 2026

Abstract: Robustness is a fundamental aspect for developing safe and trustworthy models, particularly when they are deployed in the open world. In this work we analyze the inherent capability of one-stage object detectors to robustly operate in the presence of out-of-distribution (OoD) data. Specifically, we propose a novel detection algorithm for detecting unknown objects in image data, which leverages the features extracted by the model from each sample. Read more

Enhancing design of experiments through uncertainty estimation and synthetic data generation

Authors: Luis Moles* , Alain Andres , Goretti Echegaray , Fernando Boto

Published in Results in Engineering, 2026

Abstract: Design of Experiments is a key methodology for optimizing machine learning models, but traditional methods often depend on extensive real data collection, which is costly and time-consuming. Moreover, predefined experimental designs may struggle at adapting to complex or high-dimensional input spaces, sometimes leading to inefficient exploration, especially when data are scarce and uncertainty is high. To address these challenges, we propose a methodology that integrates uncertainty estimation with synthetic data generation. Read more

D-CRISP: Explaining Object Detectors by combining Randomized and Segment-based Perturbations

Authors: Alain Andres* , Javier Del Ser

Published in European Conference on Artificial Intelligence, ECAI, 2025

Abstract: Explaining the decisions issued by Machine Learning models for object detection tasks is essential in high-stakes decision making scenarios, such as medical image processing and vehicular perception for autonomous driving. Despite the proliferation of post-hoc perturbation-based methods for generating visual explanations, most eXplainable AI (XAI) approaches rely exclusively on either random image masking or selective segmentation-based occlusion, missing the opportunity to synergistically leverage both strategies in a complementary fashion. In this paper we address this gap by proposing D-CRISP (Detector-Combining Randomized Input and Segment Perturbations), a novel post-hoc explanation method for object detection models. D-CRISP unifies both random and region-based occlusions derived from image segmentation, producing multiscale saliency maps that capture both granular (pixel-level) and semantic (region-level) cues about the objects detected by the model. Read more

Large Language Models for Structured Task Decomposition in Reinforcement Learning Problems with Sparse Rewards

Authors: Unai Ruiz-Gonzalez* , Alain Andres , Javier Del Ser

Published in Machine Learning & Knowledge Extraction, 2025

Abstract: Reinforcement learning (RL) agents face significant challenges in sparse-reward environments, as insufficient exploration of the state space can result in inefficient training or incomplete policy learning. To address this challenge, this work proposes a teacher–student framework for RL that leverages the inherent knowledge of large language models (LLMs) to decompose complex tasks into manageable subgoals. Read more

Towards Surgical Task Automation: Actor-Critic Models Meet Self-Supervised Imitation Learning

Authors: Jingshuai Liu* , Alain Andres , Yonghang Jiang , Yuning Du , Xichun Luo , Wenmiao Shu , Sotirios Tsaftaris

Published in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, 2025

Abstract: Surgical robot task automation has recently attracted great attention due to its potential to benefit both surgeons and patients. Reinforcement learning (RL) based approaches have demonstrated promising ability to perform automated surgical manipulations on various tasks. To address the exploration challenge, expert demonstrations can be utilized to enhance the learning efficiency via imitation learning (IL) approaches. However, the successes of such methods normally rely on both states and action labels. Unfortunately, action labels can be hard to capture or their manual annotation is prohibitively expensive owing to the requirement for expert knowledge. Emulating expert behaviour using noisy or inaccurate labels poses significant risks, including unintended surgical errors that may result in patient discomfort or, in more severe cases, tissue damage. It therefore remains an appealing and open problem to leverage expert data composed of pure states into RL. Read more

Comparative Evaluation of RL and MPC for 6DoF AUV Control

Authors: Sümer Tunçay* , Alain Andres , Ignacio Carlucho

Published in Towards Autonomous Robotic Systems, TAROS, 2025

Abstract: Autonomous Underwater Vehicles (AUVs) require precise and robust control strategies for 3D pose regulation in dynamic underwater environments. In this study, we present a comparative evaluation of model-free and model-based control methods for AUV position control. Specifically, we analyze the performance of neural network controllers trained by three Reinforcement Learning (RL) algorithms—Proximal Policy Optimization (PPO), Twin Delayed Deep Deterministic Policy Gradient (TD3), and Soft Actor-Critic (SAC)—alongside a Model Predictive Control (MPC) baseline. We train our RL methods in a simplified AUV simulator implemented in PyTorch, while our evaluation is done in a realistic marine robotics simulator called Stonefish. Read more

Evaluating Reinforcement Learning-Based Neural Controllers for Quadcopter Navigation in Windy Conditions

Authors: Alain Andres* , Aritz D Martinez , Sümer Tunçay , Ignacio Carlucho

Published in Engineering Applications of Artificial Intelligence, 2025

Abstract: Accurate quadcopter navigation under windy conditions remains challenging for traditional control methods, especially in the presence of unpredictable wind gusts and strict navigational constraints. This paper evaluates Deep Reinforcement Learning (DRL) based controllers under such conditions, analysing the impact of wind domain randomisation, multi-goal training, enhanced state representations with explicit wind information, and the use of temporal data to capture affecting dynamics over time. Read more

Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments

Authors: Alain Andres* , Lukas Schäfer , Stefano V Albrecht , Javier Del Ser

Published in Neurocomputing, 2025

Previously presented at 2023 Adaptive and Learning Agents Workshop (AAMAS Conference) Read more

On the Black-box Explainability of Object Detection Models for Safe and Trustworthy Industrial Applications

Authors: Alain Andres* , Aitor Martinez-Seras , Ibai Laña , Javier Del Ser

Published in Results in Engineering, 2024

Abstract: In the realm of human-machine interaction, artificial intelligence has become a powerful tool for accelerating data modeling tasks. Object detection methods have achieved outstanding results and are widely used in critical domains like autonomous driving and video surveillance. However, their adoption in high-risk applications, where errors may cause severe consequences, remains limited. Explainable Artificial Intelligence methods aim to address this issue, but many existing techniques are model-specific and designed for classification tasks, making them less effective for object detection and difficult for non-specialists to interpret. Read more

Words as Beacons: Guiding RL Agents with High-Level Language Prompts

Authors: Unai Ruiz-Gonzalez* , Alain Andres , Pedro G Bascoy , Javier Del Ser

Published in Open-World Agents workshop at NeurIPS, 2024

Abstract: Sparse reward environments in reinforcement learning (RL) pose significant challenges for exploration, often leading to inefficient or incomplete learning processes. To tackle this issue, this work proposes a teacher-student RL framework that leverages Large Language Models (LLMs) as “teachers” to guide the agent’s learning process by decomposing complex tasks into subgoals. Read more

Fostering Intrinsic Motivation in Reinforcement Learning with Pretrained Foundation Models

Authors: Alain Andres* , Javier Del Ser

Published in Intrinsically Motivated Open-ended Learning workshop at NeurIPS, 2024

Abstract: Exploration remains a significant challenge in reinforcement learning, especially in environments where extrinsic rewards are sparse or non-existent. The recent rise of foundation models, such as CLIP, offers an opportunity to leverage pretrained, semantically rich embeddings that encapsulate broad and reusable knowledge. In this work we explore the potential of these foundation models not just to drive exploration, but also to analyze the critical role of the episodic novelty term in enhancing exploration effectiveness of the agent. We also investigate whether providing the intrinsic module with complete state information – rather than just partial observations – can improve exploration, despite the difficulties in handling small variations within large state spaces. Read more

Single Agent Formulation for Reinforcement Learning Based Routing of Urban Last Mile Logistics with Platooning Vehicles

Authors: Nagore Bravo , Imanol Echeverria , Alain Andres , Ibai Lana*

Published in IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), 2024

Abstract: Last mile logistics are in the midst of a deep transformation thanks to the advent of autonomous vehicles with platooning capabilities that can take the place of typical delivery methods. Platooning brings to the vehicle routing problems new constraints and multiple objectives that are addressed in this paper with a Reinforcement Learning approach. In opposition to traditional metaheuristic optimization algorithms, Reinforcement Learning provides flexibility in the face of changing environment, shifting the challenge to the way in which the problem is formulated. While there have been successful attempts to implement RL solutions to vehicle routing problems, including some sort of optional platooning, our main contribution is funded in the application to this platooning vehicle routing problems for last mile delivery, considering all their particularities and proposing a formulation framework for this kind of problems. Read more

Advancing towards Safe Reinforcement Learning over Sparse Environments with Out-of-Distribution Observations: Detection and Adaptation Strategies

Authors: Aitor Martinez-Seras* , Alain Andres , Javier Del Ser

Published in International Joint Conference on Neural Networks, IJCNN, 2024

Abstract: Safety in AI-based systems is among the highest research priorities, particularly when such systems are deployed in real-world scenarios subject to uncertainties and unpredictable inputs. Among them, the presence of long-tailed stimuli (Out-of-Distribution data, OoD) has captured much interest in recent times, giving rise to many proposals over the years to detect unfamiliar inputs to the model and adapt its knowledge accordingly. Read more

Exploring Data Augmentation and Active Learning Benefits in Imbalanced Datasets

Authors: Luis Moles* , Alain Andres , Goretti Echegaray , Fernando Boto

Published in Mathematics, 2024

Abstract: Despite the increasing availability of vast amounts of data, the challenge of acquiring labeled data persists. This issue is particularly serious in supervised learning scenarios, where labeled data are essential for model training. In addition, the rapid growth in data required by cutting-edge technologies such as deep learning makes the task of labeling large datasets impractical. Active learning methods offer a powerful solution by iteratively selecting the most informative unlabeled instances, thereby reducing the amount of labeled data required. However, active learning faces some limitations with imbalanced datasets, where majority class over-representation can bias sample selection. To address this, combining active learning with data augmentation techniques emerges as a promising strategy. Nonetheless, the best way to combine these techniques is not yet clear. Read more

Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards

Authors: Alain Andres* , Daochen Zha , Javier Del Ser

Published in IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, IEEE ADPRL, 2023

Abstract: Exploration poses a fundamental challenge in Reinforcement Learning (RL) with sparse rewards, limiting an agent’s ability to learn optimal decision-making due to a lack of informative feedback signals. Self-Imitation Learning (self-IL) has emerged as a promising approach for exploration, leveraging a replay buffer to store and reproduce successful behaviors. However, traditional self-IL methods, which rely on high-return transitions and assume singleton environments, face challenges in generalization, especially in procedurally-generated (PCG) environments. Therefore, new self-IL methods have been proposed to rank which experiences to persist, but they replay transitions uniformly regardless of their significance, and do not address the diversity of the stored demonstrations. Read more

Evolutionary Multi-Objective Quantization of Randomization-Based Neural Networks

Authors: Javier Del Ser* , Alain Andres , Miren Nekane Bilbao , Ibai Laña , Jesus L Lobo

Published in IEEE Symposium Series on Computational Intelligence (SSCI), 2023

Abstract: The deployment of Machine Learning models on hardware devices has motivated a notable research activity around different strategies to alleviate their complexity and size. This is the case of neural architecture search or pruning in Deep Learning. This work places its focus on simplifying randomization-based neural networks by discovering fixed-point quantization policies that optimally balance the trade-off between performance and complexity reduction featured by these models. Read more

Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation

Authors: Alain Andres* , Esther Villar-Rodriguez , Javier Del Ser

Published in IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, IEEE ADPRL, 2022

Abstract: Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently. The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or bad) the decisions made by the learned agent are. Unfortunately, in a broad range of problems the design of a good reward function is not trivial, so in such cases sparse reward signals are instead adopted. The lack of a dense reward function poses new challenges, mostly related to exploration. Imitation Learning has addressed those problems by leveraging demonstrations from experts. In the absence of an expert (and its subsequent demonstrations), an option is to prioritize well-suited exploration experiences collected by the agent in order to bootstrap its learning process with good exploration behaviors.However, this solution highly depends on the ability of the agent to discover such trajectories in the early stages of its learning process. Read more

Collaborative training of heterogeneous reinforcement learning agents in environments with sparse rewards: what and when to share?

Authors: Alain Andres* , Esther Villar-Rodriguez , Javier Del Ser

Published in Neural Computing and Applications (S.I.: Human-aligned Reinforcement Learning for Autonomous Agents and Robots), 2022

Abstract: In the early stages of human life, babies develop their skills by exploring different scenarios motivated by their inherent satisfaction rather than by extrinsic rewards from the environment. This behavior, referred to as intrinsic motivation, has emerged as one solution to address the exploration challenge derived from reinforcement learning environments with sparse rewards. Read more

An Evaluation Study of Intrinsic Motivation Techniques Applied to Reinforcement Learning over Hard Exploration Environments

Authors: Alain Andres* , Esther Villar-Rodriguez , Javier Del Ser

Published in International Cross Domain Conference for Machine Learning and Knowledge Extraction, CD-MAKE, 2022

Abstract: In the last few years, the research activity around reinforcement learning tasks formulated over environments with sparse rewards has been especially notable. Among the numerous approaches proposed to deal with these hard exploration problems, intrinsic motivation mechanisms are arguably among the most studied alternatives to date. Read more

Collaborative exploration and reinforcement learning between heterogeneously skilled agents in environments with sparse rewards

Authors: Alain Andres* , Esther Villar-Rodriguez , Javier Del Ser

Published in International Joint Conference on Neural Networks, IJCNN, 2021

Abstract: A critical goal in Reinforcement Learning is the minimization of the time needed for an agent to learn to solve a given environment. In this context, collaborative reinforcement learning refers to the improvement of this learning process through the interaction between agents, which usually yields better results than training each agent in isolation. Most studies in this area have focused on the case with homogeneous agents, namely, agents equally skilled for undertaking their task. By contrast, heterogeneity among agents could arise due to the particular capabilities on how they sense the environment and/or the actions they could perform. Those differences eventually hinder the learning process and information sharing between agents. This issue becomes even more complicated to address over hard exploration scenarios where the extrinsic rewards collected from the environment are sparse. Read more

Alain Andres Fernandez

Publications