Towards Surgical Task Automation: Actor-Critic Models Meet Self-Supervised Imitation Learning
Published in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, 2025
Full paper can be found here
Abstract: Surgical robot task automation has recently attracted great attention due to its potential to benefit both surgeons and patients. Reinforcement learning (RL) based approaches have demonstrated promising ability to perform automated surgical manipulations on various tasks. To address the exploration challenge, expert demonstrations can be utilized to enhance the learning efficiency via imitation learning (IL) approaches. However, the successes of such methods normally rely on both states and action labels. Unfortunately, action labels can be hard to capture or their manual annotation is prohibitively expensive owing to the requirement for expert knowledge. Emulating expert behaviour using noisy or inaccurate labels poses significant risks, including unintended surgical errors that may result in patient discomfort or, in more severe cases, tissue damage. It therefore remains an appealing and open problem to leverage expert data composed of pure states into RL.
In this work, we present an actor-critic RL framework, termed AC-SSIL, to overcome this challenge of improving learning process with state-only demonstrations collected by an unknown expert policy. It adopts a self-supervised IL method, dubbed SSIL, to effectively incorporate expert states into RL paradigms by retrieving from demonstrations the nearest neighbours of the query state and utilizing the bootstrapping of actor networks. It applies similarity-based regularization and improves its prediction capacity jointly with the actor network. We showcase through experiments on an open-source surgical simulation platform that our method delivers remarkable improvements over the RL baseline and exhibits comparable performance against action based IL methods, which implies the efficacy and potential of our method for expert demonstration-guided learning scenarios. Code will be made publicly available at https://github.com/Jingshuai-cqu/AC-SSIL.