Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning

Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning

2024-08-28 | Minjong Yoo, Sangwoo Cho, Honguk Woo*
This paper introduces a skill-regularized task decomposition method for multi-task offline reinforcement learning (RL), aiming to improve performance on heterogeneous datasets with varying task qualities. The proposed approach jointly learns skills and tasks in a shared latent space, using quality-aware regularization to decompose tasks into achievable subtasks aligned with high-quality skills. A Wasserstein auto-encoder (WAE) is used to represent both skills and tasks in the same latent space, while a quality-weighted loss term ensures tasks are decomposed into subtasks consistent with high-quality skills. To enhance performance, the method also augments datasets with imaginary trajectories generated based on high-quality skills. Experiments on robotic manipulation and drone navigation tasks show that the proposed method outperforms state-of-the-art algorithms, demonstrating robustness to mixed configurations of different-quality datasets. The key contributions include a novel multi-task offline RL model that enables task decomposition through quality-aware joint learning, a data augmentation scheme specific to multi-task RL on limited offline datasets, and evaluation under multi-task robot and drone scenarios, highlighting the method's effectiveness in heterogeneous data conditions. The approach addresses the challenge of learning policies in offline settings with varying data quality by leveraging shared skills and generating plausible trajectories to improve learning efficiency.This paper introduces a skill-regularized task decomposition method for multi-task offline reinforcement learning (RL), aiming to improve performance on heterogeneous datasets with varying task qualities. The proposed approach jointly learns skills and tasks in a shared latent space, using quality-aware regularization to decompose tasks into achievable subtasks aligned with high-quality skills. A Wasserstein auto-encoder (WAE) is used to represent both skills and tasks in the same latent space, while a quality-weighted loss term ensures tasks are decomposed into subtasks consistent with high-quality skills. To enhance performance, the method also augments datasets with imaginary trajectories generated based on high-quality skills. Experiments on robotic manipulation and drone navigation tasks show that the proposed method outperforms state-of-the-art algorithms, demonstrating robustness to mixed configurations of different-quality datasets. The key contributions include a novel multi-task offline RL model that enables task decomposition through quality-aware joint learning, a data augmentation scheme specific to multi-task RL on limited offline datasets, and evaluation under multi-task robot and drone scenarios, highlighting the method's effectiveness in heterogeneous data conditions. The approach addresses the challenge of learning policies in offline settings with varying data quality by leveraging shared skills and generating plausible trajectories to improve learning efficiency.
Reach us at info@futurestudyspace.com
[slides] Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning | StudySpace