7 Mar 2024 | Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran, Sarath Chandar
The paper introduces a novel method called Recall to Imagine (R2I), which integrates state space models (SSMs) into the world models of model-based reinforcement learning (MBRL) agents. This integration aims to enhance both long-term memory and long-horizon credit assignment, addressing the limitations of current MBRL agents in handling long-term dependencies and complex tasks. R2I is designed to improve temporal coherence and computational efficiency, making it suitable for a wide range of tasks, including memory-intensive and credit assignment challenges.
Key contributions of the paper include:
1. **Introduction of R2I**: A new MBRL approach that combines DreamerV3 with a modified S4 model to handle temporal dependencies.
2. **State Space Models (SSMs)**: SSMs are used to capture long-range dependencies in trajectories, improving the agent's ability to remember and recall past observations.
3. **Performance Evaluation**: R2I demonstrates state-of-the-art performance in challenging memory and credit assignment tasks, such as BSuite and POPGym, and outperforms human performance in the complex Memory Maze domain.
4. **Computational Efficiency**: R2I is faster than the state-of-the-art MBRL method, DreamerV3, achieving up to 9 times faster wall-time convergence.
The paper also provides a detailed background on SSMs, the integration of SSMs into the world model, and the training and actor-critic details of R2I. Experimental results show that R2I not only excels in memory-intensive tasks but also maintains comparable performance in classic RL benchmarks like Atari and DMC, demonstrating its generality and effectiveness.The paper introduces a novel method called Recall to Imagine (R2I), which integrates state space models (SSMs) into the world models of model-based reinforcement learning (MBRL) agents. This integration aims to enhance both long-term memory and long-horizon credit assignment, addressing the limitations of current MBRL agents in handling long-term dependencies and complex tasks. R2I is designed to improve temporal coherence and computational efficiency, making it suitable for a wide range of tasks, including memory-intensive and credit assignment challenges.
Key contributions of the paper include:
1. **Introduction of R2I**: A new MBRL approach that combines DreamerV3 with a modified S4 model to handle temporal dependencies.
2. **State Space Models (SSMs)**: SSMs are used to capture long-range dependencies in trajectories, improving the agent's ability to remember and recall past observations.
3. **Performance Evaluation**: R2I demonstrates state-of-the-art performance in challenging memory and credit assignment tasks, such as BSuite and POPGym, and outperforms human performance in the complex Memory Maze domain.
4. **Computational Efficiency**: R2I is faster than the state-of-the-art MBRL method, DreamerV3, achieving up to 9 times faster wall-time convergence.
The paper also provides a detailed background on SSMs, the integration of SSMs into the world model, and the training and actor-critic details of R2I. Experimental results show that R2I not only excels in memory-intensive tasks but also maintains comparable performance in classic RL benchmarks like Atari and DMC, demonstrating its generality and effectiveness.