Decision Transformer: Reinforcement Learning via Sequence Modeling

Decision Transformer: Reinforcement Learning via Sequence Modeling

24 Jun 2021 | Lili Chen*,†, Kevin Lu*,†, Aravind Rajeswaran†, Kimin Lee†, Aditya Grover†, Michael Laskin†, Pieter Abbeel†, Aravind Srinivas†, Igor Mordatch†, 3
The paper introduces Decision Transformer, a framework that transforms Reinforcement Learning (RL) into a sequence modeling problem, leveraging the simplicity and scalability of the Transformer architecture. Unlike traditional RL methods that fit value functions or compute policy gradients, Decision Transformer outputs optimal actions by conditioning an autoregressive model on the desired return, past states, and actions. This approach allows the model to generate future actions that achieve the desired return, even in sparse reward settings. The authors evaluate Decision Transformer on various offline RL benchmarks, including Atari, OpenAI Gym, and Key-to-Door tasks, showing that it matches or exceeds the performance of state-of-the-art model-free offline RL baselines. The paper also discusses the benefits of using longer context lengths, long-term credit assignment, and the ability of transformers to be accurate critics in sparse reward settings. Overall, Decision Transformer demonstrates the potential of sequence modeling in RL, bridging the gap between language modeling and reinforcement learning.The paper introduces Decision Transformer, a framework that transforms Reinforcement Learning (RL) into a sequence modeling problem, leveraging the simplicity and scalability of the Transformer architecture. Unlike traditional RL methods that fit value functions or compute policy gradients, Decision Transformer outputs optimal actions by conditioning an autoregressive model on the desired return, past states, and actions. This approach allows the model to generate future actions that achieve the desired return, even in sparse reward settings. The authors evaluate Decision Transformer on various offline RL benchmarks, including Atari, OpenAI Gym, and Key-to-Door tasks, showing that it matches or exceeds the performance of state-of-the-art model-free offline RL baselines. The paper also discusses the benefits of using longer context lengths, long-term credit assignment, and the ability of transformers to be accurate critics in sparse reward settings. Overall, Decision Transformer demonstrates the potential of sequence modeling in RL, bridging the gap between language modeling and reinforcement learning.
Reach us at info@study.space
[slides] Decision Transformer%3A Reinforcement Learning via Sequence Modeling | StudySpace