April 10, 2024 | Jeff Guo* and Philippe Schwaller*
This paper introduces a novel algorithm called Augmented Memory for sample-efficient generative molecular design using reinforcement learning. The algorithm combines experience replay with SMILES augmentation to improve sample efficiency and enable diverse sampling. The method is shown to significantly outperform existing algorithms in sample efficiency, particularly in tasks requiring both exploration and exploitation, such as drug discovery and materials design. The algorithm also includes a component called Selective Memory Purge, which helps prevent mode collapse by removing entries in the replay buffer that correspond to scaffolds that should be discouraged. The method is evaluated on the PMO benchmark, where it achieves a new state-of-the-art performance, outperforming previous methods on 14 out of 23 tasks. The algorithm is also applied to a drug discovery case study involving the dopamine type 2 receptor (DRD2) and a materials design case study optimizing for quantum-mechanical properties. The results show that Augmented Memory can generate molecules optimized for target property profiles with minimal calls to expensive oracles. The algorithm is generalizable and applicable to a wide range of molecular design tasks, including drug discovery and materials design. The method is implemented in Python and is available on GitHub.This paper introduces a novel algorithm called Augmented Memory for sample-efficient generative molecular design using reinforcement learning. The algorithm combines experience replay with SMILES augmentation to improve sample efficiency and enable diverse sampling. The method is shown to significantly outperform existing algorithms in sample efficiency, particularly in tasks requiring both exploration and exploitation, such as drug discovery and materials design. The algorithm also includes a component called Selective Memory Purge, which helps prevent mode collapse by removing entries in the replay buffer that correspond to scaffolds that should be discouraged. The method is evaluated on the PMO benchmark, where it achieves a new state-of-the-art performance, outperforming previous methods on 14 out of 23 tasks. The algorithm is also applied to a drug discovery case study involving the dopamine type 2 receptor (DRD2) and a materials design case study optimizing for quantum-mechanical properties. The results show that Augmented Memory can generate molecules optimized for target property profiles with minimal calls to expensive oracles. The algorithm is generalizable and applicable to a wide range of molecular design tasks, including drug discovery and materials design. The method is implemented in Python and is available on GitHub.