25 Feb 2019 | Jiaxuan You, Bowen Liu, Rex Ying, Vijay Pande, Jure Leskovec
Graph Convolutional Policy Network (GCPN) is a model for goal-directed molecular graph generation using reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN achieves 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieves 184% improvement on the constrained property optimization task. GCPN is designed as a reinforcement learning agent that operates within a chemistry-aware graph generation environment. A molecule is successively constructed by either connecting a new substructure or an atom with an existing molecular graph or adding a bond to connect existing atoms. GCPN predicts the action of the bond addition, and is trained via policy gradient to optimize a reward composed of molecular property objectives and adversarial loss. The adversarial loss is provided by a graph convolutional network based discriminator trained jointly on a dataset of example molecules. Overall, this approach allows direct optimization of application-specific objectives, while ensuring that the generated molecules are realistic and satisfy chemical rules. GCPN is evaluated in three distinct molecule generation tasks: molecule property optimization, property targeting and conditional property optimization. It achieves state-of-the-art results in all tasks. GCPN generates molecules with property scores 61% higher than the best baseline method, and outperforms the baseline models in the constrained optimization setting by 184% on average.Graph Convolutional Policy Network (GCPN) is a model for goal-directed molecular graph generation using reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN achieves 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieves 184% improvement on the constrained property optimization task. GCPN is designed as a reinforcement learning agent that operates within a chemistry-aware graph generation environment. A molecule is successively constructed by either connecting a new substructure or an atom with an existing molecular graph or adding a bond to connect existing atoms. GCPN predicts the action of the bond addition, and is trained via policy gradient to optimize a reward composed of molecular property objectives and adversarial loss. The adversarial loss is provided by a graph convolutional network based discriminator trained jointly on a dataset of example molecules. Overall, this approach allows direct optimization of application-specific objectives, while ensuring that the generated molecules are realistic and satisfy chemical rules. GCPN is evaluated in three distinct molecule generation tasks: molecule property optimization, property targeting and conditional property optimization. It achieves state-of-the-art results in all tasks. GCPN generates molecules with property scores 61% higher than the best baseline method, and outperforms the baseline models in the constrained optimization setting by 184% on average.