29 Mar 2019 | Wengong Jin, Regina Barzilay, Tommi Jaakkola
The paper introduces a Junction Tree Variational Autoencoder (JT-VAE) for generating molecular graphs, aiming to automate the design of molecules based on specific chemical properties. The primary contribution is a novel generative model that directly generates molecular graphs, rather than linear SMILES strings, which are commonly used in previous approaches. JT-VAE generates molecular graphs in two phases: first, it generates a tree-structured scaffold using valid chemical substructures, and then combines these substructures into a complete molecule using a graph message passing network. This approach ensures that the generated molecules are chemically valid at every step.
The model is evaluated on multiple tasks, including molecular generation and optimization, and outperforms state-of-the-art baselines in terms of validity and property discovery. The paper also discusses the challenges of previous methods, such as the limitations of SMILES representations and the difficulty of generating valid intermediates during the atom-by-atom generation process. The proposed JT-VAE model addresses these issues by leveraging valid substructures as building blocks, maintaining chemical feasibility throughout the generation process.
The paper includes a detailed description of the JT-VAE architecture, including the tree and graph encoders and decoders, and provides experimental results demonstrating the effectiveness of the model. The results show that JT-VAE produces 100% valid molecules when sampled from the prior distribution and excels in discovering molecules with desired properties, outperforming baselines by a significant margin.The paper introduces a Junction Tree Variational Autoencoder (JT-VAE) for generating molecular graphs, aiming to automate the design of molecules based on specific chemical properties. The primary contribution is a novel generative model that directly generates molecular graphs, rather than linear SMILES strings, which are commonly used in previous approaches. JT-VAE generates molecular graphs in two phases: first, it generates a tree-structured scaffold using valid chemical substructures, and then combines these substructures into a complete molecule using a graph message passing network. This approach ensures that the generated molecules are chemically valid at every step.
The model is evaluated on multiple tasks, including molecular generation and optimization, and outperforms state-of-the-art baselines in terms of validity and property discovery. The paper also discusses the challenges of previous methods, such as the limitations of SMILES representations and the difficulty of generating valid intermediates during the atom-by-atom generation process. The proposed JT-VAE model addresses these issues by leveraging valid substructures as building blocks, maintaining chemical feasibility throughout the generation process.
The paper includes a detailed description of the JT-VAE architecture, including the tree and graph encoders and decoders, and provides experimental results demonstrating the effectiveness of the model. The results show that JT-VAE produces 100% valid molecules when sampled from the prior distribution and excels in discovering molecules with desired properties, outperforming baselines by a significant margin.