7 Jun 2024 | Shitong Luo * 1 Wenhao Gao * 2 Zuofan Wu * 1 Jian Peng 1 Connor W. Coley 2 Jianzhu Ma 3
This paper introduces a novel framework for generating new chemical structures while ensuring their synthetic accessibility. The framework uses a postfix notation of synthetic pathways to represent molecules in chemical space and a transformer-based model to translate molecular graphs into these notations. The key contributions include:
1. **Linear Representation for Synthesis Pathways**: A scalable linear representation for synthetic pathways based on postfix notations, which allows for uniform and generalizable representation of all experimental procedures.
2. **Robust Algorithm for Synthesizable Molecular Design**: An algorithm that projects unsynthesizable molecules from generative models to synthesizable chemical space, preserving desired properties such as binding and bioactivity.
3. **Empirical Performance**: Strong empirical performance in bottom-up synthesis planning and analog design, outperforming prior methods.
The paper addresses the challenges in drug discovery, particularly the vast chemical space and the need for efficient exploration. It highlights the limitations of existing generative models, which often propose synthetically infeasible molecules. The proposed framework ensures that the generated molecules are derivable from purchasable chemical building blocks and expert-defined chemical reaction rules, guaranteeing their synthesizability.
The method is validated through various experiments, including bottom-up synthesis planning, projecting molecules generated by structure-based drug design models, goal-directed generative models, and hit expansion. The results demonstrate the model's ability to find valid synthetic pathways, generate synthesizable analogs, and explore local chemical spaces effectively.This paper introduces a novel framework for generating new chemical structures while ensuring their synthetic accessibility. The framework uses a postfix notation of synthetic pathways to represent molecules in chemical space and a transformer-based model to translate molecular graphs into these notations. The key contributions include:
1. **Linear Representation for Synthesis Pathways**: A scalable linear representation for synthetic pathways based on postfix notations, which allows for uniform and generalizable representation of all experimental procedures.
2. **Robust Algorithm for Synthesizable Molecular Design**: An algorithm that projects unsynthesizable molecules from generative models to synthesizable chemical space, preserving desired properties such as binding and bioactivity.
3. **Empirical Performance**: Strong empirical performance in bottom-up synthesis planning and analog design, outperforming prior methods.
The paper addresses the challenges in drug discovery, particularly the vast chemical space and the need for efficient exploration. It highlights the limitations of existing generative models, which often propose synthetically infeasible molecules. The proposed framework ensures that the generated molecules are derivable from purchasable chemical building blocks and expert-defined chemical reaction rules, guaranteeing their synthesizability.
The method is validated through various experiments, including bottom-up synthesis planning, projecting molecules generated by structure-based drug design models, goal-directed generative models, and hit expansion. The results demonstrate the model's ability to find valid synthetic pathways, generate synthesizable analogs, and explore local chemical spaces effectively.