Recent Advances in Automated Structure-Based De Novo Drug Design

Recent Advances in Automated Structure-Based De Novo Drug Design

2024 | Yidan Tang, Rocco Moretti, and Jens Meiler
Recent Advances in Automated Structure-Based De Novo Drug Design Yidan Tang, Rocco Moretti, and Jens Meiler review recent advances in structure-based de novo drug design, covering conventional fragment-based methods, evolutionary algorithms, Metropolis Monte Carlo methods, and deep generative models. De novo drug design aims to generate novel molecules with pharmacological properties from scratch, exploring a wider chemical space more efficiently than structure-based computer-aided drug design (SB-CADD). The review highlights synthetic accessibility (SA) efforts and benchmarking strategies to validate the proposed frameworks. De novo design workflows typically involve candidate sampling and property evaluation, often iteratively. Scoring functions, including physics-based force fields, empirical potentials, and knowledge-based scoring functions, are used to evaluate ligand properties. Machine learning (ML) scoring functions have emerged, though their performance is dependent on training sets. SA is a significant challenge, as it affects the feasibility of synthesizing proposed molecules. Recent approaches have attempted to address SA concerns, with efforts discussed in the review. The review examines how recent methods have benchmarked and validated their approaches. While experimental validation is more convincing than in silico evaluation, few protocols have validated their designs in vitro or in vivo. In silico validation often relies on docking studies and molecular dynamics (MD) simulations. Methods differ in the selection of evaluation metrics and protein targets, which are outlined for each method to compare similar protocols. Fragment-based drug discovery begins with screening diverse fragments, often through computational virtual screening. Common strategies include growing, linking, and merging. Growing is the most common strategy, starting with a single core in the pocket and extending the ligand into the rest of the pocket. Linking involves finding a suitable linker to maintain the original fragments' binding modes. Merging combines fragments in different but overlapping parts of the pocket. Reaction-rule based methods are a general trend, especially with the increasing number of reaction databases. The concept of make-on-demand libraries presents promising prospects for fragment-based methods. Enamine has generated a combinatorial library containing 36 billion readily accessible molecules, significantly mitigating SA concerns. Evolutionary algorithms (EAs) are powerful approaches for solving search and optimization problems with multiple, conflicting objectives. They mimic Darwinian evolution, selecting the fittest molecules generation by generation. The genetic algorithm (GA) is the most commonly used type of EA. Other types include genetic programming, evolutionary strategy, and evolutionary programming. EAs use various operators such as mutation, crossover, and selection to generate new molecules. Fitness assessment for SB-CADD EAs requires fast evaluation, often using docking, similarity, and diversity scores. SA metrics are sometimes included in addition to docking fitness scores. Retrosynthesis analysis is a resource-intensive way to evaluate SA and is often used in post-generation inspection. Monte Carlo methods are computational algorithms that solve problems through iterative random sampling. The Metropolis criterion decides if the new state of each iteration is accepted or rejected.Recent Advances in Automated Structure-Based De Novo Drug Design Yidan Tang, Rocco Moretti, and Jens Meiler review recent advances in structure-based de novo drug design, covering conventional fragment-based methods, evolutionary algorithms, Metropolis Monte Carlo methods, and deep generative models. De novo drug design aims to generate novel molecules with pharmacological properties from scratch, exploring a wider chemical space more efficiently than structure-based computer-aided drug design (SB-CADD). The review highlights synthetic accessibility (SA) efforts and benchmarking strategies to validate the proposed frameworks. De novo design workflows typically involve candidate sampling and property evaluation, often iteratively. Scoring functions, including physics-based force fields, empirical potentials, and knowledge-based scoring functions, are used to evaluate ligand properties. Machine learning (ML) scoring functions have emerged, though their performance is dependent on training sets. SA is a significant challenge, as it affects the feasibility of synthesizing proposed molecules. Recent approaches have attempted to address SA concerns, with efforts discussed in the review. The review examines how recent methods have benchmarked and validated their approaches. While experimental validation is more convincing than in silico evaluation, few protocols have validated their designs in vitro or in vivo. In silico validation often relies on docking studies and molecular dynamics (MD) simulations. Methods differ in the selection of evaluation metrics and protein targets, which are outlined for each method to compare similar protocols. Fragment-based drug discovery begins with screening diverse fragments, often through computational virtual screening. Common strategies include growing, linking, and merging. Growing is the most common strategy, starting with a single core in the pocket and extending the ligand into the rest of the pocket. Linking involves finding a suitable linker to maintain the original fragments' binding modes. Merging combines fragments in different but overlapping parts of the pocket. Reaction-rule based methods are a general trend, especially with the increasing number of reaction databases. The concept of make-on-demand libraries presents promising prospects for fragment-based methods. Enamine has generated a combinatorial library containing 36 billion readily accessible molecules, significantly mitigating SA concerns. Evolutionary algorithms (EAs) are powerful approaches for solving search and optimization problems with multiple, conflicting objectives. They mimic Darwinian evolution, selecting the fittest molecules generation by generation. The genetic algorithm (GA) is the most commonly used type of EA. Other types include genetic programming, evolutionary strategy, and evolutionary programming. EAs use various operators such as mutation, crossover, and selection to generate new molecules. Fitness assessment for SB-CADD EAs requires fast evaluation, often using docking, similarity, and diversity scores. SA metrics are sometimes included in addition to docking fitness scores. Retrosynthesis analysis is a resource-intensive way to evaluate SA and is often used in post-generation inspection. Monte Carlo methods are computational algorithms that solve problems through iterative random sampling. The Metropolis criterion decides if the new state of each iteration is accepted or rejected.
Reach us at info@study.space
Understanding Recent Advances in Automated Structure-Based De Novo Drug Design