Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learning

Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learning

13 March 2024 | Shuan Chen, Sunggi An, Ramil Babazade, Yousung Jung
This article introduces LocalMapper, a machine learning model that learns precise atom-to-atom mapping (AAM) for chemical reactions through human-in-the-loop learning. AAM is crucial for understanding reaction mechanisms and improving the accuracy of machine learning (ML) models used in reaction prediction. Existing methods often rely on substructure alignments rather than chemistry knowledge, leading to inaccurate AAMs. LocalMapper, trained on chemist-labeled reactions, achieves 98.5% accuracy in predicting AAMs for 50,000 reactions by learning from only 2% of the labeled data. It shows 100% accuracy for 3,000 randomly sampled reactions and performs well in out-of-distribution experiments. The model uses a graph neural network with local message passing and long-range attention to learn AAMs. It also incorporates knowledge-based confidence identification, allowing it to distinguish between confident and uncertain predictions. This approach enables the model to generate reliable AAMs for reaction databases and improve the quality of future ML-based reaction prediction models. LocalMapper outperforms existing methods like RXNMapper and GraphormerMapper in accuracy and confidence. The model is trained on the USPTO-50K dataset and achieves high accuracy on both in-distribution and out-of-distribution test sets. The study highlights the importance of accurate AAM in reaction prediction and demonstrates the effectiveness of human-in-the-loop learning in improving model performance. The model's results are validated through extensive experiments and comparisons with other methods, showing its potential for enhancing reaction prediction models in chemistry.This article introduces LocalMapper, a machine learning model that learns precise atom-to-atom mapping (AAM) for chemical reactions through human-in-the-loop learning. AAM is crucial for understanding reaction mechanisms and improving the accuracy of machine learning (ML) models used in reaction prediction. Existing methods often rely on substructure alignments rather than chemistry knowledge, leading to inaccurate AAMs. LocalMapper, trained on chemist-labeled reactions, achieves 98.5% accuracy in predicting AAMs for 50,000 reactions by learning from only 2% of the labeled data. It shows 100% accuracy for 3,000 randomly sampled reactions and performs well in out-of-distribution experiments. The model uses a graph neural network with local message passing and long-range attention to learn AAMs. It also incorporates knowledge-based confidence identification, allowing it to distinguish between confident and uncertain predictions. This approach enables the model to generate reliable AAMs for reaction databases and improve the quality of future ML-based reaction prediction models. LocalMapper outperforms existing methods like RXNMapper and GraphormerMapper in accuracy and confidence. The model is trained on the USPTO-50K dataset and achieves high accuracy on both in-distribution and out-of-distribution test sets. The study highlights the importance of accurate AAM in reaction prediction and demonstrates the effectiveness of human-in-the-loop learning in improving model performance. The model's results are validated through extensive experiments and comparisons with other methods, showing its potential for enhancing reaction prediction models in chemistry.
Reach us at info@study.space
[slides] Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learning | StudySpace