Leveraging infrared spectroscopy for automated structure elucidation

Leveraging infrared spectroscopy for automated structure elucidation

2024 | Marvin Alberts, Teodoro Laino, Alain C. Vaucher
This paper presents a transformer model that leverages infrared (IR) spectroscopy to predict the molecular structure of compounds directly from experimental IR spectra. The model is pre-trained on 634,585 simulated IR spectra generated using molecular dynamics and the class II polymer consistent force field (PCFF), and fine-tuned on 3,453 experimental spectra from the NIST IR database. The model achieves a top-1 accuracy of 44.4% and a top-10 accuracy of 69.8% in predicting the molecular structure, and an average F1 score of 0.856 in predicting 19 functional groups. The model's performance is influenced by the heavy atom count and the presence of specific functional groups, with higher accuracy for molecules containing fewer heavy atoms and better performance for halogens and less complex functional groups. The model's errors are similar to those made by human analysts, indicating its effectiveness in capturing key spectral features. The study highlights the potential for using IR spectroscopy for rapid and cost-effective structure elucidation, particularly in research institutions with limited resources.This paper presents a transformer model that leverages infrared (IR) spectroscopy to predict the molecular structure of compounds directly from experimental IR spectra. The model is pre-trained on 634,585 simulated IR spectra generated using molecular dynamics and the class II polymer consistent force field (PCFF), and fine-tuned on 3,453 experimental spectra from the NIST IR database. The model achieves a top-1 accuracy of 44.4% and a top-10 accuracy of 69.8% in predicting the molecular structure, and an average F1 score of 0.856 in predicting 19 functional groups. The model's performance is influenced by the heavy atom count and the presence of specific functional groups, with higher accuracy for molecules containing fewer heavy atoms and better performance for halogens and less complex functional groups. The model's errors are similar to those made by human analysts, indicating its effectiveness in capturing key spectral features. The study highlights the potential for using IR spectroscopy for rapid and cost-effective structure elucidation, particularly in research institutions with limited resources.
Reach us at info@study.space