Quantum chemistry structures and properties of 134 kilo molecules

Quantum chemistry structures and properties of 134 kilo molecules

05 August 2014 | Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp & O. Anatole von Lilienfeld
This study reports the computed geometric, energetic, electronic, and thermodynamic properties of 134,000 stable small organic molecules composed of CHONF, derived from the GDB-17 chemical universe. These molecules represent a subset of all 133,885 species with up to nine heavy atoms (CONF) in the GDB-17 database. The properties were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry, with more accurate results at the G4MP2 level for a subset of 6,095 constitutional isomers of the predominant stoichiometry, C7H10O2. The data set provides a comprehensive and consistent chemical space of small organic molecules, which can be used for benchmarking existing methods, developing new methods such as hybrid quantum mechanics/machine learning, and identifying structure-property relationships. The data includes molecular structures and properties for the first 134,000 molecules of the GDB-17 database, covering a molecular property set of unprecedented size and consistency. The data set corresponds to the GDB-9 subset of all neutral molecules with up to nine atoms (CONF), not counting hydrogen. It includes small amino acids, nucleobases, and pharmaceutically relevant organic building blocks. The data is publicly available at Figshare in a plain text XYZ-like format. The data includes molecular structures and properties for 134,000 molecules, with properties calculated at the B3LYP/6-31G(2df,p) level of theory. For the 6,095 constitutional isomers of C7H10O2, energetics were calculated at the G4MP2 level. The data has been validated through comparisons with other high-level quantum chemistry methods, showing that the B3LYP results are accurate within 5 kcal/mol for atomization energies. The data is also validated through geometry consistency checks, ensuring that the molecular structures are accurately represented. The data is available for use in the development, training, and evaluation of machine learning models, as well as for the discovery of new trends and materials design.This study reports the computed geometric, energetic, electronic, and thermodynamic properties of 134,000 stable small organic molecules composed of CHONF, derived from the GDB-17 chemical universe. These molecules represent a subset of all 133,885 species with up to nine heavy atoms (CONF) in the GDB-17 database. The properties were calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry, with more accurate results at the G4MP2 level for a subset of 6,095 constitutional isomers of the predominant stoichiometry, C7H10O2. The data set provides a comprehensive and consistent chemical space of small organic molecules, which can be used for benchmarking existing methods, developing new methods such as hybrid quantum mechanics/machine learning, and identifying structure-property relationships. The data includes molecular structures and properties for the first 134,000 molecules of the GDB-17 database, covering a molecular property set of unprecedented size and consistency. The data set corresponds to the GDB-9 subset of all neutral molecules with up to nine atoms (CONF), not counting hydrogen. It includes small amino acids, nucleobases, and pharmaceutically relevant organic building blocks. The data is publicly available at Figshare in a plain text XYZ-like format. The data includes molecular structures and properties for 134,000 molecules, with properties calculated at the B3LYP/6-31G(2df,p) level of theory. For the 6,095 constitutional isomers of C7H10O2, energetics were calculated at the G4MP2 level. The data has been validated through comparisons with other high-level quantum chemistry methods, showing that the B3LYP results are accurate within 5 kcal/mol for atomization energies. The data is also validated through geometry consistency checks, ensuring that the molecular structures are accurately represented. The data is available for use in the development, training, and evaluation of machine learning models, as well as for the discovery of new trends and materials design.
Reach us at info@study.space
[slides and audio] Quantum chemistry structures and properties of 134 kilo molecules