Published online 22 September 2015 | Sunghwan Kim, Paul A. Thiessen, Evan E. Bolton*, Jie Chen, Gang Fu, Asta Gindulyte, Lianyi Han, Jane He, Siqian He, Benjamin A. Shoemaker, Jiyao Wang, Bo Yu, Jian Zhang and Stephen H. Bryant
The paper provides an overview of the PubChem Substance and Compound databases, which are part of the PubChem public repository for chemical substances and their biological activities. PubChem, launched in 2004, has grown to become a significant resource for scientific research, particularly in cheminformatics, chemical biology, medicinal chemistry, and drug discovery. The repository consists of three interconnected databases: Substance, Compound, and BioAssay. The Substance database stores depositor-contributed chemical information, while the Compound database contains unique chemical structures extracted from Substance. The BioAssay database holds biological activity data from assay experiments. The paper discusses data sources, content, organization, submission processes, standardization, web-based interfaces, programmatic access, and related tools such as PubChem3D and PubChemRDF. PubChem3D provides 3-D conformer models for compounds, enhancing 3-D similarity searches, and PubChemRDF uses Resource Description Framework (RDF) to facilitate data sharing and integration with other databases. The paper also highlights PubChem's commitment to continuous improvement and adaptation to new technologies.The paper provides an overview of the PubChem Substance and Compound databases, which are part of the PubChem public repository for chemical substances and their biological activities. PubChem, launched in 2004, has grown to become a significant resource for scientific research, particularly in cheminformatics, chemical biology, medicinal chemistry, and drug discovery. The repository consists of three interconnected databases: Substance, Compound, and BioAssay. The Substance database stores depositor-contributed chemical information, while the Compound database contains unique chemical structures extracted from Substance. The BioAssay database holds biological activity data from assay experiments. The paper discusses data sources, content, organization, submission processes, standardization, web-based interfaces, programmatic access, and related tools such as PubChem3D and PubChemRDF. PubChem3D provides 3-D conformer models for compounds, enhancing 3-D similarity searches, and PubChemRDF uses Resource Description Framework (RDF) to facilitate data sharing and integration with other databases. The paper also highlights PubChem's commitment to continuous improvement and adaptation to new technologies.