2023, Vol. 51, Database issue | Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A. Shoemaker, Paul A. Thiessen, Bo Yu, Leonid Zaslavsky, Jian Zhang and Evan E. Bolton
The article provides an overview of significant updates to PubChem, a widely used chemical information resource, over the past two years. Key highlights include:
1. **Data Sources Expansion**: PubChem now integrates data from over 120 new sources, bringing the total to approximately 870 data sources. Notable additions include drug information from the FDA's National Drug Code Directory and Green Book, as well as chemical health and safety data from Haz-Map, IARC, and EPA.
2. **Google Patents Integration**: The integration of Google Patents data significantly expanded the coverage of PubChem's patent data collection, now containing 767 million links between chemical structures and patent documents.
3. **New Data Collections**: The creation of the Cell Line and Taxonomy data collections, which provide quick access to chemical information specific to given cell lines and taxons, respectively.
4. **Bioassay Data Model Update**: The bioassay data model was updated to store panel assay data in a row-based format, making it easier to manage and interpret. The new model also includes endpoint qualifiers and supports UTF-8 characters.
5. **Programmatic Access Enhancements**: New functionalities were added to PUG-REST and PUG-View, including the 'standardize' operation for chemical structure standardization and programmatic access to target-centric data.
6. **PubChemRDF Update**: Major changes were made to PubChemRDF, including the addition of a Pathway subdomain and updates to predicates defining semantic relationships between entities.
These updates aim to enhance the breadth and depth of chemical information available in PubChem, improving its utility for researchers, health and safety officers, patent agents, educators, and students.The article provides an overview of significant updates to PubChem, a widely used chemical information resource, over the past two years. Key highlights include:
1. **Data Sources Expansion**: PubChem now integrates data from over 120 new sources, bringing the total to approximately 870 data sources. Notable additions include drug information from the FDA's National Drug Code Directory and Green Book, as well as chemical health and safety data from Haz-Map, IARC, and EPA.
2. **Google Patents Integration**: The integration of Google Patents data significantly expanded the coverage of PubChem's patent data collection, now containing 767 million links between chemical structures and patent documents.
3. **New Data Collections**: The creation of the Cell Line and Taxonomy data collections, which provide quick access to chemical information specific to given cell lines and taxons, respectively.
4. **Bioassay Data Model Update**: The bioassay data model was updated to store panel assay data in a row-based format, making it easier to manage and interpret. The new model also includes endpoint qualifiers and supports UTF-8 characters.
5. **Programmatic Access Enhancements**: New functionalities were added to PUG-REST and PUG-View, including the 'standardize' operation for chemical structure standardization and programmatic access to target-centric data.
6. **PubChemRDF Update**: Major changes were made to PubChemRDF, including the addition of a Pathway subdomain and updates to predicates defining semantic relationships between entities.
These updates aim to enhance the breadth and depth of chemical information available in PubChem, improving its utility for researchers, health and safety officers, patent agents, educators, and students.