The ChEMBL bioactivity database: an update

The ChEMBL bioactivity database: an update

2014 | A. Patricia Bento, Anna Gaulton, Anne Hersey, Louisa J. Bellis, Jon Chambers, Mark Davies, Felix A. Krüger, Yvonne Light, Lora Mak, Shaun McGlinchey, Michal Nowotka, George Papadatos, Rita Santos and John P. Overington
ChEMBL is an open-access bioactivity database containing information extracted from the medicinal chemistry literature. It provides structured data on compounds, their structures, biological assays, and targets, enabling users to address a wide range of drug discovery questions. The database now includes over 1.3 million distinct compound structures and 12 million bioactivity data points, mapped to over 9000 targets, including 2827 human protein targets. Recent updates include data from various sources, such as neglected disease screening projects, kinase screening results, and supplementary bioactivity data from publications. The database now includes information on compounds in development and marketed products, derived from USAN and INN applications, providing a comprehensive view of clinical candidate space. A new data model has been developed to better represent drug targets, distinguishing between targets (entities with which compounds interact) and molecular components (usually proteins). This model includes new target types such as 'SINGLE PROTEIN', 'PROTEIN FAMILY', 'PROTEIN COMPLEX', and 'PROTEIN COMPLEX GROUP', allowing for more accurate representation of complex interactions. The database also provides tools to help users identify high-quality data, including physicochemical properties, ligand efficiencies, and standardized activity types and units. These enhancements allow users to assess the drug-likeness of compounds, compare bioactivity values, and identify potential errors or duplications in the data. The ChEMBL interface allows users to search for compounds, targets, assays, or documents, and provides report cards for each compound, target, assay, or document, containing clickable graphical widgets. The database is also accessible via a Resource Description Framework (RDF) format, and provides web services for programmatic retrieval of data. The database is freely available in various formats, including Oracle, MySQL, PostGRES, SDF, and FASTA, under a Creative Commons license. The database also includes a range of tools for data mining, including the 'Browse Drugs' tab, which lists FDA-approved drugs and compounds, and the 'Drug Approvals' tab, which shows the most recently approved FDA drugs. The database is continuously updated with new data and features, and is an essential resource for drug discovery research.ChEMBL is an open-access bioactivity database containing information extracted from the medicinal chemistry literature. It provides structured data on compounds, their structures, biological assays, and targets, enabling users to address a wide range of drug discovery questions. The database now includes over 1.3 million distinct compound structures and 12 million bioactivity data points, mapped to over 9000 targets, including 2827 human protein targets. Recent updates include data from various sources, such as neglected disease screening projects, kinase screening results, and supplementary bioactivity data from publications. The database now includes information on compounds in development and marketed products, derived from USAN and INN applications, providing a comprehensive view of clinical candidate space. A new data model has been developed to better represent drug targets, distinguishing between targets (entities with which compounds interact) and molecular components (usually proteins). This model includes new target types such as 'SINGLE PROTEIN', 'PROTEIN FAMILY', 'PROTEIN COMPLEX', and 'PROTEIN COMPLEX GROUP', allowing for more accurate representation of complex interactions. The database also provides tools to help users identify high-quality data, including physicochemical properties, ligand efficiencies, and standardized activity types and units. These enhancements allow users to assess the drug-likeness of compounds, compare bioactivity values, and identify potential errors or duplications in the data. The ChEMBL interface allows users to search for compounds, targets, assays, or documents, and provides report cards for each compound, target, assay, or document, containing clickable graphical widgets. The database is also accessible via a Resource Description Framework (RDF) format, and provides web services for programmatic retrieval of data. The database is freely available in various formats, including Oracle, MySQL, PostGRES, SDF, and FASTA, under a Creative Commons license. The database also includes a range of tools for data mining, including the 'Browse Drugs' tab, which lists FDA-approved drugs and compounds, and the 'Drug Approvals' tab, which shows the most recently approved FDA drugs. The database is continuously updated with new data and features, and is an essential resource for drug discovery research.
Reach us at info@study.space
[slides and audio] The ChEMBL bioactivity database%3A an update