[slides and audio] ChEMBL%3A towards direct deposition of bioassay data

ChEMBL is a large, open-access bioactivity database that aims to capture medicinal chemistry data and knowledge across the pharmaceutical research and development process. It integrates information from peer-reviewed literature, approved drugs, and clinical development candidates, and exchanges bioactivity data with other databases such as PubChem BioAssay and BindingDB. The database has a wide range of practical applications, including identifying chemical tools for a target of interest, assessing compound selectivity, training machine learning models, assisting in drug repurposing, and integrating into other drug discovery resources. The database has undergone several important improvements in the last two years, including more robust capture and representation of assay details, a new data deposition system allowing updating of data sets and deposition of supplementary data, and a completely redesigned web interface with enhanced search and filtering capabilities. The new interface allows users to search for compounds or targets of interest, retrieve bioactivity data, and filter data on desired properties. It also includes interactive filtering, data visualizations such as the 'Bioactivity Heatmap', and a sunburst visualization to explore protein target classifications. The database now contains over 15 million bioactivity measurements for 1.8 million distinct compounds, with assays annotated to over 1600 cell lines, 500 tissues/organs, and 3600 organisms. New data sources have been incorporated, including patent bioactivity data, curated drug pharmacokinetic data, CO-ADD antimicrobial screening data, and the K4DD project. The MMV Pathogen Box also contains 400 diverse, drug-like compounds with activities in a range of neglected diseases. The ChEMBL database is made available under a Creative Commons Attribution-ShareAlike 3.0 Unported license. It is a widely used drug discovery resource with a global user base in academia, industry, and charitable organizations. The database provides high-quality, curated resources that support new discoveries, the creation of new spin-out companies, and the validation of computational tools. The current emphasis on Artificial Intelligence approaches and applications highlights the value of high-quality, curated resources such as ChEMBL. The database continues to deliver quality content and provides users with the ability to more easily interact with the data and facilitate integration with other data types.ChEMBL is a large, open-access bioactivity database that aims to capture medicinal chemistry data and knowledge across the pharmaceutical research and development process. It integrates information from peer-reviewed literature, approved drugs, and clinical development candidates, and exchanges bioactivity data with other databases such as PubChem BioAssay and BindingDB. The database has a wide range of practical applications, including identifying chemical tools for a target of interest, assessing compound selectivity, training machine learning models, assisting in drug repurposing, and integrating into other drug discovery resources. The database has undergone several important improvements in the last two years, including more robust capture and representation of assay details, a new data deposition system allowing updating of data sets and deposition of supplementary data, and a completely redesigned web interface with enhanced search and filtering capabilities. The new interface allows users to search for compounds or targets of interest, retrieve bioactivity data, and filter data on desired properties. It also includes interactive filtering, data visualizations such as the 'Bioactivity Heatmap', and a sunburst visualization to explore protein target classifications. The database now contains over 15 million bioactivity measurements for 1.8 million distinct compounds, with assays annotated to over 1600 cell lines, 500 tissues/organs, and 3600 organisms. New data sources have been incorporated, including patent bioactivity data, curated drug pharmacokinetic data, CO-ADD antimicrobial screening data, and the K4DD project. The MMV Pathogen Box also contains 400 diverse, drug-like compounds with activities in a range of neglected diseases. The ChEMBL database is made available under a Creative Commons Attribution-ShareAlike 3.0 Unported license. It is a widely used drug discovery resource with a global user base in academia, industry, and charitable organizations. The database provides high-quality, curated resources that support new discoveries, the creation of new spin-out companies, and the validation of computational tools. The current emphasis on Artificial Intelligence approaches and applications highlights the value of high-quality, curated resources such as ChEMBL. The database continues to deliver quality content and provides users with the ability to more easily interact with the data and facilitate integration with other data types.

ChEMBL: towards direct deposition of bioassay data