MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports

MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports

2019 | Alistair E. W. Johnson, Tom J. Pollard, Seth J. Berkowitz, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih-ying Deng, Roger G. Mark & Steven Horng
MIMIC-CXR is a large, publicly available dataset of chest radiographs with free-text radiology reports, containing 377,110 images from 227,835 studies of 65,379 patients. The dataset is de-identified to protect patient privacy and is available for research in computer vision, natural language processing, and clinical data mining. The dataset includes chest radiographs from the Beth Israel Deaconess Medical Center (BIDMC) between 2011 and 2016, with each study containing one or more images (usually frontal and lateral views). The dataset also includes semi-structured free-text radiology reports written by practicing radiologists. The dataset is de-identified to comply with HIPAA requirements and is available on PhysioNet. The dataset is intended to support a wide range of research in medicine, including image understanding, natural language processing, and decision support. The dataset was created by handling three distinct data modalities: electronic health records, images, and natural language reports. The dataset was processed independently and then combined to create the database. The project was approved by the Institutional Review Board of BIDMC. The dataset includes a mapping file that lists all image names with the corresponding study identifier and patient identifier. The dataset is intended to foster collaboration in medical image processing and secondary analysis of electronic health records. The dataset is publicly available and includes Jupyter Notebooks to demonstrate usage of the data. The dataset is also accompanied by related datasets that may complement future work. The dataset is available for research, but use requires proof of completion of a course on human subjects research and signing of a data use agreement. The dataset is de-identified using a combination of methods, including removal of PHI, anonymization of patient identifiers, and modification of dates. The dataset is available for download from the MIMIC-CXR Database project on PhysioNet. The dataset is intended to accelerate research in the field and ensure reproducibility of future studies. The dataset is publicly available and includes publicly accessible Jupyter Notebooks to demonstrate usage of the data. The dataset is also accompanied by related datasets that may complement future work. The dataset is available for research, but use requires proof of completion of a course on human subjects research and signing of a data use agreement. The dataset is de-identified using a combination of methods, including removal of PHI, anonymization of patient identifiers, and modification of dates. The dataset is available for download from the MIMIC-CXR Database project on PhysioNet.MIMIC-CXR is a large, publicly available dataset of chest radiographs with free-text radiology reports, containing 377,110 images from 227,835 studies of 65,379 patients. The dataset is de-identified to protect patient privacy and is available for research in computer vision, natural language processing, and clinical data mining. The dataset includes chest radiographs from the Beth Israel Deaconess Medical Center (BIDMC) between 2011 and 2016, with each study containing one or more images (usually frontal and lateral views). The dataset also includes semi-structured free-text radiology reports written by practicing radiologists. The dataset is de-identified to comply with HIPAA requirements and is available on PhysioNet. The dataset is intended to support a wide range of research in medicine, including image understanding, natural language processing, and decision support. The dataset was created by handling three distinct data modalities: electronic health records, images, and natural language reports. The dataset was processed independently and then combined to create the database. The project was approved by the Institutional Review Board of BIDMC. The dataset includes a mapping file that lists all image names with the corresponding study identifier and patient identifier. The dataset is intended to foster collaboration in medical image processing and secondary analysis of electronic health records. The dataset is publicly available and includes Jupyter Notebooks to demonstrate usage of the data. The dataset is also accompanied by related datasets that may complement future work. The dataset is available for research, but use requires proof of completion of a course on human subjects research and signing of a data use agreement. The dataset is de-identified using a combination of methods, including removal of PHI, anonymization of patient identifiers, and modification of dates. The dataset is available for download from the MIMIC-CXR Database project on PhysioNet. The dataset is intended to accelerate research in the field and ensure reproducibility of future studies. The dataset is publicly available and includes publicly accessible Jupyter Notebooks to demonstrate usage of the data. The dataset is also accompanied by related datasets that may complement future work. The dataset is available for research, but use requires proof of completion of a course on human subjects research and signing of a data use agreement. The dataset is de-identified using a combination of methods, including removal of PHI, anonymization of patient identifiers, and modification of dates. The dataset is available for download from the MIMIC-CXR Database project on PhysioNet.
Reach us at info@study.space