[slides and audio] 2010 i2b2%2FVA challenge on concepts%2C assertions%2C and relations in clinical text

The 2010 i2b2/VA challenge focused on three tasks: concept extraction, assertion classification, and relation classification in clinical text. The challenge aimed to evaluate systems for extracting medical concepts, classifying assertions, and identifying relations between medical problems, tests, and treatments. Annotated reference standard corpora were provided by i2b2 and the VA, allowing 22 systems for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems demonstrated that machine learning could be combined with rule-based approaches to improve performance. Ensemble methods, unlabeled data, and external knowledge sources were used when training data were limited. The challenge involved data from three institutions, including discharge summaries and progress reports. A total of 394 training reports, 477 test reports, and 877 unannotated reports were used. The data were available for research in November 2011 under data use agreements. The challenge highlighted the importance of feature engineering, with the most successful systems using high-dimensional feature spaces. The results showed that assertion classification was the easiest task, while relation classification was the most challenging. The best performing system for concept extraction achieved an exact F-measure of 0.852. For assertion classification, the top four systems were not significantly different, and for relation classification, the top two systems were also not significantly different. The challenge demonstrated the effectiveness of combining machine learning with rule-based systems and the importance of domain knowledge in clinical NLP. The results indicated that the systems could generalize to other institutions' data, and the challenge contributed to the advancement of NLP in clinical settings. The 2011 i2b2/VA challenge will focus on co-reference resolution, building on the success of the 2010 challenge.The 2010 i2b2/VA challenge focused on three tasks: concept extraction, assertion classification, and relation classification in clinical text. The challenge aimed to evaluate systems for extracting medical concepts, classifying assertions, and identifying relations between medical problems, tests, and treatments. Annotated reference standard corpora were provided by i2b2 and the VA, allowing 22 systems for concept extraction, 21 for assertion classification, and 16 for relation classification. These systems demonstrated that machine learning could be combined with rule-based approaches to improve performance. Ensemble methods, unlabeled data, and external knowledge sources were used when training data were limited. The challenge involved data from three institutions, including discharge summaries and progress reports. A total of 394 training reports, 477 test reports, and 877 unannotated reports were used. The data were available for research in November 2011 under data use agreements. The challenge highlighted the importance of feature engineering, with the most successful systems using high-dimensional feature spaces. The results showed that assertion classification was the easiest task, while relation classification was the most challenging. The best performing system for concept extraction achieved an exact F-measure of 0.852. For assertion classification, the top four systems were not significantly different, and for relation classification, the top two systems were also not significantly different. The challenge demonstrated the effectiveness of combining machine learning with rule-based systems and the importance of domain knowledge in clinical NLP. The results indicated that the systems could generalize to other institutions' data, and the challenge contributed to the advancement of NLP in clinical settings. The 2011 i2b2/VA challenge will focus on co-reference resolution, building on the success of the 2010 challenge.

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text

2011 | Özlem Uzuner, Brett R South, Shuying Shen, Scott L DuVall