26 February 2010 | Alan R Aronson, François-Michel Lang
MetaMap is a widely used program that facilitates access to the concepts in the Unified Medical Language System (UMLS) Metathesaurus from biomedical text. This study reviews the evolution of MetaMap over more than a decade, focusing on features developed to meet the research needs of the biomedical informatics community. Key features include the detection of author-defined acronyms/abbreviations, browsing the Metathesaurus for concepts even tenuously related to input text, detecting negation in predications, word sense disambiguation (WSD), and various technical and algorithmic enhancements. Near-term plans for MetaMap include incorporating chemical name recognition and enhancing WSD. The paper also discusses MetaMap's strengths and weaknesses, its application in various tasks such as text mining, classification, and question answering, and its role in relating biomedical text to structured knowledge sources. MetaMap's development has been driven by research issues, including tokenization, output formats, genre and task-specific considerations, and algorithm tuning. The paper concludes with a discussion of future plans for MetaMap, including improvements in clinical text processing, chemical name recognition, and WSD accuracy.MetaMap is a widely used program that facilitates access to the concepts in the Unified Medical Language System (UMLS) Metathesaurus from biomedical text. This study reviews the evolution of MetaMap over more than a decade, focusing on features developed to meet the research needs of the biomedical informatics community. Key features include the detection of author-defined acronyms/abbreviations, browsing the Metathesaurus for concepts even tenuously related to input text, detecting negation in predications, word sense disambiguation (WSD), and various technical and algorithmic enhancements. Near-term plans for MetaMap include incorporating chemical name recognition and enhancing WSD. The paper also discusses MetaMap's strengths and weaknesses, its application in various tasks such as text mining, classification, and question answering, and its role in relating biomedical text to structured knowledge sources. MetaMap's development has been driven by research issues, including tokenization, output formats, genre and task-specific considerations, and algorithm tuning. The paper concludes with a discussion of future plans for MetaMap, including improvements in clinical text processing, chemical name recognition, and WSD accuracy.