[slides and audio] Maximum Entropy Markov Models for Information Extraction and Segmentation

This paper introduces Maximum Entropy Markov Models (MEMMs), a new probabilistic model for sequence analysis, particularly useful for text-related tasks such as information extraction and segmentation. MEMMs are an extension of Hidden Markov Models (HMMs) that allow observations to be represented as arbitrary overlapping features, such as word capitalization, formatting, and part-of-speech tags. The model defines the conditional probability of state sequences given observation sequences using maximum entropy models, which are trained by generalized iterative scaling (GIS). The paper presents experimental results on the task of segmenting frequently asked questions (FAQs) into questions and answers, demonstrating that MEMMs outperform traditional HMMs and stateless maximum-entropy models in terms of precision and recall. The authors also discuss variations of the model, including factored state representations and observations in states instead of transitions, and highlight the potential for applying MEMMs to more complex tasks like named entity recognition.This paper introduces Maximum Entropy Markov Models (MEMMs), a new probabilistic model for sequence analysis, particularly useful for text-related tasks such as information extraction and segmentation. MEMMs are an extension of Hidden Markov Models (HMMs) that allow observations to be represented as arbitrary overlapping features, such as word capitalization, formatting, and part-of-speech tags. The model defines the conditional probability of state sequences given observation sequences using maximum entropy models, which are trained by generalized iterative scaling (GIS). The paper presents experimental results on the task of segmenting frequently asked questions (FAQs) into questions and answers, demonstrating that MEMMs outperform traditional HMMs and stateless maximum-entropy models in terms of precision and recall. The authors also discuss variations of the model, including factored state representations and observations in states instead of transitions, and highlight the potential for applying MEMMs to more complex tasks like named entity recognition.

Maximum Entropy Markov Models for Information Extraction and Segmentation

| Andrew McCallum, Dayne Freitag, Fernando Pereira