The paper "Word Association Norms, Mutual Information, and Lexicography" by Kenneth Ward Church and Patrick Hanks explores the concept of word association norms in psycholinguistics and proposes a new objective measure, the association ratio, based on mutual information. This measure is designed to estimate word association norms directly from computer-readable corpora, making it more efficient and reliable compared to traditional methods that rely on human subjects. The authors demonstrate the utility of this measure in various applications, including speech recognition, optical character recognition, text retrieval, and lexicography. They show how the association ratio can help identify interesting word associations, such as "dentists," "nurses," "treating," "treat," and "hospitals," and how it can aid lexicographers in organizing concordance lines and identifying semantic classes. The paper also discusses the limitations of the association ratio, such as its inability to capture compositional semantics and the need for explicit preprocessing to handle syntactic structures. Overall, the proposed measure provides a powerful tool for understanding and describing word associations in large corpora.The paper "Word Association Norms, Mutual Information, and Lexicography" by Kenneth Ward Church and Patrick Hanks explores the concept of word association norms in psycholinguistics and proposes a new objective measure, the association ratio, based on mutual information. This measure is designed to estimate word association norms directly from computer-readable corpora, making it more efficient and reliable compared to traditional methods that rely on human subjects. The authors demonstrate the utility of this measure in various applications, including speech recognition, optical character recognition, text retrieval, and lexicography. They show how the association ratio can help identify interesting word associations, such as "dentists," "nurses," "treating," "treat," and "hospitals," and how it can aid lexicographers in organizing concordance lines and identifying semantic classes. The paper also discusses the limitations of the association ratio, such as its inability to capture compositional semantics and the need for explicit preprocessing to handle syntactic structures. Overall, the proposed measure provides a powerful tool for understanding and describing word associations in large corpora.