[slides and audio] From Frequency to Meaning%3A Vector Space Models of Semantics

The paper "From Frequency to Meaning: Vector Space Models of Semantics" by Peter D. Turney and Patrick Pantel surveys the use of Vector Space Models (VSMs) for semantic processing of text. VSMs are beginning to address the limitations of computers understanding human language, which significantly impacts their ability to process and explain text. The authors organize the literature on VSMs according to the structure of the matrix in a VSM, identifying three broad classes: term-document, word-context, and pair-pattern matrices. Each class has specific applications, and the paper provides detailed overviews of each category, including specific open-source projects. The motivation for VSMs is discussed, highlighting their automatic extraction of knowledge from corpora and their performance in tasks involving measuring word similarity and semantic relations. The paper also explores the connection between VSMs and the distributional hypothesis, which posits that words occurring in similar contexts tend to have similar meanings. The authors provide a new framework for organizing the literature and discuss the breadth of applications of VSMs, focusing on practical tasks in natural language processing and computational linguistics. The paper concludes with a discussion of alternatives to VSMs and the future of VSMs, emphasizing their potential and limitations.The paper "From Frequency to Meaning: Vector Space Models of Semantics" by Peter D. Turney and Patrick Pantel surveys the use of Vector Space Models (VSMs) for semantic processing of text. VSMs are beginning to address the limitations of computers understanding human language, which significantly impacts their ability to process and explain text. The authors organize the literature on VSMs according to the structure of the matrix in a VSM, identifying three broad classes: term-document, word-context, and pair-pattern matrices. Each class has specific applications, and the paper provides detailed overviews of each category, including specific open-source projects. The motivation for VSMs is discussed, highlighting their automatic extraction of knowledge from corpora and their performance in tasks involving measuring word similarity and semantic relations. The paper also explores the connection between VSMs and the distributional hypothesis, which posits that words occurring in similar contexts tend to have similar meanings. The authors provide a new framework for organizing the literature and discuss the breadth of applications of VSMs, focusing on practical tasks in natural language processing and computational linguistics. The paper concludes with a discussion of alternatives to VSMs and the future of VSMs, emphasizing their potential and limitations.

From Frequency to Meaning: Vector Space Models of Semantics

10/09; published 02/10 | Peter D. Turney, Patrick Pantel