[slides and audio] Extracting Policy Positions from Political Texts Using Words as Data

This paper presents a new method for extracting policy positions from political texts by treating texts as data in the form of words. The method is compared to previous text analysis techniques and used to replicate published estimates of the policy positions of political parties in Britain and Ireland on both economic and social policy dimensions. The method is then applied to German political texts, including the PDS, and extended to legislative speeches. The technique uses a "language-blind" word scoring approach that allows for the estimation of policy positions without the need for extensive time and labor. It also provides uncertainty measures for the estimates, enabling analysts to assess the significance of differences between estimated policy positions. The paper discusses two contrasting approaches to estimating policy positions: a priori (inductive) and inductive (a posteriori). The a priori approach uses predefined policy dimensions and estimates policy positions based on the relative frequency of words in texts. The inductive approach uses content analysis to generate a matrix of similarities and dissimilarities between texts, which is then used to derive policy dimensions. The a priori approach is more commonly used in political science, as it allows for the estimation of policy positions on predefined dimensions. The paper describes a method for estimating policy positions by comparing two sets of texts: reference texts with known policy positions and virgin texts with unknown positions. The method uses the relative frequency of words in reference texts to calculate the probability that a word is associated with a particular reference text. This probability is then used to calculate a score for each word, which is then used to estimate the policy position of a virgin text. The method is tested on British and Irish political texts and is shown to replicate published estimates of policy positions. The method is also applied to German political texts and legislative speeches, demonstrating its versatility and effectiveness. The method provides uncertainty measures for the estimates, allowing analysts to assess the significance of differences between estimated policy positions. The paper concludes that the method is a significant advancement in the analysis of political texts, as it allows for the estimation of policy positions without the need for extensive human intervention.This paper presents a new method for extracting policy positions from political texts by treating texts as data in the form of words. The method is compared to previous text analysis techniques and used to replicate published estimates of the policy positions of political parties in Britain and Ireland on both economic and social policy dimensions. The method is then applied to German political texts, including the PDS, and extended to legislative speeches. The technique uses a "language-blind" word scoring approach that allows for the estimation of policy positions without the need for extensive time and labor. It also provides uncertainty measures for the estimates, enabling analysts to assess the significance of differences between estimated policy positions. The paper discusses two contrasting approaches to estimating policy positions: a priori (inductive) and inductive (a posteriori). The a priori approach uses predefined policy dimensions and estimates policy positions based on the relative frequency of words in texts. The inductive approach uses content analysis to generate a matrix of similarities and dissimilarities between texts, which is then used to derive policy dimensions. The a priori approach is more commonly used in political science, as it allows for the estimation of policy positions on predefined dimensions. The paper describes a method for estimating policy positions by comparing two sets of texts: reference texts with known policy positions and virgin texts with unknown positions. The method uses the relative frequency of words in reference texts to calculate the probability that a word is associated with a particular reference text. This probability is then used to calculate a score for each word, which is then used to estimate the policy position of a virgin text. The method is tested on British and Irish political texts and is shown to replicate published estimates of policy positions. The method is also applied to German political texts and legislative speeches, demonstrating its versatility and effectiveness. The method provides uncertainty measures for the estimates, allowing analysts to assess the significance of differences between estimated policy positions. The paper concludes that the method is a significant advancement in the analysis of political texts, as it allows for the estimation of policy positions without the need for extensive human intervention.

Extracting Policy Positions from Political Texts Using Words as Data

May 2003 | MICHAEL LAVER and KENNETH BENOIT, JOHN GARRY