Using Latent Semantic Analysis to assess knowledge: Some technical considerations

Using Latent Semantic Analysis to assess knowledge: Some technical considerations

| Bob Rehder, M. E. Schreiner, Michael B. W. Wolfe, Darrell Laham, Thomas K Landauer, and Walter Kintsch
This paper explores the use of Latent Semantic Analysis (LSA) to assess student knowledge, building on previous research that demonstrated how LSA can be used to grade essays and match students with appropriate instructional texts. The study investigates several technical considerations related to the effectiveness of LSA in measuring knowledge. The role of technical vocabulary in LSA-based knowledge assessment is examined. The results show that both technical and non-technical words in essays contribute equally to predicting student knowledge. This suggests that separating essays into technical and non-technical terms does not provide additional insight. Instead, the use of a list of technical terms generated by students may be as effective as writing an essay. The length of essays is also considered. While the study found that essay length did not significantly correlate with pre-questionnaire scores, longer essays (up to 200 words) were more predictive of knowledge. This suggests that essays of around 200 words provide a reasonable compromise between length and accuracy. Alternative measures of knowledge derived from LSA, such as the dot product, Euclidean distance, and vector length, are explored. The study finds that the dot product is a strong predictor of knowledge, although it does not provide additional predictive value beyond the cosine measure and vector length. The directionality problem in high-dimensional spaces is addressed. This problem arises because the cosine measure does not account for whether an essay is above or below the instructional text in terms of knowledge. The study proposes using multidimensional scaling (MDS) to overcome this issue. Three methods are presented, with Method 3 being the most effective in distinguishing between high- and low-knowledge individuals. The study concludes that LSA can be a useful tool for assessing student knowledge, but further research is needed to address remaining questions. The findings suggest that the cosine measure between an essay and an instructional text is a strong predictor of knowledge, and that the length of the essay vector also contributes to this prediction. The directionality problem can be mitigated using MDS, and the effectiveness of LSA in different domains requires further investigation.This paper explores the use of Latent Semantic Analysis (LSA) to assess student knowledge, building on previous research that demonstrated how LSA can be used to grade essays and match students with appropriate instructional texts. The study investigates several technical considerations related to the effectiveness of LSA in measuring knowledge. The role of technical vocabulary in LSA-based knowledge assessment is examined. The results show that both technical and non-technical words in essays contribute equally to predicting student knowledge. This suggests that separating essays into technical and non-technical terms does not provide additional insight. Instead, the use of a list of technical terms generated by students may be as effective as writing an essay. The length of essays is also considered. While the study found that essay length did not significantly correlate with pre-questionnaire scores, longer essays (up to 200 words) were more predictive of knowledge. This suggests that essays of around 200 words provide a reasonable compromise between length and accuracy. Alternative measures of knowledge derived from LSA, such as the dot product, Euclidean distance, and vector length, are explored. The study finds that the dot product is a strong predictor of knowledge, although it does not provide additional predictive value beyond the cosine measure and vector length. The directionality problem in high-dimensional spaces is addressed. This problem arises because the cosine measure does not account for whether an essay is above or below the instructional text in terms of knowledge. The study proposes using multidimensional scaling (MDS) to overcome this issue. Three methods are presented, with Method 3 being the most effective in distinguishing between high- and low-knowledge individuals. The study concludes that LSA can be a useful tool for assessing student knowledge, but further research is needed to address remaining questions. The findings suggest that the cosine measure between an essay and an instructional text is a strong predictor of knowledge, and that the length of the essay vector also contributes to this prediction. The directionality problem can be mitigated using MDS, and the effectiveness of LSA in different domains requires further investigation.
Reach us at info@study.space
[slides and audio] Latent semantic analysis