A Structural Probe for Finding Syntax in Word Representations

A Structural Probe for Finding Syntax in Word Representations

June 2 - June 7, 2019 | John Hewitt, Christopher D. Manning
This paper introduces a structural probe to evaluate whether syntax trees are embedded in the word representation spaces of deep models like ELMo and BERT. The probe tests whether a linear transformation of the word representations can encode the distance between words in a parse tree as squared L2 distance, and the depth of a word in the tree as squared L2 norm. The probe is trained to find such transformations for each model, and it shows that ELMo and BERT can embed parse trees with high consistency, unlike baselines. The results suggest that entire syntax trees are implicitly embedded in the vector geometry of deep models, even though the models were not explicitly trained on trees. The probe also reveals that the required linear transformation has a low rank, indicating that only a small portion of the representation space is used to encode syntax. The study provides insights into how deep models represent syntax and highlights the potential of structural probes for analyzing linguistic knowledge in neural models.This paper introduces a structural probe to evaluate whether syntax trees are embedded in the word representation spaces of deep models like ELMo and BERT. The probe tests whether a linear transformation of the word representations can encode the distance between words in a parse tree as squared L2 distance, and the depth of a word in the tree as squared L2 norm. The probe is trained to find such transformations for each model, and it shows that ELMo and BERT can embed parse trees with high consistency, unlike baselines. The results suggest that entire syntax trees are implicitly embedded in the vector geometry of deep models, even though the models were not explicitly trained on trees. The probe also reveals that the required linear transformation has a low rank, indicating that only a small portion of the representation space is used to encode syntax. The study provides insights into how deep models represent syntax and highlights the potential of structural probes for analyzing linguistic knowledge in neural models.
Reach us at info@study.space
Understanding A Structural Probe for Finding Syntax in Word Representations