This thesis explores the problem of composing meanings within distributional models of semantics to form representations of multi-word structures. Distributional models of semantics have proven effective in cognitive modeling and practical applications, such as modeling semantic similarity and association. However, the representation of larger constructions, like phrases and sentences, has received less attention. The thesis examines how to compose meanings within distributional models to form representations of multi-word structures, which is crucial for modeling natural language data that typically consists of complex structures.
The thesis introduces a framework for considering the composition of distributional representations, subsuming existing proposals like addition and tensor products, while allowing for novel composition functions. It evaluates these models on three empirical tasks: modeling similarity judgements for short phrases, enhancing n-gram language models with semantic dependencies, and analyzing reading times from an eye-movement study. The results show that different composition functions perform better on different tasks, with additive models being better for topic-based models and multiplicative models for co-occurrence-based models.
The thesis also develops a language model that integrates lexical, syntactic, and semantic dependencies using compositional representations. This model is evaluated on tasks such as predicting similarity judgements and analyzing reading times. The results indicate that compositional models generally yield better performance than non-compositional models. The thesis concludes that compositional models provide a more accurate representation of semantic structures and that further research is needed to refine these models and explore their applications in natural language processing.This thesis explores the problem of composing meanings within distributional models of semantics to form representations of multi-word structures. Distributional models of semantics have proven effective in cognitive modeling and practical applications, such as modeling semantic similarity and association. However, the representation of larger constructions, like phrases and sentences, has received less attention. The thesis examines how to compose meanings within distributional models to form representations of multi-word structures, which is crucial for modeling natural language data that typically consists of complex structures.
The thesis introduces a framework for considering the composition of distributional representations, subsuming existing proposals like addition and tensor products, while allowing for novel composition functions. It evaluates these models on three empirical tasks: modeling similarity judgements for short phrases, enhancing n-gram language models with semantic dependencies, and analyzing reading times from an eye-movement study. The results show that different composition functions perform better on different tasks, with additive models being better for topic-based models and multiplicative models for co-occurrence-based models.
The thesis also develops a language model that integrates lexical, syntactic, and semantic dependencies using compositional representations. This model is evaluated on tasks such as predicting similarity judgements and analyzing reading times. The results indicate that compositional models generally yield better performance than non-compositional models. The thesis concludes that compositional models provide a more accurate representation of semantic structures and that further research is needed to refine these models and explore their applications in natural language processing.