How Much Knowledge Can You Pack Into the Parameters of a Language Model?

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

November 16-20, 2020 | Adam Roberts, Colin Raffel, Noam Shazeer
This paper explores how much knowledge can be implicitly stored in the parameters of a language model. The authors show that large, pre-trained language models can answer questions without access to external knowledge, performing competitively with systems that explicitly retrieve information from external sources. They evaluate this approach on open-domain question answering tasks, finding that model size significantly impacts performance. The T5 model, pre-trained on a large text corpus, achieves strong results on tasks like Natural Questions, WebQuestions, and TriviaQA. The study also investigates whether models with more parameters store more knowledge, finding that larger models perform better. The results suggest that language models can effectively answer questions without external knowledge, indicating a new approach to question answering systems. The authors also highlight the limitations of current evaluation methods, noting that they may underestimate the performance of closed-book question answering systems. The study contributes to the understanding of how language models can be used for knowledge-based tasks and highlights the importance of efficient models for practical applications.This paper explores how much knowledge can be implicitly stored in the parameters of a language model. The authors show that large, pre-trained language models can answer questions without access to external knowledge, performing competitively with systems that explicitly retrieve information from external sources. They evaluate this approach on open-domain question answering tasks, finding that model size significantly impacts performance. The T5 model, pre-trained on a large text corpus, achieves strong results on tasks like Natural Questions, WebQuestions, and TriviaQA. The study also investigates whether models with more parameters store more knowledge, finding that larger models perform better. The results suggest that language models can effectively answer questions without external knowledge, indicating a new approach to question answering systems. The authors also highlight the limitations of current evaluation methods, noting that they may underestimate the performance of closed-book question answering systems. The study contributes to the understanding of how language models can be used for knowledge-based tasks and highlights the importance of efficient models for practical applications.
Reach us at info@study.space