31 Jan 2024 | Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani
This study explores the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. The authors investigate the factors contributing to the alignment of LLMs with the brain's language processing mechanisms, focusing on high-performance LLMs with similar parameter sizes. They find that as LLMs achieve higher performance on benchmark tasks, they become more brain-like in predicting neural responses from LLM embeddings. Additionally, their hierarchical feature extraction pathways map more closely to the brain's while using fewer layers. The study also compares the feature extraction pathways of different LLMs and identifies new insights into how high-performing models have converged towards similar hierarchical processing mechanisms. Furthermore, the importance of contextual information in improving model performance and brain similarity is highlighted. The findings reveal converging aspects of language processing in the brain and large language models, offering new directions for developing LLMs that align more closely with human cognitive processing.This study explores the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. The authors investigate the factors contributing to the alignment of LLMs with the brain's language processing mechanisms, focusing on high-performance LLMs with similar parameter sizes. They find that as LLMs achieve higher performance on benchmark tasks, they become more brain-like in predicting neural responses from LLM embeddings. Additionally, their hierarchical feature extraction pathways map more closely to the brain's while using fewer layers. The study also compares the feature extraction pathways of different LLMs and identifies new insights into how high-performing models have converged towards similar hierarchical processing mechanisms. Furthermore, the importance of contextual information in improving model performance and brain similarity is highlighted. The findings reveal converging aspects of language processing in the brain and large language models, offering new directions for developing LLMs that align more closely with human cognitive processing.