[slides and audio] Locating and Editing Factual Associations in GPT

This paper investigates the storage and recall of factual associations in autoregressive transformer language models, specifically GPT. The authors develop a causal intervention to identify neuron activations that are crucial for factual predictions, revealing that middle-layer feed-forward modules play a key role in processing subject tokens. They introduce Rank-One Model Editing (ROME), a method to modify feed-forward weights to update specific factual associations. ROME is effective on a zero-shot relation extraction task and maintains both specificity and generalization on a dataset of counterfactual assertions. The results confirm the importance of mid-layer feed-forward modules in storing and recalling factual associations, suggesting that direct manipulation of these mechanisms can be a feasible approach for model editing. The code, dataset, visualizations, and an interactive demo notebook are available at <https://rome.baulab.info/>.This paper investigates the storage and recall of factual associations in autoregressive transformer language models, specifically GPT. The authors develop a causal intervention to identify neuron activations that are crucial for factual predictions, revealing that middle-layer feed-forward modules play a key role in processing subject tokens. They introduce Rank-One Model Editing (ROME), a method to modify feed-forward weights to update specific factual associations. ROME is effective on a zero-shot relation extraction task and maintains both specificity and generalization on a dataset of counterfactual assertions. The results confirm the importance of mid-layer feed-forward modules in storing and recalling factual associations, suggesting that direct manipulation of these mechanisms can be a feasible approach for model editing. The code, dataset, visualizations, and an interactive demo notebook are available at <https://rome.baulab.info/>.

Locating and Editing Factual Associations in GPT

13 Jan 2023 | Kevin Meng*, David Bau*, Alex Andonian, Yonatan Belinkov†

13 Jan 2023 | Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov†