[slides] Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics

This paper investigates the "reversal curse" in auto-regressive large language models (LLMs), where models struggle to infer reverse relationships after learning forward relationships. The authors analyze this phenomenon through the lens of training dynamics for two models: a bilinear model and one-layer transformers. They find that the reversal curse is primarily due to the asymmetry of model weights, which is a consequence of the widely used cross-entropy (CE) loss function. Specifically, the increase in weights from token A to token B during training does not necessarily lead to an increase in weights from B to A. This asymmetry prevents models from automatically deducing certain types of conclusions, highlighting the importance of in-context learning (ICL), data augmentation, and planning for complex reasoning tasks. The authors also extend their analysis to chain-of-thought (COT) tasks, showing that COT is crucial for reasoning through multiple steps. Empirical results on multi-layer transformers validate the theoretical findings.This paper investigates the "reversal curse" in auto-regressive large language models (LLMs), where models struggle to infer reverse relationships after learning forward relationships. The authors analyze this phenomenon through the lens of training dynamics for two models: a bilinear model and one-layer transformers. They find that the reversal curse is primarily due to the asymmetry of model weights, which is a consequence of the widely used cross-entropy (CE) loss function. Specifically, the increase in weights from token A to token B during training does not necessarily lead to an increase in weights from B to A. This asymmetry prevents models from automatically deducing certain types of conclusions, highlighting the importance of in-context learning (ICL), data augmentation, and planning for complex reasoning tasks. The authors also extend their analysis to chain-of-thought (COT) tasks, showing that COT is crucial for reasoning through multiple steps. Empirical results on multi-layer transformers validate the theoretical findings.

Towards a Theoretical Understanding of the ‘Reversal Curse’ via Training Dynamics

28 Oct 2024 | Hanlin Zhu, Baihe Huang, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell

Towards a Theoretical Understanding of the ‘Reversal Curse’ via Training Dynamics

28 Oct 2024 | Hanlin Zhu*, Baihe Huang*, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell

28 Oct 2024 | Hanlin Zhu, Baihe Huang, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell