Nonparametric Modern Hopfield Models

Nonparametric Modern Hopfield Models

Last Update: April 8, 2024 | Jerry Yao-Chieh Hu†‡1, Bo-Yu Chen†‡2, Dennis Wu†3, Feng Ruan§4, Han Liu†§5
The paper presents a nonparametric framework for modern Hopfield models, which are deep learning-compatible associative memory models. The key contribution is to interpret the memory storage and retrieval processes as a nonparametric regression problem subject to query-memory pairs. This framework not only recovers the known results from the original dense modern Hopfield model but also introduces an efficient sparse-structured modern Hopfield model with sub-quadratic complexity. The sparse model inherits the theoretical properties of its dense counterpart, including connection to transformer attention, fixed-point convergence, and exponential memory capacity. The authors provide rigorous characterizations of the sparsity-induced advantages, such as tighter retrieval error bounds, enhanced noise robustness, and exponential memory capacity. They also construct a family of modern Hopfield models that connect to various attention variants, including linear, random masked, top-$K$, and positive random feature models. Empirical validation is provided through synthetic and realistic experiments.The paper presents a nonparametric framework for modern Hopfield models, which are deep learning-compatible associative memory models. The key contribution is to interpret the memory storage and retrieval processes as a nonparametric regression problem subject to query-memory pairs. This framework not only recovers the known results from the original dense modern Hopfield model but also introduces an efficient sparse-structured modern Hopfield model with sub-quadratic complexity. The sparse model inherits the theoretical properties of its dense counterpart, including connection to transformer attention, fixed-point convergence, and exponential memory capacity. The authors provide rigorous characterizations of the sparsity-induced advantages, such as tighter retrieval error bounds, enhanced noise robustness, and exponential memory capacity. They also construct a family of modern Hopfield models that connect to various attention variants, including linear, random masked, top-$K$, and positive random feature models. Empirical validation is provided through synthetic and realistic experiments.
Reach us at info@study.space