This paper introduces hypernetworks, a method where a smaller network (hypernetwork) generates the weights for a larger network (main network). Hypernetworks provide an abstraction similar to the relationship between genotype and phenotype in biology. Unlike HyperNEAT, which is an evolutionary approach, hypernetworks are trained end-to-end with backpropagation, making them more efficient. The focus of this work is to apply hypernetworks to deep convolutional networks and long recurrent networks, where they act as a relaxed form of weight-sharing across layers. The main result is that hypernetworks can generate non-shared weights for LSTM and achieve near state-of-the-art results on various sequence modeling tasks, including character-level language modeling, handwriting generation, and neural machine translation. Hypernetworks applied to convolutional networks also achieve competitive results for image recognition tasks with fewer learnable parameters. The paper also explores dynamic hypernetworks for recurrent networks, where weights can vary across time steps. Experiments show that hypernetworks perform well in both static and dynamic settings, achieving competitive or better results than state-of-the-art models in tasks such as image recognition, language modeling, handwriting generation, and neural machine translation. The results demonstrate that hypernetworks offer a flexible and efficient approach to generating weights for neural networks.This paper introduces hypernetworks, a method where a smaller network (hypernetwork) generates the weights for a larger network (main network). Hypernetworks provide an abstraction similar to the relationship between genotype and phenotype in biology. Unlike HyperNEAT, which is an evolutionary approach, hypernetworks are trained end-to-end with backpropagation, making them more efficient. The focus of this work is to apply hypernetworks to deep convolutional networks and long recurrent networks, where they act as a relaxed form of weight-sharing across layers. The main result is that hypernetworks can generate non-shared weights for LSTM and achieve near state-of-the-art results on various sequence modeling tasks, including character-level language modeling, handwriting generation, and neural machine translation. Hypernetworks applied to convolutional networks also achieve competitive results for image recognition tasks with fewer learnable parameters. The paper also explores dynamic hypernetworks for recurrent networks, where weights can vary across time steps. Experiments show that hypernetworks perform well in both static and dynamic settings, achieving competitive or better results than state-of-the-art models in tasks such as image recognition, language modeling, handwriting generation, and neural machine translation. The results demonstrate that hypernetworks offer a flexible and efficient approach to generating weights for neural networks.