[slides] HyperNetworks

This paper explores hypernetworks, a method where a smaller network (the hypernetwork) generates weights for a larger network (the main network). The hypernetwork is trained end-to-end with backpropagation, making it efficient and scalable. The focus is on two applications: static hypernetworks for convolutional networks and dynamic hypernetworks for recurrent networks. Static hypernetworks generate weights for convolutional layers, reducing the number of learnable parameters while maintaining performance. Dynamic hypernetworks, or HyperRNNs, adapt weights for recurrent networks over time, allowing for relaxed weight-sharing. Experiments show that hypernetworks achieve near state-of-the-art results on various tasks, including character-level language modeling, handwriting generation, and neural machine translation, demonstrating their effectiveness in both image recognition and sequence modeling.This paper explores hypernetworks, a method where a smaller network (the hypernetwork) generates weights for a larger network (the main network). The hypernetwork is trained end-to-end with backpropagation, making it efficient and scalable. The focus is on two applications: static hypernetworks for convolutional networks and dynamic hypernetworks for recurrent networks. Static hypernetworks generate weights for convolutional layers, reducing the number of learnable parameters while maintaining performance. Dynamic hypernetworks, or HyperRNNs, adapt weights for recurrent networks over time, allowing for relaxed weight-sharing. Experiments show that hypernetworks achieve near state-of-the-art results on various tasks, including character-level language modeling, handwriting generation, and neural machine translation, demonstrating their effectiveness in both image recognition and sequence modeling.

HYPERNETWORKS

1 Dec 2016 | David Ha, Andrew Dai, Quoc V. Le