Understanding Designing Neural Network Architectures using Reinforcement Learning

The paper introduces MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing Convolutional Neural Network (CNN) architectures for specific learning tasks. The learning agent is trained to sequentially choose CNN layers using Q-learning with an ε-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task. On image classification benchmarks, the agent-designed networks, consisting of only standard convolution, pooling, and fully-connected layers, outperform existing networks designed with the same layer types and are competitive against state-of-the-art methods that use more complex layer types. The paper also demonstrates that MetaQNN outperforms existing meta-modeling approaches for network design on image classification tasks. The method is validated on three standard image classification datasets: CIFAR-10, SVHN, and MNIST. The top network designs discovered by the agent on one dataset are also competitive when trained on other datasets, indicating their suitability for transfer learning tasks. Additionally, the method can generate multiple well-performing network designs, which can be ensembled to further boost prediction performance.The paper introduces MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing Convolutional Neural Network (CNN) architectures for specific learning tasks. The learning agent is trained to sequentially choose CNN layers using Q-learning with an ε-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task. On image classification benchmarks, the agent-designed networks, consisting of only standard convolution, pooling, and fully-connected layers, outperform existing networks designed with the same layer types and are competitive against state-of-the-art methods that use more complex layer types. The paper also demonstrates that MetaQNN outperforms existing meta-modeling approaches for network design on image classification tasks. The method is validated on three standard image classification datasets: CIFAR-10, SVHN, and MNIST. The top network designs discovered by the agent on one dataset are also competitive when trained on other datasets, indicating their suitability for transfer learning tasks. Additionally, the method can generate multiple well-performing network designs, which can be ensembled to further boost prediction performance.

DESIGNING NEURAL NETWORK ARCHITECTURES USING REINFORCEMENT LEARNING

22 Mar 2017 | Bowen Baker, Otkrist Gupta, Nikhil Naik & Ramesh Raskar