4 Mar 2014 | Min Lin12, Qiang Chen2, Shuicheng Yan2
The paper introduces a novel deep network structure called "Network In Network" (NIN) to enhance model discriminability for local patches within the receptive field. Traditional convolutional layers use linear filters followed by nonlinear activation functions, while NIN replaces this with micro neural networks, which are more complex structures that can better abstract data. The micro networks are implemented using multilayer perceptrons (MLPs), which are potent function approximators. Feature maps are generated by sliding these micro networks over the input, similar to CNNs. The NIN structure is then stacked to form a deep network.
NIN improves classification performance by using global average pooling instead of traditional fully connected layers, which is easier to interpret and less prone to overfitting. The paper demonstrates that NIN achieves state-of-the-art classification performance on CIFAR-10 and CIFAR-100, and reasonable performance on SVHN and MNIST datasets.
The paper also compares NIN with other architectures like maxout networks, highlighting the advantages of using MLPs in micro networks for better abstraction. It discusses the benefits of global average pooling as a structural regularizer, which prevents overfitting and improves generalization. The paper also includes experiments showing that NIN outperforms previous methods on various datasets, and that the use of dropout in between mlpconv layers further improves performance.
The paper concludes that NIN is a promising architecture for classification tasks, with the ability to generate confidence maps of categories through global average pooling. This makes NIN a valuable tool for object detection and scene labeling.The paper introduces a novel deep network structure called "Network In Network" (NIN) to enhance model discriminability for local patches within the receptive field. Traditional convolutional layers use linear filters followed by nonlinear activation functions, while NIN replaces this with micro neural networks, which are more complex structures that can better abstract data. The micro networks are implemented using multilayer perceptrons (MLPs), which are potent function approximators. Feature maps are generated by sliding these micro networks over the input, similar to CNNs. The NIN structure is then stacked to form a deep network.
NIN improves classification performance by using global average pooling instead of traditional fully connected layers, which is easier to interpret and less prone to overfitting. The paper demonstrates that NIN achieves state-of-the-art classification performance on CIFAR-10 and CIFAR-100, and reasonable performance on SVHN and MNIST datasets.
The paper also compares NIN with other architectures like maxout networks, highlighting the advantages of using MLPs in micro networks for better abstraction. It discusses the benefits of global average pooling as a structural regularizer, which prevents overfitting and improves generalization. The paper also includes experiments showing that NIN outperforms previous methods on various datasets, and that the use of dropout in between mlpconv layers further improves performance.
The paper concludes that NIN is a promising architecture for classification tasks, with the ability to generate confidence maps of categories through global average pooling. This makes NIN a valuable tool for object detection and scene labeling.