Understanding NAS-FPN%3A Learning Scalable Feature Pyramid Architecture for Object Detection

The paper "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection" by Golnaz Ghaisi, Tsung-Yi Lin, Ruoming Pang, and Quoc V. Le introduces a novel approach to designing feature pyramid networks (FPNs) for object detection using Neural Architecture Search (NAS). The authors aim to automate the design of FPNs, which are crucial for generating multiscale feature representations in object detection models. By employing NAS, they discover a new architecture called NAS-FPN, which combines top-down and bottom-up connections to fuse features across scales. This architecture is designed to be scalable and can be stacked multiple times within the RetinaNet framework, allowing for better trade-offs between accuracy and latency. The key contributions of NAS-FPN include: 1. **Scalability**: The architecture can be stacked multiple times to improve accuracy while maintaining efficiency. 2. **Flexibility**: It works well with various backbone models, such as MobileNet, ResNet, and AmoebaNet, achieving better accuracy and latency trade-offs compared to state-of-the-art models. 3. **Early Exit**: The architecture supports anytime object detection, enabling early stopping at any pyramid network for faster inference. Experiments on the COCO dataset demonstrate that NAS-FPN outperforms existing models in terms of accuracy and speed. For instance, it improves mobile detection accuracy by 2 AP compared to SSDLite with MobileNetV2 and achieves 48.3 AP, surpassing Mask R-CNN with less computation time. The paper also discusses the design of the search space for NAS-FPN, the implementation details, and the effectiveness of NAS-FPN in different scenarios, including high-accuracy and fast-inference models.The paper "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection" by Golnaz Ghaisi, Tsung-Yi Lin, Ruoming Pang, and Quoc V. Le introduces a novel approach to designing feature pyramid networks (FPNs) for object detection using Neural Architecture Search (NAS). The authors aim to automate the design of FPNs, which are crucial for generating multiscale feature representations in object detection models. By employing NAS, they discover a new architecture called NAS-FPN, which combines top-down and bottom-up connections to fuse features across scales. This architecture is designed to be scalable and can be stacked multiple times within the RetinaNet framework, allowing for better trade-offs between accuracy and latency. The key contributions of NAS-FPN include: 1. **Scalability**: The architecture can be stacked multiple times to improve accuracy while maintaining efficiency. 2. **Flexibility**: It works well with various backbone models, such as MobileNet, ResNet, and AmoebaNet, achieving better accuracy and latency trade-offs compared to state-of-the-art models. 3. **Early Exit**: The architecture supports anytime object detection, enabling early stopping at any pyramid network for faster inference. Experiments on the COCO dataset demonstrate that NAS-FPN outperforms existing models in terms of accuracy and speed. For instance, it improves mobile detection accuracy by 2 AP compared to SSDLite with MobileNetV2 and achieves 48.3 AP, surpassing Mask R-CNN with less computation time. The paper also discusses the design of the search space for NAS-FPN, the implementation details, and the effectiveness of NAS-FPN in different scenarios, including high-accuracy and fast-inference models.

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

16 Apr 2019 | Golnaz Ghaisi Tsung-Yi Lin Ruoming Pang Quoc V. Le