4 Jan 2019 | Fisher Yu Dequan Wang Evan Shelhamer Trevor Darrell
Deep Layer Aggregation (DLA) is a method to enhance feature fusion across different layers in convolutional neural networks (CNNs), improving recognition and resolution. The paper introduces two structures: Iterative Deep Aggregation (IDA) and Hierarchical Deep Aggregation (HDA). IDA iteratively merges features across layers, while HDA uses a tree-like structure to combine features from multiple modules and channels. These methods improve accuracy and reduce parameters compared to existing architectures.
DLA is implemented in various tasks, including image classification, fine-grained recognition, semantic segmentation, and boundary detection. Experiments show that DLA outperforms existing models in terms of accuracy, parameter count, and memory usage. For example, DLA achieves state-of-the-art results on classification tasks and performs well on semantic segmentation and boundary detection.
DLA is compatible with different network backbones and can be combined with existing architectures like ResNet and ResNeXt. It improves performance by aggregating features across different depths, resolutions, and scales. The method uses aggregation nodes to combine and compress inputs, with IDA focusing on resolution and scale fusion, and HDA on feature merging.
In classification tasks, DLA networks outperform ResNet and ResNeXt with fewer parameters. On fine-grained recognition tasks, DLA achieves new state-of-the-art results on datasets like Car, Plane, and Food. For semantic segmentation, DLA performs well on Cityscapes and CamVid datasets, achieving high mean intersection-over-union (IoU) scores.
In boundary detection, DLA achieves state-of-the-art results on BSDS and PASCAL boundaries datasets. It outperforms other methods in terms of accuracy and precision-recall performance. The method is efficient and can be applied to compact models with fewer parameters.
Overall, DLA is a general and effective extension to deep visual architectures, improving performance and efficiency in various tasks. The method is flexible, compatible with different network structures, and has shown significant improvements in multiple tasks.Deep Layer Aggregation (DLA) is a method to enhance feature fusion across different layers in convolutional neural networks (CNNs), improving recognition and resolution. The paper introduces two structures: Iterative Deep Aggregation (IDA) and Hierarchical Deep Aggregation (HDA). IDA iteratively merges features across layers, while HDA uses a tree-like structure to combine features from multiple modules and channels. These methods improve accuracy and reduce parameters compared to existing architectures.
DLA is implemented in various tasks, including image classification, fine-grained recognition, semantic segmentation, and boundary detection. Experiments show that DLA outperforms existing models in terms of accuracy, parameter count, and memory usage. For example, DLA achieves state-of-the-art results on classification tasks and performs well on semantic segmentation and boundary detection.
DLA is compatible with different network backbones and can be combined with existing architectures like ResNet and ResNeXt. It improves performance by aggregating features across different depths, resolutions, and scales. The method uses aggregation nodes to combine and compress inputs, with IDA focusing on resolution and scale fusion, and HDA on feature merging.
In classification tasks, DLA networks outperform ResNet and ResNeXt with fewer parameters. On fine-grained recognition tasks, DLA achieves new state-of-the-art results on datasets like Car, Plane, and Food. For semantic segmentation, DLA performs well on Cityscapes and CamVid datasets, achieving high mean intersection-over-union (IoU) scores.
In boundary detection, DLA achieves state-of-the-art results on BSDS and PASCAL boundaries datasets. It outperforms other methods in terms of accuracy and precision-recall performance. The method is efficient and can be applied to compact models with fewer parameters.
Overall, DLA is a general and effective extension to deep visual architectures, improving performance and efficiency in various tasks. The method is flexible, compatible with different network structures, and has shown significant improvements in multiple tasks.