Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

2018 | Mohammad Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Alexandra Swanson, Meredith S. Palmer, Craig Packer, and Jeff Clune
This supplementary information provides technical details and results from the study on automatically identifying, counting, and describing wild animals in camera-trap images using deep learning. The dataset consists of images scaled down to 256x256 pixels and normalized to improve learning for neural networks. Data augmentation techniques such as random cropping, flipping, and brightness modification are used to enhance model accuracy. The models are trained using stochastic gradient descent with momentum and weight decay, and the results are evaluated on both expert-labeled and volunteer-labeled test sets. The study compares a one-stage identification approach with a two-stage pipeline. While the one-stage approach reduces model size, it leads to imbalanced data and slightly worse performance. The results show that the two-stage pipeline outperforms the one-stage approach in most tasks, including species identification, animal counting, and additional attributes. The ensemble of models performs best for the description task, achieving high accuracy and precision. The study also compares the results with previous work by Gomez et al. (2016), finding that the current models achieve significantly higher accuracy, especially for rare species. Transfer learning from ImageNet was tested but did not improve performance on the SS dataset. Instead, the models were trained from scratch on the SS-26 dataset, which contains 26 of the 48 species in the SS dataset. The study also investigates the performance of the models on day and night images, finding little difference in performance between the two. The models were tested on different dataset sizes, and transfer learning was shown to improve performance when labeled data is limited. The results indicate that the models can achieve high accuracy even with few labeled examples, allowing a significant portion of the data to be automatically processed. The study also explores the use of prediction averaging and confidence thresholding to improve model performance. The results show that averaging predictions across multiple models improves accuracy, and that confidence thresholding can help identify images the model is confident about, allowing humans to label the rest. The study also addresses the issue of class imbalance, using techniques such as weighted loss, oversampling, and emphasis sampling to improve performance on rare classes. The results show that these methods can significantly improve accuracy for rare species, although they may slightly reduce accuracy for frequent classes. Overall, the study demonstrates the effectiveness of deep learning in automatically identifying, counting, and describing wild animals in camera-trap images.This supplementary information provides technical details and results from the study on automatically identifying, counting, and describing wild animals in camera-trap images using deep learning. The dataset consists of images scaled down to 256x256 pixels and normalized to improve learning for neural networks. Data augmentation techniques such as random cropping, flipping, and brightness modification are used to enhance model accuracy. The models are trained using stochastic gradient descent with momentum and weight decay, and the results are evaluated on both expert-labeled and volunteer-labeled test sets. The study compares a one-stage identification approach with a two-stage pipeline. While the one-stage approach reduces model size, it leads to imbalanced data and slightly worse performance. The results show that the two-stage pipeline outperforms the one-stage approach in most tasks, including species identification, animal counting, and additional attributes. The ensemble of models performs best for the description task, achieving high accuracy and precision. The study also compares the results with previous work by Gomez et al. (2016), finding that the current models achieve significantly higher accuracy, especially for rare species. Transfer learning from ImageNet was tested but did not improve performance on the SS dataset. Instead, the models were trained from scratch on the SS-26 dataset, which contains 26 of the 48 species in the SS dataset. The study also investigates the performance of the models on day and night images, finding little difference in performance between the two. The models were tested on different dataset sizes, and transfer learning was shown to improve performance when labeled data is limited. The results indicate that the models can achieve high accuracy even with few labeled examples, allowing a significant portion of the data to be automatically processed. The study also explores the use of prediction averaging and confidence thresholding to improve model performance. The results show that averaging predictions across multiple models improves accuracy, and that confidence thresholding can help identify images the model is confident about, allowing humans to label the rest. The study also addresses the issue of class imbalance, using techniques such as weighted loss, oversampling, and emphasis sampling to improve performance on rare classes. The results show that these methods can significantly improve accuracy for rare species, although they may slightly reduce accuracy for frequent classes. Overall, the study demonstrates the effectiveness of deep learning in automatically identifying, counting, and describing wild animals in camera-trap images.
Reach us at info@study.space
[slides and audio] Automatically identifying%2C counting%2C and describing wild animals in camera-trap images with deep learning