Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

Proc Natl Acad Sci U S A. 2018 Jun 19;115(25):E5716-E5725. doi: 10.1073/pnas.1719367115. Epub 2018 Jun 5.

Authors

Mohammad Sadegh Norouzzadeh¹, Anh Nguyen², Margaret Kosmala³, Alexandra Swanson⁴, Meredith S Palmer⁵, Craig Packer⁵, Jeff Clune^{6

7}

Affiliations

¹ Department of Computer Science, University of Wyoming, Laramie, WY 82071.
² Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849.
³ Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138.
⁴ Department of Physics, University of Oxford, Oxford OX1 3RH, United Kingdom.
⁵ Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN 55108.
⁶ Department of Computer Science, University of Wyoming, Laramie, WY 82071; jeffclune@uwyo.edu.
⁷ Uber AI Labs, San Francisco, CA 94103.

Abstract

Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into "big data" sciences. Motion-sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild.

Keywords: artificial intelligence; camera-trap images; deep learning; deep neural networks; wildlife ecology.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Animals
Animals, Wild / physiology*
Artificial Intelligence
Behavior, Animal / physiology*
Ecology / methods
Ecosystem
Humans
Machine Learning
Neural Networks, Computer