Fine-grained object recognition in underwater visual data
Multimedia Tools and Applications, 2016•Springer
In this paper we investigate the fine-grained object categorization problem of determining
fish species in low-quality visual data (images and videos) recorded in real-life settings. We
first describe a new annotated dataset of about 35,000 fish images (MA-35K dataset),
derived from the Fish4Knowledge project, covering 10 fish species from the Eastern Indo-
Pacific bio-geographic zone. We then resort to a label propagation method able to transfer
the labels from the MA-35K to a set of 20 million fish images in order to achieve variability in …
fish species in low-quality visual data (images and videos) recorded in real-life settings. We
first describe a new annotated dataset of about 35,000 fish images (MA-35K dataset),
derived from the Fish4Knowledge project, covering 10 fish species from the Eastern Indo-
Pacific bio-geographic zone. We then resort to a label propagation method able to transfer
the labels from the MA-35K to a set of 20 million fish images in order to achieve variability in …
Abstract
In this paper we investigate the fine-grained object categorization problem of determining fish species in low-quality visual data (images and videos) recorded in real-life settings. We first describe a new annotated dataset of about 35,000 fish images (MA-35K dataset), derived from the Fish4Knowledge project, covering 10 fish species from the Eastern Indo-Pacific bio-geographic zone. We then resort to a label propagation method able to transfer the labels from the MA-35K to a set of 20 million fish images in order to achieve variability in fish appearance. The resulting annotated dataset, containing over one million annotations (AA-1M), was then manually checked by removing false positives as well as images with occlusions between fish or showing partially fish. Finally, we randomly picked more than 30,000 fish images distributed among ten fish species and extracted from about 400 10-minute videos, and used this data (both images and videos) for the fish task of the LifeCLEF 2014 contest. Together with the fine-grained visual dataset release, we also present two approaches for fish species classification in, respectively, still images and videos. Both approaches showed high performance (for some fish species the precision and recall were close to one) in object classification and outperformed state-of-the-art methods. In addition, despite the fact that dataset is unbalanced in the number of images per species, both methods (especially the one operating on still images) appear to be rather robust against the long-tail curse of data, showing the best performance on the less populated object classes.
Springer