Robust acoustic bird recognition for habitat monitoring with wireless sensor networks

Amira Boulmaiz¹,
Djemil Messadeg¹,
Noureddine Doghmane¹ &
…
Abdelmalik Taleb-Ahmed²

779 Accesses
17 Citations
Explore all metrics

Abstract

The key solution to study birds in their natural habitat is the continuous survey using wireless sensors networks (WSN). The final objective of this study is to conceive a system for monitoring threatened bird species using audio sensor nodes. The principal feature for their recognition is their sound. The main limitations encountered with this process are environmental noise and energy consumption in sensor nodes. Over the years, a variety of birdsong classification methods has been introduced, but very few have focused to find an adequate one for WSN. In this paper, a tonal region detector (TRD) using sigmoid function is proposed. This approach for noise power estimation offers flexibility, since the slope and the mean of the sigmoid function can be adapted autonomously for a better trade-off between noise overvaluation and undervaluation. Once the tonal regions in the noisy bird sound are detected, the features gammatone teager energy cepstral coefficients (GTECC) post-processed by quantile-based cepstral normalization were extracted from the above signals for classification using deep neural network classifier. Experimental results for the identification of 36 bird species from Tonga lake (northeast of Algeria) demonstrate that the proposed TRD–GTECC feature is highly effective and performs satisfactorily compared to popular front-ends considered in this study. Moreover, recognition performance, noise immunity and energy consumption are considerably improved after tonal region detection, indicating that it is a very suitable approach for the acoustic bird recognition in complex environments with wireless sensor nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Article 02 May 2023

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Article 21 January 2022

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aissaoui, R., Tahar, A., Saheb, M., Guergueb, L., & Houhamdi, M. (2011). Diurnal behaviour of Ferruginous Duck Aythya nyroca wintering at the El-Kala wetlands (Northeast Algeria). Bulletin de l’Institut Scientifique, Rabat, section Sciences de la Vie, 33(2), 67–75.
Google Scholar
Bardeli, R., Wolff, D., Kurth, F., Koch, M., Tauchert, K. H., & Frommolt, K. H. (2010). Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letters, 31(12), 1524–1534.
Article Google Scholar
Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. Acoustics, Speech and Signal Processing, IEEE Transactions on, 27(2), 113–120.
Article Google Scholar
Bořil, H., & Hansen, J. H. (2010). Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments. Audio, Speech, and Language Processing, IEEE Transactions on, 18(6), 1379–1393.
Article Google Scholar
Bořil, H., & Hansen, J. H. (2011). UT-Scope: Towards LVCSR under Lombard effect induced by varying types and levels of noisy background. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 4472–4475). IEEE.
Brumm, H. (2004). The impact of environmental noise on song amplitude in a territorial bird. Journal of Animal Ecology, 73(3), 434–440.
Article Google Scholar
Chettibi, F., Khelifa, R., Aberkane, M., Bouslama, Z., & Houhamdi, M. (2013). Diurnal activity budget and breeding ecology of the White-headed Duck Oxyura leucocephala at Lake Tonga (North-east Algeria). Zoology and Ecology, 23(3), 183–190.
Article Google Scholar
Chu, W., & Blumstein, D. T. (2011). Noise robust bird song detection using syllable pattern-based hidden Markov models. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 345–348). IEEE.
Cireşan, D., Meier, U., Masci, J., & Schmidhuber, J. (2012). Multi-column deep neural network for traffic sign classification. Neural Networks, 32, 333–338.
Article Google Scholar
Cohen, I. (2003). Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. Speech and Audio Processing, IEEE Transactions on, 11(5), 466–475.
Article Google Scholar
Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2012). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1), 30–42.
Article Google Scholar
Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. Acoustics, Speech and Signal Processing, IEEE Transactions on, 28(4), 357–366.
Article Google Scholar
De Oliveira, A. G., Ventura, T. M., Ganchev, T. D., de Figueiredo, J. M., Jahn, O., Marques, M. I., et al. (2015). Bird acoustic activity detection based on morphological filtering of the spectrogram. Applied Acoustics, 98, 34–42.
Article Google Scholar
Deng, L., Hinton, G., & Kingsbury, B. (2013). New types of deep neural network learning for speech recognition and related applications: An overview. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 8599–8603). IEEE.
Deng, L., Yu, D., & Platt, J. (2012). Scalable stacking and learning for building deep architectures. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2133–2136). IEEE.
Dharanipragada, S., & Padmanabhan, M. (2000). A nonlinear unsupervised adaptation technique for speech recognition. In INTERSPEECH (pp. 556–559).
Gerkmann, T., & Hendriks, R. C. (2012). Improved MMSE-based noise PSD tracking using temporal cepstrum smoothing. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 105–108). IEEE.
Ghitza, O. (1994). Auditory models and human performance in tasks related to speech coding and speech recognition. Speech and Audio Processing, IEEE Transactions on, 2(1), 115–132.
Article Google Scholar
Glasberg, B. R., & Moore, B. C. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47(1), 103–138.
Article Google Scholar
Gros-Desormeaux, H., Vidot, N., & Hunel, P. (2010). Wildlife assessment using wireless sensor networks. Rijeka: INTECH Open Access Publisher.
Book Google Scholar
Hilger, F., & Ney, H. (2006). Quantile based histogram equalization for noise robust large vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 14(3), 845–854.
Article Google Scholar
Hill, J., & Culler, D. (2002). A wireless embedded sensor architecture for system-level optimization. UC Berkeley Technical Report.
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine IEEE, 29(6), 82–97.
Article Google Scholar
Irino, T., & Patterson, R. D. (1997). A time-domain, level-dependent auditory filter: The gammachirp. The Journal of the Acoustical Society of America, 101(1), 412–419.
Article Google Scholar
Jančovič, P., & Köküer, M. (2011). Automatic detection and recognition of tonal bird sounds in noisy environments. EURASIP Journal on Advances in Signal Processing, 2011(1), 982936.
Article Google Scholar
Kaiser, J. F. (1990). On a simple algorithm to calculate the energy’of a signal. In Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on (pp. 381–384).
Kim, C., & Stern, R. M. (2008). Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis. In INTERSPEECH (pp. 2598–2601).
Kim, C., & Stern, R. M. (2012). Power-normalized cepstral coefficients (PNCC) for robust speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 4101–4104). IEEE.
Kortelainen, J., & Noponen, K. (2005). Neural Networks, Intelligent Systems. Reading: Addison-Wesley Publishing Co.
Google Scholar
Laibowitz, M., Gips, J., Aylward, R., Pentland, A., & Paradiso, J. A. (2006). A sensor network for social dynamics. In Proceedings of the 5th international conference on Information processing in sensor networks (pp. 483–491). ACM.
Mainwaring, A., Culler, D., Polastre, J., Szewczyk, R., & Anderson, J. (2002). Wireless sensor networks for habitat monitoring. In Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications (pp. 88–97). ACM.
Maragos, P., Kaiser, J. F., & Quatieri, T. F. (1993). On amplitude and frequency demodulation using energy operators. IEEE Transactions on signal processing, 41(4), 1532–1550.
Article MATH Google Scholar
McIlraith, A. L., & Card, H. C. (1997). Bird song identification using artificial neural networks and statistical analysis. In Electrical and Computer Engineering, 1997. Engineering Innovation: Voyage of Discovery. IEEE 1997 Canadian Conference on (Vol. 1, pp. 63–66). IEEE.
Olguín, D. O., & Pentland, A. S. (2008). Social sensors for automatic data collection. In AMCIS 2008 Proceedings, p. 171.
Patil, H., & Parhi, K. K. (2010). Novel variable length Teager energy based features for person recognition from their hum. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on (pp. 4526–4529). IEEE.
Patterson, R. D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C., & Allerhand, M. (1992). Complex sounds and auditory images. Auditory Physiology and Perception, 83, 429–446.
Article Google Scholar
Patti, A., & Williamson, G. (2013). Methods for classification of nocturnal migratory bird vocalizations using Pseudo Wigner-Ville Transform. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 758–762). IEEE.
Potamitis, I. (2015). Unsupervised dictionary extraction of bird vocalisations and new tools on assessing and visualising bird activity. Ecological Informatics, 26, 6–17.
Article Google Scholar
Prasad, N. V., & Umesh, S. (2013). Improved cepstral mean and variance normalization using Bayesian framework. In Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on (pp. 156–161). IEEE.
Ptacek, L., Machlica, L., Linhart, P., Jaska, P., & Muller, L. (2015). Automatic recognition of bird individuals on an open set using as-is recordings. Bioacoustics, 25(1), 1–19.
Google Scholar
Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48(2), 220–231.
Article Google Scholar
Sadjadi, S. O., Bořil, H., & Hansen, J. H. (2012). A comparison of front-end compensation strategies for robust LVCSR under room reverberation and increased vocal effort. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 4701–4704). IEEE.
Sainath, T. N., Mohamed, A. R., Kingsbury, B., & Ramabhadran, B. (2013). Deep convolutional neural networks for LVCSR. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 8614–8618). IEEE.
Siniscalchi, S. M., Yu, D., Deng, L., & Lee, C. H. (2013). Exploiting deep neural networks for detection-based speech recognition. Neurocomputing, 106, 148–157.
Article Google Scholar
Slaney, M. (1998). Auditory toolbox. Interval Research Corporation, Technical Report (Vol. 10).
Stattner, E. (2012). Contributions à l’étude des réseaux sociaux: propagation, fouille, collecte de données (Doctoral dissertation, Université des Antilles-Guyane).
Stattner, E., Hunel, P., Vidot, N., & Collard, M. (2011). Acoustic scheme to count bird songs with wireless sensor networks. In World of Wireless, Mobile and Multimedia Networks (WoWMoM), 2011 IEEE International Symposium on a(pp. 1–3). IEEE.
Stowell, D., & Plumbley, M. D. (2014). Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ, 2, e488.
Article Google Scholar
Szewczyk, R., Osterweil, E., Polastre, J., Hamilton, M., Mainwaring, A., & Estrin, D. (2004). Habitat monitoring with sensor networks. Communications of the ACM, 47(6), 34–40.
Article Google Scholar
Trifa, V., Girod, L., Collier, T. C., Blumstein, D., & Taylor, C. E. (2007). Automated wildlife monitoring using self-configuring sensor networks deployed in natural habitats. Center for Embedded Network Sensing.
Ventura, T. M., de Oliveira, A. G., Ganchev, T. D., de Figueiredo, J. M., Jahn, O., Marques, M. I., et al. (2015). Audio parameterization with robust frame selection for improved bird identification. Expert Systems with Applications, 42(22), 8463–8471.
Article Google Scholar
Wang, H., Estrin, D., & Girod, L. (2003). Preprocessing in a Tiered Sensor Network for Habitat Monitoring, EURASIP. Journal on Applied Signal Processing, 4, 392–401.
Article Google Scholar
Weninger, F., & Schuller, B. (2011). Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 337–340). IEEE.
Yapanel, U. H., & Hansen, J. H. (2008). A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition. Speech Communication, 50(2), 142–152.
Article Google Scholar
Yong, P. C., Nordholm, S., & Dam, H. H. (2012). Noise estimation based on soft decisions and conditional smoothing for speech enhancement. In Acoustic Signal Enhancement; Proceedings of IWAENC 2012; International Workshop on (pp. 1–4). VDE.
Yoshizawa, S., Hayasaka, N., Wada, N., & Miyanaga, Y. (2004). Cepstral gain normalization for noise robust speech recognition. In Acoustics, Speech, and Signal Processing, 2004 (ICASSP’04). IEEE International Conference on (Vol. 1, pp. I–209). IEEE.
Yu, R. (2009). A low-complexity noise estimation algorithm based on smoothing of noise power estimation and estimation bias correction. In Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on (pp. 4421–4424). IEEE.
Yu, D., Deng, L., Seide, F. T. B., & Li, G. (2016). U.S. Patent No. 9,235,799. Washington, DC: U.S. Patent and Trademark Office.
Zhang, X., & Li, Y. (2015). Adaptive energy detection for bird sound detection in complex environments. Neurocomputing, 155, 108–116.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electronic Department, University of Badji Mokhtar, Annaba, Algeria
Amira Boulmaiz, Djemil Messadeg & Noureddine Doghmane
Université de Valenciennes et du Hainaut Cambrésis, Valenciennes, France
Abdelmalik Taleb-Ahmed

Authors

Amira Boulmaiz
View author publications
You can also search for this author in PubMed Google Scholar
Djemil Messadeg
View author publications
You can also search for this author in PubMed Google Scholar
Noureddine Doghmane
View author publications
You can also search for this author in PubMed Google Scholar
Abdelmalik Taleb-Ahmed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amira Boulmaiz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boulmaiz, A., Messadeg, D., Doghmane, N. et al. Robust acoustic bird recognition for habitat monitoring with wireless sensor networks. Int J Speech Technol 19, 631–645 (2016). https://doi.org/10.1007/s10772-016-9354-4

Download citation

Received: 20 January 2016
Accepted: 18 July 2016
Published: 27 July 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10772-016-9354-4

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Robust acoustic bird recognition for habitat monitoring with wireless sensor networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation