Download PDFOpen PDF in browser

Towards Classifying Bird Sounds Using a Deep Transfer Learning Model

EasyChair Preprint 15712

16 pagesDate: January 14, 2025

Abstract

The conservation of bird biodiversity relies on accurately identifying and classifying species, which is often time-consuming and requires specialized knowledge. Recent advances in deep learning, particularly in convolutional neu- ral networks (CNNs), have made it possible to detect species passively from acoustic signals, even in challenging environments. This paper presents a high- performance deep convolutional neural network (CNN) model using the VGG- 16 architecture for the passive classification of bird sounds, using a remarkably accurate model of Short-Time Fourier Transform (STFT) that accounts for 97.31% of the BirdCLEF 2022 data set and 98.41% for the Cornell Birdcall Iden- tification dataset. The model discriminates between species, even in complex soundscapes with overlapping records. The framework also uses a tool-based consensus framework to enhance the focus on relevant features, improving clas- sification accuracy for rare and endangered species. This method is highly effec- tive in various phonological and language processing tasks and enhances the model's robustness, making it suitable for real-world applications.

Keyphrases: Bird Clef 2022 Dataset, Bird Species Classification, Cornell Birdcall Identification Dataset, Mel-spectrogram, Short Time Fourier Transform, VGG-16, feature extraction

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:15712,
  author    = {Saptarshi Dey and Soumi Ghosh and Soumapriyo Mondal and Akash Harh and Spandan Bandhu and Bidhan Barai and Pawan Kumar Singh},
  title     = {Towards Classifying Bird Sounds Using a Deep Transfer Learning Model},
  howpublished = {EasyChair Preprint 15712},
  year      = {EasyChair, 2025}}
Download PDFOpen PDF in browser