Download PDFOpen PDF in browserTowards Classifying Bird Sounds Using a Deep Transfer Learning ModelEasyChair Preprint 1571216 pages•Date: January 14, 2025AbstractThe conservation of bird biodiversity relies on accurately identifying and classifying species, which is often time-consuming and requires specialized knowledge. Recent advances in deep learning, particularly in convolutional neu- ral networks (CNNs), have made it possible to detect species passively from acoustic signals, even in challenging environments. This paper presents a high- performance deep convolutional neural network (CNN) model using the VGG- 16 architecture for the passive classification of bird sounds, using a remarkably accurate model of Short-Time Fourier Transform (STFT) that accounts for 97.31% of the BirdCLEF 2022 data set and 98.41% for the Cornell Birdcall Iden- tification dataset. The model discriminates between species, even in complex soundscapes with overlapping records. The framework also uses a tool-based consensus framework to enhance the focus on relevant features, improving clas- sification accuracy for rare and endangered species. This method is highly effec- tive in various phonological and language processing tasks and enhances the model's robustness, making it suitable for real-world applications. Keyphrases: Bird Clef 2022 Dataset, Bird Species Classification, Cornell Birdcall Identification Dataset, Mel-spectrogram, Short Time Fourier Transform, VGG-16, feature extraction
|