Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species

dc.contributor.authorLudeña Choez, Jimmy Diestin
dc.contributor.authorQuispe Soncco, Raisa
dc.contributor.authorGallardo Antolín, Ascención
dc.date.accessioned2019-01-29T22:19:50Z
dc.date.available2019-01-29T22:19:50Z
dc.date.issued2017
dc.description.abstractFeature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC. © 2017 Ludeña-Choez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.es_PE
dc.description.uriTrabajo de investigaciónes_PE
dc.identifier.doihttps://doi.org/10.1371/journal.pone.0179403es_PE
dc.identifier.issn19326203es_PE
dc.identifier.urihttps://hdl.handle.net/20.500.12590/15785
dc.language.isoenges_PE
dc.publisherPublic Library of Sciencees_PE
dc.relation.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85020903270&doi=10.1371%2fjournal.pone.0179403&partnerID=40&md5=2ffdb02b008e1d9dd3cd7db2ef308770es_PE
dc.rightsinfo:eu-repo/semantics/restrictedAccesses_PE
dc.sourceRepositorio Institucional - UCSPes_PE
dc.sourceUniversidad Católica San Pabloes_PE
dc.sourceScopuses_PE
dc.subjectacoustic analysises_PE
dc.subjectanalytic methodes_PE
dc.subjectanimal experimentes_PE
dc.subjectArticlees_PE
dc.subjectaudiometryes_PE
dc.subjectbirdes_PE
dc.subjectcontrolled studyes_PE
dc.subjectdecompositiones_PE
dc.subjecthidden Markov modeles_PE
dc.subjectkernel methodes_PE
dc.subjectmel frequency cepstral coefficientses_PE
dc.subjectnonhumanes_PE
dc.subjectsound detectiones_PE
dc.subjectspecies differencees_PE
dc.subjectspeech analysises_PE
dc.subjectsupport vector machinees_PE
dc.subjecttask performancees_PE
dc.subjectvocalizationes_PE
dc.subjectanimales_PE
dc.subjectclassificationes_PE
dc.subject.ocdehttps://purl.org/pe-repo/ocde/ford#2.02.01es_PE
dc.titleBird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird specieses_PE
dc.typeinfo:eu-repo/semantics/article
Files