Show simple item record

dc.contributor.authorSharma, Jivitesh
dc.contributor.authorGranmo, Ole-Christoffer
dc.contributor.authorGoodwin, Morten
dc.date.accessioned2023-03-07T13:50:48Z
dc.date.available2023-03-07T13:50:48Z
dc.date.created2021-01-07T00:33:35Z
dc.date.issued2020
dc.identifier.citationSharma, J., Granmo, O-C. & Goodwin, M. (2020). Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network. Interspeech, 2020, 1186-1190.en_US
dc.identifier.issn2308-457X
dc.identifier.urihttps://hdl.handle.net/11250/3056508
dc.descriptionAuthor's accepted manuscripten_US
dc.description.abstractIn this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN) with Attention mechanism. The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform CQT) and Chromagram. And, we employ a deeper CNN (DCNN) compared to previous models, consisting of spatially separable convolutions working on time and feature domain separately. Alongside, we use attention odules that perform channel and spatial attention together. We use the mix-up data augmentation technique to further boost performance. Our model is able to achieve state-of-the-art performance on three enchmark environment sound classification datasets, i.e. the UrbanSound8K (97.52%), ESC-10 (94.75%) and ESC-50 (87.45%).en_US
dc.language.isoengen_US
dc.publisherInternational Speech Communication Associationen_US
dc.titleEnvironment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Networken_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionacceptedVersionen_US
dc.rights.holder© 2020 ISCAen_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.pagenumber1186-1190en_US
dc.source.volume2020en_US
dc.source.journalInterspeechen_US
dc.identifier.doihttps://doi.org/10.21437/Interspeech.2020-1303
dc.identifier.cristin1866678
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record