Emergency Detection with Environment Sound Using Deep Convolutional Neural Networks
Original version
Sharma, J., Granmo, OC., Goodwin, M. (2021). Emergency Detection with Environment Sound Using Deep Convolutional Neural Networks. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (Red.) Proceedings of Fifth International Congress on Information and Communication Technology. Advances in Intelligent Systems and Computing, 1184. https://doi.org/10.1007/978-981-15-5859-7_14Abstract
In this paper we propose a generic emergency detection system using only the sound produced in the environment. For this task, we employ multiple audio feature extraction techniques like the Mel-Frequency Cepstral Coefficients, Gammatone Frequency Cepstral Coefficients, Constant Q-transform and Chromagram. After feature extraction, a Deep Convolutional Neural Network (CNN) is used to classify an audio signal as a potential emergency situation or not. The entire model is based on our previous work that set the new state-of-the-art in the Environment Sound Classification (ESC) task1. We combine the benchmark ESC datasets: UrbanSound8K and ESC-50 (ESC-10 is a subset of ESC-50), and reduce the problem to a binary classification problem. This is done by aggregating sound classes such as sirens, fire crackling, glass breaking, gun shot as the emergency class and others as normal. Even though there are only two classes to distinguish, they are highly imbalanced. To overcome this difficulty we introduce class weights in calculating the loss while training the model. Our model is able to achieve 99.56% emergency detection accuracy.
Description
Author's accepted manuscript