Non-intrusive speech quality assessment using context-aware neural networks

Jaiswal, Rahul Kumar; Dubey, Rajesh Kumar

dc.contributor.author	Jaiswal, Rahul Kumar
dc.contributor.author	Dubey, Rajesh Kumar
dc.date.accessioned	2023-04-24T10:25:10Z
dc.date.available	2023-04-24T10:25:10Z
dc.date.created	2022-11-24T10:08:35Z
dc.date.issued	2022
dc.identifier.citation	Jaiswal, R. K. & Dubey, R. K. (2022). Non-intrusive speech quality assessment using context-aware neural networks. International Journal of Speech Technology, 25, 947–965.	en_US
dc.identifier.issn	1572-8110
dc.identifier.uri	https://hdl.handle.net/11250/3064461
dc.description.abstract	To meet the human perceived quality of experience (QoE) while communicating over various Voice over Internet protocol (VoIP) applications, for example Google Meet, Microsoft Skype, Apple FaceTime, etc. a precise speech quality assessment metric is needed. The metric should be able to detect and segregate different types of noise degradations present in the surroundings before measuring and monitoring the quality of speech in real-time. Our research is motivated by the lack of clear evidence presenting speech quality metric that can firstly distinguish different types of noise degradations before providing speech quality prediction decision. To that end, this paper presents a novel non-intrusive speech quality assessment metric using context-aware neural networks in which the noise class (context) of the degraded or noisy speech signal is first identified using a classifier then deep neutral networks (DNNs) based speech quality metrics (SQMs) are trained and optimized for each noise class to obtain the noise class-specific (context-specific) optimized speech quality predictions (MOS scores). The noisy speech signals, that is, clean speech signals degraded by different types of background noises are taken from the NOIZEUS speech corpus. Results demonstrate that even in the presence of less number of speech samples available from the NOIZEUS speech corpus, the proposed metric outperforms in different contexts compared to the metric where the contexts are not classified before speech quality prediction.	en_US
dc.language.iso	eng	en_US
dc.publisher	Springer	en_US
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	Non-intrusive speech quality assessment using context-aware neural networks	en_US
dc.title.alternative	Non-intrusive speech quality assessment using context-aware neural networks	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	publishedVersion	en_US
dc.rights.holder	© 2022 The Author(s)	en_US
dc.subject.nsi	VDP::Teknologi: 500	en_US
dc.source.pagenumber	947–965	en_US
dc.source.volume	25	en_US
dc.source.journal	International Journal of Speech Technology	en_US
dc.identifier.doi	https://doi.org/10.1007/s10772-022-10011-y
dc.identifier.cristin	2079778
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: Article.pdf
Størrelse:: 1.575Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Department of Engineering Sciences [21]
Publikasjoner fra CRIStin [4037]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal