Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation

Yadav, Rohan Kumar; Lei, Jiao; Granmo, Ole-Christoffer; Goodwin, Morten

dc.contributor.author	Yadav, Rohan Kumar
dc.contributor.author	Lei, Jiao
dc.contributor.author	Granmo, Ole-Christoffer
dc.contributor.author	Goodwin, Morten
dc.date.accessioned	2023-03-09T12:35:30Z
dc.date.available	2023-03-09T12:35:30Z
dc.date.created	2022-11-22T16:08:46Z
dc.date.issued	2022
dc.identifier.citation	Yadav, R. K., Lei, J., Granmo, O.-C. & Goodwin, M. (2022). Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation. International Joint Conferences on Artificial Intelligence, 4439-4446.	en_US
dc.identifier.isbn	978-1-956792-00-3
dc.identifier.uri	https://hdl.handle.net/11250/3057374
dc.description.abstract	The state-of-the-art natural language processing models have raised the bar for excellent performance on a variety of tasks in recent years. However, concerns are rising over their primitive sensitivity to distribution biases that reside in the training and testing data. This issue hugely impacts the performance of the models when exposed to out-of-distribution and counterfactual data. The root cause seems to be that many machine learning models are prone to learn the shortcuts, modelling simple correlations rather than more fundamental and general relationships. As a result, such text classifiers tend to perform poorly when a human makes minor modifications to the data, which raises questions regarding their robustness. In this paper, we employ a rule-based architecture called Tsetlin Machine (TM) that learns both simple and complex correlations by ANDing features and their negations. As such, it generates explainable AND-rules using negated and non-negated reasoning. Here, we explore how non-negated reasoning can be more prone to distribution biases than negated reasoning. We further leverage this finding by adapting the TM architecture to mainly perform negated reasoning using the specificity parameter s. As a result, the AND-rules becomes robust to spurious correlations and can also correctly predict counterfactual data. Our empirical investigation of the model's robustness uses the specificity s to control the degree of negated reasoning. Experiments on publicly available Counterfactually-Augmented Data demonstrate that the negated clauses are robust to spurious correlations and outperform Naive Bayes, SVM, and Bi-LSTM by up to 20 %, and ELMo by almost 6 % on counterfactual test data.	en_US
dc.language.iso	eng	en_US
dc.publisher	International Joint Conferences on Artificial Intelligence	en_US
dc.relation.ispartof	IJCAI International Joint Conference on Artificial Intelligence
dc.relation.ispartofseries	IJCAI International Joint Conference on Artificial Intelligence
dc.title	Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation	en_US
dc.title.alternative	Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation	en_US
dc.type	Peer reviewed	en_US
dc.type	Conference object	en_US
dc.description.version	acceptedVersion	en_US
dc.rights.holder	© 2022 IJCAI	en_US
dc.subject.nsi	VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550	en_US
dc.source.pagenumber	4439-4446	en_US
dc.source.issue	2022	en_US
dc.identifier.doi	https://doi.org/10.24963/ijcai.2022/616
dc.identifier.cristin	2080703
cristin.fulltext	postprint

Tilhørende fil(er)

Filnavn:: Proceeding.pdf
Størrelse:: 1.018Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel