An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application

Bhattarai, Bimal; Granmo, Ole-Christoffer; Lei, Jiao

Bhattarai, Bimal; Granmo, Ole-Christoffer; Lei, Jiao

Chapter, Peer reviewed

Accepted version

Åpne

Chapter.pdf (3.260Mb)

Permanent lenke

https://hdl.handle.net/11250/3122393

Utgivelsesdato

2023

Metadata

Vis full innførsel

Samlinger

Originalversjon

Bhattarai, B., Granmo, O.-C. & Lei, J. (2023). An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application. In J. Kamps et al. (Eds.). Lecture Notes in Computer Science (LNCS), (13980, pp. 167–181), Springer Cham. https://doi.org/10.1007/978-3-031-28244-7_11

Sammendrag

Data representation plays a crucial role in natural language processing (NLP), forming the foundation for most NLP tasks. Indeed, NLP performance highly depends upon the effectiveness of the preprocessing pipeline that builds the data representation. Many representation learning frameworks, such as Word2Vec, encode input data based on local contextual information that interconnects words. Such approaches can be computationally intensive, and their encoding is hard to explain. We here propose an interpretable representation learning framework utilizing Tsetlin Machine (TM). The TM is an interpretable logic-based algorithm that has exhibited competitive performance in numerous NLP tasks. We employ the TM clauses to build a sparse propositional (boolean) representation of natural language text. Each clause is a class-specific propositional rule that links words semantically and contextually. Through visualization, we illustrate how the resulting data representation provides semantically more distinct features, better separating the underlying classes. As a result, the following classification task becomes less demanding, benefiting simple machine learning classifiers such as Support Vector Machine (SVM). We evaluate our approach using six NLP classification tasks and twelve domain adaptation tasks. Our main finding is that the accuracy of our proposed technique significantly outperforms the vanilla TM, approaching the competitive accuracy of deep neural network (DNN) baselines. Furthermore, we present a case study showing how the representations derived from our framework are interpretable. (We use an asynchronous and parallel version of Tsetlin Machine: available at https://github.com/cair/PyTsetlinMachineCUDA).

Beskrivelse

Author's accepted manuscript

Utgiver

Springer Cham

Serie

Lecture Notes in Computer Science;13980