An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application
Chapter, Peer reviewed
Accepted version
Permanent lenke
https://hdl.handle.net/11250/3122393Utgivelsesdato
2023Metadata
Vis full innførselSamlinger
Originalversjon
Bhattarai, B., Granmo, O.-C. & Lei, J. (2023). An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application. In J. Kamps et al. (Eds.). Lecture Notes in Computer Science (LNCS), (13980, pp. 167–181), Springer Cham. https://doi.org/10.1007/978-3-031-28244-7_11Sammendrag
Data representation plays a crucial role in natural language processing (NLP), forming the foundation for most NLP tasks. Indeed, NLP performance highly depends upon the effectiveness of the preprocessing pipeline that builds the data representation. Many representation learning frameworks, such as Word2Vec, encode input data based on local contextual information that interconnects words. Such approaches can be computationally intensive, and their encoding is hard to explain. We here propose an interpretable representation learning framework utilizing Tsetlin Machine (TM). The TM is an interpretable logic-based algorithm that has exhibited competitive performance in numerous NLP tasks. We employ the TM clauses to build a sparse propositional (boolean) representation of natural language text. Each clause is a class-specific propositional rule that links words semantically and contextually. Through visualization, we illustrate how the resulting data representation provides semantically more distinct features, better separating the underlying classes. As a result, the following classification task becomes less demanding, benefiting simple machine learning classifiers such as Support Vector Machine (SVM). We evaluate our approach using six NLP classification tasks and twelve domain adaptation tasks. Our main finding is that the accuracy of our proposed technique significantly outperforms the vanilla TM, approaching the competitive accuracy of deep neural network (DNN) baselines. Furthermore, we present a case study showing how the representations derived from our framework are interpretable. (We use an asynchronous and parallel version of Tsetlin Machine: available at https://github.com/cair/PyTsetlinMachineCUDA).
Beskrivelse
Author's accepted manuscript