Vis enkel innførsel

dc.contributor.authorNicolae, Dragoş Constantin
dc.contributor.authorYadav, Rohan Kumar
dc.contributor.authorTufiş, Dan
dc.date.accessioned2022-08-23T10:58:00Z
dc.date.available2022-08-23T10:58:00Z
dc.date.created2022-05-31T08:38:15Z
dc.date.issued2022
dc.identifier.citationNicolae, D.C., Yadav, R.K. & Tufi¸s, D (2022). A Lite Romanian BERT: ALR-BERT. Computers, 11 (4), 7.en_US
dc.identifier.issn2073-431X
dc.identifier.urihttps://hdl.handle.net/11250/3013063
dc.description.abstractLarge-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.en_US
dc.language.isoengen_US
dc.publisherMDPIen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleA Lite Romanian BERT:ALR-BERTen_US
dc.title.alternativeA Lite Romanian BERT:ALR-BERTen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2022 The Author(s)en_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.pagenumber7en_US
dc.source.volume11en_US
dc.source.journalComputersen_US
dc.source.issue4en_US
dc.identifier.doihttps://doi.org/10.3390/computers11040057
dc.identifier.cristin2028257
dc.source.articlenumber57en_US
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal