Vis enkel innførsel

dc.contributor.authorMeng, Li
dc.contributor.authorYazidi, Anis
dc.contributor.authorGoodwin, Morten
dc.contributor.authorEngelstad, Paal
dc.date.accessioned2023-01-27T08:46:38Z
dc.date.available2023-01-27T08:46:38Z
dc.date.created2023-01-19T11:55:03Z
dc.date.issued2022
dc.identifier.citationMeng, L., Yazidi, A., Goodwin, M. & Engelstad, P. (2022). Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples. Proceedings of the Northern Lights Deep Learning Workshop, 3, 1-9. doi:en_US
dc.identifier.issn2703-6928
dc.identifier.urihttps://hdl.handle.net/11250/3046765
dc.description.abstractIn this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims to incorporate semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.en_US
dc.language.isoengen_US
dc.publisherSeptentrio Academic Publishingen_US
dc.rightsNavngivelse 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.no*
dc.titleExpert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examplesen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2022 The Author(s)en_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.pagenumber9en_US
dc.source.volume3en_US
dc.source.journalProceedings of the Northern Lights Deep Learning Workshopen_US
dc.identifier.doi10.7557/18.6237
dc.identifier.cristin2110224
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Navngivelse 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal