Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise

Meng, Li; Goodwin, Morten; Yazidi, Anis; Engelstad, Paal

dc.contributor.author	Meng, Li
dc.contributor.author	Goodwin, Morten
dc.contributor.author	Yazidi, Anis
dc.contributor.author	Engelstad, Paal
dc.date.accessioned	2023-02-21T13:11:37Z
dc.date.available	2023-02-21T13:11:37Z
dc.date.created	2023-01-19T11:57:07Z
dc.date.issued	2022
dc.identifier.citation	Meng, L., Goodwin, M., Yazidi, A. & Engelstad, P. (2022). Improving the Diversity of Bootstrapped DQN by Replacing Priors with Noise. IEEE Transactions on Games (TG), 1-10.	en_US
dc.identifier.issn	2475-1510
dc.identifier.uri	https://hdl.handle.net/11250/3052785
dc.description	Authors accepted manuscript	en_US
dc.description	Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.description.abstract	Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state, analogous to the definition of the exploration ratio in RL. Thus, the performance of Bootstrapped Deep Q-Learning Network is deeply connected with the level of diversity within the algorithm. In the original research, it was pointed out that a random prior could improve the performance of the model. In this article, we further explore the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm. We conduct our experiment on the Atari benchmark and compare our algorithm to both the original and other related algorithms. The results show that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games. Thus, we conclude that replacing priors with noise can improve Bootstrapped Deep Q-Learning’s performance by ensuring the integrity of diversities.	en_US
dc.language.iso	eng	en_US
dc.publisher	IEEE	en_US
dc.title	Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	acceptedVersion	en_US
dc.rights.holder	© 2022 IEEE	en_US
dc.subject.nsi	VDP::Teknologi: 500	en_US
dc.source.pagenumber	1-10	en_US
dc.source.journal	IEEE Transactions on Games (TG)	en_US
dc.identifier.doi	https://doi.org/10.1109/TG.2022.3185330
dc.identifier.cristin	2110227
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: Article.pdf
Størrelse:: 3.706Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel