Vis enkel innførsel

dc.contributor.authorBerg, Stian
dc.date.accessioned2010-12-06T13:52:28Z
dc.date.available2010-12-06T13:52:28Z
dc.date.issued2010
dc.identifier.urihttp://hdl.handle.net/11250/137504
dc.descriptionMasteroppgave i informasjons- og kommunikasjonsteknologi 2010 – Universitetet i Agder, Grimstaden_US
dc.description.abstractMulti-armed bandit problems have been subject to a lot of research in computer science because it captures the fundamental dilemma of exploration versus exploitation in reinforcement learning. The goal of a bandit problem is to determine the optimal balance between the gain of new information (exploration) and immediate reward maximization (exploitation). Dynamic bandit problems are especially challenging because they involve changing environments. Combined with game theory, where one analyze the behavior of agents in multi-agent settings, bandit problems serves as a framework for benchmarking the applicability of learning algorithms in various situations. In this thesis, we investigate a novel approach to the multi-armed bandit problem, the Kalman Bayesian Learning Automaton, an algorithm which applies concepts from Kalman filtering, a powerful technique for probabilistic reasoning over time. To determine the effectiveness of such an approach we have conducted an empirical study of the Kalman Bayesian Learning Automaton in multi-armed dynamic bandit problems and selected games from game theory. Specifically, we evaluate the performance of the Kalman Bayesian Learning Automaton in randomly changing environments, switching environments, the Goore game, the Prisoners Dilemma and zero-sum games. The scalability and robustness of the algorithm are also examined. Indeed, we reveal that the strength of the Kalman Bayesian Learning Automatons lies in its excellent tracking abilities, and are among the top performers in all experiments. Unfortunately, it is dependent on tuning of parameters. We believe further work on the approach could solve the parameter problem, but even with the need to tune parameters we consider the Kalman Bayesian Learning Automaton a strong solution to dynamic multi-armed bandit problems and definitely has the potential to be applied in various applications and multi-agent settings.en_US
dc.language.isoengen_US
dc.publisherUniversity of Agderen_US
dc.titleSolving dynamic bandit problems and decentralized games using the kalman bayesian learning automatonen_US
dc.typeMaster thesisen_US
dc.source.pagenumber129en_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel