Efficient gaussian process based optimistic knapsack sampling with applications to stochastic resource allocation

Glimsdal, Sondre

dc.contributor.author	Glimsdal, Sondre
dc.date.accessioned	2013-09-24T12:29:37Z
dc.date.available	2013-09-24T12:29:37Z
dc.date.issued	2013
dc.identifier.uri	http://hdl.handle.net/11250/137604
dc.description	Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2013 – Universitetet i Agder, Grimstad	no_NO
dc.description.abstract	The stochastic non-linear fractional knapsack problem is a challeng- ing optimization problem with numerous applications, including resource allocation. The goal is to nd the most valuable mix of materials that ts within a knapsack of xed capacity. When the value functions of the involved materials are fully known and di erentiable, the most valuable mixture can be found by direct application of Lagrange multipliers. How- ever, in many real-world applications, such as web polling, information about material value is uncertain, and in many cases missing altogether. Surprisingly, without prior information about material value, the recently proposed Learning Automata Knapsack Game (LAKG) and Hierarchy of Twofold Resource Allocation Automata (H-TRAA) o ers arbitrarily ac- curate convergence towards the optimal solution, simply by interacting with the knapsack on-line. This paper introduces Gaussian Process based Optimistic Knapsack Sampling (GPOKS) a novel model-based reinforce- ment learning scheme for solving stochastic fractional knapsack problems, founded on Gaussian Process (GP) enabled Optimistic Thompson Sam- pling (OTS). Not only does this scheme converge signi cantly faster than LAKG, GPOKS also incorporates GP based learning of the material val- ues themselves, forming the basis for OTS supported balancing between exploration and exploitation. Using resource allocation in web polling as a proof-of-concept application, our empirical results show that GPOKS con- sistently outperforms LAKG and H-TRAA, the current top-performers, under a wide variety of parameter settings.	no_NO
dc.language.iso	eng	no_NO
dc.publisher	Universitetet i Agder / University of Agder	no_NO
dc.title	Efficient gaussian process based optimistic knapsack sampling with applications to stochastic resource allocation	no_NO
dc.type	Master thesis	no_NO
dc.source.pagenumber	61, [8] s.	no_NO

Tilhørende fil(er)

Filnavn:: Glimsdal, Sondre Oppgave.pdf
Størrelse:: 1.820Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Master's theses in Information and Communication Technology [505]
MM500, IKT590, IKT591

Vis enkel innførsel