The bayesian pursuit algorithm: A new family of estimator learning automata

Zhang, Xuan; Granmo, Ole-Christoffer; Oommen, B. John

dc.contributor.author	Zhang, Xuan
dc.contributor.author	Granmo, Ole-Christoffer
dc.contributor.author	Oommen, B. John
dc.date.accessioned	2011-11-21T14:08:50Z
dc.date.available	2011-11-21T14:08:50Z
dc.date.issued	2011
dc.identifier.citation	Zhang, X., Granmo, O.-C., & Oommen, B. J. (2011). The bayesian pursuit algorithm: A new family of estimator learning automata. In K. Mehrotra, C. Mohan, J. Oh, P. Varshney & M. Ali (Eds.), Modern Approaches in Applied Intelligence (Vol. 6704, pp. 522-531): Springer.	no_NO
dc.identifier.isbn	978-3-642-21826-2
dc.identifier.uri	http://hdl.handle.net/11250/137884
dc.description	Published version of a chapter in the book: Modern Approaches in Applied Intelligence. Also available from the publisher at http://dx.doi.org/10.1007/978-3-642-21827-9_53	no_NO
dc.description.abstract	The fastest Learning Automata (LA) algorithms currently available come from the family of estimator algorithms. The Pursuit algorithm (PST), a pioneering scheme in the estimator family, obtains its superior learning speed by using Maximum Likelihood (ML) estimates to pursue the action currently perceived as being optimal. Recently, a Bayesian LA (BLA) was introduced, and empirical results that demonstrated its advantages over established top performers, including the PST scheme, were reported. The BLA scheme is inherently Bayesian in nature, but it succeeds in avoiding the computational intractability by merely relying on updating the hyper-parameters of sibling conjugate priors, and on random sampling from the resulting posteriors. In this paper, we integrate the foundational learning principles motivating the design of the BLA, with the principles of the PST. By doing this, we have succeeded in obtaining a completely novel, and rather pioneering, approach to solving LA-like problems, namely, by designing the Bayesian Pursuit algorithm (BPST). As in the BLA, the estimates are truly Bayesian (as opposed to ML) in nature. However, the action selection probability vector of the PST is used for its exploration purposes. Also, unlike the ML estimate, which is usually a single value, the use of a posterior distribution permits us to choose any one of a spectrum of values in the posterior, as the appropriate estimate. Thus, in this paper, we have chosen a 95% percentile value of the posterior (instead of the mean) to pursue the most promising actions. Further, as advocated in [7], the pursuit has been done using both the Linear Reward-Penalty and Reward-Inaction philosophies, leading to the corresponding BPST RP and BPST RI schemes respectively. It turns out that the BPST is superior to the PST, with the BPST RI being even more robust than the BPST RP . Moreover, by controlling the learning speed of the BPST, the BPST schemes perform either better or comparable to the BLA. We thus believe that the BPST constitutes a new avenue of research, in which the performance benefits of the PST and the BLA are mutually augmented, opening up for improved performance in a number of applications, currently being tested.	no_NO
dc.language.iso	eng	no_NO
dc.publisher	Springer	no_NO
dc.relation.ispartofseries	Lecture Notes in Computer Science;6704
dc.subject	estimator algorithms, learning automata, pursuit algorithm, Bayesian learning automata, Bayesian pursuit algorithm, Beta distribution	no_NO
dc.title	The bayesian pursuit algorithm: A new family of estimator learning automata	no_NO
dc.type	Chapter	no_NO
dc.type	Peer reviewed	no_NO
dc.subject.nsi	VDP::Technology: 500::Information and communication technology: 550	no_NO
dc.source.pagenumber	522-531	no_NO

Tilhørende fil(er)

Filnavn:: Zhang_2011_Bayesian.pdf
Størrelse:: 334.4Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Scientific Publications in Information and Communication Technology [710]

Vis enkel innførsel