Vis enkel innførsel

dc.contributor.authorZhang, Xuan
dc.contributor.authorGranmo, Ole-Christoffer
dc.contributor.authorOommen, B. John
dc.date.accessioned2011-11-21T14:08:50Z
dc.date.available2011-11-21T14:08:50Z
dc.date.issued2011
dc.identifier.citationZhang, X., Granmo, O.-C., & Oommen, B. J. (2011). The bayesian pursuit algorithm: A new family of estimator learning automata. In K. Mehrotra, C. Mohan, J. Oh, P. Varshney & M. Ali (Eds.), Modern Approaches in Applied Intelligence (Vol. 6704, pp. 522-531): Springer.no_NO
dc.identifier.isbn978-3-642-21826-2
dc.identifier.urihttp://hdl.handle.net/11250/137884
dc.descriptionPublished version of a chapter in the book: Modern Approaches in Applied Intelligence. Also available from the publisher at http://dx.doi.org/10.1007/978-3-642-21827-9_53no_NO
dc.description.abstractThe fastest Learning Automata (LA) algorithms currently available come from the family of estimator algorithms. The Pursuit algorithm (PST), a pioneering scheme in the estimator family, obtains its superior learning speed by using Maximum Likelihood (ML) estimates to pursue the action currently perceived as being optimal. Recently, a Bayesian LA (BLA) was introduced, and empirical results that demonstrated its advantages over established top performers, including the PST scheme, were reported. The BLA scheme is inherently Bayesian in nature, but it succeeds in avoiding the computational intractability by merely relying on updating the hyper-parameters of sibling conjugate priors, and on random sampling from the resulting posteriors. In this paper, we integrate the foundational learning principles motivating the design of the BLA, with the principles of the PST. By doing this, we have succeeded in obtaining a completely novel, and rather pioneering, approach to solving LA-like problems, namely, by designing the Bayesian Pursuit algorithm (BPST). As in the BLA, the estimates are truly Bayesian (as opposed to ML) in nature. However, the action selection probability vector of the PST is used for its exploration purposes. Also, unlike the ML estimate, which is usually a single value, the use of a posterior distribution permits us to choose any one of a spectrum of values in the posterior, as the appropriate estimate. Thus, in this paper, we have chosen a 95% percentile value of the posterior (instead of the mean) to pursue the most promising actions. Further, as advocated in [7], the pursuit has been done using both the Linear Reward-Penalty and Reward-Inaction philosophies, leading to the corresponding BPST RP and BPST RI schemes respectively. It turns out that the BPST is superior to the PST, with the BPST RI being even more robust than the BPST RP . Moreover, by controlling the learning speed of the BPST, the BPST schemes perform either better or comparable to the BLA. We thus believe that the BPST constitutes a new avenue of research, in which the performance benefits of the PST and the BLA are mutually augmented, opening up for improved performance in a number of applications, currently being tested.no_NO
dc.language.isoengno_NO
dc.publisherSpringerno_NO
dc.relation.ispartofseriesLecture Notes in Computer Science;6704
dc.subjectestimator algorithms, learning automata, pursuit algorithm, Bayesian learning automata, Bayesian pursuit algorithm, Beta distributionno_NO
dc.titleThe bayesian pursuit algorithm: A new family of estimator learning automatano_NO
dc.typeChapterno_NO
dc.typePeer reviewedno_NO
dc.subject.nsiVDP::Technology: 500::Information and communication technology: 550no_NO
dc.source.pagenumber522-531no_NO


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel