Playing Axis & Allies Revised Using Learning Automata

Lie, Gjermund Karlsen

Lie, Gjermund Karlsen

Master thesis

Åpne

Lie.pdf (1.080Mb)

Permanent lenke

http://hdl.handle.net/11250/137076

Utgivelsesdato

2009

Metadata

Vis full innførsel

Samlinger

Master's theses in Information and Communication Technology [506]

Sammendrag

The Artficial Intelligence (AI) of opponents in computer games in general, and

in strategy games in particular, have been plagued with performance problems

of many kinds since they first appeared. Not the least of these problems is the

fact that their design schemes often base themselves on predfined ways to play

the game, making these opponents predictable and dull to a seasoned player.

In this thesis, we propose using Learning Automata (LA) to create opponents

that are able to adapt to any game situation and find a good response, much in

the way a player would - by looking ahead in time to see what could happen in

the game beyond the immediate next move.

As a suitable environment for these LA, we have chosen the game Axis &

Allies Revised. A turn-based war game emulating the second world war, it has

many layers of complexity for the LA to struggle with - multiple moves per turn,

random outcome of combat, and highly complex rules. To play this game well,

the artficial opponent would need not only coordinate all his units into the best

combined move each turn, but also to avoid performing moves in the present

that it would be punished for during the next turns.

To solve these problems, we propose a two-step solution: First, each unit

will be assigned its own, independent LA. Secondly, for each possible action

that this unit can select in the next immediate turn, another independent LA

will be assigned. This process can then be repeated until a sufficient depth into

future moves has been achieved. Each tier of LA in this structure will receive

its feedback not from its immediate surroundings - but from the status of the

next LA down the tree.

In this thesis we lay the foundation for such a solution by implementing

the method on a smaller scale, and by carefully testing its performance in a

controlled environment. We find which approaches give the best results, which

can only perform under certain conditions, and which are suitable for expanding

into larger scale.

The three types of LA chosen for our testing covers most schools of reinforcement

learning. The Tsetlin Automata, with its simple, state based structure.

The Linear Reward Inaction Automata, with its linear updating scheme. And

finally the Bayesian Learning Automata, shaping conjugate distributions in order

to determine the optimal action. Each have their own unique strengths and

weaknesses, which are recorded in this thesis.

Through thorough testing and careful tuning of these automata, we conclude

that while LA may in fact have the potential to perform well in almost any type

of scenario, it would still be impractical considering the time spent on deciding

on a move. While the speed of decision making of our LA vary, so does its

performance, even in our small scale testing.

Nevertheless, we believe that our results should give some insight into the

possibilities and benefits, both in performance and design simplicity, of using

LA as the decision maker for artificial players.

Beskrivelse

Masteroppgave i informasjons- og kommunikasjonsteknologi 2009 – Universitetet i Agder, Grimstad

Utgiver

University of Agder