Generating Levels and Playing Super Mario Bros. with Deep Reinforcement Learning Using various techniques for level generation and Deep Q-Networks for playing
Master thesis
Permanent lenke
https://hdl.handle.net/11250/2683046Utgivelsesdato
2020Metadata
Vis full innførselSamlinger
Originalversjon
Engelsvoll, R. N., Gammelsrød, A. & Thoresen, B. I. S. (2020) Generating Levels and Playing Super Mario Bros. with Deep Reinforcement Learning Using various techniques for level generation and Deep Q-Networks for playing (Master's thesis). University of Agder, GrimstadSammendrag
This thesis aims to explore the behavior of two competing reinforcement learning agents in Super Mario Bros. In video games, PCG can be used to assist human game designers by generating a particular aspect of the game. A human game designer can use generated game content as inspiration to build further upon, which saves time and resources. Much research has been conducted on AI in video games, including AI for playing Super Mario Bros. Additionally, there exists a research field focused on PCG for video games, which includes generation of Super Mario Bros. levels. In this thesis, the two fields of research are combined to form a GAN-inspired system of two competing AI agents. One agent is controlling Mario, and this agent represents the discriminator. The other agent generates the level Mario is playing, and represents the generator. In an ordinary GAN system, the generator is attempting to mimic a database containing real data, while the discriminator attempts to distinguish real data samples from the generated data samples. The Mario agent utilizes a DQN algorithm for learning to navigate levels, while the level generator uses a DQN-based algorithm with different types of neural networks. The DQN algorithm utilizes neural networks to predict the expected future reward for each possible action. The expected future rewards are denoted as Q-values. The results show that the generator is capable of generating content better than random when the generator model takes a sequence of tiles as input and produces a sequence of predictions of Q-values as output.
Beskrivelse
Master's thesis in Information- and communication technology (IKT590)