Vis enkel innførsel

dc.contributor.advisorOmlin, Christian Walter Peter
dc.contributor.authorMaree, Charl
dc.date.accessioned2023-01-27T14:58:39Z
dc.date.available2023-01-27T14:58:39Z
dc.date.created2023-01-18T10:37:37Z
dc.date.issued2023
dc.identifier.citationMaree, C. (2023). Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability [Doctoral dissertation]. University of Agder.en_US
dc.identifier.isbn978-82-8427-108-8
dc.identifier.issn1504-9272
dc.identifier.urihttps://hdl.handle.net/11250/3046928
dc.description.abstractThe steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities. Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.en_US
dc.language.isoengen_US
dc.publisherUniversity of Agderen_US
dc.relation.ispartofseriesDoctoral Dissertations at the University of Agder; no. 395
dc.relation.haspartPaper I: Maree, C., Modal J. E. & Omlin, C. W. (2020). Towards Responsible AI for Financial Transactions. In C. A. Coello (Ed.), IEEE Symposium Series on Computational Intelligence (pp. 16–21). IEEE. https://doi.org/10.1109/SSCI47803.2020.9308456. Accepted version. Full-text is available in AURA as a separate file: https://hdl.handle.net/11250/3046856.en_US
dc.relation.haspartPaper II: Maree, C. & Omlin, C. W. P. (2021). Clustering in Recurrent Neural Networks for Micro-Segmentation using Spending Personality. In P. Haddow (Ed.), IEEE Symposium Series on Computational Intelligence. IEEE. https://doi.org/10.1109/SSCI50451.2021.9659905. Published version. Full-text is available in AURA as a separate file: https://hdl.handle.net/11250/3046894.en_US
dc.relation.haspartPaper III: Maree, C. & Omlin, C. W. P. (2022). Understanding Spending Behavior: Recurrent Neural Network Explanation and Interpretation. IEEE Symposium on Computational Intelligence for Financial Engineering and Economics. https://doi.org/10.1109/CIFEr52523.2022.9776210. Accepted version. Full-text is available in AURA as a separate file: .en_US
dc.relation.haspartPaper IV: Maree, C. & Omlin, C. W. P. (2022). Balancing Profit, Risk, and Sustainability for Portfolio Management. IEEE Symposium on Computational Intelligence for Financial Engineering and Economics. https://doi.org/10.1109/CIFEr52523.2022.9776048. Accepted version. Full-text is available in AURA as a separate file: .en_US
dc.relation.haspartPaper V: Maree, C. & Omlin, C. W. P. (2022). Reinforcement Learning Your Way: Agent Characterization through Policy Regularization. AI, 3(2), 250-259. https://doi.org/10.3390/ai3020015. Published version. Full-text is available in AURA as a separate file: https://hdl.handle.net/11250/3046910.en_US
dc.relation.haspartPaper VI: Maree, C. & Omlin, C. W. P. (2022). Can Interpretable Reinforcement Learning Manage Prosperity Your Way? AI, 3(2), 526-537. https://doi.org/10.3390/ai3020030. Published version. Full-text is available in AURA as a separate file: https://hdl.handle.net/11250/3001459.en_US
dc.relation.haspartPaper VII: Maree, C. & Omlin, C. W. P. (2022). Reinforcement learning with intrinsic affinity for personalized prosperity management. Digit Finance, 4, 241–262. https://doi.org/10.1007/s42521-022-00068-4. Published version. Full-text is available in AURA as a separate file: https://hdl.handle.net/11250/3046916.en_US
dc.relation.haspartPaper VIII: Maree, C. & Omlin, C. W. P. (Forthcoming). Symbolic Explanation of Affinity-Based Reinforcement Learning Agents with Markov Models. Expert Systems with Applications. https://doi.org/10.48550/arXiv.2208.12627. Submitted version. Full-text is available in AURA as a separate file.en_US
dc.relation.haspartPaper IX: Vishwanath, A., Bøhn, E. D., Granmo, O.-C., Maree, C. & Omlin, C. W. P. (2022). Towards artificial virtuous agents : games, dilemmas and machine learning. AI and Ethics. https://doi.org/10.1007/s43681-022-00251-8. Published version. Full-text is not available in AURA as a separate file.en_US
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internasjonal*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/deed.no*
dc.titleAffinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretabilityen_US
dc.typeDoctoral thesisen_US
dc.description.versionpublishedVersionen_US
dc.rights.holder© 2023 Charl Mareeen_US
dc.subject.nsiVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550en_US
dc.source.pagenumber182en_US
dc.source.issue395en_US
dc.identifier.cristin2109163


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel

Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
Med mindre annet er angitt, så er denne innførselen lisensiert som Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal