Devlin, Sam orcid.org/0000-0002-7769-3090 and Kudenko, Daniel orcid.org/0000-0003-3359-3255 (2011) Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems. In: The 10th International Conference on Autonomous Agents and Multiagent Systems. Tenth International Conference on Autonomous Agents and Multi-Agent Systems, 02 May 2011 ACM, TWN, pp. 225-232.
Abstract
Potential-based reward shaping has previously been proven to both be equivalent to Q-table initialisation and guarantee policy invariance in single-agent reinforcement learning. The method has since been used in multi-agent reinforcement learning without consideration of whether the theoretical equivalence and guarantees hold. This paper extends the existing proofs to similar results in multi-agent systems, providing the theoretical background to explain the success of previous empirical studies. Specically, it is proven that the equivalence to Q-table initialisation remains and the Nash Equilibria of the underlying stochastic game are not modied. Furthermore, we demonstrate empirically that potential-based reward shaping eects exploration and, consequentially, can alter the joint policy converged upon.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Dates: |
|
| Institution: | The University of York |
| Academic Units: | The University of York > Faculty of Sciences (York) > Computer Science (York) |
| Date Deposited: | 19 Feb 2013 16:23 |
| Last Modified: | 01 Nov 2025 00:05 |
| Status: | Published |
| Publisher: | ACM |
| Related URLs: | |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:75111 |
Downloads
Filename: aamas11.pdf
Filename: p225_devlin.pdf
Description: p225-devlin

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)