Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

Devlin, Sam orcid.org/0000-0002-7769-3090 and Kudenko, Daniel orcid.org/0000-0003-3359-3255 (2011) Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems. In: The 10th International Conference on Autonomous Agents and Multiagent Systems. Tenth International Conference on Autonomous Agents and Multi-Agent Systems, 02 May 2011 . ACM, TWN, pp. 225-232.

Abstract

Potential-based reward shaping has previously been proven to both be equivalent to Q-table initialisation and guarantee policy invariance in single-agent reinforcement learning. The method has since been used in multi-agent reinforcement learning without consideration of whether the theoretical equivalence and guarantees hold. This paper extends the existing proofs to similar results in multi-agent systems, providing the theoretical background to explain the success of previous empirical studies. Specically, it is proven that the equivalence to Q-table initialisation remains and the Nash Equilibria of the underlying stochastic game are not modied. Furthermore, we demonstrate empirically that potential-based reward shaping eects exploration and, consequentially, can alter the joint policy converged upon.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Devlin, Sam https://orcid.org/0000-0002-7769-3090 Kudenko, Daniel https://orcid.org/0000-0003-3359-3255
Dates:	Published: May 2011
Institution:	The University of York
Academic Units:	The University of York > Faculty of Sciences (York) > Computer Science (York)
Date Deposited:	19 Feb 2013 16:23
Last Modified:	24 Jun 2026 23:20
Status:	Published
Publisher:	ACM
Related URLs:	http://www.ifaamas.org/Proceedings/aamas...
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:75111