Reducing the Planning Horizon through Reinforcement Learning

Dunbar, L orcid.org/0000-0001-7486-7684, Rosman, B, Cohn, AG et al. (1 more author) (2023) Reducing the Planning Horizon through Reinforcement Learning. In: Machine Learning and Knowledge Discovery in Databases. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 19-23 Sep 2022, Grenoble, France. Lecture Notes in Computer Science. Springer, pp. 68-83. ISBN: 978-3-031-26411-5 ISSN: 0302-9743 EISSN: 1611-3349

Abstract

Planning is a computationally expensive process, which can limit the reactivity of autonomous agents. Planning problems are usually solved in isolation, independently of similar, previously solved problems. The depth of search that a planner requires to find a solution, known as the planning horizon, is a critical factor when integrating planners into reactive agents. We consider the case of an agent repeatedly carrying out a task from different initial states. We propose a combination of classical planning and model-free reinforcement learning to reduce the planning horizon over time. Control is smoothly transferred from the planner to the model-free policy as the agent compiles the planner’s policy into a value function. Local exploration of the model-free policy allows the agent to adapt to the environment and eventually overcome model inaccuracies. We evaluate the efficacy of our framework on symbolic PDDL domains and a stochastic grid world environment and show that we are able to significantly reduce the planning horizon while improving upon model inaccuracies.

Metadata

Item Type:	Proceedings Paper
Authors/Creators:	Dunbar, L https://orcid.org/0000-0001-7486-7684 Rosman, B Cohn, AG Leonetti, M
Copyright, Publisher and Additional Information:	© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG. This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-031-26412-2_5
Keywords:	Planning; Planning horizon; Reinforcement learning
Dates:	Accepted: 14 June 2022 Published (online): 17 March 2023 Published: 17 March 2023
Institution:	The University of Leeds
Academic Units:	The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds)
Depositing User:	Symplectic Publications
Date Deposited:	29 Jul 2022 15:11
Last Modified:	17 Mar 2024 01:13
Status:	Published
Publisher:	Springer
Series Name:	Lecture Notes in Computer Science
Identification Number:	10.1007/978-3-031-26412-2_5
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:189511

CORE (COnnecting REpositories)

Reducing the Planning Horizon through Reinforcement Learning

Abstract

Metadata

Download

Accepted Version

Export

Statistics