Altahhan, A orcid.org/0000-0003-1133-7744 (2018) TD(0)-Replay: An Efficient Model-Free Planning with full Replay. In: 2018 International Joint Conference on Neural Networks (IJCNN). 2018 International Joint Conference on Neural Networks (IJCNN), 08-13 Jul 2018, Rio de Janeiro, Brazil. IEEE ISBN 978-1-5090-6015-3
Abstract
In this paper we present a novel reinforcement learning method that allows for full replay of all past experience in every step of a reinforcement learning agent life with reasonable overhead. In particular, we show how to deduce an efficient equivalent backward view by replaying the full past experience online using TD(0) error for a linear model. We call the resultant methods TD(0)-Replay and Sarsa(0)-Replan, respectively. We emphasise the already established link between replaying and planning in our algorithm design by comparing it with an Extensive Dyna Planning algorithm, where we show that our method can outperform this expensive form of planning methods. We test the new methods on two different domain problems; Random Walk to test TD(0)-Replay prediction capabilities and Dyna Maze to test Sarsa(0)-Replan planning capabilities where we show that our algorithms dominate other replaying and planning methods.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Keywords: | Planning, Prediction algorithms, Complexity theory, Learning (artificial intelligence), Task analysis, Neural networks, Mathematical model |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 24 Nov 2020 12:28 |
Last Modified: | 24 Nov 2020 12:28 |
Status: | Published |
Publisher: | IEEE |
Identification Number: | 10.1109/ijcnn.2018.8489300 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:168207 |