Luna Gutierrez, R and Leonetti, M orcid.org/0000-0002-3831-2400 (2020) Information-theoretic Task Selection for Meta-Reinforcement Learning. In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020). 34th Conference on Neural Information Processing Systems (NeurIPS 2020), 06-12 Dec 2020, Online. NeurIPS
Abstract
In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of tasks to prepare for and learn faster in new, unseen, but related tasks. The training tasks are usually hand-crafted to be representative of the expected distribution of target tasks and hence all used in training. We show that given a set of training tasks, learning can be both faster and more effective (leading to better performance in the target tasks), if the training tasks are appropriately selected. We propose a task selection algorithm based on information theory, which optimizes the set of tasks used for training in meta-RL, irrespectively of how they are generated. The algorithm establishes which training tasks are both sufficiently relevant for the target tasks, and different enough from one another. We reproduce different meta-RL experiments from the literature and show that our task selection algorithm improves the final performance in all of them.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This is an author produced version of a conference paper published in NeurIPS Proceedings. Uploaded in accordance with the publisher's self-archiving policy. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 03 Nov 2020 12:29 |
Last Modified: | 08 Apr 2024 08:34 |
Published Version: | https://papers.nips.cc/paper_files/paper/2020/hash... |
Status: | Published |
Publisher: | NeurIPS |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:167448 |