Shah, A. and Gurney, K.N. (2014) Finding minimal action sequences with a simple evaluation of actions. Frontiers in Computational Neuroscience, 8 (151). ISSN 1662-5188
Abstract
Animals are able to discover the minimal number of actions that achieves an outcome (the minimal action sequence). In most accounts of this, actions are associated with a measure of behavior that is higher for actions that lead to the outcome with a shorter action sequence, and learning mechanisms find the actions associated with the highest measure. In this sense, previous accounts focus on more than the simple binary signal of “was the outcome achieved?”; they focus on “how well was the outcome achieved?” However, such mechanisms may not govern all types of behavioral development. In particular, in the process of action discovery (Redgrave and Gurney, 2006), actions are reinforced if they simply lead to a salient outcome because biological reinforcement signals occur too quickly to evaluate the consequences of an action beyond an indication of the outcome’s occurrence. Thus, action discovery mechanisms focus on the simple evaluation of “was the outcome achieved?” and not “how well was the outcome achieved?” Notwithstanding this impoverishment of information, can the process of action discovery find the minimal action sequence? We address this question by implementing computational mechanisms, referred to in this paper as no-cost learning rules, in which each action that leads to the outcome is associated with the same measure of behavior. No-cost rules focus on “was the outcome achieved?” and are consistent with action discovery. No-cost rules discover the minimal action sequence in simulated tasks and execute it for a substantial amount of time. Extensive training, however, results in extraneous actions, suggesting that a separate process (which has been proposed in action discovery) must attenuate learning if no-cost rules participate in behavioral development. We describe how no-cost rules develop behavior, what happens when attenuation is disrupted, and relate the new mechanisms to wider computational and biological context.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2014 The Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
Keywords: | Action discovery; Dopamine; Intrinsic motivation; Optimal control; Redundancy; Reinforcement learning |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Science (Sheffield) > Department of Psychology (Sheffield) |
Funding Information: | Funder Grant number EPSRC EP/J019534/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 03 Mar 2015 15:15 |
Last Modified: | 03 Mar 2015 15:15 |
Published Version: | http://dx.doi.org/10.3389/fncom.2014.00151 |
Status: | Published |
Publisher: | Frontiers Research Foundation |
Refereed: | Yes |
Identification Number: | 10.3389/fncom.2014.00151 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:83946 |