Finding minimal action sequences with a simple evaluation of actions

Abstract

Animals are able to discover the minimal number of actions that achieves an outcome (the minimal action sequence). In most accounts of this, actions are associated with a measure of behavior that is higher for actions that lead to the outcome with a shorter action sequence, and learning mechanisms find the actions associated with the highest measure. In this sense, previous accounts focus on more than the simple binary signal of “was the outcome achieved?”; they focus on “how well was the outcome achieved?” However, such mechanisms may not govern all types of behavioral development. In particular, in the process of action discovery (Redgrave and Gurney, 2006), actions are reinforced if they simply lead to a salient outcome because biological reinforcement signals occur too quickly to evaluate the consequences of an action beyond an indication of the outcome’s occurrence. Thus, action discovery mechanisms focus on the simple evaluation of “was the outcome achieved?” and not “how well was the outcome achieved?” Notwithstanding this impoverishment of information, can the process of action discovery find the minimal action sequence? We address this question by implementing computational mechanisms, referred to in this paper as no-cost learning rules, in which each action that leads to the outcome is associated with the same measure of behavior. No-cost rules focus on “was the outcome achieved?” and are consistent with action discovery. No-cost rules discover the minimal action sequence in simulated tasks and execute it for a substantial amount of time. Extensive training, however, results in extraneous actions, suggesting that a separate process (which has been proposed in action discovery) must attenuate learning if no-cost rules participate in behavioral development. We describe how no-cost rules develop behavior, what happens when attenuation is disrupted, and relate the new mechanisms to wider computational and biological context.

Metadata

Item Type:	Article
Authors/Creators:	Shah, A. Gurney, K.N.
Copyright, Publisher and Additional Information:	© 2014 The Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords:	Action discovery; Dopamine; Intrinsic motivation; Optimal control; Redundancy; Reinforcement learning
Dates:	Published: 28 November 2014
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Science (Sheffield) > Department of Psychology (Sheffield)
Funding Information:	Funder Grant number EPSRC EP/J019534/1
Depositing User:	Symplectic Sheffield
Date Deposited:	03 Mar 2015 15:15
Last Modified:	03 Mar 2015 15:15
Published Version:	http://dx.doi.org/10.3389/fncom.2014.00151
Status:	Published
Publisher:	Frontiers Research Foundation
Refereed:	Yes
Identification Number:	10.3389/fncom.2014.00151
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:83946

CORE (COnnecting REpositories)

Finding minimal action sequences with a simple evaluation of actions

Abstract

Metadata

Download

Published Version

Export

Statistics