A biologically plausible embodied model of action discovery

Abstract

During development, animals can spontaneously discover action-outcome pairings enabling subsequent achievement of their goals. We present a biologically plausible embodied model addressing key aspects of this process. The biomimetic model core comprises the basal ganglia and its loops through cortex and thalamus. We incorporate reinforcement learning (RL) with phasic dopamine supplying a sensory prediction error, signalling “surprising” outcomes. Phasic dopamine is used in a cortico-striatal learning rule which is consistent with recent data. We also hypothesized that objects associated with surprising outcomes acquire “novelty salience” contingent on the predicability of the outcome. To test this idea we used a simple model of prediction governing the dynamics of novelty salience and phasic dopamine. The task of the virtual robotic agent mimicked an in vivo counterpart (Gancarz et al., 2011) and involved interaction with a target object which caused a light flash, or a control object which did not. Learning took place according to two schedules. In one, the phasic outcome was delivered after interaction with the target in an unpredictable way which emulated the in vivo protocol. Without novelty salience, the model was unable to account for the experimental data. In the other schedule, the phasic outcome was reliably delivered and the agent showed a rapid increase in the number of interactions with the target which then decreased over subsequent sessions. We argue this is precisely the kind of change in behavior required to repeatedly present representations of context, action and outcome, to neural networks responsible for learning action-outcome contingency. The model also showed cortico-striatal plasticity consistent with learning a new action in basal ganglia. We conclude that action learning is underpinned by a complex interplay of plasticity and stimulus salience, and that our model contains many of the elements for biological action discovery to take place.

Metadata

Item Type:	Article
Authors/Creators:	Bolado-Gomez, R. Gurney, K. https://orcid.org/0000-0003-4771-728X
Copyright, Publisher and Additional Information:	© 2013 Bolado-Gomez and Gurney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/3.0/), which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
Keywords:	phasic dopamine; basal ganglia; reinforcement learning; synaptic plasticity; intrinsic motivation; action selection; operant behavior
Dates:	Accepted: 20 February 2013 Published: 12 March 2013
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Science (Sheffield) > Department of Psychology (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	27 Jun 2016 14:26
Last Modified:	27 Jun 2016 14:26
Published Version:	http://dx.doi.org/10.3389/fnbot.2013.00004
Status:	Published
Publisher:	Frontiers
Refereed:	Yes
Identification Number:	10.3389/fnbot.2013.00004
Related URLs:	Author
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:85402

CORE (COnnecting REpositories)

A biologically plausible embodied model of action discovery

Abstract

Metadata

Download

Published Version

Export

Statistics