Learning the sound inventory of a complex vocal skill via an intrinsic reward

Abstract

Reinforcement learning (RL) is thought to underlie the acquisition of vocal skills like birdsong and speech, where sounding like one’s “tutor” is rewarding. However, what RL strategy generates the rich sound inventories for song or speech? We find that the standard actor-critic model of birdsong learning fails to explain juvenile zebra finches’ efficient learning of multiple syllables. However, when we replace a single actor with multiple independent actors that jointly maximize a common intrinsic reward, then birds’ empirical learning trajectories are accurately reproduced. The influence of each actor (syllable) on the magnitude of global reward is competitively determined by its acoustic similarity to target syllables. This leads to each actor matching the target it is closest to and, occasionally, to the competitive exclusion of an actor from the learning process (i.e., the learned song). We propose that a competitive-cooperative multi-actor RL (MARL) algorithm is key for the efficient learning of the action inventory of a complex skill.

Metadata

Item Type:	Article
Authors/Creators:	Toutounji, H. https://orcid.org/0000-0002-2655-3071 Zai, A.T. https://orcid.org/0000-0002-0460-0850 Tchernichovski, O. https://orcid.org/0000-0001-6788-614X Hahnloser, R.H.R. https://orcid.org/0000-0002-4039-7773 Lipkind, D. https://orcid.org/0000-0003-0173-7066
Copyright, Publisher and Additional Information:	© 2024 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, (https://creativecommons.org/licenses/by-nc/4.0/) which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.
Keywords:	Information and Computing Sciences; Artificial Intelligence; Machine Learning; Behavioral and Social Science; Basic Behavioral and Social Science; Animals; Finches; Vocalization, Animal; Learning; Sound; Reward
Dates:	Submitted: 26 June 2024 Accepted: 22 February 2024 Published (online): 27 March 2024 Published: March 2024
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Science (Sheffield) > Department of Psychology (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	08 Apr 2024 14:12
Last Modified:	08 Apr 2024 14:12
Status:	Published
Publisher:	American Association for the Advancement of Science (AAAS)
Refereed:	Yes
Identification Number:	10.1126/sciadv.adj3824
Related URLs:	Dataset Software or Code
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:211268