Alomari, M, Duckworth, P orcid.org/0000-0001-9052-6919, Hogg, DC orcid.org/0000-0002-6125-9564 et al. (1 more author) (2017) Learning of Object Properties, Spatial Relations, and Actions for Embodied Agents from Language and Vision. In: The AAAI 2017 Spring Symposium on Interactive Multisensory Object Perception for Embodied Agents Technical Report SS-17-05. AAAI Spring Symposium Series: Symposium on Interactive Multi-Sensory Object Perception for Embodied Agents, 27-29 Mar 2017, Stanford University, CA. AAAI Press , pp. 444-448.
Abstract
We present a system that enables embodied agents to learn about different components of the perceived world, such as object properties, spatial relations, and actions. The system learns a semantic representation and the linguistic description of such components by connecting two different sensory inputs: language and vision. The learning is achieved by mapping observed words to extracted visual features from video clips. We evaluate our approach against state-of-the-art supervised and unsupervised systems that each learn from a single modality, and we show that an improvement can be obtained by using both language and vision as inputs.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Keywords: | language and vision; embodied robotics; natural language comands |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Funding Information: | Funder Grant number EU - European Union FP7-ICT-600623 |
Depositing User: | Symplectic Publications |
Date Deposited: | 10 Jan 2017 12:41 |
Last Modified: | 07 Oct 2017 07:00 |
Published Version: | https://www.aaai.org/ocs/index.php/SSS/SSS17/paper... |
Status: | Published |
Publisher: | AAAI Press |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:110303 |