Alomari, M, Duckworth, P orcid.org/0000-0001-9052-6919, Hogg, DC orcid.org/0000-0002-6125-9564 et al. (1 more author) (2017) Learning of Object Properties, Spatial Relations, and Actions for Embodied Agents from Language and Vision. In: The AAAI 2017 Spring Symposium on Interactive Multisensory Object Perception for Embodied Agents Technical Report SS-17-05. AAAI Spring Symposium Series: Symposium on Interactive Multi-Sensory Object Perception for Embodied Agents, 27-29 Mar 2017, Stanford University, CA. AAAI Press, pp. 444-448.
Abstract
We present a system that enables embodied agents to learn about different components of the perceived world, such as object properties, spatial relations, and actions. The system learns a semantic representation and the linguistic description of such components by connecting two different sensory inputs: language and vision. The learning is achieved by mapping observed words to extracted visual features from video clips. We evaluate our approach against state-of-the-art supervised and unsupervised systems that each learn from a single modality, and we show that an improvement can be obtained by using both language and vision as inputs.
Metadata
| Item Type: | Proceedings Paper |
|---|---|
| Authors/Creators: |
|
| Keywords: | language and vision; embodied robotics; natural language comands |
| Dates: |
|
| Institution: | The University of Leeds |
| Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
| Funding Information: | Funder Grant number EU - European Union FP7-ICT-600623 |
| Depositing User: | Symplectic Publications |
| Date Deposited: | 10 Jan 2017 12:41 |
| Last Modified: | 07 Oct 2017 07:00 |
| Published Version: | https://www.aaai.org/ocs/index.php/SSS/SSS17/paper... |
| Status: | Published |
| Publisher: | AAAI Press |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:110303 |

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)