Gu, F, Sridhar, M, Cohn, A orcid.org/0000-0002-7652-8907 et al. (4 more authors) (2016) Weakly supervised activity analysis with spatio-temporal localisation. Neurocomputing, 216. pp. 778-789. ISSN 0925-2312
Abstract
In computer vision, an increasing number of weakly annotated videos have become available, due to the fact it is often difficult and time consuming to annotate all the details in the videos collected. Learning methods that analyse human activities in weakly annotated video data have gained great interest in recent years. They are categorised as “weakly supervised learning”, and usually form a multi-instance multi-label (MIML) learning problem. In addition to the commonly known difficulties of MIML learning, i.e. ambiguities in instances and labels, a weakly supervised method also has to cope with large data size, high dimensionality, and a large proportion of noisy examples usually found in video data. In this work, we propose a novel learning framework that iteratively optimises over a scalable MIML model and an instance selection process incorporating pairwise spatio-temporal smoothing during training. Such learned knowledge is then generalised to testing via a noise removal process based on the support vector data description algorithm. According to the experiments on three challenging benchmark video datasets, the proposed framework yields a more discriminative MIML model and less noisy training and testing data, and thus improves the system performance. It outperforms the state-of-the-art weakly supervised and even fully supervised approaches in the literature, in terms of annotating and detecting actions of a single person and interactions between a pair of people.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | Crown Copyright © 2016 Published by Elsevier B.V. All rights reserved. This is an author produced version of a paper published in Neurocomputing. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Human activity analysis; Spatio-temporal localisation; Weakly labelled video data; Multi-instance multi-label learning |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Funding Information: | Funder Grant number DARPA W911NF |
Depositing User: | Symplectic Publications |
Date Deposited: | 01 Sep 2016 09:35 |
Last Modified: | 25 Apr 2019 16:45 |
Published Version: | https://doi.org/10.1016/j.neucom.2016.08.032 |
Status: | Published |
Publisher: | Elsevier |
Identification Number: | 10.1016/j.neucom.2016.08.032 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:104072 |