Deep convolutional neural networks for human action recognition using depth maps and postures

Abstract

In this paper, we present a method (Action-Fusion) for human action recognition from depth maps and posture data using convolutional neural networks (CNNs). Two input descriptors are used for action representation. The first input is a depth motion image that accumulates consecutive depth maps of a human action, whilst the second input is a proposed moving joints descriptor which represents the motion of body joints over time. In order to maximize feature extraction for accurate action classification, three CNN channels are trained with different inputs. The first channel is trained with depth motion images (DMIs), the second channel is trained with both DMIs and moving joint descriptors together, and the third channel is trained with moving joint descriptors only. The action predictions generated from the three CNN channels are fused together for the final action classification. We propose several fusion score operations to maximize the score of the right action. The experiments show that the results of fusing the output of three channels are better than using one channel or fusing two channels only. Our proposed method was evaluated on three public datasets: 1) Microsoft action 3-D dataset (MSRAction3D); 2) University of Texas at Dallas-multimodal human action dataset; and 3) multimodal action dataset (MAD) dataset. The testing results indicate that the proposed approach outperforms most of existing state-of-the-art methods, such as histogram of oriented 4-D normals and Actionlet on MSRAction3D. Although MAD dataset contains a high number of actions (35 actions) compared to existing action RGB-D datasets, this paper surpasses a state-of-the-art method on the dataset by 6.84%.

Metadata

Item Type:	Article
Authors/Creators:	Kamel, A. Sheng, B. Yang, P. https://orcid.org/0000-0002-8553-7127 Li, P. Shen, R. Feng, D.D.
Copyright, Publisher and Additional Information:	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy.
Keywords:	Action Recognition; Depth Motion Image; Moving Joints Descriptor; Convolutional neural network
Dates:	Accepted: 8 June 2019 Published (online): 11 July 2019 Published: September 2019
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	23 Sep 2019 13:39
Last Modified:	11 Jul 2020 00:38
Status:	Published
Publisher:	Institute of Electrical and Electronics Engineers (IEEE)
Refereed:	Yes
Identification Number:	10.1109/tsmc.2018.2850149
Related URLs:	Author
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:151224

CORE (COnnecting REpositories)

Deep convolutional neural networks for human action recognition using depth maps and postures

Abstract

Metadata

Download

Accepted Version

Export

Statistics