FSRM-STS: Cross-dataset pedestrian retrieval based on a four-stage retrieval model with Selection–Translation–Selection

Abstract

Pedestrian retrieval is widely used in intelligent video surveillance and is closely related to people’s lives. Although pedestrian retrieval from a single dataset has improved in recent years, obstacles such as a lack of sample data, domain gaps within and between datasets (arising from factors such as variation in lighting conditions, resolution, season and background etc.), reduce the generalizability of existing models. Factors such as these can act as barriers to the practical use of this technology. Cross-dataset learning is a way to obtain high-quality images from source datasets and can assist the learning of target datasets, thus helping to address the above problem. Existing studies of cross-dataset learning directly apply translated images from source datasets to target datasets, and seldom consider systematic strategies for further improving the quality of the translated images. There is therefore room for improvement in cross-dataset learning. This paper proposes a four-stage retrieval model based on Selection–Translation–Selection (FSRM-STS), to help address this problem. In the first stage of the model, images in pedestrian retrieval datasets are semantically segmented to provide information for image-translation. In the second stage, STS is proposed, based on four strategies to obtain high quality translation results from all source datasets and to generate auxiliary datasets. In the third stage, a pedestrian feature extraction model is proposed, based on both the auxiliary and target datasets. This converts each image in target datasets into an n-dimensional vector. In the final stage, the extracted image vectors are used for cross-dataset pedestrian retrieval. As the translation quality is improved, FSRM-STS achieves promising results for the cross-dataset pedestrian retrieval. Experimental results on four benchmark datasets Market-1501, DukeMTMC-reID, CUHK03 and VIPeR show the effectiveness of the proposed model. Finally, the use of parallel computing for accelerating the training speed and for realizing online applications is also discussed. A primary demo based on cloud computing is designed to verify the engineering solution in the future.

Metadata

Item Type:	Article
Authors/Creators:	Li, D. Huang, L. Ye, B. Wan, F. Madden, A. https://orcid.org/0000-0003-2305-7790 Liang, X.
Copyright, Publisher and Additional Information:	© 2020 Elsevier B.V. This is an author produced version of a paper subsequently published in Future Generation Computer Systems. Uploaded in accordance with the publisher's self-archiving policy. Article available under the terms of the CC-BY-NC-ND licence (https://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords:	Pedestrian retrieval; Transfer learning; Image translation; Semantic segmentation; Image retrieval
Dates:	Accepted: 7 February 2020 Published (online): 12 February 2020 Published: June 2020
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	21 May 2021 17:09
Last Modified:	24 May 2021 09:32
Status:	Published
Publisher:	Elsevier
Refereed:	Yes
Identification Number:	10.1016/j.future.2020.02.028
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:174437

Download

Accepted Version

Filename: ManuscriptFGCS20200121_v6.0.pdf

Licence: CC-BY-NC-ND 4.0

CLICK TO DOWNLOAD

[thumbnail of ManuscriptFGCS20200121_v6.0.pdf]

CORE (COnnecting REpositories)

FSRM-STS: Cross-dataset pedestrian retrieval based on a four-stage retrieval model with Selection–Translation–Selection

Abstract

Metadata

Download

Accepted Version

Export

Statistics