Li, D., Huang, L., Ye, B. et al. (3 more authors) (2020) FSRM-STS: Cross-dataset pedestrian retrieval based on a four-stage retrieval model with Selection–Translation–Selection. Future Generation Computer Systems, 107. pp. 601-619. ISSN 0167-739X
Abstract
Pedestrian retrieval is widely used in intelligent video surveillance and is closely related to people’s lives. Although pedestrian retrieval from a single dataset has improved in recent years, obstacles such as a lack of sample data, domain gaps within and between datasets (arising from factors such as variation in lighting conditions, resolution, season and background etc.), reduce the generalizability of existing models. Factors such as these can act as barriers to the practical use of this technology. Cross-dataset learning is a way to obtain high-quality images from source datasets and can assist the learning of target datasets, thus helping to address the above problem. Existing studies of cross-dataset learning directly apply translated images from source datasets to target datasets, and seldom consider systematic strategies for further improving the quality of the translated images. There is therefore room for improvement in cross-dataset learning. This paper proposes a four-stage retrieval model based on Selection–Translation–Selection (FSRM-STS), to help address this problem. In the first stage of the model, images in pedestrian retrieval datasets are semantically segmented to provide information for image-translation. In the second stage, STS is proposed, based on four strategies to obtain high quality translation results from all source datasets and to generate auxiliary datasets. In the third stage, a pedestrian feature extraction model is proposed, based on both the auxiliary and target datasets. This converts each image in target datasets into an n-dimensional vector. In the final stage, the extracted image vectors are used for cross-dataset pedestrian retrieval. As the translation quality is improved, FSRM-STS achieves promising results for the cross-dataset pedestrian retrieval. Experimental results on four benchmark datasets Market-1501, DukeMTMC-reID, CUHK03 and VIPeR show the effectiveness of the proposed model. Finally, the use of parallel computing for accelerating the training speed and for realizing online applications is also discussed. A primary demo based on cloud computing is designed to verify the engineering solution in the future.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2020 Elsevier B.V. This is an author produced version of a paper subsequently published in Future Generation Computer Systems. Uploaded in accordance with the publisher's self-archiving policy. Article available under the terms of the CC-BY-NC-ND licence (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
Keywords: | Pedestrian retrieval; Transfer learning; Image translation; Semantic segmentation; Image retrieval |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 21 May 2021 17:09 |
Last Modified: | 24 May 2021 09:32 |
Status: | Published |
Publisher: | Elsevier |
Refereed: | Yes |
Identification Number: | 10.1016/j.future.2020.02.028 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:174437 |