Gao, L. orcid.org/0000-0003-1617-1325, Chen, L. orcid.org/0009-0005-9615-1333, Jiang, Y. et al. (3 more authors) (2025) Feature-level fusion network for hyperspectral object tracking via mixed multi-head self-attention learning. Remote Sensing, 17 (6). 997. ISSN 2072-4292
Abstract
Hyperspectral object tracking has emerged as a promising task in visual object tracking. The rich spectral information within hyperspectral images benefits the accurate tracking in challenging scenarios. The performances of existing hyperspectral object tracking networks are constrained by neglecting the interactive information among bands within hyperspectral images. Moreover, designing an accurate deep learning-based algorithm for hyperspectral object tracking poses challenges because of the substantial amount of training data required. In order to address these challenges, a new mixed multi-head attention-based feature fusion tracking (MMFT) algorithm for hyperspectral videos is proposed. Firstly, MMFT introduces a feature-level fusion module, mixed multi-head attention feature fusion (MMFF), which fuses false-color features and augments the fused feature with one mixed multi-head attention (MMA) block with interactive information, which increases the representational ability of the features for tracking. Specifically, MMA learns the interactive information across the bands in the false-color images and incorporates the learned interactive information into the fused feature, which is obtained by combining the features of the false-color images. Secondly, a new training procedure is introduced, in which the modules designed for hyperspectral object tracking are first pre-trained on a sufficient amount of modified RGB data to enhance generalization, and then fine-tuned on a limited amount of HS data for task adaption. Extensive experiments verify the effectiveness of MMFT, demonstrating its SOTA performance.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
Keywords: | feature fusion; mixed multi-head attention; Transformer; hyperspectral object tracking |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Electronic and Electrical Engineering (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 30 Apr 2025 15:03 |
Last Modified: | 30 Apr 2025 15:03 |
Status: | Published |
Publisher: | MDPI AG |
Refereed: | Yes |
Identification Number: | 10.3390/rs17060997 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:225906 |