Decoupling multimodal transformers for referring video object segmentation

Gao, M., Yang, J., Han, J. et al. (3 more authors) (2023) Decoupling multimodal transformers for referring video object segmentation. IEEE Transactions on Circuits and Systems for Video Technology. ISSN 1051-8215

Abstract

Metadata

Authors/Creators:
  • Gao, M.
  • Yang, J.
  • Han, J.
  • Lu, K.
  • Zheng, F.
  • Montana, G.
Copyright, Publisher and Additional Information: © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reproduced in accordance with the publisher's self-archiving policy.
Keywords: Decoupled multimodal transformers; Referring video object segmentation; Vision-language pre-training
Dates:
  • Accepted: 27 May 2023
  • Published (online): 9 June 2023
Institution: The University of Sheffield
Academic Units: The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield)
Depositing User: Symplectic Sheffield
Date Deposited: 09 Jun 2023 15:59
Last Modified: 12 Jun 2023 15:43
Status: Published online
Publisher: Institute of Electrical and Electronics Engineers
Refereed: Yes
Identification Number: https://doi.org/10.1109/TCSVT.2023.3284979
Related URLs:

Download

Accepted Version


Embargoed until: 9 June 2024

Filename: FINAL_VERSION.PDF

Request a copy

file not available

Export

Statistics