Alattas, E. orcid.org/0009-0004-8411-655X, Clark, J. orcid.org/0000-0002-9230-9739, Alsulami, B. et al. (1 more author) (2026) Beyond frames: 3D-CoAtNet for generalizable deepfake video detection. IEEE Access, 14. pp. 29692-29705. ISSN: 2169-3536
Abstract
Deepfakes pose a growing risk to digital integrity and public trust, driving the need for robust video-level forgery-detection methods. Many existing approaches analyse individual frames independently and overlook temporal dependencies, thereby weakening the generalisation to unseen manipulation techniques. This paper introduces 3D-CoAtNet, a spatiotemporal architecture for deepfake video detection that processes multiple frames simultaneously, thereby reducing reliance on single-frame artefacts. The model inflates CoAtNet’s 2D convolutional, residual, pooling, and self-attention layers into their 3D counterparts to learn spatial and temporal representations from multiple frames. We evaluated two input modalities: RGB 15-frame clips sampled from each video, and 15-frame optical-flow sequences that capture motion cues. Extensive experiments on FaceForensics++ (FF++), DFDC, and Celeb-DF under intra- and cross-dataset settings show that 3D-CoAtNet is competitive in intra-dataset evaluations (best in the DeepFakes dataset) and transfers well to Celeb-DF. Moreover, although frame-based CoAtNet16A achieves strong within-dataset accuracy, 3D-CoAtNet improves cross-dataset generalisation. These findings highlight the importance of the proposed 3D-CoAtNet model for deepfake forensics.
Metadata
| Item Type: | Article |
|---|---|
| Authors/Creators: |
|
| Copyright, Publisher and Additional Information: | © 2026 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
| Keywords: | Convolutional neural networks (CNNs); CoAtNet; deepfake detection; digital forensics; generative adversarial networks (GANs); vision transformers (ViTs) |
| Dates: |
|
| Institution: | The University of Sheffield |
| Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
| Date Deposited: | 12 Mar 2026 14:22 |
| Last Modified: | 12 Mar 2026 14:22 |
| Status: | Published |
| Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
| Refereed: | Yes |
| Identification Number: | 10.1109/access.2026.3666623 |
| Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:239064 |
Download
Filename: Beyond_Frames_3D-CoAtNet_for_Generalizable_Deepfake_Video_Detection.pdf
Licence: CC-BY 4.0

CORE (COnnecting REpositories)
CORE (COnnecting REpositories)