Wang, X. and Wang, B. orcid.org/0000-0003-2404-5214 (2024) Exploring automatic methods for the construction of multimodal interpreting corpora. How to transcribe linguistic information and identify paralinguistic properties? Across Languages and Cultures, 25 (1). pp. 48-70. ISSN 1585-1923
Abstract
In corpus-based interpreting studies, typical challenges exist in the time-consuming and labour-intensive nature of transcribing spoken data and in identifying prosodic properties. This paper addresses these challenges by exploring methods for the automatic compilation of multimodal interpreting corpora, with a focus on English/Chinese Consecutive Interpreting. The results show that: 1) automatic transcription can achieve an accuracy rate of 95.3% in transcribing consecutive interpretations; 2) prosodic properties related to filled pauses, unfilled pauses, articulation rate, and mispronounced words can be automatically extracted using our rule-based programming; 3) mispronounced words can be effectively identified by employing Confidence Measure, with any word having a Confidence Measure lower than 0.321 considered as mispronounced; 4) automatic alignment can be achieved through the utilisation of automatic segmentation, sentence embedding, and alignment techniques. This study contributes to interpreting studies by broadening the empirical understanding of orality, enabling multimodal analyses of interpreting products, and providing a new methodological solution for the construction and utilisation of multimodal interpreting corpora. It also has implications in exploring applicability of new technologies in interpreting studies.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This item is protected by copyright. This is an author produced version of an article accepted for publication in Across Languages and Cultures. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | multimodal interpreting corpus; multi-layer model; automatic extraction of paralinguistic features; disfluency; mispronounced words; automatic alignment |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Arts, Humanities and Cultures (Leeds) > School of Languages Cultures & Societies (Leeds) > Translation Studies (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 01 May 2024 09:50 |
Last Modified: | 18 Jun 2024 14:28 |
Status: | Published |
Publisher: | Akadémiai Kiadó |
Identification Number: | 10.1556/084.2023.00407 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:212127 |