Brierley, C and Atwell, ES (2007) Using Nltk_lite's Chunk Parser to Detect Prosodic Phrase Boundaries in the Aix-MARSEC Corpus of Spoken English. University of Leeds, School of Computing research report 2007.02.
Abstract
Prosodic phrasing is the means by which speakers of any given language break up an utterance into meaningful chunks. The term ‘prosody’ itself refers to the tune or intonation of an utterance and therefore prosodic phrases literally signal the end of one tune and the beginning of another. This study uses phrase break annotations in the Aix-MARSEC Corpus of spoken English as a “gold standard” for measuring the degree of correspondence between prosodic phrases and the discrete syntactic grouping of prepositional phrases, where the latter is defined via a chunk parse rule using nltk_lite’s regular expression chunk parser. A three-way comparison is also introduced between “gold standard”, chunk parse rule and human judgement in the form of intuitive predictions about phrasing. Results show that even with a discrete syntactic grouping and a small sample of text (around 1400 words), problems arise for this rule-based method due to uncategorical behaviour in parts of speech. Lack of correspondence between intuitive prosodic phrases and corpus annotations highlights the optional nature of certain boundary types. Finally, there are clear indications, supported by corpus annotations, that significant prosodic phrase boundaries occur within sentences and not just at full stops.
Metadata
Item Type: | Book |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | Brierley, C and Atwell, ES (c) 2007, University of Leeds. Reproduced with permission from the copyright holders. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 17 Dec 2014 12:35 |
Last Modified: | 19 Jan 2018 13:02 |
Published Version: | http://www.comp.leeds.ac.uk/claireb/acmchunk_final... |
Status: | Published |
Publisher: | University of Leeds, School of Computing research report 2007.02. |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:81927 |