Geertzen, J., Blevins, J.P. and Milin, P. orcid.org/0000-0001-9708-7031 (2016) The informativeness of linguistic unit boundaries. Italian Journal of Linguistics, 28 (1). pp. 25-48. ISSN 1120-2726
Abstract
Contemporary models of structural analysis tend to operate with discrete units at different linguistic levels. There is, however, considerable debate regarding the choice of units and the validity of the cues that guide their demarcation. At the level of grammatical analysis, this debate focuses largely on the status of words vs sub-word units and on the generality of the linguistic properties that mark each type of unit. This paper suggests that the status of a unit type can be evaluated in terms of its informativity A measure of informativity is obtained by assessing the influence that different unit boundary types have on text compressibility. The results obtained from this initial study support a pair of general conclusions. The first is that unit boundaries primarily reflect a statistical structure, and that the typological variability of linguistic cues reflects the fact that they serve a secondary reinforcing function. The second is that word boundaries are the most informative boundary type, and that the demarcation of words provides the most informative description of the regular patterns in a language.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2016 Pacini. |
Keywords: | linguistic units; words; abstractive perspective; information theory; Shannon entropy; Kolmogorov complexity |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Department of Journalism Studies (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 14 Oct 2016 14:06 |
Last Modified: | 14 Oct 2016 14:07 |
Published Version: | http://www.italian-journal-linguistics.com/wp-cont... |
Status: | Published |
Publisher: | Pacini |
Refereed: | Yes |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:105907 |