Jost, U and Atwell, ES (1994) Proposal for a mutual-information based language model. In: Evett, L and Rose, T, (eds.) Proceedings of the 1994 AISB Workshop on Computational Linguistics for Speech and Handwriting Recognition. 1994 AISB Workshop on Computational Linguistics for Speech and Handwriting Recognition, 11-13 Apr 1994, University of Leeds, UK. AISB
Abstract
We propose a probabilistic language model that is intended to overcome some of the limitations of the well-known n-gram models, namely the strong dependence of the parameter values of the model on the discourse domain and the constant size of word context taken into account. The new model is based on the mutual information (MI) measurement for the correlation of events and derives a hierarchy of categories from unlabelled training text. It has close analogies to the bi-gram model and is therefore explained by comparing it with this model.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Editors: |
|
Copyright, Publisher and Additional Information: | (c) 1994, AISB. Reproduced with permission from the publisher. |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) > Artificial Intelligence & Biological Systems (Leeds) |
Depositing User: | Symplectic Publications |
Date Deposited: | 14 Jan 2015 12:30 |
Last Modified: | 19 Dec 2022 13:29 |
Published Version: | http://www.aisb.org.uk/ |
Status: | Published |
Publisher: | AISB |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:82287 |