Preiss, J. and Stevenson, R.M. (2017) Quantifying and filtering knowledge generated by literature based discovery. BMC Medical Informatics and Decision Making, 18 (Suppl 7). 249. ISSN 1472-6947
Abstract
Background: Literature based discovery (LBD) automatically infers missed connections between concepts in literature. It is often assumed that LBD generates more information than can be reasonably examined.
Methods: We present a detailed analysis of the quantity of hidden knowledge produced by an LBD system and the effect of various filtering approaches upon this. The investigation of filtering combined with single or multi-step linking term chains is carried out on all articles in PubMed.
Results: The evaluation is carried out using both replication of existing discoveries, which provides justification for multi-step linking chain knowledge in specific cases, and using timeslicing, which gives a large scale measure of performance.
Conclusions: While the quantity of hidden knowledge generated by LBD can be vast, we demonstrate that (a) intelligent filtering can greatly reduce the number of hidden knowledge pairs generated, (b) for a specific term, the number of single step connections can be manageable, and (c) in the absence of single step hidden links, considering multiple steps can provide valid links.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
Keywords: | data mining; literature based discovery in the biomedical domain; biomedical text |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Engineering (Sheffield) > Department of Computer Science (Sheffield) |
Funding Information: | Funder Grant number ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL (EPSRC) EP/J008427/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 29 Mar 2017 09:22 |
Last Modified: | 26 Jun 2017 13:22 |
Published Version: | https://doi.org/10.1186/s12859-017-1641-9 |
Status: | Published |
Publisher: | BioMed Central |
Refereed: | Yes |
Identification Number: | 10.1186/s12859-017-1641-9 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:114019 |