Zhang, Z. (2017) Effective and efficient Semantic Table Interpretation using TableMiner(+). Semantic Web, 8 (6). pp. 921-957. ISSN 1570-0844
Abstract
This article introduces TableMiner+, a Semantic Table Interpretation method that annotates Web tables in a both effective and efficient way. Built on our previous work TableMiner, the extended version advances state-of-the-art in several ways. First, it improves annotation accuracy by making innovative use of various types of contextual information both inside and outside tables as features for inference. Second, it reduces computational overheads by adopting an incremental, bootstrapping approach that starts by creating preliminary and partial annotations of a table using ‘sample’ data in the table, then using the outcome as ‘seed’ to guide interpretation of remaining contents. This is then followed by a message passing process that iteratively refines results on the entire table to create the final optimal annotations. Third, it is able to handle all annotation tasks of Semantic Table Interpretation (e.g., annotating a column, or entity cells) while state-of-the-art methods are limited in different ways. We also compile the largest dataset known to date and extensively evaluate TableMiner+ against four baselines and two re-implemented (near-identical, as adaptations are needed due to the use of different knowledge bases) state-of-the-art methods. TableMiner+ consistently outperforms all models under all experimental settings. On the two most diverse datasets covering multiple domains and various table schemata, it achieves improvement in F1 by between 1 and 42 percentage points depending on specific annotation tasks. It also significantly reduces computational overheads in terms of wall-clock time when compared against classic methods that ‘exhaustively’ process the entire table content to build features for inference. As a concrete example, compared against a method based on joint inference implemented with parallel computation, the non-parallel implementation of TableMiner+ achieves significant improvement in learning accuracy and almost orders of magnitude of savings in wall-clock time.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2017 IOS Press and the authors. This is an author produced version of a paper subsequently published in Semantic Web. Uploaded in accordance with the publisher's self-archiving policy. |
Keywords: | Web table; named entity recognition; named entity disambiguation; Relation Extraction; linked data; Semantic Table Interpretation; table annotation |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield) |
Funding Information: | Funder Grant number ENGINEERING AND PHYSICAL SCIENCE RESEARCH COUNCIL (EPSRC) EP/J019488/1 |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 25 Jan 2018 11:36 |
Last Modified: | 01 Feb 2018 09:52 |
Published Version: | https://doi.org/10.3233/SW-160242 |
Status: | Published |
Publisher: | IOS Press |
Refereed: | Yes |
Identification Number: | 10.3233/SW-160242 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:126465 |