Ruddle, R and Hall, M orcid.org/0000-0003-1246-2627 (2019) Using Miniature Visualizations of Descriptive Statistics to Investigate the Quality of Electronic Health Records. In: Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF. HEALTHINF 2019, 22-24 Feb 2019, Prague, Czech Republic. SciTePress , pp. 230-238. ISBN 978-989-758-353-7
Abstract
Descriptive statistics are typically presented as text, but that quickly becomes overwhelming when datasets contain many variables or analysts need to compare multiple datasets. Visualization offers a solution, but is rarely used apart from to show cardinalities (e.g., the % missing values) or distributions of a small set of variables. This paper describes dataset- and variable-centric designs for visualizing three categories of descriptive statistic (cardinalities, distributions and patterns), which scale to more than 100 variables, and use multiple channels to encode important semantic differences (e.g., zero vs. 1+ missing values). We evaluated our approach using large (multi-million record) primary and secondary care datasets. The miniature visualizations provided our users with a variety of important insights, including differences in character patterns that indicate data validation issues, missing values for a variable that should always be complete, and inconsistent encryption of patient identifiers. Finally, we highlight the need for research into methods of identifying anomalies in the distributions of dates in health data.
Metadata
Item Type: | Proceedings Paper |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | This is an author produced version of a paper accepted for publication in the Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies. |
Keywords: | Data Visualization; Electronic Health Records; Data Quality |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) The University of Leeds > Faculty of Medicine and Health (Leeds) > School of Medicine (Leeds) > Leeds Institute of Cardiovascular and Metabolic Medicine (LICAMM) > Clinical & Population Science Dept (Leeds) |
Funding Information: | Funder Grant number EPSRC EP/N013980/1 Wellcome Trust 206470/Z/17/Z |
Depositing User: | Symplectic Publications |
Date Deposited: | 10 Jan 2019 12:54 |
Last Modified: | 14 May 2019 13:06 |
Status: | Published |
Publisher: | SciTePress |
Identification Number: | 10.5220/0007354802300238 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:140847 |