Ruddle, RA, Cheshire, J and Fernstad, SJ (2023) Tasks and Visualizations Used for Data Profiling: A Survey and Interview Study. IEEE Transactions on Visualization and Computer Graphics. ISSN 1077-2626
Abstract
The use of good-quality data to inform decision making is entirely dependent on robust processes to ensure it is fit for purpose. Such processes vary between organisations, and between those tasked with designing and following them. In this paper we report on a survey of 53 data analysts from many industry sectors, 24 of whom also participated in in-depth interviews, about computational and visual methods for characterizing data and investigating data quality. The paper makes contributions in two key areas. The first is to data science fundamentals, because our lists of data profiling tasks and visualization techniques are more comprehensive than those published elsewhere. The second concerns the application question “what does good profiling look like to those who routinely perform it?,” which we answer by highlighting the diversity of profiling tasks, unusual practice and exemplars of visualization, and recommendations about formalizing processes and creating rulebooks.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © 2023, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Keywords: | Data visualization , Task analysis , Data integrity , Interviews , Visualization , Bars , Industries |
Dates: |
|
Institution: | The University of Leeds |
Academic Units: | The University of Leeds > Faculty of Engineering & Physical Sciences (Leeds) > School of Computing (Leeds) |
Funding Information: | Funder Grant number Alan Turing Institute Not Known |
Depositing User: | Symplectic Publications |
Date Deposited: | 07 Mar 2023 14:57 |
Last Modified: | 07 Mar 2023 18:34 |
Status: | Published online |
Publisher: | Institute of Electrical and Electronics Engineers |
Identification Number: | 10.1109/TVCG.2023.3234337 |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:197083 |