Journal of Business Analytics | 2021

Root-cause analysis of process-data quality problems

 
 
 
 

Abstract


Big data s rise has amplified the role of information systems in process management. Process mining, a branch of data science, provides analytical tools and methods which can distil insights about process behaviour from big process-related data. Yet challenges remain, including dealing with the quality of big data and the impact of poor quality data on event logs as the input to process mining analyses. In previous work, we have shown that despite researchers raising concerns about event log data quality, the event log preparation (data pre-processing) phase of process mining case studies is generally handled mechanistically, focusing on fixing symptoms and getting the log to a state where it can be consumed by process mining tools, rather than uncovering the root causes of event log data quality issues. This paper considers event log data quality problems from a new angle. We introduce the Odigos (Greek for `guide ) framework, adapted from Mingers and Willcocks (2014), based on semiotics and Peircean abductive reasoning, that explains the notion of process mining context at a conceptual level. The Odigos framework facilitates an informed way of dealing with data quality issues in event logs through supporting both prognostic (foreshadowing potential quality issues) and diagnostic (identifying root causes of discovered quality issues) approaches. We examine in depth how the framework supports a detailed root-cause analysis of a well-known collection of event log imperfection patterns.

Volume None
Pages None
DOI 10.1080/2573234x.2021.1947751
Language English
Journal Journal of Business Analytics

Full Text