2IMI35 - Introduction to process mining

Data science is the profession of the future, because organizations that are unable to use (big) data in a smart way will not survive. It is not sufficient to focus on data storage and data analysis. The data scientist also needs to relate data to process analysis. Process mining bridges the gap between traditional model-based process analysis (e.g., simulation and other business process management techniques) and data-centric analysis techniques such as machine learning and data mining. Process mining seeks the confrontation between event data (i.e., observed behavior) and process models (hand-made or discovered automatically). This technology has become available only recently, but it can be applied to any type of operational processes (organizations and systems). Example applications include: analyzing treatment processes in hospitals, improving customer service processes in a multinational, understanding the browsing behavior of customers using a booking site, analyzing failures of a baggage handling system, and improving the user interface of an X-ray machine. All of these applications have in common that dynamic behavior needs to be related to process models. Hence, we refer to this as “data science in action”.

The course covers the three main types of process mining. The first type of process mining is discovery. A discovery technique takes an event log and produces a process model without using any a-priori information. An example is the a-algorithm that takes an event log and produces a Petri net explaining the behavior recorded in the log. The second type of process mining is conformance. Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. The third type of process mining is enhancement. Here, the idea is to extend or improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. An example is the extension of a process model with performance information, e.g., showing bottlenecks. Process mining techniques can be used in an offline, but also online setting. The latter is known as operational support. An example is the detection of non-conformance at the moment the deviation actually takes place. Another example is time prediction for running cases, i.e., given a partially executed case the remaining processing time is estimated based on historic information of similar cases.

Process mining provides not only a bridge between data mining and business process management; it also helps to address the classical divide between “business” and “IT”. Evidence-based business process management based on process mining helps to create a common ground for business process improvement and information systems development. The course uses many examples using real-life event logs to illustrate the concepts and algorithms. After taking this course, one is able to run process mining projects and have a good understanding of the data science field.

Objectives

After taking this course students should:

  • have a good understanding of process mining,
  • understand the role of data science in today's society,
  • be able to relate process mining techniques to other analysis techniques such as simulation, business intelligence, data mining, machine learning, and verification,
  • be able to apply basic process discovery techniques such as the alpha algorithm to learn a process model from an event log (both manually and using tools),
  • be able to apply basic conformance checking techniques (such as token-based replay) to compare event logs and process models (both manually and using tools),
  • be able to extend a process model with information extracted from the event log (e.g., show bottlenecks),
  • have a good understanding of the data needed to start a process mining project,
  • be able to characterize the questions that can be answered based on such event data,
  • explain how process mining can also be used for operational support (prediction and recommendation),
  • be able to use tools such as ProM and Disco, and
  • be able to execute process mining projects in a structured manner using the L* life-cycle model

Staff involved