Want to win?

Win the Process Discovery Contest (PDC) 2018!

Project setting

The Process Discovery Contest is dedicated to the assessment of tools and techniques that discover business process models from event logs. The objective is to compare the efficiency of techniques to discover process models that provide a proper balance between “overfitting” and “underfitting”. A process model is overfitting (the event log) if it is too restrictive, disallowing behavior which is part of the underlying process. This typically occurs when the model only allows for the behavior recorded in the event log. Conversely, it is underfitting (the reality) if it is not restrictive enough, allowing behavior which is not part of the underlying process. This typically occurs if it overgeneralizes the example behavior in the event log. A number of event logs will be provided. These event logs are generated from business process models that show different behavioral characteristics. The process models will be kept secret: only “training” event logs showing a portion of the possible behavior will be disclosed. The winner is/are the contestant(s) that provides the technique that can discover process models that are the closest to the original process models, in term of balancing between “overfitting” and “underfitting”. To assess this balance we take a classification perspective where a “test” event log will be used. The test event log contains traces representing real process behavior and traces representing behavior not related to the process. Each trace of the training and test log will record complete executions of instances of the business processes. In other words, each trace records all events of one process instance from the starting state till the end state.

A model is as good in balancing “overfitting” and “underfitting” as it is able to correctly classify the traces in the “test” event log:

  • Given a trace representing real process behavior, the model should classify it as allowed.
  • Given a trace representing a behavior not related to the process, the model should classify it as disallowed.

The contest is not restricted to any modelling notation and no preference is made. Any procedural (e.g., Petri Net or BPMN) or declarative (e.g., Declare) notation is equally welcome. The context is not restricted to open-source tools. Proprietary tools can also participate.

Project description

The goal of the Master Project is to participate in the 2018 edition of the Process Discovery Contest, if possible using the techniques that were developed for the 2017 edition (see the figure below for an example model) and that allowed us to classify all traces correctly. However, these techniques did not allow us to win the 2017 edition because the models we generated were considered to be less informative than the BPMN models as discovered by the winning competitor. As a result, a possibility would be to develop a conversion from our models to BPMN models. Furthermore, the 2018 edition might bring new challenges to the Contest, which might require extensions to our techniques. As the call for the 2018 edition is not out yet, it is hard to say what kind of extensions would be needed.

For the earlier editions of the Contest, the price for the winner included a flight to the BPM conference, the lodging expenses during the conference, and a full registration for the conference. Provided that the same prices will be available for the 2018 edition, and provided that the result of the Master Project wins the Contest, it will be the master student who will pick up these prices and visit the BPM 2018 Conference, which will be held in Sydney.

Time restrictions

Given that the main goal is to participate in the 2018 edition of the Contest, it would be ideal if the master student starts just before the Contest starts. This way, the student can first get acquainted with the techniques using the 2017 edition, and then start applying and extending them for the 2018 edition. As the Contest typically starts in March, it would be ideal if the student starts February/March.

Project Team

Principal Supervisor

Renata Medeiros de Carvalho

Position: UD
Room: MF 7.059
Tel (internal):
Projects: BPR4GDPR
Courses: 2IAB0, 2IMC93, 2IMC98, JM0200
Links: Scopus page, DBLP page, TU/e employee page

Daily Supervisor

Eric Verbeek

Position: Scientific Programmer
Room: MF 7.099
Tel (internal): 3755
Projects: BPR4GDPR, CoseLoG, Philips Flagship
Courses: 2IIH0
Links: Personal home page, Google scholar page, Scopus page, ORCID page, DBLP page, TU/e employee page
Eric is the scientific programmer in the AIS group. As such, he is the custodian of the process mining framework ProM. In you want access to the ProM repository, or have any questions related to ProM and its development, as Eric. Recently, he has been working on a decomposition framework for both process discovery as conformance checking in ProM. Earlier, he also worked on ExSpect and Woflan.