Archive of possible assignments

Note that these assignments are not current any more. They are retained here to provide an idea of possible assignments.

Celonis’ special internship & master thesis program for TU Eindhoven students

Improving the Evolutionary Tree Miner

The Evolutionary Tree Miner (ETM) is a genetic process discovery algorithm that works on process trees, a specific process modelling formalism. Recently work has started on an interactive process discovery algorithm where the user is guided to modify a free-choice Petri net in such a way that the Petri net is always sound.

The master project would consist initially of ‘connecting’ the ETM to this interactive process discovery approach, hence replacing the human. The main benefit would be that the ETM does not solely work on process trees any more (which can be somewhat restrictive), but directly on Petri nets in such a way that they are guaranteed to remain sound.

After this initial step several other improvement steps can be applied to the ETM such as extending the work in deriving alignments, estimating quality of a process model, smarter mutations, smarter application of mutations, starting from solutions created by other algorithms, etc. etc.

Therefore an important part is the implementation of these ideas in the ETM which is implemented in our toolset ProM. Good programming skills in Java are therefore important, no matter if you are a BIS or a CSE student.

→ Read more...

Understanding Customer Journeys with Process Mining

In today’s customer environments, where customers use many different contact channels to solve outstanding questions and do requests, it becomes increasingly difficult to follow customer behavior, optimize service levels and provide a memorable experience for customers. Especially when the contacts of the customer are not related. These journeys can be product specific (e.g. changing your telephone provider), customer specific (e.g. change known home address) or life event specific (e.g. starting a family).

Underlined works together with several companies like CZ, Aegon and SNS to build a generic framework in which all traceable, incoming and outgoing customer contacts (call, web, e-mail, chat, etc.) are brought together as a unique dataset. The dataset of companies is further enriched by Underlined with relevant analysis that can be linked to (unique) customer events.

Underlined has worked together with the TU/e (Bart Hompes and Joos Buijs from the Architecture of Information Systems group) and CZ (one of the largest Dutch health insurance companies) to develop customer journey mining algorithms. This research showed that it is possible to distinguish the different journeys per customer without any prior process knowledge, but additional research is needed to:

  1. Apply machine learning techniques to cluster customers with similar behavior. The customer journey can significantly vary depending on the customer characteristics. Therefore, try to build a single model for all customers will lead to not fully satisfactory results. Therefore, the application of machine learning or OLAP techniques (a.k.a. as process cubes) would be beneficial to split the customers into clusters each containing customers with similar characteristics.
  2. Research multi-dimensional similarity matrix. Current customer journey mining algorithms need to know which activities might be related to each other. Information on this is stored in a similarity matrix. Currently there is a working version for relatively simple datasets and processes. We would like to develop a next level similarity matrix for more complex data which contain multiple journeys and multiple customer segments.
  3. Predictive modelling of emotions in the journey. Customers make decisions in the customer journey based on their emotions. In current datasets there is sample feedback, which expresses the feedback of customers and their concerns regarding the service of a company. We would to research a predictive model that, out the basis of the past customer journeys, it can predict the intermediate and final emotions of the customers who are currently active in their journeys.

In the above-mentioned analysis, the student should try to leverage on every piece of available data. This includes logged data of the past customer behavior, for example call center data, online click trails, social media data and online and offline feedback. But also non-transactional data like product usage and customer segmentation variables.

→ Read more...

Want to win?

Win the Process Discovery Contest (PDC) 2018!

→ Read more...

SAP Process Mining at Ciber

Ciber Netherlands [1] is an IT consulting company, with its origins in Detroit, integrated now within the Manpower group. They have a strong interest in using Process Mining to improve their systems and the services they provide to their customers. In particular, they wish to apply Process Mining to the systems they use for managing their internal processes. Their interest is on focusing their efforts on the SAP platform, which many of their clients use as well. Performing this project on their own SAP systems would be a great way to demonstrate the potential of Process Mining in these kind of environments, which they would be able to extend for their clients. Until now, Process Mining on SAP systems has been performed in an ad-hoc fashion. However, we aim at automating this procedure and applying new techniques [2,3] to extract the relevant information. These techniques allow to retrieve historical and execution information, enabling the application of analysis methods in a more standardized and meaningful way. To apply these techniques in real-life scenarios, many challenges remain open. One of them is to be able to identify interesting views on the data in order to obtain useful representations of the process. Often, this requires the involvement of domain experts, but a more automatic way would be desirable. Therefore, the project would not only involve the application of existing techniques, but the development of new methods to identify meaningful views on the real data.

→ Read more...

Operational Support for Analysis and Avoidance of Threats and Vulnerabilities in Global Supply-chain Processes (2 Master projects)

Supply-chain and Logistics processes are facing threats and issues as never before. Because of terrorisms and other forms of undesirable or illegal activities, supply chains are subjected to high vulnerabilities and disruptions. Also, the competition among the different supply-chain providers is requiring a timely and more efficient flow legitimate commerce through the European Union (EU) and other nations around the world. The aim of these Master projects is to demonstrate that vulnerabilities and inefficiencies can be at some degree predicted and recommendations can be given to minimize threats and risks, with tangible benefits to involved stakeholders. During the project to achieve the expected results, students will leverage on techniques that combine Process and Data Mining, such as classification based on decision/regression trees, OLAP technologies, process discovery and compliance checking. Specifically, two Master projects will be offered. A first project is in partnership with Jan De Rijk, a leading provider of transportation and distribution services, operating a large, modern and diversified fleet of 1000 vehicles across Europe. A second project is carried on in collaboration with Portbase, a community that brings together more than 3200 customers in all sectors of the Rotterdam port and provides integration services. In both of projects, students will be working on the datasets of the respective companies and will be given the opportunity to pay multiple visits to the companies and discuss with the different stakeholders

  • How to access and understand the historical event data
  • To discuss the business requirements
  • To present and obtain feedback on the (intermediate) results.

→ Read more...

Certification for the XES Standard

On November 11th, 2016, the IEEE Standards Association has officially published the XES Standard as IEEE Std 1849TM-2016: IEEE Standard for eXtensible Event Stream (XES) for Achieving Interopability in Event Logs and Event Streams. The IEEE Task Force on Process Mining has been driving the standardization process for over six years, because the standard allows for the exchange of event data between different process mining tools.

Through the XES Standard, event data to be transported from the location where it was generated to the location where it can be stored and analyzed, without losing semantics. The XES Standard enforces that this transport and storage is done in a standardized way, that is, in a way that is clear and well-understood. Next to providing a standardized syntax and semantics, the XES Standard also allows for extensions, e.g., adding cost information or domain specific attributes to events.

→ Read more...

Distributed optimization of energy consumption for electric vehicles with Enervalis

Sales of electric vehicles are on the rise. This poses a number of challenges for aging grids all across Europe by introducing new peak loads on already stressed grids. The problem is exacerbated by an increasing contribution of renewable energy sources. At the same time, EVs can also form part of the solution as mobile loads which offer flexibility to the grid by offering demand side management. In this context, it is necessary however to balance a number of (often competing) constraints foremost of which is meeting user specifications. These additional constraints include maximal self-consumption of renewable generation (e.g. solar), improving grid quality, taking advantage of market signals directly or by providing flexibility of consumption etc.

The optimization problem is tractable for a single vehicle but becomes more convoluted when applied to many (hundreds or even thousands of) vehicles, each with its own set of constraints. Furthermore, it is problematic both from a logistics (providing communication framework) and privacy (sharing user data) perspective to do completely centralized control. A number of techniques have been proposed to solve for this problem in a general setting ranging from classical optimization algorithms to meta-heuristics, distributed constrained optimization and multi agent reinforcement learning.

→ Read more...

Text Mining with KPMG

Together with KPMG’s department on sustainability, we are looking for a Master student interested in text- and datamining. Part of KPMG’s job is to judge yearly reports on the topic of sustainability. This task is currently done by manually assessing the sentiment of a yearly report against the actual measureable sustainability values to see if the report is too optimistic or too pessimistic.

The goal of this project is to see if this can be done automatically through text- and datamining. KPMG has a large body of test and training data available for the student to start comparing reports. In depth knowledge of both supervised and unsupervised techniques will be needed for this project.

KPMG is a highly competitive company. Therefore, they are looking for a CSE or BIS Master student on track for a Cum-Laude (i.e. an average grade of 8). Furthermore, the student should have a strong interest in datamining. A selection process within KPMG is part of the hiring procedure.

→ Read more...

Amey SC Student Projects

Strategic Consulting - Student Data Science Projects

→ Read more...

Data-aware process mining with QPR

Together with QPR, we are looking for a Master student interested in data-aware process mining.

The goal of this project is to do a case study with data provided by QPR. This data will be made available in their native data format. The case study will investigate what we can learn from the data using both QPR ProcessAnalyzer and the data-aware techniques as available in ProM.

As a part of the project, the Master student needs to implement scripts that convert the native QPR data format to the XES data format (as used by ProM) and back. For this reason, some programming skills are required.

→ Read more...