To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Dataset conversion error from CSV to XES(ProM 6) -Duplicate events observed

As part of the BPI Challenge 2017, we have performed a join on both the application and offer log that  was provided. We have a consolidated CSV file of 1.21M rows. But while converting the CSV to XES using ProM 6,  we have observed that the number of events (rows) are exactly double (around 2.3M). 

Below are the settings which we used in ProM.
We mapped the cases(Application IDs) to the events and added the timestamp(startTime and completeTime) as per the format.

 Capture1.PNG

The remaining settings have been shown in the screenshot. 

Kindly help as number of events ideally should be 1.2M.

Thank you

Regards,
Akshar Solanki


Comments

  • Dear Akshar,

    This makes sense. Each row in the table gets converted into a start event (based on the start event time) and an end event (base on the end even time). Hence there are twice as many events as rows.

    If you want to remove events and focus only on the complete events for example, you need to filter afterwards.
  • Dear bfvdonge,

    Thank you for your reply. It really helps alot. 

Sign In or Register to comment.