To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

‘Covert CSV to XES’ plugin

I have a Question involving the use of the ‘Covert CSV to XES’ plugin in PromLite.

When I hover across the ‘Start Time’ option it gives the text:

 

“In case your lifecycle events such as ‘start’ are already separate row in the CSV file, please use the expert mode to find an appropriate mapping”

 

So what I have is an .csv file that contain screening (start) and screening (end) (and a few more events of course). Here, start and end are the lifecycles. These events are in different rows just as the message states. This however is not available for all events, i.e. ‘Registration’ only has a ‘start’ value. Can I now easily import this into ProM in a .xes file or do I need to create a different column in my .csv file. If the first option is available, do you have tips on how to create this mapping because I cannot find this only or on the forum.

Comments

  • Hi, if your data is already organized such that there are separate events (rows) for start and complete life-cycle transition, then just select the column as complete timestamp. You also have to have a column named 'lifecycle:transition' in your CSV. In that column you have to manually enter 'start' and 'complete' (use Excel or R or whatever to do that). If I remember correctly the CSV importer should not overwrite an existing 'lifecycle:transition' column. If it does, then that should be fixed :-)
  • Thank you, however I do not have a start or end time for each activity. So in this sense, when importing this into ProM it gives errors that it cannot parse the date, because it does not have a value.. Which seems logical. So what now?
  • JBuijsJBuijs Posts: 914
    You only need one of the two: start or complete. If you want to ignore the 'events' where no timestamp is available, in the last wizard set the 'robustness' (the top left drop down thingy) to 'ignore event'.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • So if I understand correctly a summary; 

    As example, when I have 4 events with only a completion time, and one event with both, then I make events all with timestamps in the completion time column. However, I add an extra event line for the start time of the event with an extra column indication the lifecycle transition (start or complete).

    Right?
    Thanks for the help!
  • JBuijsJBuijs Posts: 914
    What I would do, assuming you have one row per event, and all events have something filled in in the 'completeTime' (or similar) column, and only some have something in the 'startTime' column, then I would set the start/complete time to the correct columns.
    Then in the last wizard step, the top-left dropdown indicates the 'robustness'. If you set this is 'ignore event' then the event won't be added if there is an error, in your case if there is no startTime value.

    Did not try this out to see if it works though.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
Sign In or Register to comment.