To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

CSV File Import ProM 6.5 - "java.lang.NullPointerException"

ropebender
edited July 2015 in ProM 6

I greatly appreciated the Process Mining MOOC last year, but have little Java or XES file experience.  I work mainly with Excel and VBA, so I was excited to see that you have provided the CSV Import option to ProM 6.5. I could never get XESame to work right, so after I refreshed my familiarity with ProM and tried the examples, I started trying to work with a CSV file.

The file seemed to import OK, and the CSV Import Action allowed me to open my test file and go through the options.  I was unsure of all the information to enter into the Preview pane, but did my best (as well as entering a Time Pattern of M/DD/YYYY H:MM in the aapropriate block.  When I clicked the Finish option I got the "java.lang.NullPointerException" noted above.  Don't know what to do next.  Did I leave something out?

It would be helpful is someone would write a beginner document on importing a CVS file for people who are not familliar with XES file structure.  Any takers?

Van!!

Answers

  • Dear Van,

    I'm working on the new CSV importer. It is still requires quite some development and polishing, so I'm sorry that you had to discover a bug.

    About the preview pane, it is not yet well designed. Try to not change anything there. Your date format "MM/DD/YYYY HH:MM" (I think you have to enter double M and double H for month and day) should be built-in an recognized automatically.

    Could you also start ProM from a command line keeping the console window open. Then please copy and paste the entire error (NullPointerException). This way I can investigate further :)

  • ropebender
    edited July 2015
    Sure. Who do it later tonight. Not sure how to start from the command line but i think i read it somewhere. I will research and give it a try. I did try not entering anything in the preview pane, got the same error. My time code for today would be 7/13/2015 5:13 if it was am. Do i need to change it to 07/13/20 05:13?
  • Easiest way to start from command line is as follows:

    1. Select the folder where you installed ProM. (Usually somewhere around C:\User\#username#\prom..)

    2. Hold the SHIFT key and click with the right mouse button on the folder to open an extended context menu.

    3. Select an option 'Open command line here'

    4. Type in ProM65.bat

  • Ok, That's different from what I did.  I opened the CMD.exe window and typed in ProM65.  The program started, I opened the CSV file and Imported it, selected the options and got the same error.  Nothing was listed in the CMD window, but attached is the whole error message.

    I will try again tonight (your way) and see what happens with the .bat file.

    Incidentally, I can't get the program to start at work, because the site where the packages are stored is blocked by my company's firewall.  I copied the Packages folder from my home PC from .ProM65\packages to a flash drive and put it in place of the same file on my Work machine, but the packages are not recognized and I still can't run the program at work.  Is there any way to install packages from a file rather than from your site?

     

  • Sorry about the late reply, I was on a short vacation.

    I tried it myself and found out that you have to start ProM by typing "ProM65.bat". If you start it only typing "ProM65", then it will hide the console and, thus, all detailed error messages.

    So could you please try this again?

    Besides, I think we should really incorporate some kind of error reporting mechanism. Just hiding the details from users makes it very hard to find errors. 

    Regarding the firewall issue, I think you need to modify the ProM.ini.

  • Success!  (at least partially) Laughing

    What I found is that even though both possible selections of the Timestamp option say "Optional", I had to enter the Timestamp fiod under Start Time rather than End time (even though the time is applied to the record when the event is complete.)

    When I did that, I was able to generate an .XES fileand begin to analize the event logs.  However I think I need to modify the .XES logs is some way to further define what the various fields are and how they are intended to operate, but my knowledge of all those options and packages available was not covered in the MOOC (not surprisingm after all, you have a full course of study for that stuff.

    If I could get two questions answered,though, I could make a start of it:

    When I look at the events in the Dashboard display, I see each of the fields has an entry with the fields separate by the Pipe Symbol, like this -- for Case 25 listing three events --

    Department name | Resource Name | Status (Complete)

    where Department might be "Printing", "Shipping", "New Call"

    Resource Name might be "John Smith", "Ed Collins", "Tom Jones"

    and Status might be "Pending", "Assigned", "Resolved"

    The first issue is that the word (Complete) appears after every Status entry, when the Case is actually not finished until the "status is either "Resolved" or "Cancelled"  The recovered processes all seem to rate all the statusses as if they are ending the process instead of being one of a number of intermediate steps.

    The second issue is that when I try to generate a Social Network, I can't find a way to assign the Resources to the departments.  No matter how I set up the perameters of the Social Network package, I just get one single "bubble" and can't find a way to see all the departments and resources listed as they should be.

    I assume that this has to do with definitions of the fields in the XES file, and am wondering if there is a resource I might get that would explain further what I need to do?

    I am able to generate a Petri Net and several others process nets by importing a CVS file with only a Time Stamp, Class, and Department, but I would like to get more information from my available data.

    Sorry for the length of this post, but I am so excited to have this great tool that I want more knowledge than I have.  I would be greatful for any hints you can give me.

     

  • Regarding the Firewall issue, I modified the ini file and all is OK.  I just copied all the Packages to a folder and referenced that folder.  It works great.

    Thanks!

  • Dear ropebender,

    Regarding your question concerning the 'complete' labels, by default an event is classified using the activity name, and the status (e.g. "Printing - start" and "Printing - end") to be able to calculate activity durations. The second part is stored in the 'lifecycle:state" attribute, so you might be able to use one of your fields (the one that contains resolved/cancelled) to indicate the state of the activity.

     

    Regarding the departement assignment, the social network plug-ins by default use the org:resource attribute, and not the departement. You can create a specific mapping that uses this field, or you can use another plug-in, for instance the inductive miner, and set it to use the org:resource or org:group fields to discover a process model that shows how a case moves between resources/groups.

    I hope this helps!

    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
Sign In or Register to comment.