To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Convert an imported CSV file to a XES event log throws java.lang.ArrayIndexOutOfBoundsException: 14

While I am converting my csv file to a .xes file in ProM 6.6, I get the following: 
Sorting CSV file (48,70 MB) by case and time using maximal 1092 MB of memory ...
Pre-sorting finished segment 1 ...
Finished sorting in 3 seconds
Reading cases ...
java.lang.ArrayIndexOutOfBoundsException: 14

It turns out: the csv file is loaded without any problem in Disco. And when I in Disco export the event log to a .xes file, it can be loaded without problems in ProM.

What could possibly be the issue with the original .csv file? Or could I be doing something wrong?

Answers

  • Dear Graciela,
    Is it possible to share the CSV file with me for debugging purposes (f.mannhardt@tue.nl). I would be interested in fixing this problem for future users. Maybe it is possible to reduce the CSV file and the issue still appears?

    Also, could you start ProM with the '.bat' file instead of the '.exe' file try this again and send me the full output of the console window. This give me better error messages as the error messages shown in ProM are, unfortunately, often not very useful for developers.

    Thanks

  • Thanks for the full error log send by email. I cannot help with the package manager (best to open a separate forum thread).

    As for the CSV importer. It looks like your CSV file has an inconsistent number of columns. Apparently, some row has more than 14 columns and the header only has 14 columns.

     I will improve the importer in the current nightly build to give a clear error message for such malformed CSV files. Unfortunately, I cannot fix this problem in ProM 6.6 as the policy is to not update these versions. If you use ProMLite the fix should be include after a while.

    It seems that Disco deals with these CSV files by guessing/ignoring some cells. In ProM, I choose not to do this as it might lead to data quality issues that go unnoticed. Maybe you can look with Excel or some other tool and see which row has more columns than the header.

    Thanks for reporting this problem.

  • I found the cause for this error.
    In one of my text columns, a lot special characters like double quotes, brackets, comma's appear.
    I tried to change tho change my double quotes to single quotes and to convert the csv, but still an ArrayOutOfBounds exception was thrown.

    However:
    Converting my event log without this column worked fine!

    Thanks a bunch Felix!
Sign In or Register to comment.