Marco Montali e-mailed me today with the following question, which, I agree, might be of interest for the XES community.
The problem is twofold, and I think that both issues are of interest for the XES community.
The first issue is related to parsing logs containing timestamps in a format different than something like 2005-10-24T11:57:31.000+01:00.
In particular, both the XesXMLParser and the XMXMLParser are bound to this format, and I didn't found an easy way to set up a different format.
Looking into the APIs of OpenXES, I have found that both parsers rely on XsDateTimeConversion to parse timestamps. However, the XsDateTimeConversion object used for parsing is a protected member of the parsers, and therefore cannot be customized from the external world.
Indeed, XsDateTimeConversion understands only the format I have shown before (you can find this information at http://code.deckfour.org/xes/doc/org/deckfour/xes/util/XsDateTimeConversion.html
In order to make it possible to customize the dateformat, I have therefore implemented an ugly patch, subclassing the parser and assigning inside the constructor a subclass of the XsDateTimeConversion, which now takes a dateformat as a parameter. You can find the implementation below.
However, there is still another issue related to Localization: by default, the SimpleDateFormat class is instantiated with the Locale of the running application. However, in the general case the Locale associated to the user who generated the log could be different than the Locale associated to the user who is importing and analyzing that log. If these two users are associated to different Locales, then setting up the right dateformat is not sufficient when timestamps contain not only numbers, but also textual information.
For example, Fabrizio was trying to parse a log where timestamps' months are represented with three letters ("Sep" for september). The parser didn't work, because the format was not recognized. Hence I implemented the patch described above, making it possible to pass to the parser a customized format (EEE MMM d HH:mm:ss z yyyy). But still the parser didn't work, because Fabrizio's Locale is "IT", and therefore the SimpleDateFormat didn't recognize "Sep" as a valid month (it expected "Set", because the italian version of "september" is "settembre").
When Fabrizio recognized this problem, it was easy to fix it (it sufficies to add the right Locale - Locale.ENGLISH) as a second parameter when constructing the SimpleDateFormat.
However, this means that the user who is importing the log MUST KNOW the Locale of the user who has generated it. It would be definitely better to automatically extract this information from the log.
Thanking you in advance for the attention,