Project proposed by prof.dr.ir. Jean-Bernard Martens of the TU/e Department of Industrial Design, in consultation with prof.dr.ir. Jack van Wijk from the TU/e Department of Mathematics and Computing Science.
Data visualization offers a lot of different methods for exploring data. Many of these methods are generic, in the sense that they do not make many explicit assumptions about the structure underlying the data. In cases where such more explicit assumptions can however be made, trying to model the data is the obvious next step. Often, a class of related models rather than a single model is considered. This creates two interrelated problems. First, the user needs a way to intuitive specify which specific model(s) he wants to apply. This implies that the models themselves need to be visualized so that the user can understand them and can make model choices in an intuitive (and interactive) way. Second, the user needs to be able to assess if the model predictions are in agreement with observed data. A diversity of existing data visualization methods, such as scatterplots, may be used for this purpose. Especially in case of larger data sets, it is often not beneficial to look at all data at the same time, so that ways of selecting data for visualization often need to be incorporated. Appreciating how the fit between model and data is influenced by choices within the model is another important aspect.
The general problem discussed arose within the specific context of statistical data modeling. Within Industrial Design, we are for instance using such modeling to describe how different people appreciate different aspects of products, or how such appreciation varies over time. The software tool that we use now is adequate for visualizing the data (although it could obviously be improved), in the sense that it helps people to appreciate how well the model predictions and the actual data agree. The main problem is that the tool does not visualize the underlying models, but that models options are coupled to button and menu selections. The result is that the users are not able to appreciate the options that are available to them, and hence fail to appreciate and use the full potential of the tool.
We are looking for a student that can create a new graphical user interface for an existing statistical modeling tool (implemented in C under Unix). The new interface should provide a better integration of data visualization and model visualization, which should result in making the tool more useful to non-expert users. Migrating the interface from a Unix to a Windows platform (for instance, by implementing in C#) would be an additional benefit but is not essential.
A book chapter describing the need for interactive visualization within statistical data modeling explains some of the above issues in more detail and is available upon request (due to the mathematical nature of part of the paper, in which the statistical model itself is explained in more detail, the chapter probably requires some explanation before being read, which is the reason for not distributing it at this stage).