Privacy throughout the Data Cycle

Elisa Costante

Promotors: prof.dr. S. Etalle (TU/e) and prof.dr. M. Petkovic (TU/e)
Copromotor: dr. J. den Hartog (TU/e)
Technische Universiteit Eindhoven
Date: 31 March, 16:00


Modern society relies heavily on the availability of large quantities of personal information in digital form. Private data is stored not only at the user’s premises, but at a whole range of public and private institutions as well, where it is often accessible remotely. We call data cycle the route typically followed by data from the moment it leaves the users premises, until it is stored in data repositories from where it can be accessed by the user and other actors as well. Along this route risks may arise at any time: at the very beginning (e.g., a user trust — and release data to– a site that proves to be fraudulent), along the way (e.g., privacy-invasive services are used to process an order) or at the end (e.g., sensitive information is leaked from repositories where data is stored). These risks expose individuals and society to new types of threats such as privacy breaches, identity theft and frauds. This thesis addresses the problem of data privacy protection by performing a comprehensive analysis of the privacy risks that may occur at different stages throughout the data cycle. Especially, we propose a suite of privacy solutions addressing the following challenges.

Understanding the user’s perception of privacy risks and how users establish trust online. To this end we executed a user study. The study shows that awareness of privacy risks is a crucial element in determining which factors influence trust and that by increasing awareness one can drive trust decisions.

Evaluating websites with respect to the privacy protection they offer. Providing users with an objective value of the privacy quality of a service or a website is a way to guide their decisions. For this reason we propose a solution which automatically analyzes websites by applying machine learning and natural language processing techniques to their privacy policies.

Identifying the web service composition which best preserves privacy. Although websites are often seen as single entities, they usually group many web services together to reach a more complex scope. The way services are composed is usually transparent to users but it does affect their privacy. Therefore, we propose a solution for privacy-aware service composition which takes into account privacy concerns and users’ preferences and identifies the web-service composition which best preserves privacy and best matches a user’s preferences.

Detecting privacy infringements at data repositories where data is ultimately stored. Privacy breaches may happen, e.g. because of hackers gaining access to the data or malicious employees abusing their rights. To reduce these risks we propose a monitoring solution which analyzes transactions with the repositories and applies anomaly detection techniques to identify misuse and data leakage.