# Similarity Measures and Algorithms for Cartographic Schematization

## Wouter Meulemans

Promotor: prof.dr. B. Speckmann (TU/e)

Technische Universiteit Eindhoven

Date: *2 September, 2014, 16:00*

Thesis: *PDF*

#### Summary

A schematic map visualizes information in its geographic context. It does so in a structured and organized manner, by using abstract and stylized shapes to represent the information and its context. The power of a schematic map lies in its emphasis on high-level structures and relations, which are often the important aspects of the information. It does not bother an observer with unnecessary details of geographic reality. Geographic accuracy is relinquished to improve upon the clarity of the map and to reduce the cognitive load required to mentally process the information.

Cartographic schematization is the process involved in making schematic maps. The advent of computers and the availability of digital geographic data gave rise to opportunities to automate the schematization process. This automation provides interesting algorithmic challenges which are investigated in this thesis. In particular, we focus on the schematization of territorial outlines such as country or province borders. Such elements are often used to convey information about a region or in conjunction with other visual elements to provide geographic context.

To model schematization as an algorithmic problem, we need to know what constitutes a good schematization of a territorial outline. We study the case in which the schematic map is represented by line segments. In this scenario we identify the following four criteria for a good schematic outline:

- it uses only a few line segments;
- each line segment is oriented according to a specified set of allowed orientations;
- it maintains its geographic relations to other regions;
- it resembles the geographic outline.

The first three criteria are comparatively easy to formalize. The fourth criterion is less straightforward: how do we quantify “resemblance”? To develop effective schematization algorithms, we require a suitable similarity measure to quantify resemblance.

In the first part of this thesis, we investigate existing similarity measures. We conclude that the Frèchet distance is most suitable to our problem. To compute this measure, we describe an algorithm that computes the Frèchet distance in O(n^2 sqrt(log n) (log log n)^(3/2)) time. This is the first asymptotic improvement on the O(n^2 log n)-time algorithm that was described in 1995. The Frèchet distance results in a single number that indicates the similarity between two curves. In many cases an actual description of the similarity (i.e., a matching) is desired. Many matchings result in the Frèchet distance. However, some matchings describe the similarity more accurately than others. To address this problem, we suggest a new criterion to distinguish between unintuitive and intuitive matchings. We call matchings that adhere to this criterion “locally correct”. We prove that any pair of curves admits a locally correct matching and show how one can be computed.

In the second part of this thesis, we investigate algorithms to schematize territorial outlines. First, we model the schematization problem under the Frèchet distance as another problem referred to as simple map matching. We prove that this leads to an NP-hard problem: it is unlikely that an efficient algorithm exists to compute the optimal schematization. We even prove that it is NP-hard to compute a close-to-optimal schematization. We show how to build an interval graph that can be used to solve the problem without a need for computing the Frèchet distance afterwards. By using a brute-force algorithm in combination with this graph, we demonstrate that instances can be solved nonetheless. The results show that the computed schematic outlines capture the prominent features of the geographic input. We also present a heuristic iterative algorithm for schematization.

First, the outline is changed such that all line segments adhere to the given set of orientations (criterion (2)). Then, the algorithm reduces the complexity (criterion (1)) step by step, while maintaining geographic relations (criterion (3)) and the orientations of line segments (criterion (2)). This complexity reduction is based on greedily performing operations that cause the least change in similarity (criterion (4)). The algorithm also preserves the exact enclosed area of an outline: this ensures that relative sizes do not change when schematizing multiple outlines simultaneously. The algorithm terminates when the desired level of complexity is reached. We prove that, for a single nonconvex outline, an operation always exists for the algorithm to perform. We conclude this part with an experimental evaluation of the schematization algorithm. The results show that the computed schematic outlines maintain the most salient geographic features.

In the third part of this thesis, we widen the scope of our schematization algorithms. We investigate the potential of our iterative algorithm for two different aspects of building generalization. The first aspect is building-wall squaring: we restore right-angle corners that have been distorted due to inaccuracy in the input data. The second aspect is the generalization of an urban area to make it appropriate for some predefined map scale: this requires the possibility to aggregate and eliminate buildings. We present simple extensions to the schematization algorithm such that it can be adequately applied for these problems in generalization.

Finally, we consider other geometric styles of schematization using our brute-force algorithm for the simple map-matching problem. We show that this approach can be used to provide good schematic outlines with concentric circular arcs. In addition, we suggest an interesting new geometric style: isothetic schematization.