A parallel or distributed algorithm is executed by a number of processes and/or processors to accomplish a particular task. Generally, it is much harder to design correct parallel protocols than their sequential counterparts. The reason is that it is hard to imagine all possible behaviours of a parallel system. This is the reason why many published parallel algorithms are wrong or have wrong correctness proofs.
There are two reasons why OAS is studying distributed algorithms
- To use the power of parallel and distributed machines to analyse the behaviour of large systems. State space exploration using a 2GByte machine can handle 30Mstates. State space exporation using a network of 100 of such machines can deal with state spaces of 3Gstate in a quite comparable time frame.
- Distributed algorithms are fine examples of stylized but very complex systems for which we can use our analysis techniques.
Our interest goes to all distributed algorithms, but the most intriguing are wait- and lock-free algorithms. These do not allow synchronization primitives such as semaphores and mutexes. Furthermore, they do not make any assumption on the relative speed of the processors, but require that each processor can finish its task (wait-free) or some processor can finish its task (lock-free) independently of the progress of other processes. In general it holds that wait and lock free algorithms are more reliable and offer a better performance, but are even harder to design correctly than ordinary parallel algorithms.
An elementary and surprising problem for which intricate wait-free algorithms have been designed is the WRITE-ALL problem. Consider an array of n booleans that are initially set to 0. How can we use p processors to set the whole array to 1, without making any assumption on the relative speed of the processors. Processes can only declare the job done, if they are sure that the whole array is set to one, or in other words they must be sure that the jobs assigned to other processes have been executed.
There is a p + Ω(p log n) lowerbound on the total amount of work that must be done to set the array. Especially, if p=n the amount of work is at least Ω(n log n), which is more than the naively expected linear time to set the array. An interesting book about the write all problem is [KanShva97].
Another interesting feature of wait- and lock-free algorithms is that special hardware instructions, such as test-and-set, compare-and-swap or load-locked/store-conditional are required. Such instructions have been made available on all modern processors. If only data can be written and read from memory simple algorithms cannot be made. A nice example is the consensus problem. Consider p processes that all contain a possibly different number. Besides this, the processes must be equal. The task of the processes is to reach consensus on a number that one of the processes contains. This means that after the algorithm terminates, all processes know this number, and know that all other processes know it. As said, without special instructions, such an algorithm cannot be constructed.
Wait- and lock-free algorithms that we have developed are for garbage collection [HesGro01], the WRITE-ALL problem [GrHeMaVe01], hashtables [GGH03]. Other interesting algorithms of the same nature are for the union-find problem [AnWo91], lock-free linked lists [Val95], and hashtables [ShSh03]. Of theoretical interest is a general construction to implement shared datastructures in general [Her93], but from a practical perspective this is not relevant.