The exact maximum clique problem

Recently I needed the size of the largest coclique in a number of sparse graphs (or, equivalently, the size of the largest clique in a number of dense graphs) and found that my old algorithm was 50 times as fast as published algorithms. So, this page describes an idea that can make searching for a maximum clique much faster for very dense graphs.

Motivating example

Find the largest coclique in the E_6,3(1) Kneser graph. This graph has 216 vertices and valency (degree) 10, and as it turns out the maximum coclique has size 81. (In terms of cliques, the problem would be to find a largest clique in a graph on 216 vertices of valency 205, where this largest clique has size 81.)

Patric Östergård has published three programs for the maximum weighted clique problem, and trying those on this case gave the result after 528172, 36092 and 50632 seconds, respectively. With the improvement described here the timing becomes 9362, 919 and 979 seconds, respectively, 40-50 times as fast.

Algorithm

The basic algorithm for finding a clique in a graph with vertices 1,2,...,n is to loop over the vertices, picking the first vertex i, and then go down recursively, each time invoking the algorithm on the subgraph of all vertices adjacent to all vertices picked so far.

This works well if the graph is sparse, so that the largest cliques are small. But each clique of size s will be found s! times, which is unacceptable unless s is really small.

An obvious improvement is to go down recursively, each time invoking the algorithm on the subgraph of all vertices with number larger than that of the largest vertex chosen and adjacent to all vertices chosen so far. This picks vertices in order, and each clique is found only once. If all maximal cliques are needed (and not only one maximum clique), one has to check at the bottom of the recursion whether the clique found is maximal. This avoids the factor s!, but suffers from a factor 2^s: all subsets of each clique are also visited.

An old improvement of mine eliminates this last problem: when one goes down recursively, do not keep only the list cand[] of candidate vertices, adjacent to all chosen so far, but also the list nocand[] of candidate vertices that were rejected. For each member x of the latter list we have to pick a vertex u of the former list where x and u are nonadjacent. (Otherwise the clique will be a proper subset of something considered already.)

I hear that this improvement is known nowadays as the Bron-Kerbosch algorithm.

There are various ways to implement this. One version takes the first element x from the nocand[] list, and finds the first element u from the cand[] list where x and u are nonadjacent. If there is no such u, then return - this branch can be pruned. Otherwise, first pick u and recurse with the two lists cand[] ∩ nbrs(u) and nocand[] ∩ nbrs(u). Then do not pick u and loop with cand[] \ {u} and nocand[] ∪ {u}.

Östergård's improvement

Patric Östergård describes a different improvement. Instead of determining the largest clique in the entire graph, he suggests to find the largest clique in each subgraph on the vertices {i,...,n} for i=n, n–1, ...,1. This enables one to prune: when picking i an upper bound for all that follows is known, and the branch can be pruned if it cannot contain a clique larger than the best one found so far.

This is a big improvement on the basic algorithm, and can be combined with the idea of a nocand[] list. In order to guarantee that vertices are picked in order we check for each x that a u exists (and if not, return). If all is ok, just follow Östergård's algorithm.

Other variations exist. The idea of Östergård's approach is to obtain a good upper bound on what one is going to find in the rest of the graph. Others have tried to achieve something similar by starting with a greedy coloring, and using the number of colors on the rest of the graph as an upper bound for the size of a clique there.

Results

Exact algorithm for finding a max (weighted) clique.

The columns wcl, wcl2, wcl3 are for the programs wclique, wclique2 and wclique3 (that are identical apart from the initial ordering of the vertices - clearly that order is very important!). The columns cl, cl2, cl3 are for the same programs, but including a nocand[] list in the sub() subroutine. The column cl-1.2 is for the program cliquer-1.2 (used without any options).

Some sparse graphs. The improved algorithm is much faster.

file N |C| cl cl2 cl3 wcl wcl2 wcl3 cl-1.2

aa7.1.5 105 37 2.95 1.94 0.72 5.08 7.64 4.74 9.87

aa8.1.6 168 58 8437 1683 34798 8198 382695 >400000s >150000s

gr216 216 81 9362 919 979 528172 36092 50632

file	N	\|C\|	cl	cl2	cl3	wcl	wcl2	wcl3	cl-1.2
aa7.1.5	105	37	2.95	1.94	0.72	5.08	7.64	4.74	9.87
aa8.1.6	168	58	8437	1683	34798	8198	382695	>400000s	>150000s
gr216	216	81	9362	919	979	528172	36092	50632

In other examples, not as sparse, the nocand list does not help and code including it is a little bit slower than code without it.