*Mathematical induction* plays a prominent role in the analysis of algorithms. There are various reasons for this, but in our setting we in particular use mathematical induction to prove the *correctness of recursive algorithms*. In this setting, commonly a *simple induction* is not sufficient, and we need to use *strong induction*.

We will, nonetheless, use simple induction as a starting point. This also gives us the opportunity to discuss sums that are relevant to the analysis of algorithms. We then move on to strong induction and correctness proofs for recursive algorithms.

Induction is most commonly used to prove a statement about natural numbers. Lets consider as example the statement $P(n)$:

$\sum_{i=0}^n 1/2^i = 2 - 1/2^i\;.$

We can easily check whether this statement is true for a couple of values $n$. For instance, $P(0)$ states

$\sum_{i=0}^0 1/2^i = 1/2^0 = 1 = 2 - 1 = 2 - 1/2^0$,

which is true. But also, for instance, $P(3)$,

$\sum_{i=0}^3 1/2^i = 1/2^0 + 1/2^1 + 1/2^2 + 1/2^3 = 1 + 0.5 + 0.25 + 0.125 = 2 - 0.125 = 2 - 1/2^3$

is true.

Knowing that $P(3)$ holds, we can add $1/2^4$ to $P(3)$ to prove that $P(4)$ holds:

$\sum_{i=0}^4 1/2^i = \sum_{i=0}^3 1/2^i + 0.0625 = (2 - 0.125) + 0.0625 = 2 - 0.0625 = 2 - 1/2^4$.

More generally, if we know that $P(n)$ holds, we can prove $P(n+1)$:

$\sum_{i=0}^{n+1} 1/2^i = \sum_{i=0}^{n} 1/2^i + 1/2^{n+1} = (2 - 1/2^n) + 1/2^{n+1} = 2 - (1/2^n - 1/2^{n+1}) = 2 - (2/2^{n+1} - 1/2^{n+1}) = 2 - 1/2^{n+1}$.

To summarize, given a statement $P(n)$ about a natural number $n$, if we can prove

1.) $P(0)$ and

2.) $P(n)$ implies $P(n+1)$,

then by the principle of induction we can conclude that $P(n)$ holds for all $n$.

It seems natural to ask why induction constitutes a valid proof. But this is actually a tricky question, since it brings us to deeper questions like: what is proof, and what are the natural numbers? The short answer is, that we simply assume that induction constitutes a valid proof.

An alternative answer is the following: Assume induction would not work, that is we have a statement $P(n)$ for which 1.) and 2.) holds but there is an $n$ for which $P(n)$ does not hold. Now suppose $n$ is the smallest natural number for which $P(n)$ does not hold. Because of 1.) we have $n>0$, which in particular means that $n-1$ is also a natural number. Since $n-1 < n$, we know that $P(n-1)$ holds. But because of 2.) we can conclude that $P(n)$ also holds. A contradiction!. This answer is valid, but we are making some fundamental assumptions about the natural numbers in this argument. In short: induction makes sense and is useful, and therefore we are going to use it in our proofs.

**bold** depend on the specific statement and need to be filled in appropriately. In the following, "I.H." is the abbreviation for "induction hypothesis".

*Proof:*
By induction on $n$ we prove the following statement for all $n$:

$P(n)$: **blabla** $n$ **blabla**.

*Basis ($n=0$):* **Show here that P(0) holds.** Thus, the statement holds for $n=0$.

*Step ($n \rightarrow n+1$):* Assume the statement $P(n)$ holds for $n$ (I.H.). **Show that $P(n+1)$ holds (assuming that P(n) holds. You should point out the place where you use the induction hypothesis by "by induction hypothesis".** Thus, the statement also holds for $n+1$.

By induction we can conclude that the statement holds for all $n$.

$\square$

*Note on choosing the basis:* So far we always assumed that the statement that we want to prove holds for all $n \geq 0$. However, often enough we may want to prove a statement that only holds for $n \geq 1$. This can be easily achieved: We simple use $P(1)$ as basis. More generally, if a statement holds for $n \geq k$, we probably want to use $P(k)$ as basis. However, sometimes we have to be careful: some inductions, in particular some strong inductions, require several base cases; since this is a problem that is more common in strong inductions, we ignore this here for now, and discuss it later, when looking at strong inductions.

*Claim:* $\sum_{i=0}^n q^i = \frac{1-q^{n+1}}{1-q}$ for all $n$ and $q \neq 1$.

*Proof:*
By induction on $n$ we prove the following statement for all $n$:

$P(n)$: $\sum_{i=0}^n q^i = \frac{1-q^{n+1}}{1-q}$ for $q \neq 1$.

*Basis ($n=0$):* $\sum_{i=0}^0 q^i = 1 = \frac{1-q^{0+1}}{1-q}$. Thus, the statement holds for $n=0$.

*Step ($n \rightarrow n+1$):* Assume the statement $P(n)$ holds for $n$ (I.H.). Now, $\sum_{i=0}^{n+1} q^i = (\sum_{i=0}^{n} q^i) + q^{n+1} = \frac{1-q^{n+1}}{1-q} + q^{n+1}$ by induction hypothesis. Further, $\frac{1-q^{n+1}}{1-q} + q^{n+1} = \frac{1-q^{n+1}}{1-q} + \frac{q^{n+1}-q^{n+2}}{1-q} = \frac{1-q^{n+2}}{1-q}$. Thus, the statement also holds for $n+1$.

By induction we can conclude that the statement holds for all $n$.

$\square$

Note that the series $\sum_{i=0}^n q^i$ is called *geometric series*. It shows up, for instance, in the proof of the Master Theorem. In terms of its asympotics, for us it is relevant that

1.) for $0

2.) for $q>1$, $\sum_{i=0}^n q^i = \Theta(q^n)$.

Both of this follows from the claim above.

Induction provides us with a simple and clean way to prove the claim stated above. It is worth noting, that a non-inductive proof can be more insightful: by nature induction uses "small steps" and as such does not provide us with the "big picture". A more intuitive proof of the statement above for instance could be

\begin{align} &\ (1 + q^1 + q^2 + q^3 \ldots + q^n) (1 - q)\\ = &\ (1-q) + q^1 (1 -q ) + q^2 (1 - q) + q^3 (1-q) \ldots + q^n (1-q)\\ = &\ 1 - q + q - q^2 + q^2 - q^3 \ldots + q^n - q^{n+1}\\ = &\ 1 - q^{n+1}\:. \end{align}When $q$ is the base of our favourite number system (e.g., $q = 10$), we actually take the claim above as granted, as it is the same as saying, e.g. for n = 4, $ 1000 - 1 = 999$.

A series that shows up frequently when analyzing loops is the arithmetic series:

$\sum_{i=1}^n i = n (n + 1)/2$.

*Proof:*
By induction on $n$ we prove the following statement for all $n$:

$P(n)$: $\sum_{i=1}^n i = n (n + 1)/2$.

*Basis ($n=0$):* $\sum_{i=1}^0 i = 0 = 0(0+1) /2$. Thus, the statement holds for $n=0$.

*Step ($n \rightarrow n+1$):* Assume the statement $P(n)$ holds for $n$ (I.H.). Now, $\sum_{i=0}^{n+1} i = (\sum_{i=0}^{n} i) + n+1 = n (n + 1)/2 + n+1$ by induction hypothesis. Further, $n (n + 1)/2 + n+1 = (n+1) (n/2 + 1) = (n+1) (n+2) /2$. Thus, the statement also holds for $n+1$.

By induction we can conclude that the statement holds for all $n$.

$\square$

$\sum_{i=1}^n i = n (n + 1)/2 = n^2/2 + n/2 = \Theta (n^2)$.

*fixed* constant $c>0$ such that the statement holds for all $n \geq n_0$. But if we use asymptotic notation in the induction step, then we allow us to pick a different constant for every $n$, which then no longer is a constant.

*Proof:*
By induction on $n$ we prove the following statement for all $n$:

$P(n)$: $\sum_{i=1}^n i^2 \leq c n^3$ for $c = \ldots$.

*Basis ($n=0$):* $\sum_{i=1}^0 i^2 = 0 \leq c 0^3$ for any $c$.

*Step ($n \rightarrow n+1$):* Assume the statement $P(n)$ holds for $n$ (I.H.). Now, $\sum_{i=1}^{n+1} i^2 = (\sum_{i=1}^{n} i^2) + (n+1)^2 \leq c n^3 + (n+1)^2$ by induction hypothesis. Further, we have $c (n+1)^3 = c n^3 + 3c n^2 + 3cn + 1$. This is larger than $c n^3 + (n+1)^2 = c n^3 + n^2 + 2n + 1$ for $c \geq 1$ (*Note: We could have picked an even smaller $c$ here, but we don't need to.*). Thus, the statement also holds for $n+1$.

By induction we can conclude that the statement holds for all $n$.

$\square$

*Notation: $\lceil n/2 \rceil$ means $n/2$ rounded up.*)

In [1]:

```
def merge_sort(A):
n = len(A)
if n>1:
mid = n//2
A1 = A[:mid]
A2 = A[mid:]
# recursive calls:
merge_sort(A1)
merge_sort(A2)
# merge solutions
i=0
j=0
while i<mid and mid+j<n:
if A1[i]<A2[j]:
A[i+j]=A1[i]
i+=1
else:
A[i+j]=A2[j]
j+=1
while i<mid:
A[i+j]=A1[i]
i+=1
while mid+j<n:
A[i+j]=A2[j]
j+=1
A = [5,2,8,6]
merge_sort(A)
print(A)
```

The recurrence for the running time is $T(1) = d$ for some constant $d>0$; and for $n>1$: $T(n) = 2 T(n/2) + c$ for some constant $c>0$. We would like to argue $T(n) = O(n \log n)$.

Now with the Master Theorem at hand this is of course easy. But lets try to use the *substitution method*, which is essentially a fancy way of saying that we want to prove the statement by induction. We cannot (easily) use simple induction, since $T(n+1)$ is not based on $T(n)$. But coming back to the principle of induction, we can observe that when proving the statement $P(n+1)$, we in some sense did not make use all that we knew by then. When proving $P(n+1)$ we assumed that we already knew $P(n)$, which we knew because of $P(n-1)$, which we knew because of $P(n-2)$, which we knew because of $\ldots$ $P(0)$. So actually, when proving $P(n+1)$ we can just as well assume that $P(k)$ holds for all $k < n+1$. Thus, in the induction step, we can assume as induction hypothesis $P(k)$ for all $k

**induction hypothesis: $P(k)$ holds for all $k**

This is called strong induction. We can re-use the structure from before with small changes:

*Proof:*
By strong induction on $n$ we prove the following statement for all $n$:

$P(n)$: **blabla** $n$ **blabla**.

*Basis ($n=0$):* **Show here that P(0) holds.** Thus, the statement holds for $n=0$. **You may need more base cases.**

*Step ($k < n \rightarrow n$):* Assume the statement $P(k)$ holds for $k

By induction we can conclude that the statement holds for all $n$.

$\square$

**Nonetheless, we need to handle the base case.** We can read it off from the algorithm as such ($n=1$ is a special case in the algorithm).

*Proof:*
By strong induction on $n$ we prove the following statement for all $n$:

$P(n)$: $T(n) \leq c n \log n + d$ with the constants $c,d$ given in the recurrence.

*Basis ($n=1$):* $T(1) = d \leq c 1 \log 1 + d$. Thus, the statement holds for $n=1$.

*Step ($k < n \rightarrow n$):* Assume the statement $P(k)$ holds for $1\leq k

Thus, the statement also holds for $n$.

By induction we can conclude that the statement holds for all $n$.

$\square$

*Proof:*
By strong induction on $n$ we prove the following statement for all $n$:

$P(n)$: MergeSort correctly sorts arrays $A$ of length $n$.

*Basis ($n=1$):* An array of size one is already sorted. And indeed the algorithm does nothing.

*Step ($k < n \rightarrow n$):* Assume the statement $P(k)$ holds for $1\leq k

Thus, the statement also holds for $n$.

By induction we can conclude that the statement holds for all $n$.

$\square$

*(a)* using the structure of induction correctly and *(b)* making use of the fact that (by induction hypothesis) the recursive calls work correctly.

In [ ]:

```
```