We want to analyze the efficiency of algorithms. We want to do so by looking at the running time of an algorithm. This sound easy and difficult.

It seems easy, because we can simply check how many seconds an algorithm runs. To illustrate this, we look at how long it takes to check whether a given number $x$ in a collection $A$ of numbers.

In [17]:

```
A=[10, 17, 22, 1, 9, 5]
```

In [18]:

```
%%timeit
11 in A
```

In [19]:

```
A= set([10, 17, 22, 1, 9, 5])
```

In [21]:

```
%%timeit
11 in A
```

In [22]:

```
A = list(range(2, 300))
```

In [23]:

```
%%timeit
1 in A
```

In [26]:

```
A = set(range(2, 300))
```

In [43]:

```
%%timeit
1 in A
```

In [51]:

```
import timeit
min(timeit.repeat(repeat=3, stmt='1 in A', setup='A = set(range(2, 300))'))
```

Out[51]:

In [58]:

```
import timeit
def test_in(A,x):
return x in A
x = range(1, 101, 10)
y = []
z = []
for n in x:
A = list(range(0, n))
y.append(min(timeit.repeat(lambda: test_in(A, 1000), repeat=3)))
B = set(range(0, n))
z.append(min(timeit.repeat(lambda: test_in(B, 1000), repeat=3)))
%matplotlib notebook
import matplotlib.pyplot as plt
list_plot, = plt.plot(x,y,'r') # plot list 'in' in red
set_plot, = plt.plot(x,z,'b') # plot set 'in' in blue
plt.legend([list_plot, set_plot], ['list', 'set'])
plt.ylabel('running time (in $\mu$s)')
plt.xlabel('size of input')
plt.title('running time for \'in\' on different data types')
plt.ylim(ymin=0)
plt.show()
```

On the slides I use pseudo-code for linear search, since at this point I assume it is easier to read for you. But we can very easily re-write the pseudo-code as python code. Recall that the pseudo-code was

Linear-Search(A, n, x)

- Set answer to Not-Found
- For each index i, going from 1 to n, in order:
- If A[i] = x, then set answer to the value of i
- Return the value of answer as the output

In Python this would read as follows (we will use -1 to represent Not-Found). Note that the index of the first element is 0 (and not 1). Further note that range(start, stop) includes the numbers from start to stop-1 (and not stop).

In [6]:

```
def linear_search(A, x):
answer = -1
for i in range(0, len(A)):
if A[i] == x: answer = i
return answer
```

In [7]:

```
linear_search([10, 5, 9, 9], 10)
```

Out[7]:

In [8]:

```
linear_search([10, 5, 9, 9], 9)
```

Out[8]:

In [9]:

```
linear_search([10, 5, 9, 9], 8)
```

Out[9]:

Also the better linear search can be directly translated into python:

Better-Linear-Search(A, n, x)

- For i = 1 to n: If A[i] = x, then return the value of i as the output
- Return Not-Found as the output

In [76]:

```
def better_linear_search(A, x):
for i in range(0, len(A)):
if A[i] == x: return i
return -1
```

In [11]:

```
better_linear_search([10, 5, 9, 9], 10)
```

Out[11]:

In [12]:

```
better_linear_search([10, 5, 9, 9], 9)
```

Out[12]:

In [13]:

```
better_linear_search([10, 5, 9, 9], 8)
```

Out[13]:

In [14]:

```
# import matplotlib.pyplot as plt
x = range(1, 31, 10)
y_1 = []
y_2 = []
y_3 = []
y_4 = []
for n in x:
A = list(range(0, n))
y_1.append(timeit.timeit(lambda: linear_search(A, 1000)))
y_2.append(timeit.timeit(lambda: linear_search(A, 0)))
y_3.append(timeit.timeit(lambda: better_linear_search(A, 1000)))
y_4.append(timeit.timeit(lambda: better_linear_search(A, 0)))
%matplotlib notebook
import matplotlib.pyplot as plt
plt.plot(x,y_1,'r')
plt.plot(x,y_2,'g')
plt.plot(x,y_3,'b')
plt.plot(x,y_4,'c')
plt.ylim(ymin=0)
plt.show()
```

In [72]:

```
import csv
with open('weather-eindhoven.csv') as csvfile:
reader = csv.reader(csvfile)
next(reader) # skip header row
datatype = int
date = []
max_temperature = []
for row in reader:
date.append(datatype(row[1])) #column 1 (that is, the second) contains the date
max_temperature.append(datatype(row[14])/10) # column 14 contains the max temperature times 10
```

In [78]:

```
index = better_linear_search(max_temperature, 30)
print(date[index])
```