# Elementary Data Structures¶

## Stacks based on fixed capacity arrays¶

In the lecture we have seen how to implement a stack based on an array. An array has a fixed capacity, and this first part is about implementing Stacks while dealing with this fixed capacity.

Important note: In Python you often don't use arrays directly, but Python lists (and we will also use them here). Python lists are actually dynamic arrays. So this first part provides a motivation and explanation of Python lists, but the implementations that we consider aren't practical. A better implementation of stacks is provided below (see Stacks based on Python lists).

Let us first implement a stack with a fixed capacity.

In [6]:
class CapacitatedStack:
def __init__(self):
self.capacity = 20
self.items = self.capacity*[None]
self.count = 0

def isEmpty(self):
return self.count == 0

def push(self, item):
if self.count < self.capacity:
self.items[self.count] = item
self.count += 1
else: raise Exception("CapacitatedStack overflow.")

def pop(self):
if self.count > 0:
self.count -= 1
return self.items[self.count]
else:
raise Exception("Cannot pop from empty stack")

def size(self):
return self.count

In [7]:
S = CapacitatedStack()
S.isEmpty()

Out[7]:
True
In [8]:
S.push(1)
S.pop()

Out[8]:
1
In [9]:
S.pop()

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-9-f3333266bd89> in <module>()
----> 1 S.pop()

<ipython-input-6-be8b61c677a1> in pop(self)
19                 return self.items[self.count]
20         else:
---> 21                 raise Exception("Cannot pop from empty stack")
22
23      def size(self):

Exception: Cannot pop from empty stack
In [10]:
for i in range(20): S.push(i)

In [11]:
S.push(20)

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-11-24f3837aa904> in <module>()
----> 1 S.push(20)

<ipython-input-6-be8b61c677a1> in push(self, item)
12                 self.items[self.count] = item
13                 self.count += 1
---> 14          else: raise Exception("CapacitatedStack overflow.")
15
16      def pop(self):

Exception: CapacitatedStack overflow.

Next we make the stack growable. Observe that we only need to slightly change the push method. However, push now no longer takes constant time.

In [32]:
class GrowableStack:
def __init__(self):
self.capacity = 20
self.items = self.capacity*[None]
self.count = 0

def isEmpty(self):
return self.count == 0

def push(self, item):
if self.count >= self.capacity:
# doubling, incremental would be += ...
self.capacity *= 2
copy_items = self.capacity*[None]
for i in range(self.count): copy_items[i] = self.items[i]
self.items = copy_items

self.items[self.count] = item
self.count += 1

def pop(self):
if self.count > 0:
self.count -= 1
return self.items[self.count]
else:
raise Exception("Cannot pop from empty stack")

def size(self):
return count

In [34]:
S = GrowableStack()
for i in range(100): S.push(i)
S.pop()

Out[34]:
99

This works, however, Python lists handle the "growing" for us, as we will see in the following.

## Stacks based on Python lists¶

In the lecture we have seen how to implement a stack based on a dynamic array. In Python dynamic arrays are the data structure behind lists, therefore lists directly provide an efficient stack implementation.

In [14]:
class Stack:
def __init__(self):
self.items = []

def isEmpty(self):
return self.items == []

def push(self, item):
self.items.append(item)

def pop(self):
return self.items.pop()

def size(self):
return len(self.items)

In [15]:
S = Stack()
S.isEmpty()

Out[15]:
True
In [16]:
S.push(5)
S.isEmpty()

Out[16]:
False
In [17]:
S.pop()

Out[17]:
5

Based on the theoretical considerations in the lecture we expect that pop() and push() on our stack -or stated differently pop() and append() on lists- will run in constant amortized time. We can try to validate this experimentally.

We compute the average running time for push for different sizes of lists. Specifically, we start with an empty list, and always record the time after a certain number of push operations. Then we compute the average time since the beginning.

In [19]:
from time import time

timing = []
S = []

start = time()
for n in range(1,1000):
for k in range(10000): S.append(1)
end = time()
timing.append((end-start)/n/10000)

%matplotlib notebook

import matplotlib.pyplot as plt
plt.plot(timing,'r')
plt.ylim(ymin=0)
plt.show()


We observe the following: The average T(n)/n seems indeed to be constant. But we also see another pattern: The average running time has small peaks, and the frequency of these peaks decreases. We would expect that these peaks from the resizing ("doubling") of the array.

To have a closer look at the resizing we now want to measure the running time without averaging. But to still have a significant time difference between measurements we will push to several lists.

In [20]:
from time import time

timing = []
manyS = []

nLists = 1000
for k in range(nLists): manyS.append([])

for n in range(1,10000):
start = time()
for k in range(nLists): manyS[k].append(1)
end = time()
timing.append((end-start)/1000)

%matplotlib notebook

import matplotlib.pyplot as plt
plt.plot(timing,'r')
plt.ylim(ymin=0)
plt.show()


Indeed we can see that the running time is mostly constant, but sometimes it is linear. We also see that the size isn't doubled, but increased by a smaller factor.

## Queues based on doubly linked lists/Python deques¶

Queues can be implemented using dynamic arrays, but the implementation is far less direct as it was for stacks. If we would want to use dynamic arrays more directly (e.g., using append() as enqueue and pop(1) as dequeue) to implement queues, we would run into the issue that deletions are expensive.

We can use Python deques, which provide constant time insertion and deletion at both ends.

In [7]:
from collections import deque
S = deque([2, 3, 5])
S.append(7)
S.popleft()
S

Out[7]:
deque([3, 5, 7])

Let us experimentally evaluate deques as queues, and contrast this to lists.

In [23]:
timing = []
manyQ = []

nLists = 1000
for k in range(nLists): manyQ.append(deque([]))

for n in range(1,10000):
start = time()
for k in range(nLists): manyQ[k].append(1)
end = time()
timing.append((end-start)/1000)

%matplotlib notebook

import matplotlib.pyplot as plt
plt.plot(timing,'r')
plt.ylim(ymin=0)
plt.show()


The running time per operation is constant. Next, we dequeue.

In [24]:
timing = []

for n in range(1,10000):
start = time()
for k in range(nLists): manyQ[k].popleft()
end = time()
timing.append((end-start)/1000)

%matplotlib notebook

import matplotlib.pyplot as plt
plt.plot(timing,'r')
plt.ylim(ymin=0)
plt.show()


As final experiment on queues, let us check what happens if we instead use lists, and pop(1).

In [25]:
timing = []
manyQ = []

nLists = 1000
for k in range(nLists): manyQ.append([])

for n in range(1,1000):
for k in range(nLists): manyQ[k].append(1)

for n in range(1,1000):
start = time()
for k in range(nLists): manyQ[k].pop(0)
end = time()
timing.append((end-start)/nLists)

%matplotlib notebook

import matplotlib.pyplot as plt
plt.plot(timing,'r')
plt.ylim(ymin=0)
plt.show()


The running time is linear in the size of the list as expected. Thus, using deques is indeed the better option. Deques are implemented using doubly linked lists.

We implement a singly linked list, and use it to implement a stack and queue.

In [11]:
class SNode:
def __init__(self, elem=None, next=None):
self.elem = elem
self.next  = next

class SList:
def __init__(self):

In [12]:
node3 = SNode("Toronto")
node2 = SNode("Seattle", node3)
node1 = SNode("Rome", node2)

list = SList()

def printList(slist):
while (currentNode):
print(currentNode.elem)
currentNode = currentNode.next

printList(list)

Rome
Seattle
Toronto

In [13]:
newNode = SNode("Baltimore")

printList(list)

Baltimore
Rome
Seattle
Toronto

In [14]:
list.head = list.head.next
printList(list)

Rome
Seattle
Toronto

In [15]:
class Stack:
def __init__(self):
self.list = SList()
self.count = 0

def isEmpty(self):
return self.count == 0

def push(self, item):
self.count += 1

def pop(self):
if self.isEmpty():
raise Exception('stack is empty.')
else:
self.count -= 1
return item

def size(self):
return self.count

In [16]:
S = Stack()
S.isEmpty()

Out[16]:
True
In [17]:
S.push(5)
S.push(7)
S.size()

Out[17]:
2
In [18]:
S.pop()

Out[18]:
7
In [19]:
S.pop()

Out[19]:
5
In [20]:
S.pop()

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-20-f3333266bd89> in <module>()
----> 1 S.pop()

<ipython-input-15-b1afc858aca3> in pop(self)
14      def pop(self):
15          if self.isEmpty():
---> 16              raise Exception('stack is empty.')
17          else:

Exception: stack is empty.
In [ ]:
class Queue:
def __init__(self):
self.tail = None
self.count = 0

def isEmpty(self):
return self.count == 0

def enqueue(self, elem):
newNode = SNode(elem)
if self.isEmpty():
self.tail = newNode
else:
self.tail.next = newNode
self.tail = newNode
self.count += 1

def dequeue(self):
if self.isEmpty():
raise Exception('Queue is empty.')
else:
else:
self.tail = None
self.count -= 1
return elem

def size(self):
return self.count

In [ ]:
Q = Queue()
Q.isEmpty()

In [ ]:
Q.enqueue('lion')
Q.enqueue('bird')
Q.enqueue('frog')
Q.enqueue('horse')
Q.dequeue()

In [ ]:
Q.size()

In [ ]:
Q.dequeue()
Q.dequeue()
Q.dequeue()

In [ ]:
Q.dequeue()


Finally we implement a doubly linked list and use it to implement a deque.

In [ ]:
class DNode:
def __init__(self, elem=None, previous=None, next=None):
self.elem = elem
self.previous  = previous
self.next  = next

class DList:
def __init__(self):
self.size = 0
self.trailer = DNode()

def delete(self, node):
elem = node.elem
previous = node.previous
next = node.next
previous.next = next
next.previous = previous
node = DNode()
self.size -= 1
return elem

def insert_between(self, element, previous, next):
node = DNode(element, previous, next)
previous.next = node
next.previous = node
self.size += 1

def is_empty(self):
return self.size == 0

def len(self):
return self.size

class Deque:
def __init__(self):
self.list = DList()

def enqueueLeft(self, elem):

def dequeueLeft(self):
if self.list.is_empty(): raise Exception('Deque is empty.')

def enqueueRight(self, elem):
self.list.insert_between(elem, self.list.trailer.previous, self.list.trailer)

def dequeueRight(self):
if self.list.is_empty(): raise Exception('Deque is empty.')
else: return self.list.delete(self.list.trailer.previous)

D = Deque()
D.enqueueLeft(5)
D.enqueueLeft(4)
D.dequeueRight()

In [ ]:
D.enqueueRight(6)
D.dequeueLeft()
D.dequeueLeft()


As a final remark we note that if Python provides a built-in data type, it is likely that you will want to use it rather than implementing it yourself. We illustrate this using the deque.

In [ ]:
MD = Deque()
start = time()
for n in range (10000): MD.enqueueRight(1)
print(time() - start)

PD = deque()
start = time()
for n in range (10000): PD.append(1)
print(time() - start)

In [ ]: