Analysis of Algorithms II

Deepayan Sarkar

Heapsort

Next we study another sorting algorithm called heapsort
It has the good properties of both merge sort and insertion sort
- It has \(O(n \log_2 n)\) worst-case running time
- It is in-place (requires only a constant amount of extra storage)
It is based on a data structure known as a heap.

The abstract heap data structure

plot of chunk unnamed-chunk-2

The (binary) heap data structure is an object that we can view as a nearly complete binary tree.
- Each node corresponds to an element.
- The tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point.
For each node x, the following operations are defined:
- PARENT(x) returns the parent node
- LEFT(x) returns the left child node
- RIGHT(x) returns the right child node

How can we implement a heap?

plot of chunk unnamed-chunk-3

General graph \(G = (V, E)\) consists of
- \(V\) : set of vertices or nodes
- \(E\) : set of edges
Usually stored as list of nodes and edges / adjacency matrix
Trees are a subtype of graphs
- They have a special root node
- Each node has 0 or more child nodes
- Nodes with no children are called leafs
Heaps are are (almost) complete binary trees
This makes implementation of heaps easier than for general graphs

Implementation of a heap using arrays

plot of chunk unnamed-chunk-4

Suppose we number the nodes as shown
Then we can define

```
PARENT(i)
```
return floor( \(i/2\) )
```
LEFT(i)
```
return \(2i\)
```
RIGHT(i)
```
return \(2i + 1\)

Implementation of a heap using arrays

Because of this, heaps are usually implemented using an array
Specifically, a heap is an array A with two attributes:
- length(A) gives the number of elements in the array
- The first heap-size(A) elements of the array are considered part of the heap
Note that the number of elements of an array are usually fixed
As we will see, it is common to change the heap size in heap-based algorithms

Implementation of a heap using arrays

Index the array by \(1, 2, ..., n\)
Root node has index 1
Then as shown above, we can implement

PARENT(i)

return floor(\(i/2\))

LEFT(i)

return \(2i\)

RIGHT(i)

return \(2i + 1\)

In C / C++, there are shift operators << and >> that make these efficient
Implementations need to change if arrays are indexed from 0

Height of a heap

plot of chunk unnamed-chunk-5

View the heap as a tree
The height of a node is the number of edges on the longest simple downward path from the node to a leaf.
The height of the heap is the height of its root
A heap of size \(n\) has height \(\lfloor \log_2 n \rfloor\)

Heap property

We are usually interested in heaps that satisfy a particular property
Depending on the property, the heap is called either a max-heap or a min-heap.
Max-heap: A heap \(A\) is called a max-heap if it satisfies the “max-heap property”

\[ A[PARENT(i)] \geq A[i] ~\textsf{ for all }~ i > 1 \]

That is, the value at every node (except the root node) is less than or equal to the value at its parent.
In particular,
- the largest element in a max-heap is stored at the root
- The subtree rooted at any node only contains values less that or equal to the value in that node
Min-heap: Similarly, a heap \(A\) is a min-heap if it satisfies the “min-heap property”

\[ A[PARENT(i)] \leq A[i] ~\textsf{for all}~ i > 1 \]

Example: max-heap

plot of chunk unnamed-chunk-6

Algorithms for max-heaps

For the heapsort algorithm, we will use max-heaps
The key elements of the algorithm are
- The BUILD-MAX-HEAP procedure, which produces a max-heap from an unordered input array, and
- The MAX-HEAPIFY procedure, which is used to maintain the max-heap property

MAX-HEAPIFY

Suppose that we have a heap that is almost a max-heap
However, the max-heap property may not hold for the root element
MAX-HEAPIFY fixes this error and makes it a max-heap

The MAX-HEAPIFY procedure has the following inputs
- an array \(A\), and
- an index \(i\) into the array
When called, MAX-HEAPIFY assumes that
- the binary trees rooted at \(LEFT(i)\) and \(RIGHT(i)\) are max-heaps, but
- \(A[i]\) might be smaller than its children
MAX-HEAPIFY moves \(A[i]\) down the max-heap so that the subtree rooted at \(i\) becomes a max-heap

MAX-HEAPIFY

Outline: At each step,
- The largest of the elements \(A[i], A[LEFT(i)], A[RIGHT(i)]\) is determined
- Its index is stored in the variable \(largest\)
If \(A[i]\) is largest, then the subtree rooted at node \(i\) is already a max-heap and the procedure terminates
Otherwise, one of the two children has the largest element, and so
- \(A[i]\) is swapped with \(A[largest]\)
- Node \(i\) and its immediate children now satisfy the max-heap property
- But \(A[largest]\) now equals the original \(A[i]\), so that subtree might violate the max-heap property
- So we call MAX-HEAPIFY recursively on that subtree

MAX-HEAPIFY

MAX-HEAPIFY(A, i)

l = LEFT(i)
r = RIGHT(i)
largest = i
if (l \(\leq\) heap-size(A) and A[l] > A[i]) {
    largest = l
}
if (r \(\leq\) heap-size(A) and A[r] > A[largest]) {
    largest = r
}
if (largest != i) {
    Swap A[i] and A[largest]
    MAX-HEAPIFY(A, largest)
}

Running time of MAX-HEAPIFY

Let \(T(n)\) be The running time of MAX-HEAPIFY for a sub-tree of size \(n\)
Requires a constant time to compare the root with two children to decide which is largest
If necessary, additionally requires time to MAX-HEAPIFY a subtree
Claim: The size of a subtree can be at most \(2n/3\).

Proof is an exercise: Hint:
- Height = \(k = \lfloor log_2 n \rfloor\)
- Size of subtree is at most \(2^k \leq 2^{\lfloor log_2 n \rfloor}\)
- Worst case when tree half-full (is that obvious?)
- Then, \(n = 2^{k}-1 + 2^{k}/2 = 3/2 \times 2^{k} - 1\), and size of subtree is \(m = 2^{k} - 1\)
- Then, \(m/n = 2/3 \times \frac{1-1/L}{1-2/3L}\), where \(L = 2^{k}\)
- The extra factor simplifies to \((3L-3)/(3L-2) < 1\)

Running time of MAX-HEAPIFY

This gives the recurrence

\[ T(n) = T(2n/3) + \Theta(1) \]

By the master theorem, the solution is \(T(n) = O(\log_2 n)\)
We often state this by saying that runtime of MAX-HEAPIFY is linear in the height of the tree

Building a max-heap

We can easily use MAX-HEAPIFY in a bottom-up manner to convert an array \(A[1,...,n]\) into a max-heap
All elements \(A[i]\) for \(i > PARENT(n)\) are leaves of the tree, and so are already 1-element max-heaps

BUILD-MAX-HEAP(A)

heap-size(A) = length(A)
for (i = PARENT(length(A)), …, 2, 1) {
MAX-HEAPIFY(A, i)
}

To prove correctness, we can use the following loop invariant:

At the start of each iteration of the for loop, each node \(i+1, i+2, ..., n\) is the root of a max-heap.

Building a max-heap

Initialization

\(i = PARENT(length(A))\). All subsequent nodes are leaves so trivially max-heaps

Maintenance

Children of any node \(i\) are numbered higher than \(i\)
Since these are max-heaps by the loop invariant condition, it is legitimate to apply MAX-HEAPIFY(A, i)
This now makes \(i\) the root of a max-heap, and the property continues to hold for all nodes numbered \(>i\)
When \(i\) decreases by 1, the loop invariant becomes true for the next value of \(i\)

Termination

At termination, \(i = 0\). By the loop invariant, each node \(1, 2, ..., n\) is the root of a max-heap
In particular, this holds for node 1, the root node

Runtime of `BUILD-MAX-HEAP(A)`

A simple upper bound for the running time is \(n \log_2 n\)
Can we do better? Possibly yes, because
- Running time for MAX-HEAPIFY is lower for nodes of low height
- Such nodes are more in number
In particular, An \(n\)-element heap has
- Height \(H = \lfloor \log_2 n \rfloor\), and
- At height \(h\) (i.e., height \(H-h\) from root node), at most \(2^{H-h}\) nodes
Runtime \(T(n)\) of MAX-HEAPIFY on a node of height \(h\) is \(O(h)\)
So the total run time for BUILD-MAX-HEAP is bounded above by

\[ \sum_{h = 0}^H 2^{H-h} O(h) = 2^H O \left( \sum_{h = 0}^H \frac{h}{2^h} \right) \]

Runtime of `BUILD-MAX-HEAP(A)`

Recall that \[\sum_{k=0}^n kx^k < \sum_{k=0}^{\infty} kx^k = x \frac{\text{d}}{\text{d}x} \sum_{k=0}^{\infty} x^k = x \frac{\text{d}}{\text{d}x} \frac{1}{1-x} = \frac{x}{(1-x)^2} \]
Thus we can see that

\[ \sum_{h = 0}^H \frac{h}{2^h} \leq \frac{1/2}{(1-1/2)^2} = 2 \]

As \(2^H \leq n\), \(T(n) = O(n)\)

Heapsort

Finally, we come to the heapsort algorithm

Use BUILD-MAX-HEAP to build a max-heap on the input array \(A\) of length \(n\)
Initial heap size \(s = n\)
The maximum element of the array is now stored at the root \(A[1]\)
Put it into its correct final position by swapping with \(A[s]\)
Now, discard this maximum element in \(A[n]\) from the heap, by simply decreasing the heap size \(s\) by 1
The remainder is almost a max-heap, except possibly at the root node
Make it a max-heap by calling MAX-HEAPIFY
Repeat

Heapsort

HEAPSORT(A)

BUILD-MAX-HEAP(A)
for (i = length(A), …, 3, 2) {
    swap A[1] and A[i]
    heap-size(A) = heap-size(A) - 1
    MAX-HEAPIFY(A, 1)
}

Exercise: Prove correctness of HEAPSORT using the following loop invariant:

At the start of each iteration of the for loop, the subarray \(A[1,...,i]\) is a max-heap containing the \(i\) smallest elements of \(A[1,...,n]\), and the subarray \(A[i+1,...,n]\) contains the \(n-i\) largest elements of \(A[1,...,n]\) in sorted order.

Exercise: Show that runtime \(T(n)\) of heapsort is

\[ T(n) = O(n) + \sum_{i} O(\lfloor \log_2 i \rfloor) = O(n) + O\left( \sum_{i} \log_2 i \right) = O(n \log_2 n) \]

Probabilistic Analysis

A common problem: finding the maximum
- given a list of things
- want to find the “best” among them
Typical approach: look at each one by one, keeping track of the best
Not much we can do to improve on this

A variant of this problem: there is a substantial cost to updating the current ‘best’ value
We can phrase this as the hiring problem

The hiring problem

Suppose that your current office assistant is horribly bad, and you need to hire a new office assistant
An employment agency sends you one candidate every day
You interview a candidate and decide either to hire or not
But if you don’t hire the candidate immediately, you cannot hire him / her later
You pay the employment agency a small fee to interview an applicant
Hiring an applicant is more costly because you must also compensate the current current office assistant who you are firing

Hiring strategy: always hire the best

You want to have the best possible person for the job at all times
Therefore, you decide that, after interviewing each applicant, if that applicant is better qualified than the current office assistant, you will fire the current office assistant and hire the new applicant
You are willing to pay the resulting price of this strategy, but you wish to estimate what that price will be

Hiring strategy: always hire the best

hire-assistant(n)

best = 0 // least-qualified dummy candidate
for (i = 1, …, n) {
    interview candidate i
    if (i is better than best) {
        best = i
        hire candidate i
    }
}

Let \(c_i\) be interview cost, and \(c_h\) be hiring cost.
Then the total cost is \(nc_i + mc_h\), where \(m\) is the number of times we hired someone new.
The first part is fixed, so we concentrate on \(m c_h\).

Probabilistic analysis

Worst case:
- we get applicants in increasing order (worst to best)
- we hire everyone we interview
- So \(m = n\)
Best case: \(m=1\)
What is the average case?

We need to assume a probability distribution on the input order
Simplest model: candidates come in random order
More precisely, their order is a uniformly random permutation of \(1, 2, ..., n\)

Probabilistic analysis

Define

\[\begin{eqnarray*} X_i & = & \boldsymbol{1}\left\lbrace\text{Candidate } i \text{ is hired}\right\rbrace \\ X & = & \sum_i X_i \end{eqnarray*}\]

Then \(E(X_i) = 1/i \implies E(X) = \sum_{i=1}^n 1/i \approx \log n\)
Exercise: Can we write \(E(X) = \Theta(\log n)\)?
Exercise: Determine \(Var(X)\).

Quicksort

The final general sorting algorithm we study is called quicksort
It is among the fastest sorting algorithms in practice
Estimating the runtime theoretically is somewhat tricky

Quicksort

Quicksort is a divide-and-conquer algorithm (like merge-sort)
The steps to sort an array \(A[p,...,r]\) are:
- Choose an element in \(A\) as the pivot element \(x\)
- Partition (rearrange) the array \(A[p,...,r]\) and compute index \(p \leq q \leq r\) such that
  - Each element of \(A[p,...,q] \leq x\)
  - Each element of \(A[q+1,...,r] \geq x\)
  - Computing the index \(q\) is part of the partitioning procedure
- Sort the two subarrays \(A[p,...,q]\) and \(A[q+1,...,r]\) by recursive calls to quicksort
- No further work needed, because the whole array is now sorted

Quicksort

The procedure can thus be written as

QUICKSORT(A, p, r)

if (p < r) {
    q = PARTITION(A, p, r)
    QUICKSORT(A, p, q)
    QUICKSORT(A, q+1, r)
}

The full array A of length n can be sorted with QUICKSORT(A, 1, n)
Of course, the important ingredient is PARTITION()

Partitioning in quicksort: original version

Quicksort was originally invented by C. A. R. Hoare in 1959
He proposed the following PARTITION() algorithm

PARTITION(A, p, r)

x = A[p] // choose first element as pivot
i = p - 1
j = r + 1
while (TRUE) {
    repeat
        j = j - 1
    until (A[j] \(\leq\) x)
    repeat
        i = i + 1
    until (A[i] \(\geq\) x)
    if (i < j) {
        swap A[i] and A[j]
    }
    else {
        return j
    }
}

Correctness

Exercise: Assuming \(p < r\), show that in the algorithm above,
- Elements outside the subarray \(A[p, ..., r]\) are never accessed
- The algorithm terminates after a finite number of steps
- On termination, the return value \(j\) satisfies \(p \leq j < r\)
- Every element of \(A[p, ..., j]\) is less than or equal to every element of \(A[j+1, ..., r]\)

Performance of quicksort (informally)

Runtime of PARTITION is clearly \(\Theta(n)\) (linear)
Worst-case: partitioning produces one subproblem with \(n-1\) elements and one with 1 element

\[ T(n) = T(n-1) + T(1) + \Theta(n) = T(n-1) + \Theta(n) \]

Solved by \(T(n) = \Theta(n^2)\)
Best case: always balanced split

\[ T(n) = 2 T(n/2) + \Theta(n) \]

By master theorem gives \(T(n) = O(n \log_2 n)\)
This happens if we can somehow ensure that the pivot is always the median
That is of course impossible to ensure
Average case: This turns out to be also \(O(n \log_2 n)\), but the proof of this is more involved

Lomuto partitioning scheme

We will study a slightly different version of quicksort (due to Lomuto)
Formal runtime analysis of this version is easier

PARTITION(A, p, r)

x = A[r] // choose last element as pivot
i = p - 1
for (j = p, …, r-1)
    if (A[j] <= x) {
        i = i + 1
        swap(A[i], A[j])
    }
swap(A[i+1], A[r])
return i + 1

Lomuto partitioning scheme

This rearranges \(A[p,...,r]\) and computes index \(p \leq q \leq r\) such that
- \(A[q] = x\)
- Each element of \(A[p,...,q-1] \leq x\)
- Each element of \(A[q+1,...,r] \geq x\)
The quicksort algorithm is modified as

QUICKSORT(A, p, r)

if (p < r) {
    q = PARTITION(A, p, r)
    QUICKSORT(A, p, q-1)
    QUICKSORT(A, q+1, r)
}

Correctness of Lomuto partitioning scheme

As the procedure runs, it partitions the array into four (possibly empty) regions.
At the start of each iteration of the for loop in lines 3–7, the regions satisfy certain properties.
We state these properties as a loop invariant:

At the beginning of each iteration of the loop, for any array index \(k\),

If \(p \leq k \leq i\), then \(A[k] \leq x\)

If \(i+1 \leq k \leq j-1\), then \(A[k] > x\)

If \(k = r\), then \(A[k] = x\)

(The values of \(A[k]\) can be anything for \(j \leq k < r\))

Proof of loop invariant

Initialization:

Prior to the first iteration of the loop, \(i = p-1\) and \(j = p\)
No values lie between \(p\) and \(i\) and no values lie between \(i + 1\) and \(j - 1\)
So, the first two conditions of the loop invariant are trivially satisfied
The assignment \(x = A[r]\) in line 1 satisfies the third condition

Proof of loop invariant

Maintenance:

We have two cases, depending on the outcome of the test in line 4
When \(A[j] > x\), the only action is to increment \(j\), after which
- condition 2 holds for \(A[j-1]\)
- all other entries remain unchanged
When \(A[j] \leq x\), the loop increments \(i\), swaps \(A[i]\) and \(A[j]\), and then increments \(j\)
Because of the swap, we now have that \(A[i] \leq x\), and condition 1 is satisfied
Similarly, \(A[j-1] > x\), as the value swapped into \(A[j-1]\) is, by the loop invariant, greater than \(x\)

Proof of loop invariant

Termination:

At termination, \(j = r\)
Every entry in the array is in one of the three sets described by the invariant
We have partitioned the values in the array into three sets:
- those less than or equal to \(x\)
- those greater than \(x\)
- a singleton set containing \(x\)
The second-last line of PARTITION swaps the pivot element with the leftmost element greater than \(x\)
This moved the pivot into its correct place in the partitioned array
The last line returns the pivot’s new index

Performance of quicksort

Again, it is easy to see that the running time of PARTITION is \(\Theta(n)\).
Worst case: \(T(n) = \Theta(n^2)\) as before
Best case: \(T(n) = O(n \log_2 n)\) as before
Examples of worst case:
- Input data already sorted
- All input values constant

Exercise:
- Are these worst cases for the original (Hoare) partition algorithm as well?
- Suggest simple modifications which can “fix” these worst cases
  (without increasing order of runtime of PARTITION)

Performance of quicksort

Average case: What is the runtime of quicksort in the “average case”
This is the expected runtime when the input order is random (uniformly over all permutations)

A related concept: Randomized Algorithms
An algorithm is randomized if it makes use of (pseudo)-random numbers
We will analyze a randomized version of quicksort
- This requires a “random number generator” algorithm RANDOM(i, j)
- RANDOM(i, j) should return a random integer between \(i\) and \(j\) (inclusive) with uniform probability

Randomized quicksort

Randomized quicksort chooses a random element as pivot (instead of the last) when partitioning

RANDOMIZED-PARTITION(A, p, r)

i = RANDOM(p,r)
swap(A[r], A[i])
return PARTITION(A, p, r)

The new quicksort calls RANDOMIZED-PARTITION in place of PARTITION

RANDOMIZED-QUICKSORT(A, p, r)

if (p < r) {
     q = RANDOMIZED-PARTITION(A, p, r)
     RANDOMIZED-QUICKSORT(A, p, q-1)
     RANDOMIZED-QUICKSORT(A, q+1, r)
}

Randomized quicksort and average case

A randomized algorithm can proceed differently on different runs with the same input
In other words, the runtime for a given input is a random variable
This leads to two distinct concepts:
- Expected runtime of RANDOMIZED-QUICKSORT (on a given input)
- Average case runtime of QUICKSORT (averaged over random input order)

Randomized quicksort and average case

Claim: If all input elements are distinct, these two are essentially equivalent

An alternative randomized version of quicksort is to randomly permute the input initially
The expected runtime in that case is clearly equivalent to the average case of QUICKSORT
Instead, we only choose the pivot randomly (in each partition step)
However, this does not change the resulting partitions (as sets)
A little thought shows that the number of comparisons is also the same
The number of swaps may differ, but are less than the number of comparisons

Average-case analysis

Assume that all elements of the input \(n\)-element array \(A[1, ..., n]\) are distinct
Each call to PARTITION has a for loop where each iteration makes one comparison (\(A[j] \leq x\))
Let \(X\) be the number of such comparisons in PARTITION over the entire execution of QUICKSORT
Then the running time of QUICKSORT is \(O(n + X)\)
This is easy to see, because
- PARTITION is called at most \(n\) times (actually less)
- In each such call, each iteration of the for loop makes one comparison contributing to \(X\)
- The remaining operations of PARTITION only contribute a constant term
To analyze runtime of quicksort, we will try to find \(E(X)\)
In other words, we will not analyze contribution of each PARTITION call separately

Average-case analysis

Let
- \(z_1 < z_2 < \dotsm < z_n\) be the elements of \(A\) in increasing order
- \(Z_{ij} = \lbrace z_i, ..., z_j \rbrace\) be the set of elements between \(z_i\) and \(z_j\), inclusive.
- \(X_{ij} = { \boldsymbol{1} \left\lbrace z_i \textsf{ is compared with } z_j\right\rbrace}\) sometime during the execution of QUICKSORT

First, note that two elements may be compared at most once
- One of the elements being compared is always the pivot
- The pivot is never involved in subsequent recursive calls to QUICKSORT

Average-case analysis

So, we can write

\[ X = \sum_{i=1}^{n-1} \sum_{j=i+1}^n X_{ij} \]

Therefore

\[ E(X) = \sum_{i=1}^{n-1} \sum_{j=i+1}^n E(X_{ij}) = \sum_{i=1}^{n-1} \sum_{j=i+1}^n P(z_i \textsf{ is compared with } z_j) \]

The trick to evaluating this probability is to notice that it only depends on \(Z_{ij}\)

Average-case analysis

We want to compute

\[P(z_i \textsf{ is compared with } z_j)\]

Consider the first element \(x\) in \(Z_{ij} = \lbrace z_i, ..., z_j \rbrace\) that is chosen as a pivot (at some point)
If \(z_i < x < z_j\), then \(z_i\) and \(z_j\) will never be compared
However, if \(x\) is either \(z_i\) or \(z_j\), then they will be compared
So, we want the probability that \(x\) is either \(z_i\) or \(z_j\)

Average-case analysis

This is easy once we realize that

until the first time something in \(Z_{ij}\) is chosen as a pivot, all elements in \(Z_{ij}\) remain in the same partition in any previous call to PARTITION (they are either all less than or greater than any previous pivot)

Recall that pivots are chosen uniformly randomly (in RANDOMIZED-PARTITION)
So any element of \(Z_{ij}\) is equally likely to be the one chosen first
Thus the required probability is \(2/|Z_{ij}| = 2/(j-i+1)\), and so

\[ EX = \sum_{i=1}^{n-1} \sum_{j=i+1}^n \frac{2}{j-i+1} = \sum_{i=1}^{n-1} \sum_{k=1}^{n-i} \frac{2}{k+1} < \sum_{i=1}^{n-1} \sum_{k=1}^{n} \frac{2}{k} = \sum_{i=1}^{n-1} O(\log_2 n) = O(n \log_2 n) \]

General lower bound for comparison-based sort

We have now seen four different sorting algorithms
Three of them have \(O(n \log n)\) runtime
A common property: they all use only pairwise comparison of elements to determine the result
In other words, only ranks are important, not the actual values
Such sorting algorithms are called comparison sorts

Claim: Any comparison sort algorithm requires \(\Omega(n \log n)\) comparisons in the worst case
To see why, think of any comparison sort as a decision tree
- Each comparison leads to a decision
- A sequence of decisions leads to the correct sorted result

General lower bound for comparison-based sort

For example, this is what happens when we do insertion sort on three elements \(a_1, a_2, a_3\)
Here, \(i \leq j\) denotes the act of comparing \(a_i\) and \(a_j\)

plot of chunk unnamed-chunk-7

General lower bound for comparison-based sort

Generally, this decision tree must be a binary tree (two outcomes of each comparison)
It must have at least \(n!\) leaf nodes (one or more for each possible permutation)
Comparisons needed to reach a particular leaf: length of the path from the root node

The worst case number of comparisons is the height of the binary tree (longest path)
A binary tree of height \(h\) can have at most \(2^h\) leaf nodes
A binary tree with at least \(n!\) leaf nodes must have height \(h \geq \log_2 n!\)
Using Stirling’s approximation \(\log n! = n \log n - n + O(\log n)\),

\[ h \geq \log_2(n!) / \log_2(2) = \Theta(n \log n) \]

Linear time sorting

Sorting can be done in linear time in some special cases
As shown above, they cannot be comparison-based algorithms
Usually, these algorithms put restrictions on possible values
Examples:
- Counting sort
- Radix sort

Details left for a second semester project

Randomly permuting arrays

A common requirement in randomized algorithms is to find a random permutation of an input array
One option: assign random key values to each element, then sort the elements according to these keys

PERMUTE-BY-SORTING(A)

n = length(A)
let \(P\)[1,,,,n] be a new array
for (i = 1, …, n) {
P[i] = RANDOM(1, M)
}
sort A, using P as sort keys

Here \(M\) should large enough that the possibility of keys being duplicated is small
Exercise: Show that PERMUTE-BY-SORTING produces a uniform random permutation of the input, assuming that all key values are distinct

Randomly permuting arrays

The runtime for PERMUTE-BY-SORTING will be \(\Omega(n \log_2 n)\) if we use a comparison sort
A better method for generating a random permutation is to permute the given array in place
The procedure RANDOMIZE-IN-PLACE does so in \(\Theta(n)\) time

RANDOMIZE-IN-PLACE(A)

n = length(A)
for (i = 1, …, n) {
swap(A[i], A[ RANDOM(i, n) ])
}

In the \(i\)th iteration, \(A[i]\) is chosen randomly from among \(A[i], A[i+1], ..., A[n]\)
Subsequent to the ith iteration, \(A[i]\) is never altered.

Randomly permuting arrays

Procedure RANDOMIZE-IN-PLACE computes a uniform random permutation
We prove this using the following loop invariant

Just prior to the \(i\)th iteration of the for loop, for each possible \((i-1)\)-permutation of the \(n\) elements, the subarray \(A[1,...,i-1]\) contains this \((i-1)\)-permutation with probability \((n-i+1)!/n!\).

Initialization

Holds trivially (\(i-1=0\))
If this is not convincing, take (just before) \(i=2\) to be the initial step

Randomly permuting arrays

Maintenance

Assume true upto \(i=1,...,k\)
Consider what happens just before \(i=(k+1)\)th iteration (i.e., just after \(k\)th iteration)
Let \((X_1, X_2, ..., X_k)\) be the random variable denoting the observed permutation
For any specific \(k\)-permutation \((x_1, x_2, ..., x_k)\),

\[\begin{eqnarray*} P(X_1 = x_1, X_2 = x_2, ..., X_k = x_k) &=& P(X_k = x_k | X_1 = x_1, X_2 = x_2, ..., X_{k-1} = x_{k-1}) \\ & & \times P(X_1 = x_1, X_2 = x_2, ..., X_{k-1} = x_{k-1}) \\ &=& \frac{1}{n-k+1} \times \frac{(n-k+1)!}{n!} = \frac{(n-k)!}{n!} \end{eqnarray*}\]

Termination

\(i=n+1\), so each \(n\)-permutation is observed with probability \(1/n!\)

Further topics

We will not discuss analysis of algorithms further
If you are interested, an excellent book on this topic is
Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein
We will discuss some more algorithms in second semester projects

Analysis of Algorithms II

Heapsort

The abstract heap data structure

How can we implement a heap?

Implementation of a heap using arrays

Implementation of a heap using arrays

Implementation of a heap using arrays

Height of a heap

Heap property

Example: max-heap

Algorithms for max-heaps

MAX-HEAPIFY

MAX-HEAPIFY

MAX-HEAPIFY

Running time of MAX-HEAPIFY

Running time of MAX-HEAPIFY

Building a max-heap

Building a max-heap

Initialization

Maintenance

Termination

Runtime of BUILD-MAX-HEAP(A)

Runtime of BUILD-MAX-HEAP(A)

Heapsort

Heapsort

Probabilistic Analysis

The hiring problem

Hiring strategy: always hire the best

Hiring strategy: always hire the best

Probabilistic analysis

Probabilistic analysis

Quicksort

Quicksort

Quicksort

Partitioning in quicksort: original version

Correctness

Performance of quicksort (informally)

Lomuto partitioning scheme

Lomuto partitioning scheme

Correctness of Lomuto partitioning scheme

Proof of loop invariant

Initialization:

Proof of loop invariant

Maintenance:

Proof of loop invariant

Termination:

Performance of quicksort

Performance of quicksort

Randomized quicksort

Randomized quicksort and average case

Randomized quicksort and average case

Average-case analysis

Average-case analysis

Average-case analysis

Average-case analysis

Average-case analysis

General lower bound for comparison-based sort

General lower bound for comparison-based sort

General lower bound for comparison-based sort

Linear time sorting

Randomly permuting arrays

Randomly permuting arrays

Randomly permuting arrays

Initialization

Randomly permuting arrays

Maintenance

Termination

Further topics

Runtime of `BUILD-MAX-HEAP(A)`

Runtime of `BUILD-MAX-HEAP(A)`