Chapter 8 - Bounds for sorting |
Document last modified: |
Overview
How fast can we sort? We will prove a lower bound.
Comparison sorting
- The only operation that may be used to gain order information about a sequence is comparison of pairs of elements.
if A[ i ] < A[ i + 1 ]
A[ i ] ↔ A[ i + 1]
- All sorts seen so far are comparison sorts: insertion sort, selection sort, merge sort, quick sort, heap sort, tree sort.
Upper bounds
- Largest number of binary comparisons to sort a list of n elements.
- Equals the longest path in a decision tree representing the sorting algorithm.
- Height of binary decision tree with n! leaves is ⌈lg n!⌉
- At least ⌈lg n!⌉ comparisons are needed.
Lower bounds
- Ω( n ) to examine all the input.
Question 8.1 - What does that mean?
Sorts seen so far are Ω(n lg n)
Question 8.2 - We proved Θ(n lg n) for merge and heap sort.
- What is the basic argument for merge sort?
- What is the basic argument for heap sort?
- We’ll show that Ω(n lg n) is a lower bound for all comparison sorts.
Decision Tree Model
Example
Decision tree and algorithm for finding a minimum of three numbers.
Minimum( a, b, c) if a < b then
if a < c
then return a
else return c
else
if b < c
then return b
else return c
Question 8.3
Would the following produce the same results?
Would the following produce the same decision tree, that is, the same number of comparisons?
Diagram the left decision tree.
Min( x, y ) Minimum( a, b, c)if x < y then
then return x
else return y
return Min( Min( a, b ), c)
Minimum( a, b, c) if a < b then
if a < c
then return a
else return c
else
if b < c
then return b
else return c
Bubble sort
bubblesort( A ) for i ← 1 to n - 1 do
for j ← 1 to n - i do
if A[ j ] > A[ j + 1 ] then A[ j ] ↔ A[ j + 1 ]
Question 8.4
- Guess (estimate) the worst and best case run time.
- Is the sort ascending or descending?
Comparisons A[ j ] > A[ j + 1 ]
(n-1) + (n-2) + ... + 2 + 1 = n*(n-1) / 2
Question 8.5 - Worst case order?
Sort on 3 elements A[1]=a, A[2]=b and A[3]=c initially into ascending order
1:2 means to compare value at index 1 with value at index 2.
if A[1] > A[2] then A[1] ↔ A[2]
Each internal node is labeled by indices of array elements from their original positions.
Each leaf is labeled by the permutation of orders that the algorithm determines.
Question 8.6
How many possible results (leaves) are there for sorting 3 elements? Count them.
What function of n=3 gives that result? That is f(n) = ?
What would be the number of leaves for n=5?
What is: P(n, n)?
What is a good guess for the height of the binary decision tree for sorting n elements?
Permutations - n!
Claim binary decision tree has n! leaves.
n values have n! orderings.
3 things can be arranged 3! = 6 ways.
Sort must generate decisions that produce every possible ordering (e.g. [a,b,c], [a,c,b], [b,a,c], ....)
Insertion sort revisited
Question 8.7 - Which of the following analyses represent the best and the worst cases? When does each occur?
T(n) = c1n + c2(n-1) + c4(n-1) + c5(n-1) + c8(n-1) = (c1+ c2 + c4 + c5 + c8)n - (c2 + c4 + c5 + c8)
= an + b for some a and b constants
= an2 + bn + c for some constants a, b and c
Bounding sorts Ω(n lg n) and O(n lg n)
Longest path from decision tree root to leaf is worst case number of comparisons, the height of the decision tree.
Lower bound of height is therefore lower bound on running time.
Theorem 1 - If a binary tree of height h has l leaves then: ⌈lg l⌉ ≤ h
Proof: There are at most 2h leaves in a binary tree of height h (see Proof 1 Tree and Heap Proofs)
so l ≤ 2h
lg l ≤ lg 2h = h
⌈lg l⌉ ≤ h since h is an integer
lg n! ≤ ⌈lg n!⌉ ≤ h
For n elements, the binary decision tree represents each n! permutation as a leaf.
By Theorem 1, for a binary tree of height h:
lg n! ≤ ⌈lg n!⌉ ≤ h
so whatever is a lower bound for ⌈lg n!⌉ is also a lower bound for lg n!
⌈lg n!⌉ = O(n lg n) Just as n2 = O(n3)
n! = 1*2*3*...*n
≤ n * n * ... * n
= nn
log n! ≤ log nn = n log n
→ lg n! ≤ n lg n
For n ≥ 1, c = 1
O(g(n)) = {h(n): ∃ positive constants c, n0 such that 0 ≤ h(n) ≤ cg(n), ∀ n ≥ n0}
lg n! ≤ 1*n lg n
→ ⌈lg n!⌉ = O(n lg n)
⌈lg n!⌉ = Ω(n lg n)
nn/2 = n/2*n/2*...*n/2
≤ 1*2*3*...*(n-1)/2*n/2*(n+1)/2*...*n
= n!
log nn/2 = (n log n)/2
≤ log n!
→ (n lg n)/2 ≤ lg n!
For n ≥ 1, c = 1/2
Ω(g(n)) = {h(n): ∃ positive constants c, n0 such that 0 ≤ cg(n) ≤ h(n), ∀ n ≥ n0}
(1/2)(n lg n) ≤ ⌈lg n!⌉
→ ⌈lg n!⌉ = ω(n lg n)
⌈lg n!⌉ = Θ(n lg n)
⌈lg n!⌉ = O(n lg n)
⌈lg n!⌉ = Ω(n lg n)
Theorem 2 - Any binary comparison sort of n elements requires Ω(n lg n) comparisons.
Each permutation of n inputs of a binary decision tree is a leaf.
Have shown for height h of the tree :
⌈lg n!⌉ ≤ h
and that
⌈lg n!⌉ = Ω(n lg n)
Since exactly one comparison is made at each non-leaf level and the worst case is bounded by the height h.
Question 8.8
What does Any binary comparison sort of n elements requires Ω(n lg n) comparisons in the worst case really mean?
Can we get lucky and do better? worse?