The purpose of c2 is to say that no matter what value of c2 is used, f1(n) will always be smaller than g1(n).
c2g1(n) will always be slower than or equal to f1(n) above the threshold of n0 where n0 is the intersection betweenf1(n)andg1(n); correction: where n0 is an arbitrarily chosen constant.
n0 and c2 are correlated.
L02: (Asymp) Asymptotics, big-Oh notation
notion
idea
interpretation
pronunciation
o(g)
<
f grows strictly slower than g
little-oh
O(g)
≤
f grows no faster than g
big-oh
θ(g)
=
f grows as fast as g
theta
Ω(g)
≥
f grows no slower than g
big-omega
ω(g)
>
f grows strictly faster than g
little-omega
Ex: If f1(n)=n2, g1(n)=n3, then f1(n)ϵO(g1(n)).
Proof: Let n0=1 and c2=1.
No matter how much we speed up algorithm g (speed up comes in the form of c2), the algorithm f will always be faster once the input size is big enough (the threshold comes in the form of n0.
Ex: If f1(n)=n2, g1(n)=n3, then f1(n)ϵo(g1(n)).
Proof: Let n0=500 and c2=5001, then we want n2≤c2n3.
Given c2, set n0=c21+1 so there is always a threshold for any given speedup. The plus one is just to guarantee the threshold but it could be any number added.
Theta is a combination of big-oh and big-omega; it isn’t faster and isn’t slower, it’s the same.
c1 is the slow down factor, c2 is the speedup factor.
Properties of big-O
fϵo(g)⇒fϵO(g). If f is strictly slower than g, then f is slower (no faster) than g.
fϵω(g)⇒fϵΩ(g).
fϵO(g)&fϵΩ(g)⇔fϵΘ(g).
fϵO(g)⇔gϵΩ(f).
fϵΘ(g)⇔gϵΘ(f).
fϵo(g)⇔gϵω(f).
L03: (Asymp) Limit test, logarithms, exponentials
Logs come from cutting the search space usually in half hence log2().
We assume all logs have base 2.
Theorem: ∀a,b>1,n>0,loga(n)=logb(a)logb(n). Lets you switch the log base.
Examples
7log(n)ϵΘ(log(n))
(log(n))7ϵω(log(n))
log(7n)=nlog(7)ϵΘ(n)
Theorem: ∀a,b>0,limn→∞na(log(n))b=0
The intuition is that the top is faster than the bottom.
Any log-power is smaller than any polynomial.
Examples
(log(n))2ϵo(n)
(log(n))2ϵo(n)
(log(n))7ϵo(n1/100)
(log(n))2ϵo((log(n))2n)
Ordering example
lognn
logn
n
log(logn)
(logn)2
n
logn
(logn)(loglogn)
(logn)2n
n(logn)2 (Not an algorithm)
Answer in increasing growth order
n(logn)2
log(logn)
logn
logn
(logn)(loglogn)
(logn)2
n
(logn)2n
lognn
n
One helpful tip for ordering algorithms is to cancel the same terms on both sides.
Theorem: log(n!)ϵΘ(nlog(n))
Suppose we have a deck of cards, we want to find some permutation where they’re sorted.
The search space is n! and applying log() to it means we split the search space in half for every iteration.
n!≤nn, then log both sides to get log(n!)≤nlog(n) which is the big-O bound of the theorem.
n!=n⋅(n−1)⋅(2n+1)(2n)...1≥(2n)2n, then log both sides to get log(n!)≥2nlog(2n) which is the big-Theta bound of the theorem.
L04: (Asymp) Classification of exponentials, polynomials, and logarithms
An algorithm transforms an input into an output by performing some transformation on the input.
What is an efficient algorithm?
An algorithm that is fast. Doesn’t have to be optimal.
An algorithm is efficient if it runs in polynomial time in the input size. (Will show up on final exam)
2n<n!<nn
Poly logs, subpolys, polys are efficient. Super polys, exps, and superexps aren’t efficient. (For exams, if asked for an efficient algorithm, then it must be in the first three groups).
Greedy (5)
L05: (Greedy) Interval scheduling 1
Proof techniques
Our greedy algorithm never makes a bad choice.
Our greedy algorithm always makes a good choice.
Our greedy algorithm stays ahead.
Our greedy algorithm is on the way to produce an optimal.
At any stage r during the computation subset Ar can be extended to an optimal solution.
Pieces of Candy
You can pick 4 pieces of candy.
We want to maximize the total weight of candy.
Greedy heuristic: Always pick the heaviest remaining piece.
Greedy algorithms are typically simple and can be stated in one line.
Candy pickings
Input: Integers n,k with 1≤k≤n. Weights w1,w2,...,wn.
Output: Heaviest subset A⊆{1,2,...,n} of size k. weight(A)=∑iϵAwi.
Theorems
Our algorithm produces a valid solution.
Our algorithm produces an optimal solution.
Our algorithm runs in polynomial time.
Proof
What we want: Subset Ar can be extended to an optimal solution for all 0≤r≤k. (If we can proof this then we are done).
Proof by contradiction. Assume there is some j such that Aj can’t be extended to an optimal solution. Aka assume there’s a state that can’t be extended to an optimal solution.
Let r=smallestj such that Aj can’t be extended to an optimal solution.
A0={}∪{9,4,8,10} where {9,4,8,10} is an extension that would lead to an optimal solution.
A1={9}∪{4,8,10}
Ar−1={a1,a2,...,ar−1}∪{br,br+1,...,bn} where {br,br+1,...,bn} is an extension that would lead to an optimal solution.
Ar={a1,a2,...,ar−1}∪{ar} ← The step where we made a mistake aka we can’t extend to an optimal solution.
Observation 1: Arˉ has size k so it is a valid set.
The indices {a1,...,ar−1,ar} are all distinct because the greedy algorithm picks only distinct indices by definition of our greedy algorithm.
The indices {a1,...,ar−1,br+1,...,bk} are distinct because it’s part of an optimal solution by definition.
The index of ar is distinct from {br+1,...,bk} because had it been one of them, then Ar could have been extended.
Ar−1∗={a1,...,ar−1}∪{br}∪{br+1,...,bk} is an optimal solution.
Arˉ={a1,...,ar−1}∪{ar}∪{br+1,...,bk} is our greedy solution.
Fact 2: We want to argue value(Arˉ)≥value(Ar−1∗) by arguing that weight(ar)≥weight(br).
We know thatweight(ar)≥weight(br)because the greedy algorithm picks the heaviest among the remaining pieces.
From Fact 1 and Fact 2, we can conclude that Arˉ is a valid and optimal solution. That is a contradiction to our assumption that Ar can’t be extended to an optimal solution.
Proof by contradiction is the same as proof by induction.
L06: (Greedy) Interval scheduling 2
Interval scheduling
Input: Set of n jobs. Each job has a starting time si and a finishing time fi.
Output: Select as many non-overlapping jobs as possible.
Approaches
Pick the shortest job of the next available jobs.
Pick the next available job / job that starts first.
Pick the job that ends the first.
Split ties arbitrarily.
Among jobs that do not overlap already picked jobs.
Answer to handout: {4, 11, 8, 1, 10}
A good algorithm
Outputs a valid solution.
Outputs an optimal solution.
Runs in poly time.
For the interval scheduling GA
By definition of the GA, we will never pick a job that overlaps so the solution is valid.
The proof that it gives an optimal solution is given below.
The GA is poly time as we can sort the job deadlines in nlog(n) time.
Proof idea: GA never makes a bad choice (on our way to construct an optimal solution).
Theorem: Subset As can be extended to an optimal solution for all 0≤s≤k∗.
Proof by contradiction.
Assume there exists some j such that Aj cannot be extended to an optimal solution.
Let r be the smallest j such that Aj cannot be extended to an optimal solution. In other words, let r be the first time we make a mistake.
Then Ar−1={a1,a2,...,ar−1}∪{br,...} can be extended to an optimal solution but Ar={a1,a2,...,ar−1}∪{ar} cannot be extended.
Let the extension be {br,br+1,...,bk} be an optimal extension.
We aren’t going to argue that Ar is in the optimal extension because there might be multiple optimal solutions.
We’re going to argue that Ar can be extended to an optimal solution to contradict our assumption.
Because Ar is the greedy choice, we know that ar will end before or at the same time as br.
Then extend Ar by {br+1,...,bk} but the extension is not necessarily valid because ar and br+1 could overlap.
The solution to this problem is to sort the optimal extension and to replace the earliest job with the greedy choice. Let br be the job that finishes the earliest among jobs {br,br+1,...}.
There’s a greedy choice and that choice can be substituted into an optimal extension.
L07: (Greedy) Shortest paths in graphs
Dijkstra’s Greedy Algorithm
For all uϵV, set d(u)=∞. Set S0={s}, d(s)=0, A0={}.
For k=1 to n−1, do
Find vϵVnotinS with min{d(u)+l(u,v)} ← Greedy choice
Set d(v)=d(u)+l(u,v), Sk=Sk−1∪{v}, Ak=Ak−1∪{(u,v)}
Single source shortest path problem
Input: Directed graph G=(V,E), connected source sϵV. Non-negative weights for each edge.
Goal: A shortest path from s to every other vertex vϵV.
Solution is a tree rooted at s.
GA is to grow a shortest-path tree.
The proof depends on choosing paths that we will never regret.
Proof
We only pick edges that we’ll never regret.
L08: (Greedy) Minimum spanning tree
Kruskal algorithm
Presort edges e1,e2,...,em so that c(e1)≤c(e2)≤c(em) where c(...) is the cost of the edge.
Fo={}, k=0
For i = 1 to m
If Fk∪{ei} is acyclic then Fk+1=Fk∪{ei} and k=k+1.
Return Fk and kgr where kgr=k.
Minimum Spanning Tree (MST)
Undirected graph with non-negative edge costs.
Trees are paths that are unique paths.
A spanning tree has n - 1 edges.
A tree contains a unique path from every vertex to every other vertex.
If we add an edge to a spanning tree, then we introduce a unique cycle.
One GA to obtain the MST is to remove the edge with the greatest cost without making the graph disconnected.
Kruskal’s Heuristic
Pick the cheapest cost edge that doesn’t introduce a cycle.
Whenever we have a cycle, we can remove the edge with the most cost.
Theorem: Kruskal’s algorithm outputs a MST. (Assume Kruskal’s alg outputs a tree)
Proof by contradiction.
Assume there is some k such that Fk can’t be extended to an optimal solution. Aka Kruskal’s algorithm made a mistake.
Let r = smallest such k. So Fr−1 can be extended to an optimal solution but Fr can’t.
Let Fr−1∗=Fr−1∪{er,...,en−1} be the optimal solution and Fr=Fr−1∪{ex} be the set where we made the mistake.
Because of the greedy choice, the edge GA picked must be at least the same cost as the optimal solution.
So we can extend Fr to an optimal solution by Frˉ=(Fr−1∪{ex}−{ey}).
L09: (Greedy) Huffman codes
Huffman Codes
Input: A set of n symbols. S={s1,s2,sn} with frequencies f1,f2,fn.
Output: An optimal prefix-free code over {0, 1} for those frequencies
Example
Message (M) = “appletrees”
StdEnc(M) = 01.1.1.00.0.000.10.0.0.11 ← Not an accurate encoding because the decimal is a third symbol.
Prefix-free code: a code that is unambiguous by having each encoding not have the same prefix.
symbol
frequency
standard encoding (std enc)
e
3
0
p
2
1
l, a, r, s, t
1
00, 01, 10, 11, 000
Observation 1: More frequently used characters should be have a shorter encoding. fi>fj→∣E(si)∣≤∣E(sj)∣
A prefix-free code is a binary tree.
Swap leaves to get higher frequencies closer to the root.
If you keep swapping subtrees until no longer possible, then we get an optimal prefix-free code.
In practice, the tree is built bottom up.
GA means I can locally find an improvement.
If it seems that I can make a local improvement, then GA will probably apply.
Divide & conquer (4)
L10: (D&C) Simple recursive algorithms
Recurrences
Mergesort (Θ(nlog(n)))
T(2)=1
T(n)=2T(2n)+n
Bubblesort (Θ(n2))
T(1)=0
T(n)=T(n−1)+n
Binary search (Θ(log(n)))
T(1)=1
T(n)=T(2n)+1
Integer multiplication (Θ(nlog2(3))≈Θ(n1.57))
T(1)=1
T(n)=3T(2n)+1
Multiplication of complex numbers
Input: (a + bi)(c + di)
Output: (ac - bd) + (bc + ad)i
Faster algorithm for multiplying two complex numbers
3 multiplications (1 at r, 1 at s, 1 at t)
5 additions/subtractions (2 at r, 2 at e, 1 at f)
r =(a + b)(c + d)= ac - bd + bc - ad
s = bc
t = ad
e = ac - bd = r - s + t
f = ad + bc = s + t
Multiplication of large integers
ab∗cd where a is the upper half of the bit string, b is the lower half of the bit string, and ab is the entire number represented as a bit string.
(a⋅2n/2+b)(c⋅2n/2+d)
(ac)2n+(ad+bc)2n/2+bd where 2n and 2n/2 are shifts.
Apply the same trick when multiplying two complex numbers.
Requires 3 multiplications, 4 additions, and 2 shifts.
When explaining recursion, describe the recursive step, not the entire process. Avoid using iteration to describe recursion. E.g. For all, for each, etc.
Matrix multiplication
For each entry in the answer matrix, it requires O(n) operations so constructing the entire answer matrix requires O(n3) since n⋅n2=n3.
Recursive method
Split each matrix into quadrants.
L11: (D&C) Largest sum of subarray and Recursions
Mergesort recurrence: T(n)=2T(n/2)+n+n
2T(n/2) because we split the input in half and we have to sort two subarrays.
+n+n because we divide the array and later merge it
Example
[7,3,−11,5,4,−3,7,2,−9,−8,12,−2]
Sample output: [5,4,−3,7,2]=15.
Brute force is Θ(n3) because we try all starting indices and all ending indices. Getting the starting and ending indices is n2 and getting the indices between them is n.
D&C is to split the array into two subarrays and to calculate the largest sum of those subarrays. We also need to care for the case when the solution crosses the border.
We ask each subarray for its
Largest subarray
Largest subarray touching the dividing border
Largest subarray touching its edges
Total sum
Each subarray returns four pieces of information.
Whatever I ask my children, I must also give to my parent.
The largest sum of an array must come from three cases
The left subarray
The right subarray
A combination of the left and right subarray
Runtime of T(n)=2T(n/2)+Θ(n)+Θ(1)ϵΘ(nlogn)
Dividing is linear time if we copy the array into two subarrays.
Merging is clearly constant time Θ(1). We need to compute 4 numbers from 8 numbers.
T(n)=2T(n/2)+Θ(1)+Θ(1)ϵΘ(n)
If the division were constant time, then the algorithm would be linear. It can be if we pass indices for the start and end of the subarray instead of copying.
Master Theorem
Let T(n)=aT(n/b)+f(n) where a is the number of subproblems, b is the size of each subproblem, and f(n) is the cost of dividing and merging.
Depth = logb(n).
L12: (D&C) Closest points in the plane
Input: n points (x1,y1),...,(xn,yn)
Output: the two closest points
{dist(p,p′)=∣x−x′∣2+∣y−y′∣2}
Trivial algorithm: Θ(n2)
Now: Θ(nlog(n))
Master theorem
Case 1: Many leaves so the cost is mostly the leaves. More expensive to solve subproblems.
Case 2: Leaves and root are the same cost. Same to solve subproblems.
Case 3: Few leaves so the cost is mostly the root. Cheaper to solve subproblems.
To solve for the closest points in a plane
Divide the plane into half
Find the half by presorting
Solve each half recursively and deal with the border somehow
Whatever we get as input, our children should get as input
Algorithm
Presort (n log n Time)
P_x = all n points sorted with respect to x-axis
P_y = all n points sorted with respect to y-axis
Divide (Linear Time)
Q_x = left half of points sorted with respect to x-axis
Q_y = left half of points sorted with respect to y-axis
R_x = right half of points sorted with respect to x-axis
R_y = right half of points sorted with respect to y-axis
Merge
d = min(d_left, d_right)
d_final = min(d_left, d_right, d_border)
We want to merge in linear time to get n log n time.
Discard all points that are delta (d) away from the border.
We use P_y to get the points closer to the border in linear time.
L13: (D&C) FFT - Fast Fourier Transforms
D&C = Recursion with non-overlapping
DP = Recursion with overlapping subproblems
Develop recursively
Implement iteratively
Don’t use memoization because it doesn’t speak enough about recursion.
Master theorem
f(n) is the cost of the root, nlogb(a) is the cost of the leaves.
Master theorem doesn’t apply to non-polynomial separated functions, only to poly-separated functions.
Dynamic programming (5)
L14: (DP) Scheduling
Job selection
Input: wages w1,w2,...,wn (wages for 1 day of work)
Output: Earn as much as possible
Constraint: You cannot work 2 days in a row
Examples
E.g. [4,3,2,4,1] (n = 5)
Work on days 1 and 4 to earn 8.
E.g. [6,9,2,1,4,5,3] (n = 7)
Work on days 2, 5, and 7 to earn 16.
E.g. [6,9,2,1,4,5,3,4] (n = 8)
Work on days 2, 4, 6, and 8 to earn 19.
Greedy doesn’t work because we can’t extend the solution from n = 7 to n = 8.
A small change at the end changes the solution so this is a global problem.
Solution
Recursively think about solving the subproblems.
Either we work day n or we don’t work day n.
If we work day n, then we can’t work day n - 1. Work optimally among the first n - 2 days.
If we don’t work day n, then we work optimally among the first n - 1 days.
We pick the better of the two conditionals.
Proof
Base case
(n = 1) work day 1
(n = 2) work the day with the higher wage
The important idea is to ask what is the optimal value from the children.
What value do you want from your children?
C[k] = optimal value we can earn among days 1,2,...,k. CORE OF ALGORITHM
C[n]=max{wn+C[n−2],C[n−1]}. Either we work for wage wn or we don’t work.
C[k]=max{wk+C[k−2],C[k−1]} for k≥3.
C[2]=max{w1,w2}
C[1]=w1
Implementation (Bottom-Up)
E.g. [6,9,2,1,4,5,3,4] (n = 8)
Base case: C=[6,9]
Completed C: C=[6,9,9,10,13,15,16,19]
Output 19 as the total we will earn.
Remember to output a value from your algorithm.
Weighted Interval Scheduling
We can make a small change to the problem (changing a weight to infinity) that changes the entire solution.
Think in terms of the number of jobs (n), not the number of days (d).
Solution
Presort the intervals with respect to finishing time. (Break ties arbitrarily)
Either we work job n and earn wage w
Work optimally among jobs that finishes before job n starts.
Or we don’t work job n and don’t earn w
Work optimally among jobs 1, …, n-1
C[n] = maximum amount we can earn among jobs 1 through n.
C[k] = max amount we can earn among jobs 1 through k (k≥1)
Either we work job n so: wn+maxj{C[j]∣fj≤sn} (we want to maximize the wage for jobs that finish before n starts)
Or we don’t work job n so: C[n−1] (work optimally among n−1 jobs)
Then we take the max of both cases to get C[n].
C[1]=w1 for the base case.
L15: (DP) Knapsack and Subset Sum
Our algorithm developed today is not poly-time.
Problem
Input
n items
weights w1,w2,...,wn
values v1,v2,...,vn.
Knapsack of capacity W
Output
Fill knapsack with as valuable load as possible (not overfill)
An optimal solution: {1,2,3}. The optimal value: $135
Solution
Either we pick item n
The knapsack capacity changes. Remaining capacity is W−wn where W is the knapsack capacity before we chose n. If it doesn’t fit then don’t consider it.
We earn the the value of n plus the optimal value among 1 to n−1.
Or we don’t pick item n
Earn optimally among items 1 to n−1.
Remaining capacity doesn’t change.
C[n,W] = maximum value in knapsack of capacity W among items 1 through n.
C[k,T] = maximum value in knapsack of capacity T among items 1 through k.
A 2D array where the rows are the number of items and the columns are the capacity of the knapsack.
Populating the array takes Θ(n⋅W) which is not poly-time. Poly-time means an algorithm that runs in polynomial time on the input size. It’s actually exponential.
Starting from the end/back of X and Y. Recursively apply the following function.
If the end symbols match, then remove the end symbols and call this algorithm again.
If the end symbols don’t match, then call this algorithm twice. Once on the full first string and the second string with the end symbol removed, and once on the first string with the end symbol removed and the full second string.
Subproblem is a prefix of X and a prefix of Y.
C[i,j] = value of the subproblem LCS(x1,...,xi,y1,...,yj).
The storage of the subproblem solutions is a 2D array where the rows are prefixes of Y and the columns are prefixes of X. Each cell holds the length of the LCS of x1,...,xi and y1,...,yj. The output is the value at the top right of the 2D array.
Number of subproblems is Θ(nm) because subproblems is (n+1)(m+1). Since each subproblem takes constant time, the runtime is Θ(nm).
Printing the LCS
PrintLCS(C, n, m, X, Y)# Call subroutine
PrintLCSRec(C, i, j, X, Y):# Define subroutineif i ==0or j ==0:returnif C[i, j]=="Diagonal":
PrintLCSRec(C, i -1, j -1, X, Y)print(X[i])elif C[i, j]=="Horizontal":
PrintLCSRec(C, i -1, j, X, Y)else:
PrintLCSRec(C, i, j -1, X, Y)
L17: (DP) Matrix problems
Matrix Chain Multiplication
Input
matrices A1,A2,..,An
integer n
dimensions p0,p1,...,pn. (Matrix Ai has dimensions (pi−1,pi))
Output
the fastest way (order of multiplication) to compute the product B=A1A2...An.
Dimensions of B (p0,pn), A1(p0,p1), A2(p1,p2), … An(pn−1,pn).
Cost of multiplying two matrices, one of (p,q) and one of (q,r), is pqr.
Example
A1(4,6),A2(6,2),A3(2,2),B(4,2)
Two multiplications: (A1A2)A3 or A1(A2A3).
The first one costs 64, the second one costs 72. So the first chain multiplication is cheaper.
The key insight is that there’s a last multiplication, we just don’t know where.
Last multiplication
Cost(A1−n)=min{cost(A1−k)+cost(A(k+1)−n)+(p0pkpn)} where 1≤k≤n−1.
The cost equation is the smallest cost to compute the product (A1A2...An).
General cost: Cost(Ai−j)=min{cost(Ai−k)+cost(A(k+1)−j)+(pi−1pkpj)} where i≤k≤j−1.
Base cases: Cost(Ai−i)=0 where 1≤i≤n.
Solution storage: C[i,j]=cost(Ai−j) which is a 2D array.
Rows are j, columns are i.
Output the cell at i = 1 and j = n so C[1,n].
Base cases are the diagonal so we get a diagonal of zeros.
We don’t use half of C (bottom-right triangle) because i must be less than j.
We fill out C diagonally.
Solving one cell in C requires the solutions down and right.
It takes Θ(n) per cell for Θ(n2) cells so the algorithm takes Θ(n3).