User:Kszlim/sandbox

Dynamic Programming

 * A problem must have optimal substructure and overlapping subproblems for dynamic programming to be considered. Note that the overlapping subproblems should only be slightly smaller; if they are something like half the size of the original problem, we'd use divide and conquer instead.

Directly from Wikipedia:
 * Top-down approach: This is the direct fall-out of the recursive formulation of any problem. If the solution to any problem can be formulated recursively using the solution to its subproblems, and if its subproblems are overlapping, then one can easily memoize or store the solutions to the subproblems in a table. Whenever we attempt to solve a new subproblem, we first check the table to see if it is already solved. If a solution has been recorded, we can use it directly, otherwise we solve the subproblem and add its solution to the table.


 * Bottom-up approach: This is the more interesting case. Once we formulate the solution to a problem recursively as in terms of its subproblems, we can try reformulating the problem in a bottom-up fashion: try solving the subproblems first and use their solutions to build-on and arrive at solutions to bigger subproblems. This is also usually done in a tabular form by iteratively generating solutions to bigger and bigger subproblems by using the solutions to small subproblems. For example, if we already know the values of F41 and F40, we can directly calculate the value of F42. (Referring to the Fibonacci problem)

Step 1: Define your sub-problem. Tell us in English what your sub-problem means, whether it look like P(k) or R(i, j) or anything else.

Step 2: Present your recurrence. Give a mathematical definition of your sub-problem in terms of “smaller” sub-problems.

Step 3: Prove that your recurrence is correct. Usually a small paragraph.

Step 4: State your base cases. Sometimes only one or two or three bases cases are needed, and sometimes you’ll need a lot (say O(n)).

Step 5: Present the algorithm. This often involves initializing the base cases and then using your recurrence to solve all the remaining sub-problems. You want to ensure that by filling in your table of sub-problems in the correct order, you can compute all the required solutions. Finally, generate the desired solution. Often this is the solution to one of your sub-problems, but not always.

Step 6: Running time. Matrix Chain Multiplication M(i,j) = min i<=k aj
 * max 1<=k w
 * max(Value(i-1, w), Value(i-1, w - wi) + vi) otherwise

Balls and Jars Ways(n,k) = Ways(n,k-1) + Ways(n-k,k) Servers MinCost(i) = 0 if i>n
 * min1<=k0

Travelling SalesPerson C(S,j) = min( C(S-{j}, i) + dij) Coin Change Making C(j) = 0 if j =0
 * 1 + min1<=i=1

Moving on a Grid A(i,j) = C(i,j) + min(A(i-1,j-1),A(i-1,j), A(i-1,j+1)) Boolean Parenthization T(i,j) = sumi<=k best-bq
 * then best-choice  j
 * best-bq  bq
 * table[i]: bq  best-bq
 * table[i]: last -end  best-choice

return table[i]: bq Here’s a bottom-up version: BOTTOM-UP-SEGMENT(y; n) for i  1 to n
 * do best-choice  0
 * best-bq  quality(y1 : : : yi)
 * for j  1 to i 􀀀 1
 * do bq  table[j]: bq +quality(yj+1 : : : yi)
 * if bq > best-bq
 * then best-choice  j
 * best-bq  bq
 * table[i]: bq  best-bq
 * table[i]: last -end  best-choice

return table[n]: bq

Aggregate Analysis
Show that for any n, a sequence of n operations takes worst case $$T(n)$$ in total. The average (amortized) cost per operation is then $$\frac{T(n)}{n}$$, and this cost applies to every operation in the sequence, not just a single type of operation in the sequence.

Accounting Method
Charge one type of operation more early in the sequence, and use the excess credit to pay for other operations later in the sequence. Here, different operations can have a different cost. We require

$$\sum_{i=1}^{n} \widehat{c}_i \geq \sum_{i=1}^{n} c_i$$

Where $$\widehat{c}_i$$ is the amortized (charged) cost of the $$i^{th}$$ operation and $$c_i$$ is the actual cost of the $$i^{th}$$ operation. The total credit stored in the data structure is thus $$\sum_{i=1}^{n} \widehat{c}_i - \sum_{i=1}^{n} c_i$$ and we must always have nonnegative credit.

Potential Method
Similar to the accounting method, we store prepaid charges in the data structure, but we store the credit as a whole in the data structure, not for individual operations. This "total credit" is the potential. We have a potential function $$\Phi : D_i \rightarrow \mathbb{R}$$ where $$\Phi(D_i)$$ is the potential stored in the data structure $$D_i$$.

The amortized cost $$\widehat{c}_i = c_i + \Phi(D_i) - \Phi(D_{i-1})$$. Thusly, the total amortized cost is the actual cost plus the potential stored in the data structure:

$$\sum_{i=1}^{n} \widehat{c}_i = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_{0}) $$

And we require that $$\forall i \; \Phi(D_i) \geq \Phi(D_{0})$$ then we guarantee that we pay for each operation in advance.


 * If $$\Phi(D_i) - \Phi(D_{i-1}) > 0$$, then $$\widehat{c}_i$$ represents an overcharge on the $$i^{th}$$ operation
 * If $$\Phi(D_i) - \Phi(D_{i-1}) < 0$$, then $$\widehat{c}_i$$ represents an undercharge on the $$i^{th}$$ operation, and the potential decrease pays for the actual cost of this $$i^{th}$$ operation

Optimal vs. Greedy

 * Optimization problem: The problem of finding the best solution from all feasible solutions, according to some metric.
 * Greedy algorithm: Repeatedly make the "locally best choice" until the choices form a complete solution.
 * Optimal substructure: Optimal solution to the problem is composed of pieces which are themselves optimal solutions to subproblems.
 * Greedy-choice property: Locally optimal choices can be extended to a globally optimal solution.

How does one prove that a greedy solution is optimal? Two ways: e.g Exchange argument for Activity Selection
 * Greedy algorithm stays ahead: Measure the greedy algorithm's progress in a step-by-step fashion, and show that it does better than any other algorithm at each step.
 * Exchange argument: Gradually transform any given solution to the greedy solution.
 * Let O be some optimal solution and G a greedy solution (where we pick the earliest finish time) to an activity selection problem.
 * Let Ai be the first activity that differs between G and O. That is, let Ai conflict with some elements ALT in O.
 * Three Cases
 * ALT = 0 (no conflicts). Contradiction because then we could add Ai to O and get a better sol'n so O was not optimal.
 * ALT = 1. We can now swap Ai and ALT because Ai ends earlier than ALT and does not conflict with anything else in O. If Ai did not end earlier then G would've just picked ALT instead.
 * ALT >= 2. Conflict have one element in ELT ending before Ai and some element ending after Ai. This is a contradiction because then G would've picked the element in ELT that ended before Ai.

Since O's conflict with G is always a single item that can be swapped, keep swapping until O == G. Now Size(O) == Size(G) and we G gave us an optimal sol'n.

Graph definitions
A graph G = (V, E) is composed of vertices $$V = \{v_1, v_2, ..., v_n\}$$ and edges $$E = \{e_1, e_2, ..., e_n\}$$ where each $$e_i$$ connects two vertices $$(v_{i1}, v_{i2})$$.
 * sparse: $$|E|$$ is much less than $$|V|^2$$
 * dense: $$|E|$$ is close to $$|V|^2$$
 * acyclic - A graph is acyclic if it contains no cycles; unicyclic if it contains exactly one cycle; and pancyclic if it contains cycles of every possible length (from 3 to the order of the graph).
 * weakly connected - replacing all of its directed edges with undirected edges produces a connected (undirected) graph
 * strongly connected - there is a path from each vertex in the graph to every other vertex. Implies graph is directed
 * spanning tree: a subset of edges from a connected graph that touches all vertices in the graph (spans the graph) and forms a tree (is connected and contains no cycles)
 * minimum spanning tree: the spanning tree with the least total edge cost
 * cut (S, V-S)
 * light edge
 * any undirected graph (not necessarily connected) has a minimum spanning forest, which is a union of minimum spanning trees for its connected components.

Representing a graph in memory

 * adjacency-list representation: efficient for sparse graphs
 * memory required: $$O(V + E)$$
 * adjacency-matrix representation: efficient for dense graphs
 * memory required: $$O(V^2)$$

Kruskal (MST algorithm)
TODO: analyzed runtimes, and proof.


 * 1) Let $$Q$$ be a min-priority queue of the edges in E
 * 2) Let $$T = \{\}$$ (the minimum spanning tree being built)
 * 3) While $$Q$$ is not empty do:
 * 4) $$(u,v) = Q.deleteMin$$
 * 5) If $$u$$ is not connected to $$v$$
 * 6) $$T = T \cup \{(u,v)\}$$
 * 7) Record that $$u$$ and $$v$$ are now connected.

Note: To efficiently implement the priority queue, use a binary heap. To efficiently implement checking for and changing connectivity, use the up-tree disjoint-set union-find data structure (with union-by-rank and path compression).

Where E is the number of edges in the graph and V is the number of vertices, Kruskal's algorithm can be shown to run in O(E log E) time, or equivalently, O(E log V) time, all with simple data structures.We can achieve this bound as follows: first sort the edges by weight using a comparison sort in O(E log E) time; this allows the step "remove an edge with minimum weight from S" to operate in constant time. Next, we use a disjoint-set data structure (Union&Find) to keep track of which vertices are in which components. We need to perform O(E) operations, two 'find' operations and possibly one union for each edge. Even a simple disjoint-set data structure such as disjoint-set forests with union by rank can perform O(E) operations in O(E log V) time. Thus the total time is O(E log E) = O(E log V).

If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).

Prim (MST algorithm)
The algorithm continuously increases the size of a tree starting with a single vertex until it spans all the vertices.

* Input: A connected weighted graph with vertices V and edges E.   * Initialize: Vnew = {x}, where x is an arbitrary node (starting point) from V, Enew = {} * Repeat until Vnew = V:         o Choose edge (u,v) with minimal weight such that u is in Vnew and v is not (if there are multiple edges with the same weight, choose arbitrarily but consistently) o Add v to Vnew, add (u, v) to Enew * Output: Vnew and Enew describe a minimal spanning tree

Time Complexity:

There are a number of specialized heap data structures that either supply additional operations or outperform the above approaches. The binary heap uses O(log n) time for both operations, but allows peeking at the element of highest priority without removing it in constant time. Binomial heaps add several more operations, but require O(log n) time for peeking. Fibonacci heaps can insert elements, peek at the maximum priority element, and increase an element's priority in amortized constant time (deletions are still O(log n)).v

Proof: Let P be a connected, weighted graph. At every iteration of Prim's algorithm, an edge must be found that connects a vertex in a subgraph to a vertex outside the subgraph. Since P is connected, there will always be a path to every vertex. The output Y of Prim's algorithm is a tree, because the edge and vertex added to Y are connected. Let Y1 be a minimum spanning tree of P. If Y1=Y then Y is a minimum spanning tree. Otherwise, let e be the first edge added during the construction of Y that is not in Y1, and V be the set of vertices connected by the edges added before e. Then one endpoint of e is in V and the other is not. Since Y1 is a spanning tree of P, there is a path in Y1 joining the two endpoints. As one travels along the path, one must encounter an edge f joining a vertex in V to one that is not in V. Now, at the iteration when e was added to Y, f could also have been added and it would be added instead of e if its weight was less than e. Since f was not added, we conclude that


 * $$w(f) \ge w(e).$$

Let Y2 be the graph obtained by removing f and adding e from Y1. It is easy to show that Y2 is connected, has the same number of edges as Y1, and the total weights of its edges is not larger than that of Y1, therefore it is also a minimum spanning tree of P and it contains e and all the edges added before it during the construction of V. Repeat the steps above and we will eventually obtain a minimum spanning tree of P that is identical to Y. This shows Y is a minimum spanning tree.

Shortest-path (Dijkstra's)
Algorithm for Dijkstra's, analyzed runtimes, and proof.

Description of the algorithm
Suppose you want to find the shortest path between two intersections on a map, a starting point and a destination. To accomplish this, you could highlight the streets (tracing the streets with a marker) in a certain order, until you have a route highlighted from the starting point to the destination. The order is conceptually simple: at each iteration, create a set of intersections consisting of every unmarked intersection that is directly connected to a marked intersection, this will be your set of considered intersections. From that set of considered intersections, find the closest intersection to the destination (this is the "greedy" part, as described above) and highlight it and mark that street to that intersection, draw an arrow with the direction, then repeat. In each stage mark just one new intersection. When you get to the destination, follow the arrows backwards. There will be only one path back against the arrows, the shortest one.

Pseudocode
In the following algorithm, the code, searches for the vertex   in the vertex set   that has the least   value. That vertex is removed from the set  and returned to the user. calculates the length between the two neighbor-nodes  and. The variable  on line 13 is the length of the path from the root node to the neighbor node   if it were to go through. If this path is shorter than the current shortest path recorded for, that current path is replaced with this   path. The  array is populated with a pointer to the "next-hop" node on the source graph to get the shortest route to the source.

1 function Dijkstra(Graph, source): 2     for each vertex v in Graph:           // Initializations 3         dist[v] := infinity               // Unknown distance function from source to v 4         previous[v] := undefined          // Previous node in optimal path from source 5     dist[source] := 0                     // Distance from source to source 6     Q := the set of all nodes in Graph // All nodes in the graph are unoptimized - thus are in Q 7     while Q is not empty:                 // The main loop 8         u := vertex in Q with smallest dist[] 9         if dist[u] = infinity: 10             break                         // all remaining vertices are inaccessible from source 11         remove u from Q 12         for each neighbor v of u:         // where v has not yet been removed from Q. 13              alt := dist[u] + dist_between(u, v'') 14             if alt < dist[v]:             // Relax (u,v,a) 15                 dist[v] := alt 16                 previous[v] := u 17     return dist[] Description of the algorithm

Suppose you want to find the shortest path between two intersections on a map, a starting point and a destination. To accomplish this, you could highlight the streets (tracing the streets with a marker) in a certain order, until you have a route highlighted from the starting point to the destination. The order is conceptually simple: at each iteration, create a set of intersections consisting of every unmarked intersection that is directly connected to a marked intersection, this will be your set of considered intersections. From that set of considered intersections, find the closest intersection to the destination (this is the "greedy" part, as described above) and highlight it and mark that street to that intersection, draw an arrow with the direction, then repeat. In each stage mark just one new intersection. When you get to the destination, follow the arrows backwards. There will be only one path back against the arrows, the shortest one.

Asymptotic Notations

 * $$f(n) \in O(g(n))$$ iff $$\exist c \in \mathbf{R^+}, \exist n_0 \in \mathbf{Z^+}, \forall n \in \mathbf{Z^+}, n \geq n_0 \rightarrow f(n) \leq c g(n)$$
 * $$f(n) \in \Omega(g(n))$$ iff $$g(n) \in O(f(n))$$, $$ \exist c \in \mathbf{R^+}, \exist n_0 \in \mathbf{Z^+}, \forall n \in \mathbf{Z^+}, n \geq n_0 \rightarrow 0 \leq cg(n) \leq f(n)$$
 * $$f(n) \in \Theta(g(n))$$ iff $$f(n) \in O(g(n))$$ and $$g(n) \in O(f(n))$$, $$\exist c_{1}, c_{2} \in \mathbf{R^+}, \exist n_0 \in \mathbf{Z^+}, \forall n \in \mathbf{Z^+}, n \geq n_0 \rightarrow 0 \leq c_{1}g(n) \leq f(n) \leq c_{2}g(n)$$
 * $$f(n) \in o(g(n))$$ iff $$\forall c \in \mathbf{R^+}, \exist n_0 \in \mathbf{Z^+}, \forall n \in \mathbf{Z^+}, n \geq n_0 \rightarrow f(n) < c g(n)$$
 * $$f(n) \in \omega(g(n))$$ iff $$g(n) \in o(f(n))$$, $$\forall c \in \mathbf{R^+}, \exist n_0 \in \mathbf{Z^+}, \forall n \in \mathbf{Z^+}, n \geq n_0 \rightarrow 0 \leq cg(n) < f(n) $$

Also, if $$\displaystyle \lim_{n\to\infty} \frac{f(n)}{g(n)} =$$


 * $$\infty$$, then $$f(n) \in \omega(g(n))$$
 * a non-zero constant, then $$f(n) \in \Theta(g(n))$$
 * $$0$$, then $$f(n) \in o(g(n))$$

Complexity of Deterministic Select
T(n) <= c[n/5] + c[7n/10 + 6] + an    <= cn/5 + c + 7cn/10 + 6c + an     <= 9cn/10 + 7c + an     <= cn + (-cn/10 + 7c + an) -cn/10 + 7c + an <= 0 cn/10 - 7c >= an cn - 70c >= 10an c(n - 70) >= 10an c >= 10an(n/(n - 70))

Log Laws
For all real a > 0, b > 0, c > 0, and n,
 * $$a = b^{\log_b a}$$
 * $$\log_c ab = \log_c a + \log_c b$$
 * $$\log_b a^n = n \log_b a$$
 * $$\log_b a = \frac{\log_c a}{\log_c b}$$
 * $$\log_b \frac{1}{a} = -\log_b a$$
 * $$\log_b a = \frac{1}{\log_a b}$$
 * $$a^{\log_b c} = c^{\log_b a}$$
 * $$\log 1 = 0 $$



Decision Tree-Related Notes

 * for a list of $$n$$ elements, there's $$n!$$ permutations
 * a binary tree of height $$d$$ has at most $$2^d$$ leaves
 * therefore, a binary tree with at least $$n$$ leaves must have height at least $$\lceil \lg n\rceil$$
 * to get the height of a tree: the longest path (say we divide by 3/2 at each level) finishes when $$\left(\frac{2}{3}\right)^kn = 1 \rightarrow k = log_{\frac{3}{2}}n$$, so the height is $$log_{\frac{3}{2}}n$$
 * $$\lg(n!) \in \Theta(n \lg n)$$, which we can establish by proving big-O and big-&Omega; bounds separately (pumping "up" or "down" the values of the terms in the factorial and the overall number of terms as neede

REDUCTION
If we reduce problem A to problem B (A -> B), then A<=B.
 * Q.Why is reducing sorting to selection not a promising approach to develop a lower-bound on the selection problem?
 * A.The best we could hope for is a lower bound of(n lg n)—”transferring” sorting’s lower bound to selection, but we know already that we can do better than this bound (i.e., O(n)); so, the reduction is hopeless.
 * Q.why is reducing selection to sorting also not a promising approach to develop a lower-bound on the selection problem?
 * A.This reduction can establish a lower bound on sorting, but it cannot establish a lower bound on selection. So, it’s useless for this purpose.

Master theorem
If $$\displaystyle T(n) = aT(n/b) + f(n)$$ and a constant in the base case, where $$n/b$$ can be either $$\lfloor n/b \rfloor$$ or $$\lceil n/b \rceil$$, $$ a >= 1 $$ and $$ b > 1 $$ then: Also, Case 3 always hold when f = n^k and If $$f(n) \in \Omega(n^{log_b {a+\mathcal{E}}})$$
 * If $$f(n) \in O(n^{log_b {a-\mathcal{E}}})$$ for some $$\mathcal{E} >0$$, then $$T(n) \in \Theta(n^{log_b a})$$
 * Dominated by leaf cost
 * If $$f(n) \in \Theta (n^{log_b a})$$ then, $$T(n) \in \Theta (n^{log_b a} lg n) $$
 * Balanced cost
 * If $$f(n) \in \Omega(n^{log_b {a+\mathcal{E}}})$$ for some $$\mathcal{E} > 0$$ and if $$\displaystyle af(n/b) \leq cf(n)$$ for some constant $$\displaystyle c < 1$$ and sufficiently large $$n$$, then $$T(n) \in \Theta(f(n))$$
 * Dominated by root cost - It is important in this case to check that f(n) is well behave. More specifically,
 * For sufficient large n, a*f(n/b) <= c*f(n)

Log Laws
For all real a > 0, b > 0, c > 0, and n,
 * $$a = b^{\log_b a}$$
 * $$\log_c ab = \log_c a + \log_c b$$
 * $$\log_b a^n = n \log_b a$$
 * $$\log_b a = \frac{\log_c a}{\log_c b}$$
 * $$\log_b \frac{1}{a} = -\log_b a$$
 * $$\log_b a = \frac{1}{\log_a b}$$
 * $$a^{\log_b c} = c^{\log_b a}$$
 * $$\log 1 = 0 $$

Summation

 * Arithmetic Series: $$\displaystyle{\sum_{k=1}^n(k) = 1 + 2 + \dots + n} = \frac{1}{2}n(n + 1)$$
 * Sum of Squares: $$\displaystyle{\sum_{k=0}^n(k^2)} = \frac{n(n+1)(2n+1)}{6}$$
 * Sum of Cubes: $$\displaystyle{\sum_{k=0}^n(k^3)} = \frac{n^2(n+1)^2}{4}$$
 * Geometric Series: $$\displaystyle{\sum_{k=0}^n(x^k) = 1 + x + x^2 + \dots + x^k} = \frac{x^{n + 1} - 1}{x -1 }, $$ for real $$ x \ne 1$$
 * Infinite decreasing: $$\displaystyle{\sum_{k=0}^{\infty}(x^k) = \frac{1}{ 1 - x }, for |x| < 1 } $$