- Maximum Flow and Minimum Cut
- Flow Network
- Cut, Cut-Set, \(s\)-\(t\) Cut, Capacity of a Cut
- Mincut Problem
- Maxflow Problem
- Ford-Fulkerson Algorithm
- Flow Cut Relationship
- Augmenting Path Theorem
- Maxflow-Mincut Theorem
- Proofs
- Application: Bipartite Matching
- Application: Baseball Elimination
- Radix Sorts
- Key-Indexed Counting
- LSD Radix Sort
- MSD Radix Sort
- 3-way Radix Quicksort (Multi-key quicksort)
- Suffix Sort
- Manber-Myers Algorithm
- Tries
- R-way Trie
- Ternary Search Trie (TST)
- TST with R² Branching at the Root
- Radix tree
- Suffix Tree
- Trie-based Character Operations
- Substring Search
- Regular Expressions
- Data Compression
- Other Notes

The assignment of flow values to edges is called an

An

- There are many algorithms to solve maxflow problems
- Best in-practice: push-relabel method with gab relabeling (in \(E^{3/2}\))
- Worst-case time complexity is generally not useful for predicting or comparing maxflow algorithm performance in practice

- Start with 0 flow at all edges
- Repeatedly find an "augmenting"
*undirected*path from \(s\) to \(t\) such that the flow on forward edges is less than the capacity and the flow on backwards edges is more than 0. - Increase/decrease the flow along the path on all forward/backward edges.
- Terminate when no augmenting paths are left.

The performance of this algorithm depends heavily on the augmenting path selection algorithm.

Note that the method of finding augmenting paths is not specified in the Ford-Fulkerson algorithm. When BFS is used the algorithm is referred to as the

With a bad augmenting path selection algorithm (e.g. longest path first), Ford-Fulkerson can pretty slow (e.g. 200 vs 2 iterations, where each iteration +1/+100 is added to the flow)

So overall we know that \(\text{value of the flow} = \text{net flow across cut} \le \text{capacity of the cut}\).

Value of the maxflow \(=\) capacity of the mincut

- There exists a cut whose capacity equals the value of the flow \(f\).
- \(f\) is a maxflow.
- There is no augmenting path with respect to \(f\).

Prove 1 \(\implies\) 2, ~3 \(\implies\) ~2 and 3 \(\implies\) 2.

How to compute a mincut from a maxflow follows directly from the proof.

With a Ford-Fulkerson implementation specifically optimized for bipartite matching the runtime is proportional to \(E\sqrt{V}\).

LSD radix sort can be used to efficiently sort a list of integers by grouping the number into bytes (or \(n\)-bit sized blocks) and sorting the string of bytes (or string of blocks).

LSD radix sort runs in \(\mathcal{O}(2\cdot nw)\) where \(n\) is the number of keys and \(w\) is the key length.

`less()`

.MSD radix sort is generally linear \(\mathcal{O}(2\cdot nw)\) but can be sublinear \(\mathcal{O}(n \log_R n)\) in input size since for (almost) random keys only very few digits need to be examined to determine the correct order.

Note: LSD radix sort is generally only defined for fixed length keys whereas MSD radix sort can operate on mixed length keys.

- Accesses memory randomly (→ cache inefficient)
- Needs lots of extra space for count[] and aux[]
- Inner loop with lot of instructions

- Linearithmic number of string compares
- Has to rescan many characters in keys with long prefix matches (since comparison with the pivot is done by comparing the entire strings)

3-way string/radix quicksort combines radix sort with quicksort to create a sorting algorithm with quicksort like performance characteristics that performs better than quicksort in cases where lots of keys share prefixes (as is common in many applications of sorting).

- Short inner loop
- Cache friendly
- In place

3-way string quicksort partitions the array on the \(d\)th character based on a selected pivot character (median/random/..) such that the first/second/third partition contains keys where the \(d\)th characters is less than/equal/greater than the pivot character. Then the algorithm is recursively applied to the partitions.

3-way radix quicksort benefits from typical quicksort optimizations like median-of-three pivoting and switching to insertion sort for small arrays. 3-way radix quicksort is isomorphic to ternary search trees in the same way that quicksort is isomophic to binary search trees.

First an array of all suffixes of the string is created (in linear time and space using array views) then this array is sorted with any sorting algorithm (e.g. very efficiently with 3-way radix quicksort).

When doing suffix sort with 3-way radix quicksort the worst-case input contains a very long longest repeated substring which causes the algorithm to have a runtime quadratic (or worse) in D. The suffix sort problem can also be solved in linearithmic time with the Manber-Myers algorithm or in linear time with suffix trees.

We can build balanced TSTs via rotations to achieve a \(L + \log N\) worst-case guarantee.

- Prefix match (find all keys that match a given prefix)
- Wildcard match (find all keys that match a given pattern)
- Longest prefix (find key that is the longest prefix of a given word, e.g. ip packet routing)
- Sorted iteration (inorder traversal of trie yields keys in sorted order)

- If character is not in pattern: skip over entire pattern
- Otherwise skip to the rightmost matching character. Unless this would involve back-up of the leftmost edge of the pattern, then just move one character over.

Step 2. can be performed efficiently by precomputing a table of rightmost character occurrences.

Substring search with the Boyer-Moore mismatched character heuristic takes about \(N/M\) character compares where \(N\) is the length of the string and \(M\) is the length of the pattern.

The worst-case of \(M\cdot N\) can be avoided by using a KMP (see above) hybrid rule to guard against repetitive patterns which brings the worst case runtime to \(3N\).

Above, \(Q=997\) is chosen as an arbitrary, large prime number. If the hash of the current substring matches the hash of the pattern the strings can be compared to see whether they actually match.

The hash computation can also be performed efficiently. Basically we have to evaluate the polynomial $$ x_i = t_iR^{M-1} + t_{i+1}R^{M-2} + \ldots + t_{i+M-1}R^{0} $$ for every starting position \(i\) and hash the result. To compute the hash of a single substring/pattern we can use Horner's method and apply the hash function between each step which ensures that the values can't grow arbitrarily large, no matter how long the pattern is.

We can then efficiently compute the hashes for sequential substrings in a running fashion (applying the hash function after each step): $$ x_{i+1} = (x_i - t_iR^{M-1}) R + t_{i+M} $$ Note that \(R^{M-1} \pmod{Q}\) can be precomputed (using Horner's method to prevent overflows).

Note that arithmetic ops in Rabin-Karp are generally slower than char compares.

If \(Q\) is a sufficiently large random prime (about \(MN^2\)), then the probability of a false collision is a about \(1/N\). In practice choose \(Q\) to be a large random prime, under reasonable assumptions the probability of a collision is about \(1/Q\).

Other common regex patterns like "[a-z]", "a+", "." (wildcard), "[0-9]{5}" (exactly k), ... can all be expressed through the notational devices listed above. Notations like "\0" (matches previous match) are not regular.

Instead of DFAs we use NFAs in the following algorithm since we can't process an exponential number of states in polynomial time.

- Create NFA from RE
- Simulate NFA with text as input
- If the NFA ends in a terminal state the text matches the pattern.

The NFA can be constructed in time proportional to \(M\), where \(M\) is the length of the RE with the following procedure:

- We are using states numbered from \(0\) to \(M\), one for each symbol in the RE + one terminal state.
- Every in-alphabet symbol (A, B,..) has an edge to the next state. (implicitly stored in the RE array)
- Create separate directed graph to store \(\epsilon\)-transitions (red edges in the picture below) in. There are simple rules to determine which \(\epsilon\)-edges are needed for each type of metacharacter.

Note: to create the \(\epsilon\)-transition digraph, maintain a stack that stores opening and closing parentheses and \(|\) (or) symbols.

- Maintain a set of all possible states that the NFA could be in after reading the first \(i\) characters.
- Read character \(i+1\) and update set of possible states after transitioning.
- Add all states reachable via \(\epsilon\) transitions by running dfs on the \(\epsilon\)-digraph starting from all possible current states.
- Repeat for all characters in the input.

- Eulerian tour is tractable; graph isomophism, hamilton cycle are intractable
- Find topological sort with reverse dfs postorder
- Shortest path: topological sort (restriction: no directed cycles), dijkstra (restriction: no negative edges), bellman-ford (restriction: no negative cycles)
- We can also use bellman ford to detect and find negative cycles (e.g. for arbitrage)