CPSC448: Midterm preparation
Design a datastructure that can support these operations:
The structure should be able to handle thousands of operations per second.
Given n>=0, return the number of binary strings of length n that have no pairs of adjacent ones.
For example, if n=4, the string "0101" is counted, but the string "0110" is unacceptable because it has two adjacent ones.
Given an unweighted graph G and a source vertex s, output the vertices of G, sorted first by their depth with respect to s and then (if depths are equal) alphabetically. The vertices are specified as distinct strings.
Given a directed graph, determine whether it has cycles.
Given a weighted, directed graph, where each vertex can be either red, blue or gray, return the shortest distance between the red side and the blue side (i.e. the shortest path from some red vertex to some blue vertex).
If each operation requires O(log(n)) time in the number of elements, this should be good enough to satisfy the speed requirement. A datastructure that can perform all four operations in O(log(n)) time is a (balanced) binary search tree (BST).
To actually implement a BST in C++, the simplest approach is to use a set<int> (which is implemented as a red-black tree). Insert() can be done using insert(), Erase() - via the erase() member function, GetMax() - by dereferencing the rbegin() iterator and GetMin() - by dereferencing the begin() iterator.
Let T[n] denote the answer to the question for a given n value. Then T=1 and T=2. Consider the first digit in a binary string of size n. We want to know how many binary strings do not have adjacent ones. If we look at all the strings whose first digit is 0, then the rest of the digits can be anything, as long as there are no adjacent ones there. If, on the other hand, the first digit is 1, then the second digit (if it exists) is forced to be 0. It can not be 1 since that would give a pair of adjacent ones. So the number of binary strings of length n that do not have any adjacent ones and start with a 1 is exactly T[n-2]. Since every binary string starts with either 0 or 1, the total number T[n] is the sum of T[n-1] (for strings that start with a 0) and T[n-2] (for those that start with 1, and hence 10).
The equation T[n]=T[n-1]+T[n-2] together with the two initial conditions, T=1 and T=2, is a recurrence closely related to the Fibonacci numbers.
One way of computing T[n] (relatively) quickly is to use a table that stores all the T[.] values from 0 to n. The table can be filled by first setting the two initial values and then using simple dynamic programming to compute the values T, T, ..., T[n] directly from the recurrence equation.
First, run BFS on the graph with s being the source. This will produce a depth array d that will contain the depth of every vertex in G, with respect to s.
Next, create an array (or a vector) of all the vertices and sort it using a custom comparator function as illustrated below.
DFS is perfectly suited for this job, but DFS is not the only possible solution.
Pick any vertex of G and call dfs() with that vertex as the source. As you may remember, DFS keeps a record of the visited (seen) vertices. If you look in the book, there is a slight generalization of this idea. Each vertex is given a colour - either white, grey or black. White vertices have not been visited yet. Grey vertices are being visited right now (they correspond to the stack frames of calls to dfs()). Black vertices are those that have already been processed (dfs() has seen this vertex and returned from it).
Suppose dfs() is in the middle of its job of traversing the graph and it is now at the vertex u. Suppose also that a vertex v is gray. That means that there is a path from the source to v, and there is also a path from v to u. Now what if dfs() encounters an edge from u to v? Since there is already a path from v to u, that would mean that we have found a cycle! Thus, if dfs() is ever called with a grey vertex as a parameter, we can stop immediately and declare to have found a cycle.
One small detail to note here is that since the graph is not guaranteed to be connected, dfs() might not visit all of the vertices. If there are still white vertices left after dfs() returns, we will need to call dfs() again on all of the white vertices.
The code below illustrates the algorithm above. Here, dfs() will terminate normally if it does not find a cycle and it will throw an exception if it does find one. This makes the code shorter.
The trick is to convert the graph to a form in which a known shortest-path algorithm can be applied. To do this, add two vertices, s and t. Add an edge of weight 0 from s to every red vertex. Add an edge of weight 0 from every blue vertex to t. The colours of s and t are irrelevant (and meaningless). Now find the shortest path from s to t using, for example, the Bellman-Ford algorithm with s as the source. Return the length of the shortest path found.