Consider a set system, a.k.a. hypergraph, where is the ground set and where . We wish to color the ground set $V$ with two colors, say red and blue, in
such way that all sets in the family are colored in a “balanced” way, i.e., each set has nearly the same number of red and blue points.
As it can be seen from the family this is not possible, since by the pidgeonhold principle at least one color will appear at least times and all the possible subsets of those points will be monochromatic. We formalize the above ideas immediately. It shall be convenient to use in the place of red/blue colorings, the coloring . For any define . Define the discrepancy of with respect to by . The discrepancy of is . It is worth outlining that the discrepancy can be defined in a linear algebraic way. Specifically, let be the incidence matrix of . Then, . We start with the following theorem, which is an easy one as long as you know Chernoff bounds.
Theorem: Let . Then, .
Proof: Select a coloring uniformly at random from the set of all possible random colorings. Let us call bad if its discrepancy exceeds . Applying the Chernoff bound for set we obtain:
Using a simple union bound we see that . This suggests that with positive probability there exists a coloring that achieves discrepancy less than . QED
The above theorem is non-constructive but with little effort can be turned into a randomized algorithm. Specifically, if we set and do the same analysis as above we see that the failure probability is . If we generate times a random coloring then with probability at least we succeed in finding at least one coloring which achieves discrepancy at most . The next natural question is whether this randomized algorithm can be turned into a deterministic algorithm. The answer is yes, and specifically through the method of conditional expectation. Here is a sketch of how to do it.
We assign colors successively in a greedy manner. The criterion of greediness here will be the conditional expectation of , where is an indicator variable taking value 1 if set is “bad”.
Specifically we know . Let be such that . Notice that this suggests $latex E[B]=E[E[B|\chi(1)]] \geq E[B|\chi(1)=\epsilon_1]$. In general pick in such way that minimizes the function of . At the end . Notice that since the right-hand side is a non-negative integer less than 1, it is unavoidably equal to 0. It is easy to check that this procedure can be done in polynomial time. Check it yourselves. QED
In the next post we will discuss Spencer’s beautiful result known as “6 standard deviations suffice”. Neat staff.
Chernoff bound: Let be mutually independent random variables with and set following the usual convention $latex S_n=X_1+..+X_n$ . Let . Then, .