
Joel Spencer
Consider a set system, a.k.a. hypergraph, where
is the ground set and
where
. We wish to color the ground set $V$ with two colors, say red and blue, in
such way that all sets in the family are colored in a “balanced” way, i.e., each set has nearly the same number of red and blue points.
As it can be seen from the family this is not possible, since by the pidgeonhold principle at least one color will appear at least
times and all the possible subsets of those points will be monochromatic. We formalize the above ideas immediately. It shall be convenient to use in the place of red/blue colorings, the coloring
. For any
define
. Define the discrepancy of
with respect to
by
. The discrepancy of
is
. It is worth outlining that the discrepancy can be defined in a linear algebraic way. Specifically, let
be the
incidence matrix of
. Then,
. We start with the following theorem, which is an easy one as long as you know Chernoff bounds.
Theorem: Let . Then,
.
Proof: Select a coloring uniformly at random from the set of all possible random colorings. Let us call
bad if its discrepancy exceeds
. Applying the Chernoff bound for set
we obtain:
.
Using a simple union bound we see that . This suggests that with positive probability there exists a coloring that achieves discrepancy less than
. QED
The above theorem is non-constructive but with little effort can be turned into a randomized algorithm. Specifically, if we set and do the same analysis as above we see that the failure probability is
. If we generate
times a random coloring then with probability at least
we succeed in finding at least one coloring which achieves discrepancy at most
. The next natural question is whether this randomized algorithm can be turned into a deterministic algorithm. The answer is yes, and specifically through the method of conditional expectation. Here is a sketch of how to do it.
We assign colors successively in a greedy manner. The criterion of greediness here will be the conditional expectation of
, where
is an indicator variable taking value 1 if set
is “bad”.
Specifically we know . Let
be such that
. Notice that this suggests $latex E[B]=E[E[B|\chi(1)]] \geq E[B|\chi(1)=\epsilon_1]$. In general pick
in such way that minimizes the function of
. At the end
. Notice that since the right-hand side is a non-negative integer less than 1, it is unavoidably equal to 0. It is easy to check that this procedure can be done in polynomial time. Check it yourselves. QED
In the next post we will discuss Spencer’s beautiful result known as “6 standard deviations suffice”. Neat staff.
Appendix
Chernoff bound: Let be mutually independent random variables with
and set following the usual convention $latex S_n=X_1+..+X_n$ . Let
. Then,
.