Discrepancy I

Joel Spencer

Consider a set system, a.k.a. hypergraph, (V,\mathcal{F}) where V=[n] is the ground set and \mathcal{F}=\{A_1,\ldots,A_m\} where A_i \subseteq V. We wish to color the ground set $V$ with two colors, say red and blue, in
such way that all sets in the family are colored in a  “balanced” way, i.e., each set has nearly the same number of red and blue points.
As it can be seen from the family \mathcal{F}=2^{[n]} this is not possible,  since by the pidgeonhold principle at least one color will appear at least n/2 times and all the possible subsets of those points will be monochromatic.  We formalize the above ideas immediately. It shall be convenient  to use in the place of red/blue colorings, the coloring \chi:V\rightarrow\{-1,+1\}. For any A \subseteq V define \chi(A) =\sum_{i \in A} \chi(i). Define the discrepancy of \mathcal{F} with respect to \chi by \text{disc}_{\chi}(\mathcal{F}) = \max_{A_i \in \mathcal{F}} |\chi(A_i)|. The discrepancy of \mathcal{F} is  \text{disc}(\mathcal{F})=\min_{\chi}\text{disc}_{\chi}(\mathcal{F}). It is worth outlining that the discrepancy can be defined in a linear algebraic way. Specifically, let A be the m \times n incidence matrix of \mathcal{F}. Then, \text{disc}(\mathcal{F}) = \min_{x \in \{-1,+1\}} ||Ax||_{+\infty}. We start with the following theorem, which is an easy one as long as you know Chernoff bounds.

Theorem: Let |V|=n,|\mathcal{F}|=m. Then, \text{disc}(\mathcal{F}) \leq \sqrt{2n \log{(2m)}}.

Proof:  Select a coloring \chi uniformly at random from the set of all possible random colorings. Let us call A_i bad if its discrepancy exceeds t=\sqrt{2n\log{2m}}. Applying the Chernoff bound for set A_i we obtain:

\text{Prob}(A_i \text{~is bad})=\text{Prob}(|\chi(A_i)| > t)<

2\exp{\Big( -\frac{t^2}{2|A_i|} \Big)} \leq 2\exp{\Big( -\frac{t^2}{2n} \Big)}=\frac{1}{m}.

Using a simple union bound we see that \text{Prob}(\text{disc}(\mathcal{F})>t )= \text{Prob}(\exists \text{~bad~}A_i )< m \times \frac{1}{m}=1. This suggests that with positive probability there exists a coloring that achieves discrepancy less than t. QED

The above theorem is non-constructive but with little effort can be turned into a randomized algorithm. Specifically, if we set t=\sqrt{3n\log{2m}} and do the same analysis as above we see that the failure probability is \frac{1}{\sqrt{m}}. If we generate k times a random coloring then with probability at least 1-\frac{1}{m^{k/2}} we succeed in finding at least one coloring which achieves discrepancy at most t. The next natural question is whether this randomized algorithm can be turned into a deterministic algorithm. The answer is yes, and specifically through the method of conditional expectation. Here is a sketch of how to do it.

We assign colors \chi(1),\chi(2),... successively in a greedy manner. The criterion of greediness here will be the conditional expectation of B=\sum_{i=1}^m B_i, where B_i is an indicator variable taking value 1 if set A_i is “bad”.
Specifically we know E[B] < \frac{1}{\sqrt{m}}. Let \epsilon_1=\pm 1 be such that E[B|\chi(1)=\epsilon_1] \leq E[B|\chi(1)=-\epsilon_1]. Notice that this suggests $latex E[B]=E[E[B|\chi(1)]] \geq E[B|\chi(1)=\epsilon_1]$. In general pick \epsilon_k=\pm 1 in such way that minimizes the function of x E[B|\chi(1)=\epsilon_1,\ldots,\chi(k-1)=\epsilon_{k-1},\chi(k)=x]. At the end 1>\frac{1}{\sqrt{m}}>E[B]\geq E[B|\chi(1)=\epsilon_1,\ldots,\chi(n)=\epsilon_n]. Notice that since the right-hand side is a non-negative integer less than 1,  it is unavoidably equal to 0. It is easy to check that this procedure can be done in polynomial time. Check it yourselves. QED

In the next post we will discuss Spencer’s beautiful result known as “6 standard deviations suffice”. Neat staff.


Chernoff bound: Let X_i, 1 \leq i \leq n be mutually independent random variables with \text{Prob}(X_i=+1)=\text{Prob}(X_i=-1)=\frac{1}{2} and set following the usual convention  $latex S_n=X_1+..+X_n$ . Let \alpha>0. Then, \text{Prob}(S_n > \alpha)< e^{-\tfrac{\alpha^2}{2n}}.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: