Empty bins

Suppose that m=cn balls,  where c>0 is a constant, are thrown sequentially to n bins, all of which are initially empty. Let X(i) be the number of empty bins after we have thrown i=0,\ldots,m balls. Clearly, n=X(0)>X(1)\geq X(2) \ldots \geq X(m). Let’s see how we can understand this sequence of random variables. When we move from step i to i+1 we throw a ball. It either falls in an empty bin or not. We quantify this by an indicator random variable Y_{i+1} which takes the value 1 if the i+1 ball falls in an empty bin, 0 otherwise. Notice that Prob(Y_{i+1}=1) = \frac{X(i)}{n}. We obtain:
X(i+1) = X(i) - Y_{i+1} and by taking expectations, E[X(i+1)-X(i)] = -\frac{ X(i)}{n}.
We now introduce a linear time scaling for this process. In particular, we define x(t) =\frac{1}{n} X(tn). In particular we map the i-th step of the process to time i/n. Our difference equation becomes x(t+1/n)-x(t) = - Y_{t+1}. By taking the limit  we obtain the differential equation dx/dt=-x. This implies x(t)=Ce^{-t} and since x(0)=1 we get x(t) = e^{-t}.
This implies that after throwing tn balls, the expected fraction of empty bins is e^{-t}. In particular after throwing i balls, we expect the number of empty bins to be really close to ne^{-i/n}. Using Azuma’s  inequality one can further prove a concentration result. This is a simple sketch of how the differential equations method is used to analyze discrete random processes. For those interested in finding out more, please  take a look at Nick Wormald’s expository paper “The differential equation method for random graph processes and greedy algorithms“.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: