Suppose there is an underlying 1 dimensional histogram stored in the cloud. As a concrete example, consider the distribution of bank deposits (x-axis is the amount of dollars, and the y-axis is the count of accounts). For simplicity let’s assume that all the amounts of money deposited are integers within the range . The histogram is queried by the bank in the following way: the bank sends a range
and the cloud returns the count of accounts with money between
. Let’s further assume that the range is selected uar, namely the bank selects the pair of indices
uniformly at random. how many queries are needed to reconstruct the histogram? This question is interesting because you can imagine that an eavesdropper can see both the query and the answer returned by the cloud and wishes to infer the histogram.
By Chernoff queries suffice with high probability. However,
queries suffice to reconstruct the high probability too. To see why consider a graph where we have
nodes, where the
-th node corresponds to the
-th prefix sum. Recall that the
-th prefix sum
is the number of bank accounts with less or equal to
dollars. If we throw, say
edges uar, by Erdős–Rényi the graph will be connected whp. This is enough to reconstruct the histogram.