## Risk-Averse Matchings over Uncertain Graph Databases

Probabilistic graph, each edge $e$ is annotated with $(p(e),w(e))$, its probability and its reward/weight. The matching $(A,B), (C,D)$ has higher expected weight than $(A,C), (B,D)$. However, the reward of the former matching is 0 with probability $\frac{1}{4}$, but the reward of the latter matching is 80 with probability 1.

A large number of applications such as querying sensor networks, and analyzing protein-protein interaction (PPI) networks, rely on mining uncertain graph and hypergraph databases. In this paper  we study the following problem:

Given an uncertain, weighted (hyper)graph, how can we efficiently find a (hyper)matching with high expected reward, and low risk?

For instance, the figure on the left shows an uncertain graph with two perfect matchings. While the matching $(A,B), (C,D)$ has expected reward $100 \times 2 \times \frac{1}{2}$, and the matching $(A,C), (B,D)$ expected reward 80, the former entails greater risk: with probability $(1-\frac{1}{2})\times (1-\frac{1}{2})$ the actual reward will be 0.

## Contributions

• Model. We introduce a general model of uncertain weighted hypergraphs, that subsumes prior work and is suitable for a wider variety of applications, including privacy related applications that inject continuous noise to the edge weights.  We provide a novel formulation for risk-averse (hyper)graph matchings.
• Approximation algorithms. For the case of uncertain weighted graphs, we provide a $\frac{1}{3}$-approximation algorithm, and a $\frac{1}{5}$-approximation algorithm with near optimal run time. For the case of uncertain weighted hypergraphs, we provide a $\Omega(\frac{1}{k})$-approximation algorithm, where $k$ is the rank of the hypergraph (i.e., any hyperedge includes at most $k$ nodes), that runs in almost (modulo log factors) linear time.
• Experiments. We complement our theoretical results by testing our approximation algorithms on a wide variety of synthetic experiments, where we observe in a controlled setting interesting findings on the trade-off between reward, and risk. We also provide an application of our formulation for providing recommendations of teams that are likely to collaborate, and have high impact.

## Data

We created an uncertain weighted hypergraph for the purpose of testing our methods from DBLP. The data is available here as pickled file. For a complete description, check our paper.