# Union-find

Imagine we have a graph (e.g., a network) like this

and we are interested in reachability queries. For example

Is *g* reachable from *a*? — NO

Is there a path between *i* and *h*? — YES

However, to make things more interesting, our graph is not static

and we can link two nodes.

Link *f* and *j*:

Is there a path between *e* and *j*? — YES

Is there a path between *c* and *i*? — NO

Link *b* and *g*:

Is there a path between *c* and *i* now? — YES

Link *a* and *h*:

Link *a* and *d*:

Is there a path between *e* and *h*? — NO

Is there a path between *d* and *g*? — YES

This problem can be formulated a little more abstractly: given a collection of disjoint sets, we would like to support the following two operations:

`union(A,B)`

– replaces sets*A*and*B*with their union`find(x)`

– find set*X*where*x*belongs.

In our network application, the nodes reachable from each other form disjoint sets (called *connected components*). By linking two nodes from two different components, we get one big component – their union. Nodes *x* and *y* are connected by a path if and only if they belong to the same component, i.e., if `find(x)=find(y)`

.

Link *f* and *j*: union *C* and *E*

Link *b* and *g*: union *A* and *D*

Link *a* and *h*; link *a* and *d*

### Applications

Kruskal’s algorithm

Unification

Other:

Gilbert et al. (1994) An efficient algorithm to compute row and column counts for sparse Cholesky factorization. SIAM Journal on Matrix Analysis and Applications, 15:1075.

Gabow, H. and Tarjan, R. (1985). A linear-time algorithm for a special case of disjoint set union. Journal of computer and system sciences, 30(2):209–221.