Directed separation in graphs

This is a simple graphical applet to calculate d- and star-separation in directed acyclic graphs. Here, one can construct a directed graph by mouse, select some sets of vertices and see what is separated given the chosen semantics.

You can try the applet right in the browser. It is part of my attempt to implement and understand the algorithms. The code can be found on Github. Explanations on how to use it and what this is about can be found below the applet.

How to use this

Right now, this applet isn’t very polished. (The reload button is called debugRefresh, to give you an idea…) Still, one can place nodes and insert directed edges (by first clicking on the source node, then the target node). There are two different separation semantics which can be toggled in the toolbar.

After placing some nodes one can select the subsets \(A\) (left-click, in blue) and $C$ (right-click, in orange) and the separated subset \(B\) gets updated immediately (in red).

What is happening here

As mentioned in the beginning, the question that this applet solves can be summarised as follows.

Given an directed graph \({\cal D}\) and two disjoint subsets \(A\) and \(C\) of its nodes, which set of nodes \(B\) is not reachable from \(A\) if we can ‘skip’ colliders in \(C\) but cannot pass non-colliders in \(C\)?

Here, a collider means a node \(C\) with a pair of edges \(A\rightarrow C\leftarrow B\), whereas a non-collider is any node without two such edges.

One the left side, a collider. On the right side, two configurations of three nodes that are not colliders.

Intuitively, when searching for a directed path from one node to another, we would strictly obey the directions of the edges. In this case, we do allow taking edges in any direction, unless our path would contain two edges meeting head-to-head, thus constituting a collider. With ‘skipping colliders’ we mean that we again allow traversing selected head-to-head edges.

D-connecting and star-connecting paths

For star-separation (or rather, star-connecting paths between two nodes), we only allow to skip at most one collider. This means that a d-connecting path might not be a star-connecting path anymore. But every star-connecting path is d-connecting.

In the picture above, path a) is d-connecting if we allow to skip over 4 and 5, thus 3 can be reached from 1. On the other hand, path b) is only star-connecting from 1 to 5, because up to this node we only need to skip at most one collider, namely 4. We cannot reach node 3 since we would have to skip another collider.

The motivation for this question about directed graphs comes from statistics. There, we might have a set of random observables $X = (X_1,\dots,X_n)$ whose causal relationships can be expressed in a directed acyclic graph whose nodes correspond precisely with the \(X_i\). This graph encodes the conditional independence statements between the random variables and we can associate separation of the nodes with conditional independence. The unblocking of colliders and blocking of non-colliders corresponds to conditioning on selected random variables.

In most types of graphical models, d-separation is sufficient to detect all conditional independence statements that hold for all networks with this graph.

Now, for so called max-linear Bayesian networks, d-separation doesn’t work (completely). Here, max-linear Bayesian networks means a graphical model with recursive structural equations \[ X_i = \bigvee_{j\in {\rm parent}(i)} c_{ij}X_j \vee Z_i. \] On the other hand, star-separation does work well for these types of models.

For further reading, the following paper are useful. The first one gives more insight in the situation for max-linear Bayesian networks. The second one contains the algorithm I implemented for the first version of this applet.