Constraint-Based Learning of Interventional Markov Equivalence Classes on High-Dimensional Data
Directed Acyclic Graphs (DAGs) are a powerful tool to model the network of dependencies among variables. They provide a basis for causal discovery, and have been widely used in many fields, especially biology. Unfortunately, structure learning is quite non-trivial for DAG. One major difficulty is that some DAGs are unidentifiable with observational data only, and undirected edges cannot be resolved to directed edges.
The opportunity to apply interventions motivates interest in the smaller interventional Markov equivalence class. In this dissertation, we discuss how to modify the classic PC algorithm for causal discovery so that it can be used safely on interventional data. We introduce invariance relations on conditional distributions with different intervention targets that provide a powerful rule for edge orientation. There are several advantages of this rule: first, it does not require the Gaussian distribution assumption, instead a general structural equation model (SEM) of DAG is sufficient; second, it works for both (structural) intervention and soft intervention. Finally, we can merge some data blocks with different interventions for edge orientation.
A new constraint-based method is proposed to recover the interventional essential graph from the CPDAG (or called as observational essential graph) based on the invariance rule. We also establish consistency guarantees for both an interventional PC and an edge orientation algorithm under a sparse high-dimensional setting. Such high-dimensional consistency results are rarely seen in this area. It is also worthwhile to emphasize that the constraints on the family of interventions throughout this dissertation are mild. Finally, simulations are used to show the effectiveness of our method.