Estimating high-dimensional directed acyclic graphs with the PC-algorithm

Estimating high-dimensional directed acyclic graphs with the PC-algorithm

February 2, 2008 | Markus Kalisch and Peter Bühlmann
The paper presents the PC-algorithm for estimating the skeleton of a high-dimensional acyclic directed graph (DAG) with a Gaussian distribution. The algorithm is computationally feasible for sparse problems with many nodes and automatically achieves high efficiency based on the sparsity of the true DAG. The authors prove that the PC-algorithm is consistent for very high-dimensional, sparse DAGs where the number of nodes grows as $ O(n^a) $ for $ 0 < a < \infty $. The sparsity assumption requires that the neighborhoods in the DAG are of lower order than the sample size. The algorithm is shown to be insensitive to the choice of its single tuning parameter, a significance level for testing conditional independence. Empirical results demonstrate that the PC-algorithm performs well in simulated data and is more reliable than alternative methods like GES and MWST, especially in high-dimensional settings. The algorithm is computationally feasible even when the number of nodes is much larger than the sample size, making it suitable for high-dimensional data. Theoretical analysis confirms the consistency of the PC-algorithm under minimal assumptions on sparsity and the decay of non-zero partial correlations. The algorithm is also shown to be robust to variations in the significance level, making it a practical and efficient method for estimating DAG skeletons in high-dimensional settings.The paper presents the PC-algorithm for estimating the skeleton of a high-dimensional acyclic directed graph (DAG) with a Gaussian distribution. The algorithm is computationally feasible for sparse problems with many nodes and automatically achieves high efficiency based on the sparsity of the true DAG. The authors prove that the PC-algorithm is consistent for very high-dimensional, sparse DAGs where the number of nodes grows as $ O(n^a) $ for $ 0 < a < \infty $. The sparsity assumption requires that the neighborhoods in the DAG are of lower order than the sample size. The algorithm is shown to be insensitive to the choice of its single tuning parameter, a significance level for testing conditional independence. Empirical results demonstrate that the PC-algorithm performs well in simulated data and is more reliable than alternative methods like GES and MWST, especially in high-dimensional settings. The algorithm is computationally feasible even when the number of nodes is much larger than the sample size, making it suitable for high-dimensional data. Theoretical analysis confirms the consistency of the PC-algorithm under minimal assumptions on sparsity and the decay of non-zero partial correlations. The algorithm is also shown to be robust to variations in the significance level, making it a practical and efficient method for estimating DAG skeletons in high-dimensional settings.
Reach us at info@study.space