2006 | Ioannis Tsamardinos · Laura E. Brown · Constantin F. Aliferis
The Max-Min Hill-Climbing (MMHC) algorithm is a new method for learning the structure of Bayesian networks. It combines ideas from local learning, constraint-based, and search-and-score techniques. The algorithm first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. MMHC outperforms several prototypical and state-of-the-art algorithms in empirical evaluations, including PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. MMHC offers theoretical advantages, particularly over the Sparse Candidate algorithm, and is publicly available for further study.
Bayesian networks are mathematical constructs that compactly represent joint probability distributions. They are used for modeling domain knowledge in Decision Support Systems, particularly in medicine. Learning Bayesian networks from observational data is an important problem, as it can be used to automatically construct Decision Support Systems and infer possible causal relations. The theory of learning Bayesian networks has deep connections with variable selection for classification and has been used to design algorithms that optimally solve the problem under certain conditions.
The recent explosion of high-dimensional data sets in the biomedical realm and other domains has posed a serious challenge to existing Bayesian network learning algorithms. Current state-of-the-art algorithms do not reliably scale up to thousands of variables in reasonable time. In this paper, we present an algorithm, called Max-Min Hill-Climbing (MMHC), that is able to overcome the perceived limitations. The algorithm is able to scale to distributions with thousands of variables and pushes the envelope of reliable Bayesian network learning in both terms of time and quality in a large variety of representative domains.
There are two main approaches for learning Bayesian networks from data: the search-and-score approach and the constraint-based approach. The search-and-score approach attempts to identify the network that maximizes a score function indicating how well the network fits the data. The constraint-based approach estimates from the data whether certain conditional independencies between the variables hold. The Max-Min Hill-Climbing (MMHC) algorithm can be categorized as a hybrid method, using concepts and techniques from both approaches. MMHC first learns the skeleton of the Bayesian network using a local discovery algorithm called Max-Min Parents and Children (MMPC), then orients the skeleton using a greedy Bayesian-scoring hill-climbing search.
MMHC can be viewed as a particular instantiation of the Sparse Candidate algorithm (SC), one of the first Bayesian network learning algorithms to be successfully applied to datasets with several hundred variables. In a similar fashion to MMHC, Sparse Candidate constrains the search of a search-and-score algorithm: each variable X is allowed to have parents only from within a predetermined candidate parents set C(X) of size at most k, where k is defined by the user. However, there are three main problems with the Sparse Candidate algorithm: the estimation of the candidate sets is not sound, the user has to guess the parameterThe Max-Min Hill-Climbing (MMHC) algorithm is a new method for learning the structure of Bayesian networks. It combines ideas from local learning, constraint-based, and search-and-score techniques. The algorithm first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. MMHC outperforms several prototypical and state-of-the-art algorithms in empirical evaluations, including PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. MMHC offers theoretical advantages, particularly over the Sparse Candidate algorithm, and is publicly available for further study.
Bayesian networks are mathematical constructs that compactly represent joint probability distributions. They are used for modeling domain knowledge in Decision Support Systems, particularly in medicine. Learning Bayesian networks from observational data is an important problem, as it can be used to automatically construct Decision Support Systems and infer possible causal relations. The theory of learning Bayesian networks has deep connections with variable selection for classification and has been used to design algorithms that optimally solve the problem under certain conditions.
The recent explosion of high-dimensional data sets in the biomedical realm and other domains has posed a serious challenge to existing Bayesian network learning algorithms. Current state-of-the-art algorithms do not reliably scale up to thousands of variables in reasonable time. In this paper, we present an algorithm, called Max-Min Hill-Climbing (MMHC), that is able to overcome the perceived limitations. The algorithm is able to scale to distributions with thousands of variables and pushes the envelope of reliable Bayesian network learning in both terms of time and quality in a large variety of representative domains.
There are two main approaches for learning Bayesian networks from data: the search-and-score approach and the constraint-based approach. The search-and-score approach attempts to identify the network that maximizes a score function indicating how well the network fits the data. The constraint-based approach estimates from the data whether certain conditional independencies between the variables hold. The Max-Min Hill-Climbing (MMHC) algorithm can be categorized as a hybrid method, using concepts and techniques from both approaches. MMHC first learns the skeleton of the Bayesian network using a local discovery algorithm called Max-Min Parents and Children (MMPC), then orients the skeleton using a greedy Bayesian-scoring hill-climbing search.
MMHC can be viewed as a particular instantiation of the Sparse Candidate algorithm (SC), one of the first Bayesian network learning algorithms to be successfully applied to datasets with several hundred variables. In a similar fashion to MMHC, Sparse Candidate constrains the search of a search-and-score algorithm: each variable X is allowed to have parents only from within a predetermined candidate parents set C(X) of size at most k, where k is defined by the user. However, there are three main problems with the Sparse Candidate algorithm: the estimation of the candidate sets is not sound, the user has to guess the parameter