IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

November 3, 2014 | Lam-Tung Nguyen, Heiko A. Schmidt, Arndt von Haeseler, Bui Quang Minh
IQ-TREE is a fast and effective stochastic algorithm for estimating maximum-likelihood (ML) phylogenies. It combines hill-climbing approaches with stochastic perturbation methods to efficiently explore tree space. When given the same CPU time as RAxML and PhyML, IQ-TREE found higher likelihoods in 62.2% to 87.1% of studied alignments. Using IQ-TREE's stopping rule, RAxML and PhyML were faster in 75.7% and 47.1% of DNA alignments, and 42.2% and 100% of protein alignments, respectively. However, IQ-TREE achieved higher likelihoods in 73.3–97.1% of cases. IQ-TREE is freely available at http://www.cibiv.at/software/iqtree. Phylogenetic inference using ML involves estimating substitution model parameters, branch lengths, and tree topology. While efficient methods exist for parameter estimation, finding the optimal tree topology is an NP-hard problem. Therefore, search heuristics are used to find the "best" tree. ML tree searches use local tree rearrangements like NNI, SPR, or TBR to improve the current tree. However, these methods can get stuck in local optima. Stochastic algorithms were developed to overcome this issue, but they have not performed as well as SPR-based hill-climbing algorithms in terms of likelihood maximization and computation time. IQ-TREE combines hill-climbing algorithms, random perturbation of current best trees, and broad sampling of initial starting trees. Comparative analyses on large DNA and amino acid (AA) alignments from TreeBASE showed that IQ-TREE often achieved higher likelihoods compared to RAxML and PhyML. When restricted to the same CPU time as RAxML and PhyML, IQ-TREE found higher likelihoods in 87.1% of DNA alignments and 62.2% of AA alignments. For AA alignments, IQ-TREE found higher likelihoods in 66.7% of cases compared to PhyML. IQ-TREE required longer CPU times than RAxML for 75.7% of DNA alignments but was faster for AA alignments in 57.8% of cases. When not restricted by RAxML or PhyML's running time, IQ-TREE found higher likelihoods in 97.1% of DNA alignments and 73.3% of AA alignments. IQ-TREE's performance was compared with PhyML, where it found higher likelihoods in 91.4% of DNA and 77.8% of AA alignments. IQ-TREE's CPU time was generally comparable to RAxML, with an average difference of less than 13 minutes per run. IQ-TREE also performed better than PhyML in terms of log-likelihoods and computation time. IQ-TREEIQ-TREE is a fast and effective stochastic algorithm for estimating maximum-likelihood (ML) phylogenies. It combines hill-climbing approaches with stochastic perturbation methods to efficiently explore tree space. When given the same CPU time as RAxML and PhyML, IQ-TREE found higher likelihoods in 62.2% to 87.1% of studied alignments. Using IQ-TREE's stopping rule, RAxML and PhyML were faster in 75.7% and 47.1% of DNA alignments, and 42.2% and 100% of protein alignments, respectively. However, IQ-TREE achieved higher likelihoods in 73.3–97.1% of cases. IQ-TREE is freely available at http://www.cibiv.at/software/iqtree. Phylogenetic inference using ML involves estimating substitution model parameters, branch lengths, and tree topology. While efficient methods exist for parameter estimation, finding the optimal tree topology is an NP-hard problem. Therefore, search heuristics are used to find the "best" tree. ML tree searches use local tree rearrangements like NNI, SPR, or TBR to improve the current tree. However, these methods can get stuck in local optima. Stochastic algorithms were developed to overcome this issue, but they have not performed as well as SPR-based hill-climbing algorithms in terms of likelihood maximization and computation time. IQ-TREE combines hill-climbing algorithms, random perturbation of current best trees, and broad sampling of initial starting trees. Comparative analyses on large DNA and amino acid (AA) alignments from TreeBASE showed that IQ-TREE often achieved higher likelihoods compared to RAxML and PhyML. When restricted to the same CPU time as RAxML and PhyML, IQ-TREE found higher likelihoods in 87.1% of DNA alignments and 62.2% of AA alignments. For AA alignments, IQ-TREE found higher likelihoods in 66.7% of cases compared to PhyML. IQ-TREE required longer CPU times than RAxML for 75.7% of DNA alignments but was faster for AA alignments in 57.8% of cases. When not restricted by RAxML or PhyML's running time, IQ-TREE found higher likelihoods in 97.1% of DNA alignments and 73.3% of AA alignments. IQ-TREE's performance was compared with PhyML, where it found higher likelihoods in 91.4% of DNA and 77.8% of AA alignments. IQ-TREE's CPU time was generally comparable to RAxML, with an average difference of less than 13 minutes per run. IQ-TREE also performed better than PhyML in terms of log-likelihoods and computation time. IQ-TREE
Reach us at info@study.space