jModelTest: Phylogenetic Model Averaging

jModelTest: Phylogenetic Model Averaging

April 8, 2008 | David Posada
jModelTest is a new program for statistically selecting nucleotide substitution models based on PhyML. It implements five selection strategies, including hierarchical and dynamic likelihood ratio tests, Akaike information criterion (AIC), Bayesian information criterion (BIC), and a decision-theoretic performance-based approach. The program calculates the relative importance of substitution parameters and model-averaged estimates, including a model-averaged phylogeny. Written in Java, jModelTest runs on Mac OS X, Windows, and Unix systems with a Java Runtime Environment. It is freely downloadable from http://darwin.uvigo.es. Models of nucleotide substitution allow calculation of probabilities of change between nucleotides along phylogenetic tree branches. The choice of substitution model can affect phylogenetic analysis outcomes. Statistical model selection is essential for estimating phylogenies from DNA sequence alignments. Several programs exist for model selection, including Modeltest. jModelTest supersedes Modeltest by allowing restricted model sets, customizable hierarchical and dynamic likelihood ratio tests, and model ranking based on AIC, BIC, or decision-theoretic performance-based approaches. It calculates parameter importance and model-averaged estimates, including tree topology. jModelTest is a front-end for a computational pipeline that uses existing programs for various tasks. The pipeline includes ReadSeq for sequence alignment conversion, PhyML for likelihood calculations, Ted for performance-based model selection, and Consense for consensus tree calculation. Likelihood calculations, including model parameters and tree estimates, are performed with PhyML. Tree topologies can be fixed or optimized for each model. Branch lengths are estimated and counted as parameters. jModelTest includes 11 nucleotide substitution schemes, resulting in 88 distinct models when combined with base frequency, invariable sites, and rate variation. Sequential likelihood ratio tests (LRTs) can be implemented under a hierarchy (hLRTs) or dynamically (dLRTs). The program implements three information criteria: AIC, BIC, and a performance-based decision-theoretic approach (DT). AICc is also available for small samples. Model selection uncertainty is addressed by assigning scores to models, allowing calculation of AIC or BIC weights. DT scores use a gross approach, with DT weights as rescaled reciprocal DT scores. Confidence intervals (CIs) can be defined based on cumulative weights. Model-averaged phylogenies are computed by building a consensus of maximum likelihood trees, weighted by model weights (AIC, BIC, or DT). jModelTest is written in Java and available for academic use from http://darwin.uvigo.es. It addresses open questions in statistical phylogenetics, providing increased flexibility for exploring data and substitution model effects on phylogenetic tree estimation.jModelTest is a new program for statistically selecting nucleotide substitution models based on PhyML. It implements five selection strategies, including hierarchical and dynamic likelihood ratio tests, Akaike information criterion (AIC), Bayesian information criterion (BIC), and a decision-theoretic performance-based approach. The program calculates the relative importance of substitution parameters and model-averaged estimates, including a model-averaged phylogeny. Written in Java, jModelTest runs on Mac OS X, Windows, and Unix systems with a Java Runtime Environment. It is freely downloadable from http://darwin.uvigo.es. Models of nucleotide substitution allow calculation of probabilities of change between nucleotides along phylogenetic tree branches. The choice of substitution model can affect phylogenetic analysis outcomes. Statistical model selection is essential for estimating phylogenies from DNA sequence alignments. Several programs exist for model selection, including Modeltest. jModelTest supersedes Modeltest by allowing restricted model sets, customizable hierarchical and dynamic likelihood ratio tests, and model ranking based on AIC, BIC, or decision-theoretic performance-based approaches. It calculates parameter importance and model-averaged estimates, including tree topology. jModelTest is a front-end for a computational pipeline that uses existing programs for various tasks. The pipeline includes ReadSeq for sequence alignment conversion, PhyML for likelihood calculations, Ted for performance-based model selection, and Consense for consensus tree calculation. Likelihood calculations, including model parameters and tree estimates, are performed with PhyML. Tree topologies can be fixed or optimized for each model. Branch lengths are estimated and counted as parameters. jModelTest includes 11 nucleotide substitution schemes, resulting in 88 distinct models when combined with base frequency, invariable sites, and rate variation. Sequential likelihood ratio tests (LRTs) can be implemented under a hierarchy (hLRTs) or dynamically (dLRTs). The program implements three information criteria: AIC, BIC, and a performance-based decision-theoretic approach (DT). AICc is also available for small samples. Model selection uncertainty is addressed by assigning scores to models, allowing calculation of AIC or BIC weights. DT scores use a gross approach, with DT weights as rescaled reciprocal DT scores. Confidence intervals (CIs) can be defined based on cumulative weights. Model-averaged phylogenies are computed by building a consensus of maximum likelihood trees, weighted by model weights (AIC, BIC, or DT). jModelTest is written in Java and available for academic use from http://darwin.uvigo.es. It addresses open questions in statistical phylogenetics, providing increased flexibility for exploring data and substitution model effects on phylogenetic tree estimation.
Reach us at info@study.space