2017 June | Subha Kalyaanamoorthy1,2,*, Bui Quang Minh3,*, Thomas KF Wong1,4,*, Arndt von Haeseler3,5, and Lars S Jermini1,4,*
ModelFinder is a fast and accurate method for selecting the best substitution model for phylogenetic analysis. It improves accuracy by incorporating a model of rate heterogeneity across sites not previously considered and by allowing simultaneous searches in model and tree space. The method uses a flexible rate-heterogeneity-across-sites (RHAS) model, including a probability-distribution-free (PDF) model, which allows for more complex rate distributions than the traditional discrete Gamma distribution. ModelFinder is implemented in IQ-TREE and supports various data types, including nucleotides, codons, and amino acids. It offers options for comparing models based on the same or different trees and includes 22 DNA substitution models, 36 protein substitution models, and 13 RHAS models, including the PDF model with up to 10 rate categories.
ModelFinder was tested on 100 amino-acid alignments generated on a 100-tipped tree and showed accurate parameter estimation when using the correct tree and model. It performed well regardless of the optimality criterion (AIC, AICc, or BIC) or search option (default or advanced). The method was also tested on other phylogenetic data and showed significant improvements in the fit between tree, model, and data. ModelFinder outperformed other model-selection methods in terms of accuracy and was found to be more flexible and accurate than existing methods. It is particularly effective in detecting complex rate distributions and can identify models of sequence evolution that other methods are unable to detect.
ModelFinder is fast and efficient, with a 39- to 289-fold speedup compared to jModelTest and a 16- to 52-fold speedup compared to ProtTest. It is suitable for data evolving under time-reversible conditions but not for data evolving under non-time-reversible conditions. The method uses the expectation-maximization (EM) algorithm to estimate parameters of the PDF model and allows for the optimization of multiple parameters. ModelFinder is recommended for use with the advanced search option to avoid local optima and improve accuracy. The method has been validated on various data sets and is available in multiple phylogenetic programs, making it a versatile tool for phylogenetic analysis.ModelFinder is a fast and accurate method for selecting the best substitution model for phylogenetic analysis. It improves accuracy by incorporating a model of rate heterogeneity across sites not previously considered and by allowing simultaneous searches in model and tree space. The method uses a flexible rate-heterogeneity-across-sites (RHAS) model, including a probability-distribution-free (PDF) model, which allows for more complex rate distributions than the traditional discrete Gamma distribution. ModelFinder is implemented in IQ-TREE and supports various data types, including nucleotides, codons, and amino acids. It offers options for comparing models based on the same or different trees and includes 22 DNA substitution models, 36 protein substitution models, and 13 RHAS models, including the PDF model with up to 10 rate categories.
ModelFinder was tested on 100 amino-acid alignments generated on a 100-tipped tree and showed accurate parameter estimation when using the correct tree and model. It performed well regardless of the optimality criterion (AIC, AICc, or BIC) or search option (default or advanced). The method was also tested on other phylogenetic data and showed significant improvements in the fit between tree, model, and data. ModelFinder outperformed other model-selection methods in terms of accuracy and was found to be more flexible and accurate than existing methods. It is particularly effective in detecting complex rate distributions and can identify models of sequence evolution that other methods are unable to detect.
ModelFinder is fast and efficient, with a 39- to 289-fold speedup compared to jModelTest and a 16- to 52-fold speedup compared to ProtTest. It is suitable for data evolving under time-reversible conditions but not for data evolving under non-time-reversible conditions. The method uses the expectation-maximization (EM) algorithm to estimate parameters of the PDF model and allows for the optimization of multiple parameters. ModelFinder is recommended for use with the advanced search option to avoid local optima and improve accuracy. The method has been validated on various data sets and is available in multiple phylogenetic programs, making it a versatile tool for phylogenetic analysis.