2016 | OLGA CHERNOMOR, ARNDT VON HAESER, AND BUI QUANG MINH
The article introduces a phylogenetic terrace aware (PTA) data structure for efficient phylogenomic inference from supermatrices. This structure allows for the efficient analysis under partition models by exploiting phylogenetic terraces, which are sets of trees with the same likelihood or parsimony score. PTA is implemented in IQ-TREE and provides significant speedups compared to standard implementations, up to 4.5 and 8 times for IQ-TREE and RAxML, respectively. PTA is applicable to all partition models and common topological rearrangements, making it compatible with various phylogenomic inference software. The PTA data structure includes the species tree, induced partition trees, and maps from species tree edges to partition tree edges. This structure enables efficient detection and handling of partial terraces during tree searches, reducing computation time by avoiding unnecessary likelihood calculations. The PTA data structure was tested on 12 real alignments with varying levels of missing data, showing improved performance under all three partition models (EUL, EL-equal, EL-proportional). The results demonstrate that accounting for partial and full terraces significantly speeds up tree searches and improves computational efficiency. The study also highlights the importance of considering terraces in phylogenetic inference to ensure accurate and reliable results. The PTA data structure is implemented in IQ-TREE and can be integrated into existing ML software packages for broader application. The findings suggest that incorporating terrace-aware strategies into phylogenetic tree searches is essential for handling gappy data and improving the efficiency of phylogenomic analyses.The article introduces a phylogenetic terrace aware (PTA) data structure for efficient phylogenomic inference from supermatrices. This structure allows for the efficient analysis under partition models by exploiting phylogenetic terraces, which are sets of trees with the same likelihood or parsimony score. PTA is implemented in IQ-TREE and provides significant speedups compared to standard implementations, up to 4.5 and 8 times for IQ-TREE and RAxML, respectively. PTA is applicable to all partition models and common topological rearrangements, making it compatible with various phylogenomic inference software. The PTA data structure includes the species tree, induced partition trees, and maps from species tree edges to partition tree edges. This structure enables efficient detection and handling of partial terraces during tree searches, reducing computation time by avoiding unnecessary likelihood calculations. The PTA data structure was tested on 12 real alignments with varying levels of missing data, showing improved performance under all three partition models (EUL, EL-equal, EL-proportional). The results demonstrate that accounting for partial and full terraces significantly speeds up tree searches and improves computational efficiency. The study also highlights the importance of considering terraces in phylogenetic inference to ensure accurate and reliable results. The PTA data structure is implemented in IQ-TREE and can be integrated into existing ML software packages for broader application. The findings suggest that incorporating terrace-aware strategies into phylogenetic tree searches is essential for handling gappy data and improving the efficiency of phylogenomic analyses.