February 26, 2016 | Jaime Huerta-Cepas, François Serra and Peer Bork
ETE 3 is a computational framework for the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. It features an improved API library and a new set of standalone tools for comparative genomics and phylogenetics. The new features include building gene-based and supermatrix-based phylogenies with a single command, testing and visualizing evolutionary models, calculating distances between trees of different sizes, and integrating with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org.
The ete-build tool provides a unified interface for executing reproducible phylogenetic workflows, including the reconstruction of gene trees and supermatrix-based species trees. It uses a versioned collection of external tools and allows for the execution of complex phylogenetic pipelines, including sequence alignment, trimming, substitution-model testing, tree inference, and image rendering. The supermatrix-based reconstruction mode allows for the easy building and concatenation of multiple sequence alignments, simplifying the inference of species trees based on multiple genes. Advanced options allow for the automatic switching between amino-acid and nucleotide alignments based on sequence identity, resuming workflows, or testing multiple strategies in parallel.
The ete-evol tool automates CodeML/SLR-based analyses by using pre-configured evolutionary models and directly producing a graphical representation of results. These models include site, branch, branch-site, and clade models. Ete-evol can test differential selective pressures along each branch in a given phylogeny in parallel. Evolutionary measures from the best-fitting models are then plotted or interactively visualized by mapping the predicted selective pressures acting on sites and branches into the tested topology, as well as on the multiple sequence alignment.
ETE v3 provides three measures to compute distances between trees: the Robinson-Foulds distance, a branch congruence measure, and the TreeKO Speciation distance. It calculates all three distances at the same time, accepts trees of varying sizes and containing duplication events, allows filtering branches with low support, and is optimized for comparing large datasets. The TreeKO method for splitting gene trees into duplication-free subtrees has been optimized and integrated into ETE's API library.
The ete-ncbiquery tool allows for efficient queries to the NCBI-taxonomy database, enabling tasks such as extracting pruned subtrees, converting NCBI taxids into scientific names, obtaining full lineage tracks, and annotating user-trees with taxonomic data. All queries are carried out locally, avoiding unnecessary lags and permitting the integration of the tool into genomic and metagenomic pipelines.
ETE offers a unified framework for computing and analyzing genome-wide collections of evolutionary data while providing unique visualization capabilities. With the addition of command line tools, ETE has significantly broadened its scope, simplifying many common tasks in phylogenomics for both expert and casual users.ETE 3 is a computational framework for the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. It features an improved API library and a new set of standalone tools for comparative genomics and phylogenetics. The new features include building gene-based and supermatrix-based phylogenies with a single command, testing and visualizing evolutionary models, calculating distances between trees of different sizes, and integrating with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org.
The ete-build tool provides a unified interface for executing reproducible phylogenetic workflows, including the reconstruction of gene trees and supermatrix-based species trees. It uses a versioned collection of external tools and allows for the execution of complex phylogenetic pipelines, including sequence alignment, trimming, substitution-model testing, tree inference, and image rendering. The supermatrix-based reconstruction mode allows for the easy building and concatenation of multiple sequence alignments, simplifying the inference of species trees based on multiple genes. Advanced options allow for the automatic switching between amino-acid and nucleotide alignments based on sequence identity, resuming workflows, or testing multiple strategies in parallel.
The ete-evol tool automates CodeML/SLR-based analyses by using pre-configured evolutionary models and directly producing a graphical representation of results. These models include site, branch, branch-site, and clade models. Ete-evol can test differential selective pressures along each branch in a given phylogeny in parallel. Evolutionary measures from the best-fitting models are then plotted or interactively visualized by mapping the predicted selective pressures acting on sites and branches into the tested topology, as well as on the multiple sequence alignment.
ETE v3 provides three measures to compute distances between trees: the Robinson-Foulds distance, a branch congruence measure, and the TreeKO Speciation distance. It calculates all three distances at the same time, accepts trees of varying sizes and containing duplication events, allows filtering branches with low support, and is optimized for comparing large datasets. The TreeKO method for splitting gene trees into duplication-free subtrees has been optimized and integrated into ETE's API library.
The ete-ncbiquery tool allows for efficient queries to the NCBI-taxonomy database, enabling tasks such as extracting pruned subtrees, converting NCBI taxids into scientific names, obtaining full lineage tracks, and annotating user-trees with taxonomic data. All queries are carried out locally, avoiding unnecessary lags and permitting the integration of the tool into genomic and metagenomic pipelines.
ETE offers a unified framework for computing and analyzing genome-wide collections of evolutionary data while providing unique visualization capabilities. With the addition of command line tools, ETE has significantly broadened its scope, simplifying many common tasks in phylogenomics for both expert and casual users.