2011 | Catherine Lozupone, Manuel E Lladser, Dan Knights, Jesse Stombaugh and Rob Knight
UniFrac is a β-diversity measure that uses phylogenetic information to compare microbial communities. It has been widely used in over 150 research publications to study microbial communities in various systems, including human disease and ecology. UniFrac measures the evolutionary history unique to each community, based on the fraction of branch length in a phylogenetic tree that leads to descendants of one sample but not both. A weighted version of UniFrac accounts for differences in relative abundances. UniFrac is used to test if phylogenetic lineages between samples are significantly different or to cluster samples using multivariate statistical techniques. It is implemented in software like QIIME and mothur.
UniFrac is a distance metric that satisfies the formal requirements of a distance metric (non-negative, symmetric, and satisfies the triangle inequality). However, its values can be influenced by the number of sequences per sample. Sequence jackknifing is recommended to avoid this issue. Recent simulations suggested that UniFrac is not suitable for multivariate analysis, but this conclusion is based on flawed assumptions. UniFrac values correlate well with community overlap in simulations, and its values are not sensitive to sampling when sequences per sample are increased. In fact, deeper sequencing can help resolve relationships among similar samples.
UniFrac is sensitive to sampling depth, which can lead to inflated distances in small samples. This effect is particularly pronounced for diverse communities with a high fraction of rare species. Jackknifing techniques are used to assess the robustness of UniFrac cluster results to these factors. These techniques have shown support for biologically relevant clustering patterns identified with uneven sampling.
Despite potential shortcomings in undersampled environments, pairing UniFrac with multivariate statistical methods is still a powerful approach for analyzing complex microbial datasets. These methods allow visualization of which measured variables correlate best with differences between samples. However, significance tests for UniFrac are limited to pairwise comparisons and require correction for multiple comparisons. The P-value for a particular pair of samples is also affected by sampling depth, which can lead to misleading conclusions. The authors recommend using sequence jackknifing to standardize the number of sequences per sample and to ensure reliable results.UniFrac is a β-diversity measure that uses phylogenetic information to compare microbial communities. It has been widely used in over 150 research publications to study microbial communities in various systems, including human disease and ecology. UniFrac measures the evolutionary history unique to each community, based on the fraction of branch length in a phylogenetic tree that leads to descendants of one sample but not both. A weighted version of UniFrac accounts for differences in relative abundances. UniFrac is used to test if phylogenetic lineages between samples are significantly different or to cluster samples using multivariate statistical techniques. It is implemented in software like QIIME and mothur.
UniFrac is a distance metric that satisfies the formal requirements of a distance metric (non-negative, symmetric, and satisfies the triangle inequality). However, its values can be influenced by the number of sequences per sample. Sequence jackknifing is recommended to avoid this issue. Recent simulations suggested that UniFrac is not suitable for multivariate analysis, but this conclusion is based on flawed assumptions. UniFrac values correlate well with community overlap in simulations, and its values are not sensitive to sampling when sequences per sample are increased. In fact, deeper sequencing can help resolve relationships among similar samples.
UniFrac is sensitive to sampling depth, which can lead to inflated distances in small samples. This effect is particularly pronounced for diverse communities with a high fraction of rare species. Jackknifing techniques are used to assess the robustness of UniFrac cluster results to these factors. These techniques have shown support for biologically relevant clustering patterns identified with uneven sampling.
Despite potential shortcomings in undersampled environments, pairing UniFrac with multivariate statistical methods is still a powerful approach for analyzing complex microbial datasets. These methods allow visualization of which measured variables correlate best with differences between samples. However, significance tests for UniFrac are limited to pairwise comparisons and require correction for multiple comparisons. The P-value for a particular pair of samples is also affected by sampling depth, which can lead to misleading conclusions. The authors recommend using sequence jackknifing to standardize the number of sequences per sample and to ensure reliable results.