October 17, 2024 | Robert Lanfear, Matthew W. Hahn
The Meaning and Measure of Concordance Factors in Phylogenomics
Concordance factors (CFs) are important tools in phylogenetics that describe topological variation among loci rather than statistical support. They provide information about the predictive power of the species tree not captured by support measures. CFs are not measures of statistical support but rather biological parameters indicating the proportion of the genome for which a given clade is true. They are not affected by the amount of data, as they represent statistical variation, not confidence.
CFs and similar statistics have become increasingly important in phylogenetics as larger datasets reveal biological variation in topologies across individual loci. The single species tree inferred by most phylogenetic methods is a limited summary of the data for many purposes. CFs help identify which branches of the species tree are most relevant to predicting the history of the genome at any given locus.
CFs are estimated from empirical data using different methods, including gene trees, quartets, and sites. Each method has its own advantages and disadvantages. For example, gCFs estimate CFs from gene trees, qCFs from quartets, and sCFs from site patterns. These methods can be used to calculate and compare measures of concordance and discordance.
The concordance vector is a simple summary that describes fundamental aspects of concordance and discordance for a branch. It consists of four proportions that sum to 1, with the first entry being the CF and the remaining three entries summarizing the DFs. The concordance vector helps reveal the relationships between different methods of estimating and interpreting CFs and DFs.
CFs are not measures of statistical support but rather biological parameters that describe the predictive power of the species tree. They are not affected by the amount of data, as they represent statistical variation, not confidence. The distinction between CFs and statistical support measures is important because it highlights the need to consider both concordance and discordance when interpreting evolutionary history.
CFs are important for understanding evolutionary histories and testing evolutionary hypotheses. They provide information about the amount of data needed to have high statistical support and help identify which branches of the species tree are most relevant to predicting the history of the genome at any given locus. The concordance vector is a useful tool for summarizing concordance and discordance and can be used to compare different measures of concordance and discordance.The Meaning and Measure of Concordance Factors in Phylogenomics
Concordance factors (CFs) are important tools in phylogenetics that describe topological variation among loci rather than statistical support. They provide information about the predictive power of the species tree not captured by support measures. CFs are not measures of statistical support but rather biological parameters indicating the proportion of the genome for which a given clade is true. They are not affected by the amount of data, as they represent statistical variation, not confidence.
CFs and similar statistics have become increasingly important in phylogenetics as larger datasets reveal biological variation in topologies across individual loci. The single species tree inferred by most phylogenetic methods is a limited summary of the data for many purposes. CFs help identify which branches of the species tree are most relevant to predicting the history of the genome at any given locus.
CFs are estimated from empirical data using different methods, including gene trees, quartets, and sites. Each method has its own advantages and disadvantages. For example, gCFs estimate CFs from gene trees, qCFs from quartets, and sCFs from site patterns. These methods can be used to calculate and compare measures of concordance and discordance.
The concordance vector is a simple summary that describes fundamental aspects of concordance and discordance for a branch. It consists of four proportions that sum to 1, with the first entry being the CF and the remaining three entries summarizing the DFs. The concordance vector helps reveal the relationships between different methods of estimating and interpreting CFs and DFs.
CFs are not measures of statistical support but rather biological parameters that describe the predictive power of the species tree. They are not affected by the amount of data, as they represent statistical variation, not confidence. The distinction between CFs and statistical support measures is important because it highlights the need to consider both concordance and discordance when interpreting evolutionary history.
CFs are important for understanding evolutionary histories and testing evolutionary hypotheses. They provide information about the amount of data needed to have high statistical support and help identify which branches of the species tree are most relevant to predicting the history of the genome at any given locus. The concordance vector is a useful tool for summarizing concordance and discordance and can be used to compare different measures of concordance and discordance.