ESTIMATING CHARACTER WEIGHTS DURING TREE SEARCH

ESTIMATING CHARACTER WEIGHTS DURING TREE SEARCH

1993 | Pablo A. Goloboff
A new method for weighting characters based on their homoplasy is proposed. This method is non-iterative and does not require initial weight estimates. It searches for trees with maximum total fit, where character fit is a concave function of homoplasy. Trees with fewer steps in characters showing more homoplasy are considered more reliable. The reliability of characters is inferred from the trees being compared. The "fittest" trees imply characters are maximally reliable, resolving character conflict by minimizing steps for better-fitting characters. If other trees save steps in some characters, it is at the expense of gaining steps in less homoplastic characters. Farris (1982, 1983) argued that the most parsimonious tree best explains data. Initially, some authors believed parsimony and weighting were exclusive, but Farris showed that the most parsimonious cladogram is the hypothesis with greatest explanatory power given character weights. Now, it is widely accepted that parsimony does not preclude weighting, and weighting is necessary for accurate analysis. Three types of weighting schemes have been proposed. The first requires knowledge of character properties, such as change rate. The second is based on character compatibility. The third, based on homoplasy, is defensible on cladistic grounds, as homoplasy indicates less reliable characters. Homoplasy-based weighting is the only type justifiable without evolutionary or statistical assumptions. Farris (1969) proposed successive weighting, an iterative method where weights implied by the most parsimonious tree are used to reanalyze data. The final stable solution depends on initial weights. Self-consistency is defined as a tree that is shortest under its implied weights, resolving character conflict in favor of less homoplastic characters. Self-consistency is better defined per tree, and not requiring iterations. Trees not self-consistent should be rejected. The criterion of self-consistency is better implemented by examining each tree separately. Trees that imply higher weights are preferred. The "total fit" is maximized, with fit being a concave function of homoplasy. The consistency index (c) is a well-known measure of fit. The fittest tree is the one with the highest average c. The shortest tree may not be the fittest. Convex functions of homoplasy are rejected as they imply more importance for step differences in highly homoplastic characters. Farris preferred concave functions for estimating reliability. The "heaviest" trees are those that imply characters have higher weights. Searching for "heaviest" trees aligns with cladistic ideas, providing more accurate results than self-consistency alone. Fitting functions must be finely grained to accurately reflect character reliability. The consistency index is modified to account for homoplasy and steps. The function $ f_{i} = \langle k + 1 \rangle / (s_{i} + k + 1 - m_{i}) $ is used,A new method for weighting characters based on their homoplasy is proposed. This method is non-iterative and does not require initial weight estimates. It searches for trees with maximum total fit, where character fit is a concave function of homoplasy. Trees with fewer steps in characters showing more homoplasy are considered more reliable. The reliability of characters is inferred from the trees being compared. The "fittest" trees imply characters are maximally reliable, resolving character conflict by minimizing steps for better-fitting characters. If other trees save steps in some characters, it is at the expense of gaining steps in less homoplastic characters. Farris (1982, 1983) argued that the most parsimonious tree best explains data. Initially, some authors believed parsimony and weighting were exclusive, but Farris showed that the most parsimonious cladogram is the hypothesis with greatest explanatory power given character weights. Now, it is widely accepted that parsimony does not preclude weighting, and weighting is necessary for accurate analysis. Three types of weighting schemes have been proposed. The first requires knowledge of character properties, such as change rate. The second is based on character compatibility. The third, based on homoplasy, is defensible on cladistic grounds, as homoplasy indicates less reliable characters. Homoplasy-based weighting is the only type justifiable without evolutionary or statistical assumptions. Farris (1969) proposed successive weighting, an iterative method where weights implied by the most parsimonious tree are used to reanalyze data. The final stable solution depends on initial weights. Self-consistency is defined as a tree that is shortest under its implied weights, resolving character conflict in favor of less homoplastic characters. Self-consistency is better defined per tree, and not requiring iterations. Trees not self-consistent should be rejected. The criterion of self-consistency is better implemented by examining each tree separately. Trees that imply higher weights are preferred. The "total fit" is maximized, with fit being a concave function of homoplasy. The consistency index (c) is a well-known measure of fit. The fittest tree is the one with the highest average c. The shortest tree may not be the fittest. Convex functions of homoplasy are rejected as they imply more importance for step differences in highly homoplastic characters. Farris preferred concave functions for estimating reliability. The "heaviest" trees are those that imply characters have higher weights. Searching for "heaviest" trees aligns with cladistic ideas, providing more accurate results than self-consistency alone. Fitting functions must be finely grained to accurately reflect character reliability. The consistency index is modified to account for homoplasy and steps. The function $ f_{i} = \langle k + 1 \rangle / (s_{i} + k + 1 - m_{i}) $ is used,
Reach us at info@study.space