A new method for inferring evolutionary trees using nucleotide sequence data is presented. The method uses a birth-death process to model speciation and extinction, specifying the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and substitution model are estimated by maximum likelihood. Posterior probabilities of different phylogenies are calculated, and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. This is referred to as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process.
The method differs from conventional maximum likelihood parameter estimation in that the functional form of the likelihood depends on the tree topology, and the regularity conditions required for the asymptotic properties of maximum likelihood estimators are not satisfied. Another difficulty is the lack of a reliable method for evaluating the significance of the estimated tree. The method of nonparametric bootstrapping has been found to give somewhat unreliable results.
In this paper, the problem of phylogenetic tree estimation is approached from a different perspective. A birth-death process is used to specify the prior distribution of tree topologies and divergence times, and a Markov process is used to model nucleotide substitution. Parameters of the birth-death process and substitution model are estimated by maximizing the likelihood. The posterior probability of each tree topology, conditional on the nucleotide sequence data and the estimated parameters, is then calculated. The tree with the highest posterior probability is taken as the estimate of phylogeny. This is referred to as the maximum posterior probability (MAP) tree. The MAP method differs from the ML method in that topologies and branch lengths are treated as random variables rather than parameters.A new method for inferring evolutionary trees using nucleotide sequence data is presented. The method uses a birth-death process to model speciation and extinction, specifying the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and substitution model are estimated by maximum likelihood. Posterior probabilities of different phylogenies are calculated, and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. This is referred to as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process.
The method differs from conventional maximum likelihood parameter estimation in that the functional form of the likelihood depends on the tree topology, and the regularity conditions required for the asymptotic properties of maximum likelihood estimators are not satisfied. Another difficulty is the lack of a reliable method for evaluating the significance of the estimated tree. The method of nonparametric bootstrapping has been found to give somewhat unreliable results.
In this paper, the problem of phylogenetic tree estimation is approached from a different perspective. A birth-death process is used to specify the prior distribution of tree topologies and divergence times, and a Markov process is used to model nucleotide substitution. Parameters of the birth-death process and substitution model are estimated by maximizing the likelihood. The posterior probability of each tree topology, conditional on the nucleotide sequence data and the estimated parameters, is then calculated. The tree with the highest posterior probability is taken as the estimate of phylogeny. This is referred to as the maximum posterior probability (MAP) tree. The MAP method differs from the ML method in that topologies and branch lengths are treated as random variables rather than parameters.