An improved general amino acid replacement matrix

An improved general amino acid replacement matrix

2008, 25 (7), pp.1307-1320 | Quang Le Si, Olivier Gascuel
The paper presents an improved general amino acid replacement matrix, called LG, which is designed to better capture the evolutionary patterns of amino acids. The authors refine the method used by Whelan and Goldman (2001) by incorporating the variability of evolutionary rates across sites and using a larger and more diverse database than the one used to estimate the WAG matrix. The LG matrix is estimated using an adaptation of the XRATE software and 3,912 alignments from Pfam, comprising approximately 50,000 sequences and 6.5 million residues. The performance of the LG matrix is evaluated using an independent sample of 59 alignments from TreeBase and 3,412 training alignments from Pfam. The results show that the LG matrix significantly outperforms the WAG and JTT matrices in terms of likelihood improvement, with an average Akaike information criterion gain per site of 0.25 and 0.42, respectively. The LG matrix also leads to different tree topologies compared to WAG and JTT, indicating that it not only improves the likelihood value but also affects the output tree. The LG matrix and its PHYML implementation are available for download.The paper presents an improved general amino acid replacement matrix, called LG, which is designed to better capture the evolutionary patterns of amino acids. The authors refine the method used by Whelan and Goldman (2001) by incorporating the variability of evolutionary rates across sites and using a larger and more diverse database than the one used to estimate the WAG matrix. The LG matrix is estimated using an adaptation of the XRATE software and 3,912 alignments from Pfam, comprising approximately 50,000 sequences and 6.5 million residues. The performance of the LG matrix is evaluated using an independent sample of 59 alignments from TreeBase and 3,412 training alignments from Pfam. The results show that the LG matrix significantly outperforms the WAG and JTT matrices in terms of likelihood improvement, with an average Akaike information criterion gain per site of 0.25 and 0.42, respectively. The LG matrix also leads to different tree topologies compared to WAG and JTT, indicating that it not only improves the likelihood value but also affects the output tree. The LG matrix and its PHYML implementation are available for download.
Reach us at info@study.space
[slides and audio] An improved general amino acid replacement matrix.