Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions

Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions

2013, Vol. 41, No. 12 | Jaina Mistry, Robert D. Finn, Sean R. Eddy, Alex Bateman, Marco Punta
The study investigates the challenges in homology detection using sequence similarity, particularly focusing on regions under convergent evolution and compositional bias. The researchers test the accuracy of HMMER3, a profile hidden Markov model method, in assigning homologous sequences to manually curated families from the Pfam database. They identify problem families by analyzing protein regions that match multiple Pfam families not currently annotated as related. The results show that HMMER3's E-value estimates are less accurate for families with periodic compositional bias, such as coiled-coil regions. This highlights the need for manual curation of inclusion thresholds in Pfam and the development of new methods to correct for compositional bias. The study also identifies a subset of families with high overlap and compositional bias, suggesting that these families may be enriched in false positives. Overall, the findings emphasize the importance of considering higher-order correlations in protein sequences and the need for improved bias correction methods.The study investigates the challenges in homology detection using sequence similarity, particularly focusing on regions under convergent evolution and compositional bias. The researchers test the accuracy of HMMER3, a profile hidden Markov model method, in assigning homologous sequences to manually curated families from the Pfam database. They identify problem families by analyzing protein regions that match multiple Pfam families not currently annotated as related. The results show that HMMER3's E-value estimates are less accurate for families with periodic compositional bias, such as coiled-coil regions. This highlights the need for manual curation of inclusion thresholds in Pfam and the development of new methods to correct for compositional bias. The study also identifies a subset of families with high overlap and compositional bias, suggesting that these families may be enriched in false positives. Overall, the findings emphasize the importance of considering higher-order correlations in protein sequences and the need for improved bias correction methods.
Reach us at info@study.space
[slides and audio] Challenges in homology search%3A HMMER3 and convergent evolution of coiled-coil regions