[slides] Automated de novo identification of repeat sequence families in sequenced genomes.

Zhirong Bao and Sean R. Eddy have developed an automated method for de novo identification and classification of repeat sequence families in sequenced genomes. Their approach, implemented in the program RECON, uses multiple alignment information to define the boundaries of individual copies of repeats and distinguish homologous but distinct repeat element families. The method addresses the limitations of traditional single linkage clustering by using multiple alignments to infer element boundaries and improve the accuracy of family clustering. When tested on the human genome, RECON successfully identified and grouped known transposable elements, demonstrating its utility for first-pass automatic classification of repeats in newly sequenced genomes. The authors discuss the challenges and limitations of their approach, including issues related to segmental duplications and interfamily similarity, and provide a detailed description of the RECON algorithm and its implementation.Zhirong Bao and Sean R. Eddy have developed an automated method for de novo identification and classification of repeat sequence families in sequenced genomes. Their approach, implemented in the program RECON, uses multiple alignment information to define the boundaries of individual copies of repeats and distinguish homologous but distinct repeat element families. The method addresses the limitations of traditional single linkage clustering by using multiple alignments to infer element boundaries and improve the accuracy of family clustering. When tested on the human genome, RECON successfully identified and grouped known transposable elements, demonstrating its utility for first-pass automatic classification of repeats in newly sequenced genomes. The authors discuss the challenges and limitations of their approach, including issues related to segmental duplications and interfamily similarity, and provide a detailed description of the RECON algorithm and its implementation.

Automated De Novo Identification of Repeat Sequence Families in Sequenced Genomes

July 2002 | Zhirong Bao and Sean R. Eddy