Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner

Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner

2004 | Mathieu Blanchette, W. James Kent, Cathy Riemer, Laura Elnitski, Arian F.A. Smit, Krishna M. Roskin, Robert Baertsch, Kate Rosenbloom, Hiram Clawson, Eric D. Green, David Haussler, Webb Miller
The paper introduces a new method for aligning multiple genomic sequences called the "threaded blockset" and a corresponding program, TBA (threaded blockset aligner). A threaded blockset is a generalization of multiple sequence alignment, where each position in the given sequences appears exactly once in the blocks. TBA aligns sequences under the assumption that matching segments occur in the same order and orientation, and it is designed for aligning megabase-sized regions of mammalian genomes. TBA can project its alignments onto any reference genome, ensuring consistent predictions of orthologous positions. The program uses MULTIZ for dynamic-programming alignment, which can handle rearranged or incompletely sequenced genomes. The authors evaluated TBA's accuracy using simulated evolutionary sequences and found it more accurate than existing programs. TBA was used to produce whole-genome alignments for the UCSC Genome Browser. The paper also discusses the challenges of aligning genomic sequences, the importance of accurate alignment for identifying conserved regions, and the limitations of current methods. The authors propose a new conceptual framework for multiple sequence alignment using threaded blocksets, which can accommodate complex evolutionary events like inversions and duplications. The paper evaluates the accuracy of various multiple alignment programs, including TBA, CLUSTALW, DIALIGN, MAVID, and MLAGAN, using simulated data. The results show that TBA performs better on diverged sequences. The paper also describes the implementation of TBA and MULTIZ as a suite of independent programs, and discusses the advantages and disadvantages of this approach. The authors conclude that their method provides a more accurate and flexible approach to multiple sequence alignment, particularly for complex genomic regions.The paper introduces a new method for aligning multiple genomic sequences called the "threaded blockset" and a corresponding program, TBA (threaded blockset aligner). A threaded blockset is a generalization of multiple sequence alignment, where each position in the given sequences appears exactly once in the blocks. TBA aligns sequences under the assumption that matching segments occur in the same order and orientation, and it is designed for aligning megabase-sized regions of mammalian genomes. TBA can project its alignments onto any reference genome, ensuring consistent predictions of orthologous positions. The program uses MULTIZ for dynamic-programming alignment, which can handle rearranged or incompletely sequenced genomes. The authors evaluated TBA's accuracy using simulated evolutionary sequences and found it more accurate than existing programs. TBA was used to produce whole-genome alignments for the UCSC Genome Browser. The paper also discusses the challenges of aligning genomic sequences, the importance of accurate alignment for identifying conserved regions, and the limitations of current methods. The authors propose a new conceptual framework for multiple sequence alignment using threaded blocksets, which can accommodate complex evolutionary events like inversions and duplications. The paper evaluates the accuracy of various multiple alignment programs, including TBA, CLUSTALW, DIALIGN, MAVID, and MLAGAN, using simulated data. The results show that TBA performs better on diverged sequences. The paper also describes the implementation of TBA and MULTIZ as a suite of independent programs, and discusses the advantages and disadvantages of this approach. The authors conclude that their method provides a more accurate and flexible approach to multiple sequence alignment, particularly for complex genomic regions.
Reach us at info@study.space