SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes

SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes

April 24, 2012 | Elmar Pruesse, Jörg Peplies, Frank Oliver Glöckner
SINA is a high-throughput multiple sequence alignment (MSA) tool designed for ribosomal RNA (rRNA) genes. It uses a combination of k-mer searching and partial order alignment (POA) to achieve high accuracy and efficiency. SINA was evaluated against PyNAST and mothur, showing superior performance in benchmarking. It aligns sequences using a reference MSA, which allows scalability and efficient processing of large datasets. The algorithm constructs a directed acyclic graph (DAG) from the reference sequences, enabling dynamic programming alignment. SINA also includes a reference sequence selection process based on k-mer similarity, ensuring accurate alignment. It handles sequence ends and insertions with specific policies, and allows for parameter tuning to optimize alignment accuracy. SINA's performance was tested on various benchmarks, demonstrating higher accuracy than other tools, especially for sequences with low identity to the reference. The tool is available for use with the latest SILVA SSU/LSU Ref datasets and is provided under a personal use license. SINA is a flexible and reliable tool for high-throughput MSA, particularly useful for analyzing large-scale rRNA data.SINA is a high-throughput multiple sequence alignment (MSA) tool designed for ribosomal RNA (rRNA) genes. It uses a combination of k-mer searching and partial order alignment (POA) to achieve high accuracy and efficiency. SINA was evaluated against PyNAST and mothur, showing superior performance in benchmarking. It aligns sequences using a reference MSA, which allows scalability and efficient processing of large datasets. The algorithm constructs a directed acyclic graph (DAG) from the reference sequences, enabling dynamic programming alignment. SINA also includes a reference sequence selection process based on k-mer similarity, ensuring accurate alignment. It handles sequence ends and insertions with specific policies, and allows for parameter tuning to optimize alignment accuracy. SINA's performance was tested on various benchmarks, demonstrating higher accuracy than other tools, especially for sequences with low identity to the reference. The tool is available for use with the latest SILVA SSU/LSU Ref datasets and is provided under a personal use license. SINA is a flexible and reliable tool for high-throughput MSA, particularly useful for analyzing large-scale rRNA data.
Reach us at info@futurestudyspace.com
Understanding SINA%3A Accurate high-throughput multiple sequence alignment of ribosomal RNA genes