Fast and SNP-tolerant detection of complex variants and splicing in short reads

Fast and SNP-tolerant detection of complex variants and splicing in short reads

Advance Access publication February 10, 2010 | Thomas D. Wu* and Serban Nacu
The paper presents GSNAP (Genomic Short-read Nucleotide Alignment Program), a computational method for detecting complex variants and splicing in short reads from next-generation sequencing data. GSNAP is designed to align both single- and paired-end reads as short as 14 nucleotides and can detect short- and long-distance splicing, including interchromosomal splicing. It also supports SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite-treated DNA for studying methylation state. GSNAP uses a successively constrained search process to merge and filter position lists from a genomic index, achieving high sensitivity and speed. In comparisons with other alignment programs, GSNAP is particularly effective in detecting complex variants with four or more mismatches or insertions/deletions of specific lengths. The program's ability to handle SNP tolerance and bisulfite-treated DNA data is also highlighted, with simulations showing its effectiveness in these scenarios.The paper presents GSNAP (Genomic Short-read Nucleotide Alignment Program), a computational method for detecting complex variants and splicing in short reads from next-generation sequencing data. GSNAP is designed to align both single- and paired-end reads as short as 14 nucleotides and can detect short- and long-distance splicing, including interchromosomal splicing. It also supports SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite-treated DNA for studying methylation state. GSNAP uses a successively constrained search process to merge and filter position lists from a genomic index, achieving high sensitivity and speed. In comparisons with other alignment programs, GSNAP is particularly effective in detecting complex variants with four or more mismatches or insertions/deletions of specific lengths. The program's ability to handle SNP tolerance and bisulfite-treated DNA data is also highlighted, with simulations showing its effectiveness in these scenarios.
Reach us at info@study.space