2009 | Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin and 1000 Genome Project Data Processing Subgroup
The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads up to 128 Mbp from various sequencing platforms. It is flexible, compact, and efficient for random access, and is used by the 1000 Genomes Project. SAMtools is a software package that provides utilities for post-processing alignments in the SAM format, including indexing, variant calling, and alignment viewing. The SAM format includes a header and alignment sections, with mandatory and optional fields. Extended CIGAR operations support complex alignments. The Binary Alignment/Map (BAM) format is a binary version of SAM, compressed for efficient storage and retrieval. SAMtools can convert alignments, sort, merge, and index alignments, and perform variant calling. It has C and Java implementations. The SAM/BAM format and SAMtools enable efficient processing of genomic data by separating alignment from downstream analysis. The format supports large-scale alignments and allows for fast retrieval of alignments in specific regions. The paper describes the SAM format and SAMtools, highlighting their utility in genomic research.The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads up to 128 Mbp from various sequencing platforms. It is flexible, compact, and efficient for random access, and is used by the 1000 Genomes Project. SAMtools is a software package that provides utilities for post-processing alignments in the SAM format, including indexing, variant calling, and alignment viewing. The SAM format includes a header and alignment sections, with mandatory and optional fields. Extended CIGAR operations support complex alignments. The Binary Alignment/Map (BAM) format is a binary version of SAM, compressed for efficient storage and retrieval. SAMtools can convert alignments, sort, merge, and index alignments, and perform variant calling. It has C and Java implementations. The SAM/BAM format and SAMtools enable efficient processing of genomic data by separating alignment from downstream analysis. The format supports large-scale alignments and allows for fast retrieval of alignments in specific regions. The paper describes the SAM format and SAMtools, highlighting their utility in genomic research.