Nomenclature recommendations for describing human sequence variations have been proposed to standardize the naming of mutations and polymorphisms in DNA and protein sequences. These recommendations are now widely accepted but need to be expanded to cover more complex mutations. The document outlines existing recommendations and suggests ways to describe additional, more complex changes. It emphasizes the importance of using systematic names for sequence variations, with descriptions based on reference sequences (genomic, cDNA, mitochondrial, RNA, or protein). Each level of description is unique, with specific formatting rules. For example, genomic variations are denoted with "g.", cDNA with "c.", mitochondrial with "m.", RNA with "r.", and protein with "p.".
Sequence variations are described relative to a reference sequence, with the accession number of the reference sequence included in publications or database submissions. The notation for sequence changes includes a ">", "del" for deletions, and "dup" for duplications. Intronic nucleotides are numbered relative to the nearest exon, with specific notations for the start and end of introns. Polymorphic variants should not be described as 76A/G. The document also recommends assigning unique identifiers to each mutation, using OMIM identifiers if available.
At the DNA level, nucleotides are numbered starting from the ATG initiation codon, with specific rules for non-coding regions and introns. Substitutions, deletions, and duplications are described with specific notations, and the document provides examples of how to describe these changes. The goal is to create a uniform and unambiguous system for describing sequence variations, which will eventually evolve into a widely accepted standard.Nomenclature recommendations for describing human sequence variations have been proposed to standardize the naming of mutations and polymorphisms in DNA and protein sequences. These recommendations are now widely accepted but need to be expanded to cover more complex mutations. The document outlines existing recommendations and suggests ways to describe additional, more complex changes. It emphasizes the importance of using systematic names for sequence variations, with descriptions based on reference sequences (genomic, cDNA, mitochondrial, RNA, or protein). Each level of description is unique, with specific formatting rules. For example, genomic variations are denoted with "g.", cDNA with "c.", mitochondrial with "m.", RNA with "r.", and protein with "p.".
Sequence variations are described relative to a reference sequence, with the accession number of the reference sequence included in publications or database submissions. The notation for sequence changes includes a ">", "del" for deletions, and "dup" for duplications. Intronic nucleotides are numbered relative to the nearest exon, with specific notations for the start and end of introns. Polymorphic variants should not be described as 76A/G. The document also recommends assigning unique identifiers to each mutation, using OMIM identifiers if available.
At the DNA level, nucleotides are numbered starting from the ATG initiation codon, with specific rules for non-coding regions and introns. Substitutions, deletions, and duplications are described with specific notations, and the document provides examples of how to describe these changes. The goal is to create a uniform and unambiguous system for describing sequence variations, which will eventually evolve into a widely accepted standard.