VOL 5 | NOVEMBER 2020 | 1403-1407 | Andrew Rambaut, Edward C. Holmes, Áine O'Toole, Verity Hill, John T. McCrone, Christopher Ruis, Louis du Plessis, Oliver G. Pybus
The authors propose a dynamic and rational nomenclature system for SARS-CoV-2 lineages to aid genomic epidemiology during the ongoing pandemic. With tens of thousands of virus genome sequences generated, there is a need for a coherent naming scheme to track and understand the global spread of the virus. The proposed system uses a phylogenetic framework to identify lineages that contribute most to active spread, focusing on those spreading to new locations. The system constraints the number and depth of hierarchical lineage labels and flags or delabels lineages that become unobserved, likely inactive. Key principles include capturing local and global patterns of genetic diversity, tracking emerging lineages, and being robust and flexible to accommodate new diversity. The nomenclature system is designed to handle tens to hundreds of thousands of genomes and has no more than 100 active lineage labels. Lineages are classified as 'active,' 'unobserved,' or 'inactive,' with the latter two being reassigned if new evidence appears. The system aims to provide commonly agreed labels for viruses circulating in different parts of the world, facilitating real-time epidemiology and understanding of the virus's dynamics.The authors propose a dynamic and rational nomenclature system for SARS-CoV-2 lineages to aid genomic epidemiology during the ongoing pandemic. With tens of thousands of virus genome sequences generated, there is a need for a coherent naming scheme to track and understand the global spread of the virus. The proposed system uses a phylogenetic framework to identify lineages that contribute most to active spread, focusing on those spreading to new locations. The system constraints the number and depth of hierarchical lineage labels and flags or delabels lineages that become unobserved, likely inactive. Key principles include capturing local and global patterns of genetic diversity, tracking emerging lineages, and being robust and flexible to accommodate new diversity. The nomenclature system is designed to handle tens to hundreds of thousands of genomes and has no more than 100 active lineage labels. Lineages are classified as 'active,' 'unobserved,' or 'inactive,' with the latter two being reassigned if new evidence appears. The system aims to provide commonly agreed labels for viruses circulating in different parts of the world, facilitating real-time epidemiology and understanding of the virus's dynamics.