A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology

A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology

NOVEMBER 2020 | Andrew Rambaut, Edward C. Holmes, Áine O'Toole, Verity Hill, John T. McCrone, Christopher Ruis, Louis du Plessis, Oliver G. Pybus
A dynamic nomenclature system for SARS-CoV-2 lineages is proposed to assist in genomic epidemiology. The system uses a phylogenetic framework to identify lineages contributing most to active spread. It is designed to be tractable by limiting the number and depth of hierarchical labels and flagging lineages that become unobserved. The system focuses on active lineages and those spreading to new locations, aiding in tracking and understanding the global spread of SARS-CoV-2. Currently, over 35,000 SARS-CoV-2 genomes are publicly available, with the number continuing to grow. However, there is no coherent system for naming and discussing the growing number of phylogenetic lineages. A dynamic nomenclature system is urgently needed to avoid confusion in scientific communication. Classification systems for virus genetic diversity are typically based on clades, which are monophyletic groups on a phylogenetic tree. However, for rapidly evolving viruses like SARS-CoV-2, a more dynamic system is needed. The proposed system uses a phylogenetic framework to identify lineages that contribute most to transmission and genetic diversity. It tracks emerging lineages as they move between countries and populations, and marks lineages as active, unobserved, or inactive based on their recent observation. The system uses a set of evolutionary and phylogenetic principles to define lineage labels. Major lineage labels begin with a letter, with the root sequences of SARS-CoV-2 denoted as lineages A and B. Further lineages are defined based on phylogenetic evidence of emergence from an ancestral lineage into a geographically distinct population. Lineages are assigned numerical values, with additional sublevels introduced as needed. The system is designed to be practical, with no more than 100 or 200 active lineage labels. It allows for real-time epidemiology by providing commonly agreed labels to refer to viruses circulating in different parts of the world, revealing links between outbreaks. The system also helps describe virus lineages that vary in phenotypic or antigenic properties. The proposed nomenclature system is practical and robust, but phylogenetic inference carries statistical uncertainty. The system uses a genome coverage threshold for proposing new lineages and requires at least 70% coverage of the coding region for lineage designation. The system is flexible and can be adjusted as the dynamics of lineage generation and extinction are better understood. The system is intended for tracking currently circulating lineages and is not meant to capture the entire history of a lineage. It is designed to be used for real-time genomic epidemiology and may be adopted for other viral epidemics. The system is expected to be most useful during the global pandemic, which may last a few years. After that time, the remaining endemic/seasonal lineages can retain their names from the dynamic nomenclature system.A dynamic nomenclature system for SARS-CoV-2 lineages is proposed to assist in genomic epidemiology. The system uses a phylogenetic framework to identify lineages contributing most to active spread. It is designed to be tractable by limiting the number and depth of hierarchical labels and flagging lineages that become unobserved. The system focuses on active lineages and those spreading to new locations, aiding in tracking and understanding the global spread of SARS-CoV-2. Currently, over 35,000 SARS-CoV-2 genomes are publicly available, with the number continuing to grow. However, there is no coherent system for naming and discussing the growing number of phylogenetic lineages. A dynamic nomenclature system is urgently needed to avoid confusion in scientific communication. Classification systems for virus genetic diversity are typically based on clades, which are monophyletic groups on a phylogenetic tree. However, for rapidly evolving viruses like SARS-CoV-2, a more dynamic system is needed. The proposed system uses a phylogenetic framework to identify lineages that contribute most to transmission and genetic diversity. It tracks emerging lineages as they move between countries and populations, and marks lineages as active, unobserved, or inactive based on their recent observation. The system uses a set of evolutionary and phylogenetic principles to define lineage labels. Major lineage labels begin with a letter, with the root sequences of SARS-CoV-2 denoted as lineages A and B. Further lineages are defined based on phylogenetic evidence of emergence from an ancestral lineage into a geographically distinct population. Lineages are assigned numerical values, with additional sublevels introduced as needed. The system is designed to be practical, with no more than 100 or 200 active lineage labels. It allows for real-time epidemiology by providing commonly agreed labels to refer to viruses circulating in different parts of the world, revealing links between outbreaks. The system also helps describe virus lineages that vary in phenotypic or antigenic properties. The proposed nomenclature system is practical and robust, but phylogenetic inference carries statistical uncertainty. The system uses a genome coverage threshold for proposing new lineages and requires at least 70% coverage of the coding region for lineage designation. The system is flexible and can be adjusted as the dynamics of lineage generation and extinction are better understood. The system is intended for tracking currently circulating lineages and is not meant to capture the entire history of a lineage. It is designed to be used for real-time genomic epidemiology and may be adopted for other viral epidemics. The system is expected to be most useful during the global pandemic, which may last a few years. After that time, the remaining endemic/seasonal lineages can retain their names from the dynamic nomenclature system.
Reach us at info@study.space
Understanding A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology