3 April 2024 | Glenn A. Logsdon, Allison N. Rozanski, Fedor Ryabov, Tamara Potapova, Valery A. Shepelev, Claudia R. Catacchio, David Porubsky, Yafei Mao, DongAhn Yoo, Mikko Rautiainen, Sergey Koren, Sergey Nurk, Julian K. Lucas, Kendra Hoekzema, Katherine M. Munson, Jennifer L. Gerton, Adam M. Phillippy, Mario Ventura, Ivan A. Alexandrov, Evan E. Eichler
The study presents a comprehensive analysis of human centromeres, which have traditionally been challenging to sequence and assemble due to their repetitive nature and large size. Using long-read sequencing, the researchers completely sequenced and assembled all centromeres from a second human genome and compared them to the reference genome. They found that the two sets of centromeres show a significant increase in single-nucleotide variation and size variation compared to their unique flanks. Additionally, 45.8% of centromeric sequences cannot be reliably aligned due to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments revealed that 26% of centromeres differ in their kinetochore position by more than 500 kb. To understand evolutionary changes, the researchers sequenced and assembled 31 orthologous centromeres from common chimpanzee, orangutan, and macaque genomes. Comparative analyses revealed a nearly complete turnover of α-satellite HORs, with characteristic changes in each species. Phylogenetic reconstruction supported limited to no recombination between the short and long arms of centromeres and showed that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA. The study highlights the rapid evolution and diversity of human centromeres and the importance of long-read sequencing and assembly technologies in advancing our understanding of these complex regions.The study presents a comprehensive analysis of human centromeres, which have traditionally been challenging to sequence and assemble due to their repetitive nature and large size. Using long-read sequencing, the researchers completely sequenced and assembled all centromeres from a second human genome and compared them to the reference genome. They found that the two sets of centromeres show a significant increase in single-nucleotide variation and size variation compared to their unique flanks. Additionally, 45.8% of centromeric sequences cannot be reliably aligned due to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments revealed that 26% of centromeres differ in their kinetochore position by more than 500 kb. To understand evolutionary changes, the researchers sequenced and assembled 31 orthologous centromeres from common chimpanzee, orangutan, and macaque genomes. Comparative analyses revealed a nearly complete turnover of α-satellite HORs, with characteristic changes in each species. Phylogenetic reconstruction supported limited to no recombination between the short and long arms of centromeres and showed that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA. The study highlights the rapid evolution and diversity of human centromeres and the importance of long-read sequencing and assembly technologies in advancing our understanding of these complex regions.