April 2024 | David Yao, Josh Tycko, Jin Woo Oh, Lexi R. Bounds, Sager J. Gosai, Lazaros Lataniotis, Ava Mackay-Smith, Benjamin R. Doughty, Idan Gabdank, Henri Schmidt, Tania Guerrero-Altamirano, Keith Siklenka, Katherine Guo, Alexander D. White, Ingrid Youngworth, Kalina Andreeva, Xingjie Ren, Alejandro Barrera, Yunhai Luo, Galip Gürkan Yardimci, Ryan Tewhey, Anshul Kundaje, William J. Greenleaf, Pardis C. Sabeti, Christina Leslie, Yuri Pritkin, Jill E. Moore, Michael A. Beer, Charles A. Gersbach, Timothy E. Reddy, Yin Shen, Jesse M. Engreitz, Michael C. Bassik & Steven K. Reilly
The ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE–gene links in K562 cells, they established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, they found that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. They uncovered a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, they provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.
The noncoding genome contains critical regulators of gene expression and harbors >90% of trait-associated human genetic variation. Major efforts over the past decade have mapped hundreds of thousands of noncoding candidate cis-regulatory elements (cCREs). Such efforts have relied primarily on mapping sequence conservation and biochemical markers that are correlated with regulatory activity rather than direct functional characterization. Site-specific, programmable and highly scalable CRISPR genome and epigenome manipulation methods have enabled massively parallel perturbation assays to identify and characterize functional CREs. However, the overlap between CREs, elements with empirically characterized endogenous function, and cCREs, elements nominated by biochemical markers, screens or sequence content, is unknown.
Various CRISPR-based perturbation methods have been developed to determine the effects of different cCREs on target gene expression and/or downstream phenotypes. Systematic benchmarking of noncoding CRISPR screening methods and attempts to harmonize the results have been limited by low numbers of available datasets and inconsistent reporting.
The ENCODE4 Functional Characterization Centers have generated the largest collective dataset of endogenous cCRE perturbation screens to date, including many loci perturbed to saturation in K562 cells, using diverse experimental approaches. Here, they compare noncoding CRISPR screening approaches and provide technical suggestions and data file formats potentially generalizable to such screens. They analyze various CRISPR noncoding screens extensively in K562 cells and other biological systems at each screening stage, including (1) library design, (2) CRISPR perturbation selection, (3) phenotyping strategy and (4) analytical methods. By assembling and jointly analyzing this large repository of bulkThe ENCODE Consortium's efforts to annotate noncoding cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes. Pooled, noncoding CRISPR screens offer a systematic approach to investigate cis-regulatory mechanisms. The ENCODE4 Functional Characterization Centers conducted 108 screens in human cell lines, comprising >540,000 perturbations across 24.85 megabases of the genome. Using 332 functionally confirmed CRE–gene links in K562 cells, they established guidelines for screening endogenous noncoding elements with CRISPR interference (CRISPRi), including accurate detection of CREs that exhibit variable, often low, transcriptional effects. Benchmarking five screen analysis tools, they found that CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity single guide RNAs. They uncovered a subtle DNA strand bias for CRISPRi in transcribed regions with implications for screen design and analysis. Together, they provide an accessible data resource, predesigned single guide RNAs for targeting 3,275,697 ENCODE SCREEN candidate CREs with CRISPRi and screening guidelines to accelerate functional characterization of the noncoding genome.
The noncoding genome contains critical regulators of gene expression and harbors >90% of trait-associated human genetic variation. Major efforts over the past decade have mapped hundreds of thousands of noncoding candidate cis-regulatory elements (cCREs). Such efforts have relied primarily on mapping sequence conservation and biochemical markers that are correlated with regulatory activity rather than direct functional characterization. Site-specific, programmable and highly scalable CRISPR genome and epigenome manipulation methods have enabled massively parallel perturbation assays to identify and characterize functional CREs. However, the overlap between CREs, elements with empirically characterized endogenous function, and cCREs, elements nominated by biochemical markers, screens or sequence content, is unknown.
Various CRISPR-based perturbation methods have been developed to determine the effects of different cCREs on target gene expression and/or downstream phenotypes. Systematic benchmarking of noncoding CRISPR screening methods and attempts to harmonize the results have been limited by low numbers of available datasets and inconsistent reporting.
The ENCODE4 Functional Characterization Centers have generated the largest collective dataset of endogenous cCRE perturbation screens to date, including many loci perturbed to saturation in K562 cells, using diverse experimental approaches. Here, they compare noncoding CRISPR screening approaches and provide technical suggestions and data file formats potentially generalizable to such screens. They analyze various CRISPR noncoding screens extensively in K562 cells and other biological systems at each screening stage, including (1) library design, (2) CRISPR perturbation selection, (3) phenotyping strategy and (4) analytical methods. By assembling and jointly analyzing this large repository of bulk