The ENCODE Blacklist: Identification of Problematic Regions of the Genome

The ENCODE Blacklist: Identification of Problematic Regions of the Genome

27 June 2019 | Haley M. Amemiya, Anshul Kundaje & Alan P. Boyle
The ENCODE Blacklist is a comprehensive set of regions in the human, mouse, worm, and fly genomes that exhibit anomalous, unstructured, or high signal in next-generation sequencing experiments, independent of cell line or experiment. These regions, often associated with repetitive sequences or assembly issues, can introduce significant noise and bias in functional genomics assays like ChIP-seq. The removal of these blacklisted regions is crucial for accurate peak calling and downstream analyses, ensuring that biological conclusions are not compromised by artifact signals. The ENCODE project has developed an automated procedure to identify and flag these regions based on uniform criteria across multiple samples, using input ChIP-seq data from various organisms. This method has been shown to effectively reduce background noise and improve the quality of ChIP-seq data, making it essential for reliable functional genomics studies. The blacklists are specific to each genome assembly and are integrated into analysis pipelines to enhance the accuracy of genomic assays.The ENCODE Blacklist is a comprehensive set of regions in the human, mouse, worm, and fly genomes that exhibit anomalous, unstructured, or high signal in next-generation sequencing experiments, independent of cell line or experiment. These regions, often associated with repetitive sequences or assembly issues, can introduce significant noise and bias in functional genomics assays like ChIP-seq. The removal of these blacklisted regions is crucial for accurate peak calling and downstream analyses, ensuring that biological conclusions are not compromised by artifact signals. The ENCODE project has developed an automated procedure to identify and flag these regions based on uniform criteria across multiple samples, using input ChIP-seq data from various organisms. This method has been shown to effectively reduce background noise and improve the quality of ChIP-seq data, making it essential for reliable functional genomics studies. The blacklists are specific to each genome assembly and are integrated into analysis pipelines to enhance the accuracy of genomic assays.
Reach us at info@study.space