No country for old methods: New tools for studying microproteins

No country for old methods: New tools for studying microproteins

February 16, 2024 | Fabiola Valdivia-Francia and Ataman Sendoe
Microproteins encoded by small open reading frames (sORFs) have emerged as a significant area in genomics. Traditionally overlooked due to their small size, recent technologies like ribosome profiling, mass spectrometry, and computational methods have enabled the annotation of over 7000 sORFs in the human genome. Despite progress, only a small fraction of these microproteins have been characterized, and identifying functionally relevant ones remains a challenge. This review highlights recent advancements in sORF research, focusing on new methodologies and computational approaches that facilitate their identification and functional characterization. These tools hold promise for understanding microproteins' roles in cellular processes and disease pathogenesis. The mammalian genome contains many uncharacterized sORFs, often misinterpreted as "junk DNA." New technologies, such as proteomics and ribosome profiling, along with advanced bioinformatics, have helped identify sORFs as functional proteins. These sORFs include those on non-coding RNAs, overlapping sequences, and regions like 5' and 3' UTRs. Recent efforts to standardize sORF catalogs have identified over 7000 human sORFs, suggesting they are a significant part of eukaryotic genomes. Microproteins are involved in various cellular functions, including tumor suppression and cell proliferation, and hold potential as drug targets. The translation of sORFs can result in peptides or regulatory functions, with uORFs (upstream ORFs) being particularly significant. About 50% of mammalian genes contain uORFs, which can modulate ribosome access to downstream ORFs and affect translational efficiency. Under stress, uORF-mediated regulation can induce translation of certain genes, such as ATF4, to mount cellular responses. Genome-wide uORF translation may be regulated and contribute to translational programs in embryonic stem cells or tumor initiation. New computational approaches and RNA sequencing data have improved transcriptome annotations and facilitated microprotein classification. Tools like PhyloCSF and OpenProt have been developed to adjust classification parameters for ORF annotation. Machine learning tools such as RNASamba and DeepCPP predict sORFs based on sequence patterns and codon bias. These tools help identify sORFs by analyzing ribosome profiling and mass spectrometry data. Ribosome profiling (Ribo-Seq) provides real-time snapshots of translation by assessing ribosome-protected fragments, aiding in the identification of previously unannotated ORFs. Ribo-Seq has shown that translation occurs in the 5'UTR of mRNAs, revealing widespread translation on long non-coding RNAs and other regions. Ribo-Seq combined with inhibitors like harringtonine helps map alternative start sites and identify truncated proteins. Mass spectrometry (MS) is the gold standard for characterizing the proteome and verifying the presence of microproteins. MS-based proteomics has beenMicroproteins encoded by small open reading frames (sORFs) have emerged as a significant area in genomics. Traditionally overlooked due to their small size, recent technologies like ribosome profiling, mass spectrometry, and computational methods have enabled the annotation of over 7000 sORFs in the human genome. Despite progress, only a small fraction of these microproteins have been characterized, and identifying functionally relevant ones remains a challenge. This review highlights recent advancements in sORF research, focusing on new methodologies and computational approaches that facilitate their identification and functional characterization. These tools hold promise for understanding microproteins' roles in cellular processes and disease pathogenesis. The mammalian genome contains many uncharacterized sORFs, often misinterpreted as "junk DNA." New technologies, such as proteomics and ribosome profiling, along with advanced bioinformatics, have helped identify sORFs as functional proteins. These sORFs include those on non-coding RNAs, overlapping sequences, and regions like 5' and 3' UTRs. Recent efforts to standardize sORF catalogs have identified over 7000 human sORFs, suggesting they are a significant part of eukaryotic genomes. Microproteins are involved in various cellular functions, including tumor suppression and cell proliferation, and hold potential as drug targets. The translation of sORFs can result in peptides or regulatory functions, with uORFs (upstream ORFs) being particularly significant. About 50% of mammalian genes contain uORFs, which can modulate ribosome access to downstream ORFs and affect translational efficiency. Under stress, uORF-mediated regulation can induce translation of certain genes, such as ATF4, to mount cellular responses. Genome-wide uORF translation may be regulated and contribute to translational programs in embryonic stem cells or tumor initiation. New computational approaches and RNA sequencing data have improved transcriptome annotations and facilitated microprotein classification. Tools like PhyloCSF and OpenProt have been developed to adjust classification parameters for ORF annotation. Machine learning tools such as RNASamba and DeepCPP predict sORFs based on sequence patterns and codon bias. These tools help identify sORFs by analyzing ribosome profiling and mass spectrometry data. Ribosome profiling (Ribo-Seq) provides real-time snapshots of translation by assessing ribosome-protected fragments, aiding in the identification of previously unannotated ORFs. Ribo-Seq has shown that translation occurs in the 5'UTR of mRNAs, revealing widespread translation on long non-coding RNAs and other regions. Ribo-Seq combined with inhibitors like harringtonine helps map alternative start sites and identify truncated proteins. Mass spectrometry (MS) is the gold standard for characterizing the proteome and verifying the presence of microproteins. MS-based proteomics has been
Reach us at info@futurestudyspace.com
[slides] No country for old methods%3A New tools for studying microproteins | StudySpace