2016 September | Mihaela Pertea, Daehwan Kim, Geo Pertea, Jeffrey T. Leek, Steven L. Salzberg
This protocol describes the use of HISAT, StringTie, and Ballgown for transcript-level expression analysis of RNA-seq experiments. These tools allow scientists to align reads to a genome, assemble transcripts, compute transcript abundance, and identify differentially expressed genes and transcripts. The protocol outlines the steps required to process raw RNA-seq reads and generate gene and transcript lists, expression levels, and differentially expressed genes. The execution time depends on computing resources but typically takes under 45 minutes. HISAT aligns reads to the genome, StringTie assembles transcripts and estimates expression levels, and Ballgown determines differentially expressed transcripts. The protocol is designed for RNA-seq experiments comparing two biological conditions, using human RNA-seq samples, though it applies to any species with a sequenced genome. The protocol includes steps for data preprocessing, read alignment, transcript assembly, quantification, and differential expression analysis. It also provides guidance on using R for data visualization and statistical analysis. The protocol assumes familiarity with the Unix command line and basic R scripting. Alternative analysis packages are mentioned, but the protocol focuses on HISAT, StringTie, and Ballgown. The protocol is suitable for a wide range of RNA-seq experiments and can be adapted for different experimental designs. The protocol includes detailed instructions for data processing, software installation, and analysis steps, along with troubleshooting tips and examples of output. The protocol is efficient, requiring minimal computational resources and providing accurate results for transcript-level expression analysis.This protocol describes the use of HISAT, StringTie, and Ballgown for transcript-level expression analysis of RNA-seq experiments. These tools allow scientists to align reads to a genome, assemble transcripts, compute transcript abundance, and identify differentially expressed genes and transcripts. The protocol outlines the steps required to process raw RNA-seq reads and generate gene and transcript lists, expression levels, and differentially expressed genes. The execution time depends on computing resources but typically takes under 45 minutes. HISAT aligns reads to the genome, StringTie assembles transcripts and estimates expression levels, and Ballgown determines differentially expressed transcripts. The protocol is designed for RNA-seq experiments comparing two biological conditions, using human RNA-seq samples, though it applies to any species with a sequenced genome. The protocol includes steps for data preprocessing, read alignment, transcript assembly, quantification, and differential expression analysis. It also provides guidance on using R for data visualization and statistical analysis. The protocol assumes familiarity with the Unix command line and basic R scripting. Alternative analysis packages are mentioned, but the protocol focuses on HISAT, StringTie, and Ballgown. The protocol is suitable for a wide range of RNA-seq experiments and can be adapted for different experimental designs. The protocol includes detailed instructions for data processing, software installation, and analysis steps, along with troubleshooting tips and examples of output. The protocol is efficient, requiring minimal computational resources and providing accurate results for transcript-level expression analysis.