Skip to content

RiboSeq Pipeline Specification

Pipeline Details

  • Name: RiboSeq
  • Pipeline UUID: f9319knx3wbl58m4ti1wos8f7znb0s
  • Version: 1.0.1
  • View Pipeline:

Overview

RiboSeq pipeline is designed for processing ribosome profiling (Ribo-seq) data. Ribo-seq captures ribosome-protected mRNA fragments to provide a snapshot of active translation in a cell, enabling precise mapping of ribosome positions on transcripts and offering insights into translation dynamics, ribosome occupancy, and coding potential. The pipeline automates data preprocessing, quality control, alignment, quantification, and ORF prediction to ensure reliable and reproducible results.

Key Use cases:

  • Translation Dynamics Analysis: Mapping ribosome positions on transcripts to study active translation.
  • ORF Discovery: Identifying and predicting open reading frames and their translation potential.
  • Ribosome Occupancy Profiling: Quantifying ribosome density across different genomic features.

Features

  • Comprehensive Quality Control: Implements FastQC analysis at multiple stages with adapter removal, trimming, and quality filtering options.
  • Multiple Quantification Methods: Supports both featureCounts and Salmon for gene and transcript-level quantification.
  • UMI Support: Includes UMI extraction and deduplication capabilities for enhanced accuracy.
  • ORF Prediction: Implements ORF detection and translation prediction based on validated methodologies.
  • Flexible Read Processing: Handles both single-end and paired-end sequencing data with customizable trimming and filtering parameters.
  • Modular Design: Supports customization with optional preprocessing steps and multiple alignment strategies.
  • Comprehensive Reporting: Generates detailed summary reports and visualizations for each processing step.

Input/Output Specification

Note: This pipeline uses dynamic input/output configuration. Specific inputs and outputs are defined during pipeline execution based on the selected processing modules.

Inputs

Required

The pipeline accepts standard sequencing inputs including FASTQ files and reference annotations, with specific requirements determined by the selected processing modules and analysis parameters.

Outputs

Reported Outputs

  • Gene Expression Matrix: Quantified gene expression levels from ribosome profiling data
  • Transcript Expression Matrix: Transcript-level quantification results
  • ORF Predictions: Identified open reading frames with translation potential scores
  • Quality Control Reports: Comprehensive QC summaries from FastQC and processing steps
  • Alignment Statistics: Detailed mapping and alignment summary statistics

Supporting Outputs

  • Processed BAM Files: Aligned and processed sequencing reads
  • Summary Tables: Detailed statistics from each processing step
  • Log Files: Processing logs for debugging and quality assessment

Associated Processes

References & Additional Documentation

  • Related Papers:
  • "Accurate detection of short and long active ORFs using Ribo-seq data" - https://pubmed.ncbi.nlm.nih.gov/31750902/
  • "Scikit-ribo Enables Accurate Estimation and Robust Modeling of Translation Dynamics at Codon Resolution"
  • Workflow Diagram: Available in pipeline description pages