DE module Pipeline Specification
Pipeline Details
- Name:
DE module - Pipeline UUID:
7107a162a03a4d1fa8a9ae45335e96dd - Version:
2.1.5 - View Pipeline:
Overview
DE module pipeline is designed for performing differential expression analysis on RNA-sequencing (RNA-seq) data using DESeq2 or Limma Voom software. It automates data preprocessing, quality control, statistical analysis, and visualization to ensure reliable and reproducible differential expression results.
Key Use cases:
- Differential Gene Expression Analysis: Identify significantly up- and down-regulated genes between experimental conditions using DESeq2 or Limma Voom algorithms.
- Multi-factor Design Support: Handle complex experimental designs including batch effects and multiple covariates with customizable design formulas.
- Quality Control and Visualization: Generate comprehensive QC reports including PCA plots, count distributions, and sample correlation analyses.
Features
- Support for Multiple DE Methods: Includes DESeq2 and Limma Voom algorithms for robust differential expression analysis.
- Batch Effect Correction: Optional batch correction using CombatSeq package for improved data quality.
- Comprehensive Quality Control: Implements distribution plots, PCA analysis, and reproducibility assessments with detailed visualizations.
- Flexible Input Modes: Supports 'All' samples or 'Comparison-only' modes for DESeqDataSet creation to optimize dispersion estimation.
- Multi-factor Design Support: Handles complex experimental designs with custom design formulas and interaction terms.
- Automated Visualization: Generates summary reports, MA plots, volcano plots, and heatmaps for easy interpretation of results.
Input/Output Specification
Inputs
Required
Counts File
- Description: A tab or comma separated file containing gene or transcript counts with samples as columns and features as rows.
- Format: .tsv, .csv
- Requirements: First column must contain unique feature names, header must contain sample names matching the groups file.
Groups File
- Description: Sample metadata file specifying experimental conditions and covariates.
- Format: .tsv, .csv
- Required Columns: sample_name, group
- Additional Columns: Optional metadata columns for multi-factor designs and batch correction.
Comparison File
- Description: Specifies which groups to compare in differential expression analysis.
- Format: .tsv, .csv
- Required Columns: controls, treats, names
- Optional Column: design (for custom design formulas)
Outputs
Reported Outputs
- Differential Expression Results:
- Description: Statistical results tables with fold changes, p-values, and adjusted p-values
- Format: .csv
- Visualization App: DE Browser, Excel
-
Location: Results folder
-
Quality Control Reports:
- Description: PCA plots, count distribution plots, and sample correlation matrices
- Format: .html, .png
- Visualization App: Web browser, Image viewer
-
Location: QC folder
-
MA and Volcano Plots:
- Description: Statistical visualization plots showing significance and fold change relationships
- Format: .png, .pdf
- Visualization App: Image viewer, PDF viewer
- Location: Plots folder
Supporting Outputs
- Normalized Count Matrices:
- Description: DESeq2 or Limma Voom normalized expression values
- Format: .csv
-
Location: Intermediate folder
-
Batch Corrected Data:
- Description: CombatSeq batch-corrected count matrices when batch correction is enabled
- Format: .csv
- Location: BatchCorrection folder
Associated Processes
References & Additional Documentation
- Related Papers:
- DESeq2 paper
- Limma Voom paper
- CombatSeq paper
- Software Documentation:
- DESeq2 documentation
- Limma documentation