Crispr Screen Pipeline Specification
Pipeline Details
- Name:
Crispr Screen - Pipeline UUID:
Kdz5esil4bu1QCajNleLoxbhkTVB4U - Version:
1.0.0 - View Pipeline:
Overview
Crispr Screen pipeline is designed for analyzing CRISPR screening data using the MAGeCK (Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout) toolkit. It automates the counting of guide RNAs from sequencing reads and performs statistical analysis to identify significantly enriched or depleted genes in CRISPR knockout screens.
Key Use cases:
- Gene Essentiality Screens: Identify essential genes by analyzing guide RNA depletion in cell survival screens.
- Drug Resistance Screens: Discover genes involved in drug resistance or sensitivity by comparing treated versus control conditions.
- Functional Genomics Studies: Systematically analyze gene function through large-scale CRISPR knockout experiments.
Features
- Integrated MAGeCK Workflow: Combines both counting and statistical testing phases of MAGeCK analysis in a single pipeline.
- Automated Report Generation: Produces comprehensive HTML and PDF reports with visualizations and statistical summaries.
- Quality Control Integration: Generates count reports with quality metrics and clustering analysis for data validation.
- Flexible Sample Comparison: Supports treatment versus control comparisons with customizable sample labeling.
- Reproducible Analysis: Implements error handling and retry mechanisms for robust pipeline execution.
Input/Output Specification
Inputs
Required
Library File
- Description: Text file containing guide RNA library information with target gene mappings.
- Format: .txt
- Example File Path: /libs/lib.txt
FASTQ Reads
- Description: Sequencing reads containing guide RNA sequences from CRISPR screen samples.
- Format: .fastq or .fastq.gz
- Example File Path: /fastq/Treat.fastq, /fastq/Control.fastq
Sample Labels
- Description: Comma-separated labels for treatment and control samples.
- Format: Text string
- Example: "Treat,Control"
Outputs
Reported Outputs
- Count Matrix:
- Description: Read count matrix for all guide RNAs across samples
- Format: .txt
- Example File Path: /results/demo.count.txt
-
Location: Results folder
-
Statistical Test Results:
- Description: Gene-level statistical analysis results with p-values and fold changes
- Format: .txt
- Example File Path: /test/demo.gene_summary.txt
-
Location: Results folder
-
Analysis Reports:
- Description: Comprehensive HTML and PDF reports with visualizations and quality metrics
- Format: .html, .pdf
- Example File Path: /results/demo.count_report.html, /test/demo.report.pdf
- Visualization App: Web browser for HTML reports, PDF viewer for summary reports
- Location: Results folder
Supporting Outputs
- Count Report:
- Description: Quality control report for guide RNA counting with clustering analysis
- Format: .html
-
Example File Path: /results/demo.count_report.html
-
Statistical Summary:
- Description: Detailed statistical analysis output with gene rankings
- Format: .txt
- Example File Path: /test/demo.gene_summary.txt
Associated Processes
References & Additional Documentation
- Related Papers: Li, W., Xu, H., Xiao, T. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol 15, 554 (2014).
- MAGeCK Documentation: https://sourceforge.net/projects/mageck/
- Pipeline Repository: https://github.com/dolphinnext/dcrispr