FastQC Specifications
Process Details
- Name:
FastQC - Process UUID:
20bdb3aa-ed05-11e8-9ce4-85221b21564c - Process Group:
QC
Overview
FastQC provides a simple way to perform quality control checks on raw sequence data from high throughput sequencing pipelines. This process analyzes FASTQ files to generate comprehensive quality reports that help researchers assess the quality of their sequencing data before downstream analysis.
This process is implemented in Bash.
Key Functionality
- Quality Assessment: Performs comprehensive quality control analysis on raw sequencing reads
- Report Generation: Creates detailed HTML reports with quality metrics and visualizations
- Data Organization: Organizes input reads into a structured directory format for downstream processing
Input/Output Specification
Inputs
Required Inputs
-
mate
- Description: Mate pair information for paired-end sequencing data
- Format: mate
-
reads
- Description: Raw sequencing reads to be analyzed for quality control
- Format: fastq
Outputs
-
FastQCout
- Description: Comprehensive quality control report containing analysis results and visualizations
- Format: html
-
reads
- Description: Original sequencing reads organized in a structured directory format
- Format: fastq
Parameters & Settings
This process runs conditionally based on the run_FastQC parameter configuration in the pipeline workflow.
References & Resources
- Tool Documentation: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- Related Papers: Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc