Kraken2 Specifications
Process Details
- Name:
kraken2 - Process UUID:
f931wfyxpqpzg9ay3ri3lzruvy43ns - Process Group:
short_read_taxonomy
Overview
This process performs taxonomic classification of sequencing reads using the Kraken2 algorithm and a reference database. It assigns taxonomic labels to DNA sequences by comparing k-mers in the input reads against a pre-built taxonomic database, enabling rapid species identification and microbiome analysis.
This process is implemented in Bash, utilizing the Kraken2 containerized tool for taxonomic classification.
Key Functionality
- Taxonomic Classification: Assigns taxonomic labels to input sequencing reads using k-mer matching against a reference database
- Report Generation: Creates detailed taxonomic reports showing classification results and abundance estimates
- Multi-threading Support: Leverages available CPU cores for accelerated processing of large datasets
Input/Output Specification
Inputs
Required Inputs
-
Kraken Database
- Description: Pre-built Kraken2 taxonomic database containing k-mer to taxonomy mappings
- Format: kraken format
-
Reads
- Description: Paired-end sequencing reads to be taxonomically classified
- Format: FASTQ format
Outputs
-
Classification Results
- Description: Taxonomic classification results for each input read
- Format: TSV format
-
Taxonomic Reports
- Description: Summary reports showing taxonomic abundance and classification statistics
- Format: TSV format
-
Log Output
- Description: Processing log containing runtime information and statistics
- Format: OUT format
References & Resources
- Tool Documentation: Contact the team for details on Kraken2 classification parameters
- Related Papers: Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0