Skip to content

Kraken2 Specifications

Process Details

  • Name: kraken2
  • Process UUID: f931wfyxpqpzg9ay3ri3lzruvy43ns
  • Process Group: short_read_taxonomy

Overview

This process performs taxonomic classification of sequencing reads using the Kraken2 algorithm and a reference database. It assigns taxonomic labels to DNA sequences by comparing k-mers in the input reads against a pre-built taxonomic database, enabling rapid species identification and microbiome analysis.

This process is implemented in Bash, utilizing the Kraken2 containerized tool for taxonomic classification.

Key Functionality

  • Taxonomic Classification: Assigns taxonomic labels to input sequencing reads using k-mer matching against a reference database
  • Report Generation: Creates detailed taxonomic reports showing classification results and abundance estimates
  • Multi-threading Support: Leverages available CPU cores for accelerated processing of large datasets

Input/Output Specification

Inputs

Required Inputs

  • Kraken Database

    • Description: Pre-built Kraken2 taxonomic database containing k-mer to taxonomy mappings
    • Format: kraken format
  • Reads

    • Description: Paired-end sequencing reads to be taxonomically classified
    • Format: FASTQ format

Outputs

  • Classification Results

    • Description: Taxonomic classification results for each input read
    • Format: TSV format
  • Taxonomic Reports

    • Description: Summary reports showing taxonomic abundance and classification statistics
    • Format: TSV format
  • Log Output

    • Description: Processing log containing runtime information and statistics
    • Format: OUT format

References & Resources

  • Tool Documentation: Contact the team for details on Kraken2 classification parameters
  • Related Papers: Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0