Skip to content

Draft Consensus Calling Specifications

Process Details

  • Name: draftConsensusCalling
  • Process UUID: e2c4e4cc81d0411f8ce4089419ce889f
  • Process Group: uminator

Overview

This process generates draft consensus sequences from UMI-tagged reads by consolidating reads with identical UMIs and creating consensus sequences through alignment-based methods. The process is part of the UMInator workflow for processing unique molecular identifier (UMI) data to reduce sequencing errors and improve accuracy of variant calling.

This process is implemented in Bash, which invokes an R script for consensus sequence generation and alignment operations.

Key Functionality

  • UMI Read Consolidation: Concatenates reads from different chunks that share the same UMI identifier
  • Format Conversion: Converts FASTQ files to FASTA format for downstream consensus calling
  • Draft Consensus Generation: Creates consensus sequences from aligned UMI-grouped reads using configurable parameters for read coverage and plurality thresholds

Input/Output Specification

Inputs

Required Inputs

  • UMI Dataset

    • Description: Set of UMI-tagged sequencing reads organized by sample and UMI identifiers
    • Format: UMI file set
  • Output Directory

    • Description: Directory structure for organizing consensus calling results
    • Format: Directory

Outputs

  • UMI Dataset

    • Description: Processed UMI dataset with draft consensus sequences generated for each UMI group
    • Format: UMI file set
  • Output Directory

    • Description: Directory containing organized consensus calling results and intermediate files
    • Format: Directory

References & Resources

  • Tool Documentation: Contact the team for details on Obtain_draft_consensus.R