Skip to content

Meta-CAMP Pipeline Specification

Pipeline Details

  • Name: Meta-CAMP
  • Pipeline UUID: 6d34fe61976640178ed79dd450bd0a64
  • Version: 1.1.1
  • View Pipeline:

Overview

Meta-CAMP pipeline is designed for dynamic and educational analyses of metagenomes, bacterial isolates, and microbial communities. The MetaSUB Core Modular Analysis Pipeline (CAMP) is a software toolkit that serves as the primary analytic workflow for the MetaSUB Consortium. The core philosophy is anchored in modularity, enabling users to gain total control over and deep understanding of their bioinformatic analyses through consistently documented and parameterized processes.

Key Use cases:

  • Metagenomic Community Analysis: Comprehensive taxonomic profiling and functional analysis of microbial communities from environmental samples.
  • MAG (Metagenome-Assembled Genome) Reconstruction: Binning and quality assessment of metagenome-assembled genomes using multiple algorithms.
  • Gene Cataloguing and Functional Annotation: Identification, clustering, and functional annotation of open reading frames across samples.

Features

  • Modular Design: Each analytical step is defined as a single, consistently documented and parameterized process, allowing for complete workflow customization.
  • Multiple Taxonomic Classification Tools: Integrates MetaPhlAn4, Kraken2/Bracken, and XTree for comprehensive taxonomic profiling with standardized output formats.
  • Comprehensive Quality Control: Implements FastQC, MultiQC, fastp for read quality assessment, adapter removal using AdapterRemoval, and host read removal using Bowtie2.
  • Multi-Algorithm MAG Binning: Utilizes six binning algorithms (MetaBAT2, CONCOCT, SemiBin2, MaxBin2, VAMB, MetaBinner) with DAS Tool ensemble refinement.
  • Dual Assembly Options: Supports both MetaSPAdes and MegaHIT assemblers with optional metaviral and plasmid assembly modes.
  • Advanced Gene Cataloguing: Uses Bakta for ORF identification, MMSeqs for clustering, and Diamond for functional profiling.
  • Interactive Visualization: Generates outputs compatible with microViz and animalcules apps for comparative analysis.
  • Error Correction: Implements BayesHammer for sequencing error correction.

Input/Output Specification

Inputs

Required

Reads

  • Description: Forward and reverse reads made into a collection from one type of source (e.g., mouse, human)
  • Format: .fastq.gz
  • Example File Path: /path/to/reads/sample_1.fastq.gz, /path/to/reads/sample_2.fastq.gz

Host Genome

  • Description: Reference genome selection that determines which databases are used in the pipeline for host read removal
  • Format: Bowtie2 index files
  • Options: Human (GRCh38), Mouse (mm10), or other available reference genomes

Adapter File

  • Description: Sequencing adapters for trimming adapter sequences from insert DNA
  • Format: .txt
  • Example File Path: /path/to/adapters/adapters.txt

Metadata

  • Description: Sample metadata for microViz and animalcules applications. First column should contain sample names, additional columns can include sample features (age, sex, disease, etc.)
  • Format: Tab-separated values (.tsv)
  • Example File Path: /path/to/metadata/sample_metadata.tsv

Optional Inputs

Binner Tool Selections

  • Description: Selection of at least 3 from six available binning tools (MetaBAT2, CONCOCT, SemiBin2, MaxBin2, VAMB, MetaBinner) for accurate bin creation
  • Default: All six tools selected

MicroViz Analysis App Selection

  • Description: Option to run microViz comparative analysis (requires 3 or more different samples)
  • Format: Boolean selection

Outputs

Reported Outputs

  • Short Read Quality Control Reports:
  • Description: Pre and post-processing quality control reports
  • Format: .html (MultiQC reports)
  • Location: summary/fastqc_pre/, summary/fastqc_post/
  • Visualization App: MultiQC

  • Taxonomic Profiling Results:

  • Description: Standardized taxonomic abundance tables at multiple taxonomic levels (species, genus, family, order, class, phylum)
  • Format: .csv (XTree, Kraken2/Bracken, MetaPhlAn outputs)
  • Location: final_reports/
  • Visualization App: Pavian, Krona, animalcules

  • Assembly Files:

  • Description: Assembled contigs from MetaSPAdes and/or MegaHIT
  • Format: .fasta.gz
  • Location: assembly/
  • Visualization App: MetaQUAST reports (.html)

  • MAG Binning Results:

  • Description: Refined metagenome-assembled genomes from DAS Tool consensus binning
  • Format: .fa
  • Location: bins/
  • Visualization App: CheckM2, GTDB-Tk classification reports

  • Gene Cataloguing Outputs:

  • Description: ORF cluster tables including relative abundance, counts, sizes, and functional annotations
  • Format: .csv, .tsv
  • Location: final_reports/
  • Files: orf_cluster_sizes.csv, orf_rel_abund.tsv, orf_read_cts.tsv, orf_annotations.tsv

Supporting Outputs

  • MAG Quality Control Summary:
  • Description: Aggregated quality metrics from GUNC, GTDB-Tk, CheckM2, and QUAST
  • Format: .csv
  • Location: final_reports/mag_qc_summary.csv

  • Error Correction Statistics:

  • Description: Statistical properties of reads after error correction
  • Format: .csv
  • Location: summary/

  • Assembly Statistics:

  • Description: Contig length and assembly descriptive statistics
  • Format: .csv
  • Location: assembly/stats/

Associated Processes

References & Additional Documentation

  • MetaSUB Consortium: International MetaSUB Consortium
  • Pipeline Repository: Meta-CAMP GitHub Repository
  • Related Publications: Mason, C.E., et al. The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium inaugural meeting report. Microbiome 4, 24 (2016).