Merge Metaphlan Specifications
Process Details
- Name:
merge_metaphlan - Process UUID:
f9311tpae6qzx7vt1mfdc6yqifxgi2 - Process Group:
short_read_taxonomy
Overview
This process merges standardized MetaPhlAn outputs across all samples to create consolidated taxonomic abundance tables at different taxonomic levels. MetaPhlAn (Metaphylogenetic Analysis of Phylogenetic Lineages) is a computational tool for profiling the composition of microbial communities from metagenomic sequencing data. This merging step consolidates individual sample results into comprehensive tables suitable for downstream comparative analysis and visualization.
This process is implemented in Bash, which invokes a Python script for table concatenation operations.
Key Functionality
- Multi-level Taxonomic Merging: Consolidates MetaPhlAn results at six taxonomic levels (species, genus, family, order, class, and phylum)
- Cross-sample Integration: Combines individual sample taxonomic profiles into unified abundance matrices
- Standardized Output Generation: Produces consistently formatted CSV files for each taxonomic level suitable for downstream analysis
Input/Output Specification
Inputs
Required Inputs
- MetaPhlAn CSV Outputs
- Description: Individual MetaPhlAn taxonomic abundance files from multiple samples
- Format: CSV files containing standardized MetaPhlAn output with taxonomic classifications and relative abundances
Outputs
- Merged Taxonomic Tables
- Description: Consolidated taxonomic abundance tables at six different taxonomic levels (species, genus, family, order, class, phylum)
- Format: CSV files with samples as columns and taxonomic units as rows
References & Resources
- Tool Documentation: Contact the team for details on
concat_tbls.py - Related Papers: Beghini, F., McIver, L.J., Blanco-Míguez, A. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021). DOI: 10.7554/eLife.65088