Cell Type Mapper (ABA Reference) Pipeline Specification
Pipeline Details
- Name:
Cell Type Mapper (ABA Reference) - Pipeline UUID:
45a14d9c68bd4497ae5a6ff9e7293819 - Version:
1.1.2 - View Pipeline:
Overview
Cell Type Mapper (ABA Reference) pipeline is designed for mapping single-cell RNA sequencing data onto cell type taxonomies such as those provided by the Allen Institute for Brain Science. It automates the cell type classification process using reference taxonomies and marker gene lists to provide reliable and reproducible cell type assignments.
Key Use cases:
- Single-cell RNA-seq Cell Type Classification: Automated assignment of cell types to single cells based on established taxonomies from the Allen Institute for Brain Science.
- Cross-dataset Cell Type Harmonization: Standardizing cell type annotations across different single-cell datasets using reference taxonomies.
- Quality Control for Cell Type Assignments: Bootstrap-based validation of cell type mapping confidence through iterative sampling.
Features
- Allen Institute Integration: Specifically designed to work with Allen Institute for Brain Science cell type taxonomies and marker gene sets.
- Bootstrap Validation: Implements statistical validation through configurable bootstrap iterations with random downsampling of marker genes.
- Flexible Input Formats: Supports both RDS and h5ad input formats with automatic conversion capabilities.
- Normalization Options: Handles both raw count data and log2(CPM+1) normalized data for type assignment.
- Comprehensive Output: Generates both JSON and CSV formatted results for downstream analysis and visualization.
- Containerized Execution: Uses Docker containers (quay.io/viascientific/cell-type-mapper) ensuring reproducible execution environments.
Input/Output Specification
Inputs
Required
Reference Taxonomy (h5 format)
- Description: A reference taxonomy file containing cell type hierarchies and associated metadata from Allen Institute taxonomies.
- Format: .h5
- Example File Path: /path/to/reference/taxonomy.h5
Marker Gene List (JSON format)
- Description: A structured list of marker genes in JSON format used for cell type classification.
- Format: .json
- Example File Path: /path/to/markers/marker_genes.json
Single-cell Dataset
- Description: Unlabeled single-cell RNA-sequencing dataset containing gene expression data for cell type mapping.
- Format: .h5ad or .rds
- Example File Path: /path/to/data/single_cell_data.h5ad
Outputs
Reported Outputs
- Cell Type Mapping Results (CSV):
- Description: Tab-delimited file containing cell type assignments for each cell with confidence scores
- Format: .csv
- Example File Path: /output/directory/mapping_output.csv
- Visualization App: Excel, R, Python pandas
-
Location: Output Folder
-
Extended Mapping Results (JSON):
- Description: Comprehensive mapping results including detailed statistics and bootstrap validation metrics
- Format: .json
- Example File Path: /output/directory/mapping_output.json
- Visualization App: JSON viewers, custom analysis scripts
- Location: Output Folder
Supporting Outputs
- Processed h5ad File:
- Description: Converted and processed single-cell dataset in h5ad format (when input is RDS)
- Format: .h5ad
- Example File Path: /intermediate/directory/processed_data.h5ad
Associated Processes
References & Additional Documentation
- Related Papers/links: Cell Type Mapper Documentation
- Pipeline Repository: Allen Institute Cell Type Mapper
- Input Data Documentation: Running Online Taxonomies Locally