Skip to content

Cell Ranger Count Pipeline Specification

Pipeline Details

  • Name: Cell Ranger Count Pipeline
  • Pipeline UUID: c9205e92847b4658898d8585e4733953
  • Version: 3.6.0
  • View Pipeline:

Overview

Cell Ranger Count Pipeline pipeline is designed for analyzing Chromium single-cell RNA-seq data from 10x Genomics experiments. It automates the complete workflow from BCL file conversion to advanced single-cell analysis, including read alignment, barcode identification, UMI counting, clustering, and downstream visualization to ensure reliable and reproducible results.

Key Use cases:

  • Single-cell Gene Expression Analysis: Process 3' Gene Expression libraries for comprehensive transcriptomic profiling at single-cell resolution.
  • Feature Barcoding Integration: Analyze combined Gene Expression with Antibody Capture or CRISPR Guide Capture data for multi-modal single-cell experiments.
  • BCL to FASTQ Conversion: Convert raw Illumina sequencing data (BCL files) to FASTQ format with proper sample demultiplexing.

Features

  • Complete Cell Ranger v9.0.0 Integration: Utilizes the latest Cell Ranger suite for optimal single-cell data processing with improved algorithms and performance.
  • Flexible Input Support: Accepts both FASTQ files and BCL directories with automatic sample sheet processing for streamlined workflow initiation.
  • Custom Reference Genome Support: Enables addition of custom sequences to reference genomes with automated GTF generation and indexing.
  • Advanced Quality Control: Implements comprehensive QC including ambient RNA removal using deconX, mitochondrial gene filtering, and multi-metric cell filtering.
  • Multi-modal Analysis: Supports integration of Gene Expression with Feature Barcoding data (Antibody Capture, CRISPR screens).
  • Batch Effect Correction: Incorporates Harmony algorithm for batch effect correction across multiple samples or experimental conditions.
  • Comprehensive Clustering: Automated clustering with resolution optimization, marker gene identification, and visualization via UMAP/tSNE.
  • Multiple Output Formats: Generates outputs in various formats including H5, H5AD, Loom, and RDS for compatibility with different analysis platforms.
  • Gene Regulatory Network Analysis: Optional pySCENIC integration for transcription factor regulon analysis and gene regulatory network inference.
  • Scalable Processing: Supports parallel processing of multiple samples with configurable computational resources.

Input/Output Specification

Inputs

Required

reads

  • Description: FASTQ files containing raw single-cell RNA sequencing reads from 10x Genomics Chromium platform
  • Format: .fastq.gz
  • Example File Path: /path/to/sample_S1_L001_R1_001.fastq.gz, /path/to/sample_S1_L001_R2_001.fastq.gz

mate

  • Description: Sequencing configuration specifying single-end or paired-end reads
  • Format: String value ("single" or "pair")

Optional Inputs

BCL Directory

  • Description: Illumina BCL directory containing raw sequencing data for conversion to FASTQ format
  • Format: Directory structure with BCL files and RunInfo.xml
  • Example File Path: /path/to/bcl_directory/

Sample Sheet CSV

  • Description: Sample sheet file for BCL to FASTQ conversion specifying sample information and barcodes
  • Required Columns: Sample_ID, Sample_Name, index, index2
  • Format: Comma-separated values (.csv)

Custom Reference Sequences

  • Description: Additional FASTA sequences to be added to the reference genome (e.g., transgenes, viral sequences)
  • Format: .fasta or .fa
  • Example File Path: /path/to/custom_sequences.fasta

Outputs

Reported Outputs

  • Web Summary HTML:
  • Description: Comprehensive quality control report with alignment metrics, cell detection statistics, and gene expression summary
  • Format: .html
  • Example File Path: /output/sample_web_summary.html
  • Visualization App: Web browser
  • Location: Cell Ranger Count output folder

  • Feature-Barcode Matrix (H5):

  • Description: Filtered gene expression count matrix in HDF5 format containing cell barcodes and gene counts
  • Format: .h5
  • Example File Path: /output/sample_filtered_feature_bc_matrix.h5
  • Visualization App: Seurat, Scanpy, Cell Ranger Loupe Browser
  • Location: Cell Ranger Count output folder

  • Clustering Analysis Report:

  • Description: HTML report containing PCA results, UMAP/tSNE visualizations, clustering analysis, and marker gene identification
  • Format: .html
  • Example File Path: /output/clustering_analysis_report.html
  • Visualization App: Web browser
  • Location: Clustering output folder

  • Seurat Object:

  • Description: Processed single-cell data object containing normalized expression data, dimensionality reduction, and clustering results
  • Format: .rds
  • Example File Path: /output/processed_seurat_object.rds
  • Visualization App: R/Seurat
  • Location: Analysis output folder

Supporting Outputs

  • BAM Files:
  • Description: Aligned reads in BAM format with cellular barcode and UMI information
  • Format: .bam
  • Example File Path: /output/sample_possorted_genome_bam.bam

  • Raw Feature-Barcode Matrix:

  • Description: Unfiltered gene expression count matrix including all detected barcodes
  • Format: Directory with matrix.mtx, barcodes.tsv, features.tsv
  • Example File Path: /output/sample_raw_feature_bc_matrix/

  • H5AD File:

  • Description: AnnData format file compatible with Scanpy and Python-based single-cell analysis tools
  • Format: .h5ad
  • Example File Path: /output/sample_data.h5ad

  • Loom File:

  • Description: Loom format file for visualization and analysis in tools like SCope and pySCENIC
  • Format: .loom
  • Example File Path: /output/sample_data.loom

Associated Processes

References & Additional Documentation

  • 10x Genomics Cell Ranger Documentation: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/count
  • Example Datasets: https://www.viafoundry.com/test_data/fastq_10x_pbmc_1k_v3/
  • Supported Reference Genomes: human_hg39_gencode_v32_cellranger_v6 and other Cell Ranger compatible references