Prepare Cellranger Arc Specifications
Process Details
- Name:
prepare_cellranger_arc - Process UUID:
f93154cxym0jhts2hq1hwxnnpdpega - Process Group:
SingleCell
Overview
This process prepares the directory structure and organizes input files for Cell Ranger ARC count analysis. Cell Ranger ARC is designed for processing single-cell multiome data that combines both gene expression (RNA-seq) and chromatin accessibility (ATAC-seq) measurements from the same cells. The process takes FASTQ files and configuration metadata to create properly structured inputs for downstream Cell Ranger ARC processing.
This process is implemented in Groovy.
Key Functionality
- File Organization: Systematically organizes RNA-seq and ATAC-seq FASTQ files according to Cell Ranger ARC requirements
- Metadata Processing: Parses configuration files containing sample information including input names, groups, and library types
- Quality Control: Validates that required R1 and R2 files are present for each sample, with optional R3 files for specific library types
- Directory Structure Preparation: Creates the proper directory structure needed for Cell Ranger ARC count execution
Input/Output Specification
Inputs
Required Inputs
-
reads (RNA-seq)
- Description: Set of RNA-seq FASTQ files containing gene expression reads
- Format: FASTQ (.fastq.gz)
-
reads (ATAC-seq)
- Description: Set of ATAC-seq FASTQ files containing chromatin accessibility reads
- Format: FASTQ (.fastq.gz)
-
inputText
- Description: Configuration file containing sample metadata with columns for input_name, group, and library_type
- Format: Tab-delimited text file
Outputs
- directory
- Description: Organized directory structure containing properly arranged FASTQ files and metadata ready for Cell Ranger ARC processing
- Format: Directory structure
References & Resources
- Tool Documentation: Contact the team for details on the Groovy implementation script