Subclustering Module Pipeline Specification
Pipeline Details
- Name:
Subclustering Module - Pipeline UUID:
c663y0w1an4t4wuwxs4f13gz9dop5h - Version:
1.1.1 - View Pipeline:
Overview
Subclustering Module pipeline is designed for subsetting and reclustering single-cell RNA sequencing data stored in Seurat objects. It automates the process of filtering cells based on metadata criteria and performing downstream clustering analysis to enable focused analysis of specific cell populations.
Key Use cases:
- Cell Type-Specific Analysis: Isolate and analyze specific cell types or populations from larger single-cell datasets.
- Condition-Based Subsetting: Extract cells based on experimental conditions, treatments, or sample metadata for comparative analysis.
- Quality Control Refinement: Remove low-quality cells or outliers and reanalyze the remaining high-quality cell populations.
Features
- Flexible Metadata Filtering: Supports subsetting based on any metadata column and value in the Seurat object.
- Automated Reclustering: Performs complete reclustering workflow including normalization, variable feature detection, scaling, and dimensionality reduction.
- Multiple Normalization Methods: Supports LogNormalize, CLR, RC, and SCT normalization methods.
- Comprehensive Analysis Pipeline: Includes PCA, UMAP, t-SNE, nearest neighbor graph construction, and cluster marker identification.
- Dual Output Format: Generates both RDS (Seurat) and h5ad (AnnData) format outputs for compatibility with R and Python workflows.
- Interactive Reporting: Produces HTML reports with visualizations and analysis summaries.
- Configurable Parameters: Allows customization of principal components, normalization settings, and feature selection criteria.
Input/Output Specification
Inputs
Required
RDS File
- Description: A Seurat object stored as an RDS file containing single-cell RNA sequencing data with associated metadata.
- Format: .rds
- Example File Path: /path/to/input/seurat_object.rds
Outputs
Reported Outputs
- Subsetted RDS File:
- Description: Filtered and reclustered Seurat object containing only the specified cell population
- Format: .rds
- Example File Path: /output/directory/subsetted_seurat.rds
-
Location: Main output folder
-
h5ad File:
- Description: AnnData format file compatible with Python-based single-cell analysis tools like scanpy
- Format: .h5ad
- Example File Path: /output/directory/seurat_object.h5ad
- Visualization App: scanpy, cellxgene
- Location: Output folder
Supporting Outputs
- Analysis Report:
- Description: Interactive HTML report containing clustering results, UMAP/t-SNE plots, and cluster marker analysis
- Format: .html
- Example File Path: /output/directory/analysis_report.html
Associated Processes
References & Additional Documentation
- Related Papers/links: Seurat - Guided Clustering Tutorial
- Workflow Diagram: Available in pipeline description page