HLA Typing Pipeline Specification
Pipeline Details
- Name:
HLA_typing - Pipeline UUID:
s0ki8cd82jwjlhegscpueyxhi3wwyi - Version:
1.0.1 - View Pipeline:
Overview
HLA Typing pipeline is designed for predicting neoantigens from tumor and normal samples (WES or WGS) and RNA-seq data (optional) from tumor. It performs HLA class I genotyping from high-throughput sequencing data using OptiType, a state-of-the-art algorithm for precise HLA typing from DNA or RNA sequencing reads.
Key Use cases:
- HLA Class I Genotyping: Accurate 4-digit HLA genotyping predictions for HLA-A, HLA-B, and HLA-C loci from NGS data.
- Neoantigen Prediction: Supporting workflow for identifying potential neoantigens in cancer immunotherapy research.
- Immunogenomics Research: HLA typing for population genetics studies and personalized medicine applications.
Features
- Support for Multiple Input Types: Accepts both DNA and RNA sequencing data with configurable data type parameter.
- Flexible Input Format Support: Handles both single-end and paired-end FASTQ files seamlessly.
- High Precision Algorithm: Uses OptiType's integer linear programming approach for accurate 4-digit HLA genotyping.
- Comprehensive Output: Generates detailed results including allele predictions, confidence scores, and coverage visualizations.
- Containerized Execution: Utilizes Docker container (quay.io/mustafapir/optitype:1.3.0) for reproducible results across different environments.
- IMGT/HLA Database Integration: Leverages the latest IMGT/HLA reference database for allele matching.
Input/Output Specification
Inputs
Required
reads
- Description: Primary sequencing reads file containing raw sequencing data for HLA typing analysis.
- Format: FASTQ (.fastq or .fastq.gz)
- Example File Path: /path/to/input/sample_R1.fastq.gz
mate
- Description: Mate pair sequencing reads file for paired-end sequencing data.
- Format: FASTQ (.fastq or .fastq.gz)
- Example File Path: /path/to/input/sample_R2.fastq.gz
Optional Inputs
Data Type Parameter
- Description: Specifies whether the input sequencing data is from DNA or RNA sources.
- Options: "dna" or "rna"
- Default: "rna"
- Format: String parameter
Outputs
Reported Outputs
- csvFile:
- Description: Table of predicted HLA alleles with confidence scores and detailed typing results
- Format: .csv/.tsv
- Example File Path: /output/directory/sample_result.tsv
-
Location: Main output folder
-
outputFilePdf:
- Description: Coverage visualization plots showing alignment coverage for each predicted HLA allele
- Format: .pdf
- Example File Path: /output/directory/sample_coverage_plot.pdf
- Location: Main output folder
Supporting Outputs
- bamFile:
- Description: Aligned BAM file containing HLA-specific read alignments used for genotyping
- Format: .bam
- Example File Path: /output/directory/sample_aligned.bam
Associated Processes
References & Additional Documentation
- Related Papers: Szolek A. et al. OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics, 30(23), 3310–3316 (2014).
- OptiType Repository: https://github.com/FRED-2/OptiType
- IMGT/HLA Database: https://www.ebi.ac.uk/ipd/imgt/hla/