ATAC-seq Pipeline Specification
Pipeline Details
- Name:
ATAC-seq Pipeline - Pipeline UUID:
c663r5ndnj9cle5z9uud80fkexijm8 - Version:
1.2.0 - View Pipeline:
Overview
ATAC-seq Pipeline is designed for analyzing Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data. It automates data preprocessing, quality control, read mapping, peak calling, and downstream analysis to identify accessible chromatin regions and generate count matrices for differential accessibility analysis.
Key Use cases:
- Chromatin Accessibility Analysis: Identifies accessible chromatin regions using MACS2 peak calling with Tn5 transposase cut site positioning.
- Differential Accessibility Studies: Generates count matrices for peak regions that can be analyzed with DEBrowser for differential accessibility analysis.
- Multi-sample Peak Consensus: Creates consensus peak calls by merging peaks from multiple samples using Bedtools for comparative analysis.
Features
- Support for Multiple Quality Control Options: Includes FastQC, Trimmomatic, and Cutadapt for comprehensive read quality assessment and filtering.
- Specialized ATAC-seq Processing: Implements accurate Tn5 transposase cut site positioning by extending 29 bases downstream from the 9th base upstream of 5' read ends.
- Dual Mapping Strategy: Uses Bowtie2 for both sequential mapping to filter common reads (ERCC, rmsk) and genome alignment with duplicate removal via Picard.
- Comprehensive Peak Analysis: Employs MACS2 for peak calling and Bedtools for consensus peak generation and read quantification across samples.
- Visualization Support: Generates IGV (TDF) and Genome Browser (BigWig) files for interactive data exploration.
- Integrated Analysis Platform: Direct integration with DEBrowser for differential accessibility analysis.
Input/Output Specification
Inputs
Required
Reads
- Description: FASTQ files containing ATAC-seq sequencing reads from transposase-accessible chromatin regions.
- Format: .fastq.gz
- Example File Path: /path/to/input/sample.fastq.gz
ATAC-prep Section
- Description: Sample definitions required for peak calling with MACS2. Configure through the run_ATAC_MACS2 settings.
- Required Fields: Output-Prefix, Sample-Prefix, Input-Prefix (optional)
- Format: Tab-separated configuration
| Output-Prefix | Sample-Prefix | Input-Prefix (optional) |
|---|---|---|
| exper-rep1 | exper-rep1 | |
| control-rep1 | control-rep1 |
Outputs
Reported Outputs
- Peak Count Matrix:
- Description: Matrix containing read counts for each peak region across all samples
- Format: .tsv
- Example File Path: /output/directory/peak_counts_matrix.tsv
- Visualization App: DEBrowser
-
Location: Results folder
-
Peak Calls:
- Description: Individual and consensus peak calls from MACS2 analysis
- Format: .bed, .narrowPeak
- Example File Path: /output/directory/consensus_peaks.bed
- Visualization App: IGV, UCSC Genome Browser
- Location: Results folder
Supporting Outputs
- Alignment Files:
- Description: Bowtie2 aligned reads with duplicates removed by Picard
- Format: .bam, .bai
-
Example File Path: /intermediate/directory/sample_aligned_dedup.bam
-
Quality Control Reports:
- Description: FastQC quality assessment reports and MultiQC summary
- Format: .html, .zip
-
Example File Path: /intermediate/directory/sample_fastqc.html
-
Visualization Files:
- Description: IGV TDF files and BigWig files for genome browser visualization
- Format: .tdf, .bw
- Example File Path: /output/directory/sample.tdf
Associated Processes
- Add custom seq to genome gtf
- Adapter Removal
- Adapter Removal Summary
- ATAC CHIP summary
- ATAC MACS
- ATAC Prep
- bed merge
- bedtools coverage
- Bowtie Summary
- Check BED12
- Check Build Bowtie2 Index
- check Bowtie2 files
- check files
- Check Genome GTF
- Check chrom sizes and index
- Check Sequential Mapping Indexes
- convert gtf attributes
- Deduplication Summary
- Download build sequential mapping indexes
- FastQC
- FastQC after Adapter Removal
- featureCounts
- featureCounts Prep
- featureCounts summary
- IGV BAM2TDF converter
- Map Bowtie2
- Merge Bam
- Merge TSV Files
- MultiQC
- Overall Summary
- Picard
- Picard MarkDuplicates
- Picard Summary
- Quality Filtering
- Quality Filtering Summary
- RSeQC
- RSeQC Summary
- Sequential Mapping
- Sequential Mapping Bam count
- Sequential Mapping Summary
- Trimmer
- Trimmer Summary
- UCSC BAM2BigWig converter
- UMIextract
- Umitools Summary
References & Additional Documentation
- Related Papers:
- Yukselen, O., Turkyilmaz, O., Ozturk, A.R. et al. DolphinNext: a distributed data processing platform for high throughput genomics. BMC Genomics 21, 310 (2020). https://doi.org/10.1186/s12864-020-6714-x
- Buenrostro et al. 2013; Donnard et al. 2018 (ATAC-seq methodology)
- Zhang et al. 2008 (MACS2); Quinlan and Hall 2010 (Bedtools)
- Pipeline Repository: Available through DolphinNext platform
- Workflow Diagram: Available in pipeline description pages