Skip to content

ShapeMapper2 Pipeline Specification

Pipeline Details

  • Name: ShapeMapper2
  • Pipeline UUID: 05365fb58d3f49e0893aab9b3ce8b443
  • Version: 1.1.0
  • View Pipeline:

Overview

ShapeMapper2 pipeline is designed for automating the calculation of RNA chemical probing reactivities from mutational profiling (MaP) experiments, in which chemical adducts on RNA are detected as internal mutations in cDNA through reverse transcription and read out by massively parallel sequencing.

Key Use cases:

  • SHAPE Structure Probing Analysis: Processing SHAPE-MaP data to determine RNA secondary structure through chemical probing reactivity profiles.
  • DMS Chemical Probing: Analyzing DMS (dimethyl sulfate) experiments with specialized per-nucleotide normalization for base accessibility mapping.
  • RNA Structure Validation: Generating reactivity profiles for validating predicted RNA secondary structures and identifying flexible regions.

Features

  • Support for Multiple Chemical Probing Methods: Includes SHAPE and DMS modes with optimized analysis parameters for each probing chemistry.
  • Flexible Aligner Options: Supports both Bowtie2 and STAR aligners, with STAR recommended for sequences longer than several thousand nucleotides.
  • Comprehensive Data Processing Pipeline: Performs reference sequence correction, quality trimming, paired read merging, alignment, mutation handling, and reactivity calculation.
  • Amplicon and Random Primer Support: Handles both amplicon-based experiments with primer trimming and random-primed experiments.
  • Quality Control and Validation: Implements heuristic quality control checks, minimum depth requirements, and mutation frequency thresholds.
  • Automated Reporting: Generates PDF visualization reports and comprehensive output files for downstream analysis.

Input/Output Specification

Inputs

Required

target

  • Description: FASTA file containing one or more target DNA sequences ('T' not 'U'). Lowercase positions will be excluded from reactivity profile and should indicate primer-binding sites if using directed primers.
  • Format: .fa or .fasta
  • Example File Path: /path/to/input/target_sequence.fa

modified_fastq

  • Description: SHAPE Modified RNA sequencing data. Folder directory containing the modified fastq files from chemical probing treatment.
  • Format: .fastq/.fastq.gz
  • Example File Path: /path/to/modified_samples/

untreated_fastq

  • Description: Untreated/Unmodified RNA control samples. Folder directory containing the unmodified fastq files for background correction.
  • Format: .fastq/.fastq.gz
  • Example File Path: /path/to/untreated_samples/

Optional Inputs

denatured_fastq

  • Description: Denatured RNA samples for normalization. Folder directory containing denatured fastq files used for reactivity profile normalization.
  • Format: .fastq/.fastq.gz
  • Example File Path: /path/to/denatured_samples/

dms

  • Description: Enable DMS mode for per-nucleotide basis normalization optimized for DMS-MaP data collected using Marathon and/or TGIRT enzymes.
  • Format: Checkbox (TRUE/FALSE)
  • Default: TRUE

aligner

  • Description: Choice between Bowtie2 or STAR for sequence alignment. STAR is recommended for sequences longer than several thousand nucleotides.
  • Format: Dropdown selection
  • Options: "BowTie2", "STAR"
  • Default: "BowTie2"

amplicon

  • Description: Require reads to align near expected primer pair locations and intelligently trim primer sites. Use for targeted amplicon experiments.
  • Format: Checkbox (TRUE/FALSE)
  • Default: FALSE

min_depth

  • Description: Minimum effective sequencing depth for including data (threshold must be met for all provided samples).
  • Format: Numeric input
  • Default: 5000

Outputs

Reported Outputs

  • PDF Reports:
  • Description: Comprehensive visualization reports including reactivity profiles, mutation rates, and quality control plots
  • Format: .pdf
  • Example File Path: /output/shapemapper_plots/shapemapper_out/profiles.pdf
  • Location: shapemapper_plots/shapemapper_out/

  • Modified Parsed Mutations:

  • Description: Processed mutation data from modified RNA samples with calculated reactivity values
  • Format: .mut
  • Example File Path: /output/modified_mut/shapemapper_out/sample_modified.mut
  • Location: modified_mut/shapemapper_out/

  • Untreated Parsed Mutations:

  • Description: Processed mutation data from untreated RNA samples for background correction
  • Format: .mut
  • Example File Path: /output/untreated_mut/shapemapper_out/sample_untreated.mut
  • Location: untreated_mut/shapemapper_out/

Supporting Outputs

  • Log Files:
  • Description: Detailed execution logs with process information, quality control metrics, and error reporting
  • Format: .txt
  • Example File Path: /output/shapemapper_log.txt

  • Reactivity Profiles:

  • Description: Normalized reactivity values for each nucleotide position in the target RNA sequence
  • Format: .txt/.csv
  • Example File Path: /output/reactivity_profiles.txt

Associated Processes

References & Additional Documentation