Liftoff Pipeline Specification
Pipeline Details
Overview
Liftoff pipeline is designed for mapping annotations between assemblies of the same, or closely-related species. It automates the accurate transfer of gene annotations from a reference genome to a target genome, ensuring reliable mapping while preserving transcript and gene structure.
Key Use cases:
- Gene Annotation Transfer: Mapping gene annotations from a well-annotated reference genome to a newly assembled genome of the same species.
- Comparative Genomics: Transferring annotations between closely-related species for comparative analysis.
- Assembly Updates: Updating gene annotations when moving from an older genome assembly to a newer version.
Features
- Standalone Tool: No pre-generated "chain" file required - takes genome assemblies and reference annotation as direct input.
- Minimap2 Integration: Uses Minimap2 for efficient alignment of gene sequences rather than whole genomes.
- Structure Preservation: Maintains transcript and gene structure while maximizing sequence identity.
- Conflict Resolution: Automatically detects and resolves overlapping gene mappings by determining most-likely mis-mapped genes.
- Additional Copy Detection: Identifies additional gene copies in the target assembly not present in the reference annotation.
- Flexible Parameters: Customizable alignment parameters, coverage thresholds, and sequence identity requirements.
- Containerized Execution: Runs in Docker container (quay.io/viascientific/liftoff:1.0) for reproducible results.
Input/Output Specification
Inputs
Required
Reference Genome FASTA
- Description: Reference genome assembly in FASTA format from which annotations will be lifted.
- Format: .fasta
- Example File Path: /path/to/reference/genome.fasta
Target Genome FASTA
- Description: Target genome assembly in FASTA format to which annotations will be mapped.
- Format: .fasta
- Example File Path: /path/to/target/genome.fasta
Reference Annotation GTF
- Description: Reference gene annotation file containing gene structures to be lifted over.
- Format: .gtf
- Example File Path: /path/to/reference/annotation.gtf
Optional Inputs
Feature List File
- Description: Text file specifying particular features to lift over (optional filtering).
- Format: .txt
- Example File Path: /path/to/feature_list.txt
Outputs
Reported Outputs
- Lifted Annotation GTF:
- Description: Gene annotation file for the target genome with lifted over features
- Format: .gtf
- Example File Path: /output/directory/lifted_annotation.gtf
- Visualization App: IGV, UCSC Genome Browser
- Location: Output Folder
Associated Processes
References & Additional Documentation
- Related Papers: Alaina Shumate, Steven L Salzberg, Liftoff: accurate mapping of gene annotations, Bioinformatics, Volume 37, Issue 12, June 2021, Pages 1639–1643, https://doi.org/10.1093/bioinformatics/btaa1016
- Pipeline Repository: https://github.com/agshumate/Liftoff