Skip to content

Check Files Specifications

Process Details

  • Name: check_files
  • Process UUID: zO0kEVdaXn1bg1bd3ntumkH9kRxBIj
  • Process Group: Index

Overview

This process performs file validation and path checking for essential genomic reference files. It verifies the existence of specified GTF annotation files, genome FASTA sequences, genome size files, and BED format files, providing their corresponding file paths and existence status for downstream analysis workflows.

This process is implemented in Groovy, which utilizes a custom pathChecker function for file validation operations.

Key Functionality

  • File Existence Validation: Checks whether specified genomic reference files exist in the file system
  • Path Resolution: Resolves and validates file paths for GTF, FASTA, sizes, and BED files
  • Status Reporting: Provides existence status and validated paths for all checked files

Input/Output Specification

Inputs

Optional Inputs

  • GTF File

    • Description: Gene annotation file in GTF format containing genomic features
    • Format: GTF
  • Genome

    • Description: Reference genome sequence file in FASTA format
    • Format: FASTA
  • Genome Sizes

    • Description: Chromosome/contig size information file
    • Format: sizes
  • BED File

    • Description: Genomic interval file in BED format
    • Format: BED

Outputs

  • GTF File

    • Description: Validated GTF annotation file with confirmed existence status
    • Format: GTF
  • Genome

    • Description: Validated genome FASTA file with confirmed existence status
    • Format: FASTA
  • Genome Sizes

    • Description: Validated genome sizes file with confirmed existence status
    • Format: sizes
  • BED File

    • Description: Validated BED format file with confirmed existence status
    • Format: BED

References & Resources

  • Tool Documentation: Contact the team for details on the custom pathChecker function implementation