App Development Guide
We use these practices internally!
Container Development recommendations
Via Foundry pipelines and applications rely on containerized environments to ensure reproducibility, performance, and scalability. This guide outlines best practices we use when working with containers.
Container Sources and Registries
Use Community Containers When Possible
For common bioinformatics tools, community-maintained containers are often the most reliable choice. These images are widely used and tested by the research community, saving time and reducing maintenance overhead.
Some Recommendations:
- Biocontainers on Quay — trusted images for tools like FastQC, STAR, Picard, and more.
- NVIDIA NGC Catalog — GPU-optimized images for machine learning and computational workloads.
Container Build Best Practices
Keep Image Size Under 5 GB
Larger images slow down both development and execution. Keeping images lightweight ensures:
- Faster build and deploy times
- Lower storage and compute costs
- More efficient scaling across pipelines
Use Multi-Stage Builds
Multi-stage builds separate build-time and run-time dependencies, producing smaller, cleaner images.
Benefits:
- Reduce image size by removing unnecessary build tools
- Create streamlined runtime environments
- Improve reproducibility and security
Dockerfile recommendations
General recommendations:
- Use official base images where possible
- Pin versions of dependencies for reproducibility
- Remove unnecessary files and packages after installation
- Document your build steps clearly in the
Dockerfile
Defining Containers in Via Foundry Pipelines
Via Foundry pipelines allow container definitions directly within Nextflow configurations.
Example:
// Base image
params.IMAGE_BASE = "quay.io/viascientific"
// Example process definition
process run_analysis {
container "${params.IMAGE_BASE}/pipeline_base_image:1.0"
script:
"""
run-analysis.sh input_data
"""
}
If you require private images, set the params.IMAGE_BASE variable to your private registry (e.g., AWS ECR).