Skip to content

Using GCP Tools (gsutil / Cyberduck)

Tip

When uploading data from your local machine to the cloud, it's crucial to organize your data effectively. This will make it much easier to locate your files when you need them.

Once your organization is registered in Foundry, a default Google Cloud Storage (GCS) bucket is created for you. You can access this bucket directly within Via Foundry or use external tools like gsutil or Cyberduck for uploading your data.

Prerequisites

Contact Support

Before you begin, contact Via Scientific support at support@viascientific.com to obtain:

  • Organization Default Bucket Name (e.g., gs://your-org-bucket)
  • GCP Project ID associated with your bucket access.
  • Confirmation that Google OAuth access is enabled for your organization.

Option 1: Upload with gsutil (CLI)

gsutil is a Python application that lets you access Google Cloud Storage from the command line. It is recommended for large uploads.

1. Install Google Cloud CLI

Install the Google Cloud CLI (which includes gsutil) following the instructions here.

2. Authenticate via Google OAuth

Run the following command to initialize the SDK and authenticate:

gcloud init

Or, if you already have gcloud configured:

gcloud auth login

3. Step-by-Step Upload

  1. Verify Access: List the contents of your bucket to ensure you have access:

    gsutil ls gs://YOUR_ORG_BUCKET/
    
  2. Upload Data:

    • Single File:
      gsutil cp /path/to/local/file.fastq.gz gs://YOUR_ORG_BUCKET/my-dataset/
      
    • Folder (Recursive): Use the -m flag for parallel uploads (faster for many files):
      gsutil -m cp -r /path/to/local/folder gs://YOUR_ORG_BUCKET/my-dataset/
      
  3. Verify Upload:

    gsutil ls gs://YOUR_ORG_BUCKET/my-dataset/
    

Option 2: Upload with Cyberduck (GUI)

If you prefer a graphical user interface, you can use Cyberduck to upload your data. Please refer to our Cyberduck Guide for detailed instructions on how to set up and use Cyberduck with your cloud storage.

Using Uploaded Data in Via Foundry

Once your data is uploaded, you can connect it as a Data Source:

  1. Go to Data -> Create Dataset.
  2. Select Google Cloud Storage.
  3. In Choose Credentials, select "Account Default".
  4. In Data Source Path, enter the path to your data (e.g., gs://YOUR_ORG_BUCKET/my-dataset/).
  5. Click Connect.