Skip to content

Running Lacuna on an HPC cluster

This guide walks you through setting up and running Lacuna analyses on a high-performance computing (HPC) cluster using SLURM and Apptainer.

Overview

The typical HPC workflow is:

  1. Pull the Lacuna container image once
  2. Prepare a batch script for the analysis you want to run
  3. Submit it as a SLURM job array that distributes subjects across nodes

Ready-to-use scripts for FNM and SNM are provided in the hpc_scripts/ directory.

Regional damage analysis is fast enough to run locally and does not require HPC resources.

Prerequisites

  • SLURM workload manager
  • Apptainer (or Singularity) available as a module
  • BIDS-formatted dataset on a shared filesystem
  • Connectome data for the analysis you want to run

Pull the container

Pull the Lacuna image once and store the resulting .sif file on the shared filesystem:

module load apptainer
apptainer pull lacuna.sif docker://ghcr.io/m-petersen/lacuna:latest 

This creates lacuna.sif in the current directory.

Batch scripts

Each batch script follows the same structure:

  1. Initialization — load Apptainer, create log directory
  2. Configuration — define paths to BIDS data, outputs, connectomes, and the SIF image
  3. Subject slicing — discover all subjects and assign a batch to the current array task
  4. Execution — run Lacuna inside the container with bind mounts

Functional network mapping

lacuna_fnm.batch
#!/bin/bash
#SBATCH --job-name=lacuna_fnm
#SBATCH --output=logs/lacuna_%A_%a.out
#SBATCH --error=logs/lacuna_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=04:00:00

# --- 1. Initialization ---
module load apptainer
mkdir -p logs

BATCH_SIZE=${BATCH_SIZE:-200}

# --- 2. Configuration & Paths ---
# Adapt these paths to your cluster and project
BIDS_DIR=/path/to/bids
OUTPUT_DIR=/path/to/output
CONNECTOMES_DIR=/path/to/connectomes
SIF_IMAGE=/path/to/lacuna.sif

# Unique cache per job to prevent write-conflicts between array tasks
export LACUNA_CACHE_DIR=/tmp/lacuna_cache
export LACUNA_TMP_DIR=/tmp/lacuna_tmp

mkdir -p "$OUTPUT_DIR" "$LACUNA_CACHE_DIR"

# --- 3. Subject Slicing ---
# Use SUBJECT_LIST if provided, otherwise discover from BIDS_DIR
if [ -n "${SUBJECT_LIST:-}" ]; then
    read -ra ALL_SUBJECTS <<< "$SUBJECT_LIST"
    ALL_SUBJECTS=( $(printf '%s\n' "${ALL_SUBJECTS[@]}" | sed 's/^sub-//' | sort) )
else
    mapfile -t ALL_SUBJECTS < <(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d -printf "%f\n" | sed 's/sub-//' | sort)
fi

START_INDEX=$(( SLURM_ARRAY_TASK_ID * BATCH_SIZE ))
SUBJECT_SUBSET=( "${ALL_SUBJECTS[@]:$START_INDEX:$BATCH_SIZE}" )

if [ ${#SUBJECT_SUBSET[@]} -eq 0 ]; then
    echo "No subjects for array task $SLURM_ARRAY_TASK_ID (start index: $START_INDEX). Exiting."
    exit 0
fi

echo "--- Job Array ID: $SLURM_ARRAY_TASK_ID ---"
echo "Processing ${#SUBJECT_SUBSET[@]} subjects starting at index $START_INDEX"
echo "Subjects: ${SUBJECT_SUBSET[*]}"

# --- 4. Run functional network mapping ---
apptainer run \
    --bind "$BIDS_DIR":/bids:ro \
    --bind "$OUTPUT_DIR":/output \
    --bind "$CONNECTOMES_DIR":/connectomes:ro \
    --bind "$LACUNA_CACHE_DIR":/lacuna_cache \
    --env LACUNA_CACHE_DIR="/lacuna_cache" \
    --env LACUNA_TMP_DIR="/lacuna_cache" \
    "$SIF_IMAGE" run fnm \
    /bids /output \
    --participant-label ${SUBJECT_SUBSET[*]} \
    --mask-space MNI152NLin6Asym \
    --connectome-path /connectomes/gsp1000/processed/ \
    --method boes \
    --fdr-alpha 0.05 \
    --t-threshold 9 \
    --output-resolution 2 \
    --parcel-atlases schaefer2018parcels100networks7 \
    --batch-size -1 \
    --nprocs $SLURM_CPUS_PER_TASK \
    --verbose

rm -rf "$LACUNA_CACHE_DIR"

Structural network mapping

SNM is the most resource-intensive analysis. The script copies the tractogram to node-local storage ($TMPDIR) for faster I/O and uses a smaller batch size:

lacuna_snm.batch
#!/bin/bash
#SBATCH --job-name=lacuna_snm
#SBATCH --output=logs/lacuna_%A_%a.out
#SBATCH --error=logs/lacuna_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=04:00:00

# --- 1. Initialization ---
module load apptainer
mkdir -p logs

BATCH_SIZE=${BATCH_SIZE:-20}

# --- 2. Configuration & Paths ---
# Adapt these paths to your cluster and project
BIDS_DIR=/path/to/bids
OUTPUT_DIR=/path/to/output
CONNECTOMES_DIR=/path/to/connectomes
SIF_IMAGE=/path/to/lacuna.sif

# Use node-local temp for large tractogram I/O
export LACUNA_CACHE_DIR=/tmp/lacuna_cache
export LACUNA_TMP_DIR=/tmp/lacuna_tmp

mkdir -p "$OUTPUT_DIR" "$LACUNA_CACHE_DIR"

# Copy tractogram to node-local storage for faster I/O
cp "$CONNECTOMES_DIR"/dTOR_full_tractogram.tck "$LACUNA_CACHE_DIR"/

# --- 3. Subject Slicing ---
# Use SUBJECT_LIST if provided, otherwise discover from BIDS_DIR
if [ -n "${SUBJECT_LIST:-}" ]; then
    read -ra ALL_SUBJECTS <<< "$SUBJECT_LIST"
    ALL_SUBJECTS=( $(printf '%s\n' "${ALL_SUBJECTS[@]}" | sed 's/^sub-//' | sort) )
else
    mapfile -t ALL_SUBJECTS < <(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d -printf "%f\n" | sed 's/sub-//' | sort)
fi

START_INDEX=$(( SLURM_ARRAY_TASK_ID * BATCH_SIZE ))
SUBJECT_SUBSET=( "${ALL_SUBJECTS[@]:$START_INDEX:$BATCH_SIZE}" )

if [ ${#SUBJECT_SUBSET[@]} -eq 0 ]; then
    echo "No subjects for array task $SLURM_ARRAY_TASK_ID (start index: $START_INDEX). Exiting."
    exit 0
fi

echo "--- Job Array ID: $SLURM_ARRAY_TASK_ID ---"
echo "Processing ${#SUBJECT_SUBSET[@]} subjects starting at index $START_INDEX"
echo "Subjects: ${SUBJECT_SUBSET[*]}"

# --- 4. Run structural network mapping ---
apptainer run \
    --bind "$BIDS_DIR":/bids:ro \
    --bind "$OUTPUT_DIR":/output \
    --bind "$LACUNA_CACHE_DIR":/lacuna_cache \
    --env LACUNA_CACHE_DIR="/lacuna_cache" \
    --env LACUNA_TMP_DIR="/lacuna_cache" \
    "$SIF_IMAGE" run snm \
    /bids /output \
    --participant-label ${SUBJECT_SUBSET[*]} \
    --mask-space MNI152NLin6Asym \
    --connectome-path /lacuna_cache/dTOR_full_tractogram.tck \
    --compute-roi-disconnection \
    --compute-disconnectivity-matrix \
    --parcel-atlases schaefer2018parcels100networks7 \
    --output-resolution 2 \
    --batch-size 10 \
    --nprocs $SLURM_CPUS_PER_TASK \
    --verbose

rm -rf "$LACUNA_CACHE_DIR"

Submit scripts

Each submit script counts the subjects, calculates how many array tasks are needed, and submits the corresponding batch script. You can pass specific subject names as arguments to process a subset; if none are given, all subjects in BIDS_DIR are submitted:

submit_fnm_jobs.sh
#!/bin/bash

# --- Configuration ---
BATCH_SIZE=200
BIDS_DIR="/path/to/bids"

# --- Determine subjects ---
# Pass subject names as arguments to process a subset, e.g.:
#   bash submit_fnm_jobs.sh sub-001 sub-002 sub-003
# If no arguments are given, all subjects in BIDS_DIR are used.
if [ $# -gt 0 ]; then
    num_subjects=$#
    SUBJECT_LIST="$*"
else
    num_subjects=$(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d | wc -l)
    SUBJECT_LIST=""
fi

if [ "$num_subjects" -eq 0 ]; then
    echo "Error: No subjects found in $BIDS_DIR"
    exit 1
fi

# --- Calculate array limit ---
num_batches=$(( (num_subjects + BATCH_SIZE - 1) / BATCH_SIZE ))
array_limit=$(( num_batches - 1 ))

# --- Submit ---
echo "Found $num_subjects subjects."
echo "Batch size: $BATCH_SIZE"
echo "Submitting $num_batches jobs (array indices 0-$array_limit)."

sbatch --array=0-$array_limit --export=BATCH_SIZE=$BATCH_SIZE,SUBJECT_LIST="$SUBJECT_LIST",ALL lacuna_fnm.batch

Usage:

# Edit BIDS_DIR in the submit script, then:

# Submit all subjects
bash submit_fnm_jobs.sh

# Submit specific subjects only
bash submit_fnm_jobs.sh sub-001 sub-002 sub-003

The SNM submit script (submit_snm_jobs.sh) follows the same pattern.

Adapting the scripts

Before running, update the placeholder paths in both the batch and submit scripts:

Variable Description Example
BIDS_DIR BIDS dataset on shared storage /data/projects/my_study/bids
OUTPUT_DIR Output directory (read-write) /scratch/$USER/lacuna_output
CONNECTOMES_DIR Connectome files /data/connectomes
SIF_IMAGE Path to the .sif container /containers/lacuna_latest.sif
CACHE_ROOT Per-job cache directory /scratch/$USER/.cache_slurm

Resource requirements

Analysis CPU Memory Time (per batch) Batch size
Functional network mapping 16 64 GB ~4 h 200
Structural network mapping 16 64 GB ~4 h 20

Adjust --cpus-per-task, --mem, and --time to match your cluster's constraints and dataset size.

Subject slicing

The scripts use SLURM job arrays to distribute subjects across nodes. Each array task processes a slice of BATCH_SIZE subjects:

  • Array task 0 processes subjects 0–199
  • Array task 1 processes subjects 200–399
  • etc.

The submit script calculates the required number of array tasks automatically.

Filtering masks

Use --pattern to select specific masks within each subject directory:

--pattern "*label-acuteinfarct*"

This is useful when subjects have multiple lesion masks (e.g., acute vs. chronic infarct).

Caching

Each array task gets its own cache directory to prevent write conflicts between parallel jobs. The cache is cleaned up after each job completes.

For SNM, the tractogram is copied to node-local storage ($TMPDIR) before processing. This avoids repeated reads from the shared filesystem and significantly improves I/O performance.

Monitoring jobs

# Check job status
squeue -u $USER

# View logs for a specific array task
cat logs/lacuna_<job_id>_<array_id>.out

# Cancel all jobs in an array
scancel <job_id>

Collecting results

After all jobs complete, aggregate results into group-level tables:

apptainer run \
    --bind /path/to/output:/output \
    lacuna_latest.sif \
    collect /output --output-dir /output/group