Running Lacuna on an HPC cluster¶

This guide walks you through setting up and running Lacuna analyses on a high-performance computing (HPC) cluster using SLURM and Apptainer.

Overview¶

The typical HPC workflow is:

Pull the Lacuna container image once
Prepare a batch script for the analysis you want to run
Submit it as a SLURM job array that distributes subjects across nodes

Ready-to-use scripts for FNM and SNM are provided in the hpc_scripts/ directory.

Regional damage analysis is fast enough to run locally and does not require HPC resources.

Prerequisites¶

SLURM workload manager
Apptainer (or Singularity) available as a module
BIDS-formatted dataset on a shared filesystem
Connectome data for the analysis you want to run

Pull the container¶

Pull the Lacuna image once and store the resulting .sif file on the shared filesystem:

module load apptainer
apptainer pull lacuna.sif docker://ghcr.io/m-petersen/lacuna:latest

This creates lacuna.sif in the current directory.

Batch scripts¶

Each batch script follows the same structure:

Initialization — load Apptainer, create log directory
Configuration — define paths to BIDS data, outputs, connectomes, and the SIF image
Subject slicing — discover all subjects and assign a batch to the current array task
Execution — run Lacuna inside the container with bind mounts

Functional network mapping¶

lacuna_fnm.batch

#!/bin/bash
#SBATCH --job-name=lacuna_fnm
#SBATCH --output=logs/lacuna_%A_%a.out
#SBATCH --error=logs/lacuna_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=04:00:00

# --- 1. Initialization ---
module load apptainer
mkdir -p logs

BATCH_SIZE=${BATCH_SIZE:-200}

# --- 2. Configuration & Paths ---
# Adapt these paths to your cluster and project
BIDS_DIR=/path/to/bids
OUTPUT_DIR=/path/to/output
CONNECTOMES_DIR=/path/to/connectomes
SIF_IMAGE=/path/to/lacuna.sif

# Unique cache per job to prevent write-conflicts between array tasks
export LACUNA_CACHE_DIR=/tmp/lacuna_cache
export LACUNA_TMP_DIR=/tmp/lacuna_tmp

mkdir -p "$OUTPUT_DIR" "$LACUNA_CACHE_DIR"

# --- 3. Subject Slicing ---
# Use SUBJECT_LIST if provided, otherwise discover from BIDS_DIR
if [ -n "${SUBJECT_LIST:-}" ]; then
    read -ra ALL_SUBJECTS <<< "$SUBJECT_LIST"
    ALL_SUBJECTS=( $(printf '%s\n' "${ALL_SUBJECTS[@]}" | sed 's/^sub-//' | sort) )
else
    mapfile -t ALL_SUBJECTS < <(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d -printf "%f\n" | sed 's/sub-//' | sort)
fi

START_INDEX=$(( SLURM_ARRAY_TASK_ID * BATCH_SIZE ))
SUBJECT_SUBSET=( "${ALL_SUBJECTS[@]:$START_INDEX:$BATCH_SIZE}" )

if [ ${#SUBJECT_SUBSET[@]} -eq 0 ]; then
    echo "No subjects for array task $SLURM_ARRAY_TASK_ID (start index: $START_INDEX). Exiting."
    exit 0
fi

echo "--- Job Array ID: $SLURM_ARRAY_TASK_ID ---"
echo "Processing ${#SUBJECT_SUBSET[@]} subjects starting at index $START_INDEX"
echo "Subjects: ${SUBJECT_SUBSET[*]}"

# --- 4. Run functional network mapping ---
apptainer run \
    --bind "$BIDS_DIR":/bids:ro \
    --bind "$OUTPUT_DIR":/output \
    --bind "$CONNECTOMES_DIR":/connectomes:ro \
    --bind "$LACUNA_CACHE_DIR":/lacuna_cache \
    --env LACUNA_CACHE_DIR="/lacuna_cache" \
    --env LACUNA_TMP_DIR="/lacuna_cache" \
    "$SIF_IMAGE" run fnm \
    /bids /output \
    --participant-label ${SUBJECT_SUBSET[*]} \
    --mask-space MNI152NLin6Asym \
    --connectome-path /connectomes/gsp1000/processed/ \
    --method boes \
    --fdr-alpha 0.05 \
    --t-threshold 9 \
    --output-resolution 2 \
    --parcel-atlases schaefer2018parcels100networks7 \
    --batch-size -1 \
    --nprocs $SLURM_CPUS_PER_TASK \
    --verbose

rm -rf "$LACUNA_CACHE_DIR"

Structural network mapping¶

SNM is the most resource-intensive analysis. The script copies the tractogram to node-local storage ($TMPDIR) for faster I/O and uses a smaller batch size:

lacuna_snm.batch

#!/bin/bash
#SBATCH --job-name=lacuna_snm
#SBATCH --output=logs/lacuna_%A_%a.out
#SBATCH --error=logs/lacuna_%A_%a.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=04:00:00

# --- 1. Initialization ---
module load apptainer
mkdir -p logs

BATCH_SIZE=${BATCH_SIZE:-20}

# --- 2. Configuration & Paths ---
# Adapt these paths to your cluster and project
BIDS_DIR=/path/to/bids
OUTPUT_DIR=/path/to/output
CONNECTOMES_DIR=/path/to/connectomes
SIF_IMAGE=/path/to/lacuna.sif

# Use node-local temp for large tractogram I/O
export LACUNA_CACHE_DIR=/tmp/lacuna_cache
export LACUNA_TMP_DIR=/tmp/lacuna_tmp

mkdir -p "$OUTPUT_DIR" "$LACUNA_CACHE_DIR"

# Copy tractogram to node-local storage for faster I/O
cp "$CONNECTOMES_DIR"/dTOR_full_tractogram.tck "$LACUNA_CACHE_DIR"/

# --- 3. Subject Slicing ---
# Use SUBJECT_LIST if provided, otherwise discover from BIDS_DIR
if [ -n "${SUBJECT_LIST:-}" ]; then
    read -ra ALL_SUBJECTS <<< "$SUBJECT_LIST"
    ALL_SUBJECTS=( $(printf '%s\n' "${ALL_SUBJECTS[@]}" | sed 's/^sub-//' | sort) )
else
    mapfile -t ALL_SUBJECTS < <(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d -printf "%f\n" | sed 's/sub-//' | sort)
fi

START_INDEX=$(( SLURM_ARRAY_TASK_ID * BATCH_SIZE ))
SUBJECT_SUBSET=( "${ALL_SUBJECTS[@]:$START_INDEX:$BATCH_SIZE}" )

if [ ${#SUBJECT_SUBSET[@]} -eq 0 ]; then
    echo "No subjects for array task $SLURM_ARRAY_TASK_ID (start index: $START_INDEX). Exiting."
    exit 0
fi

echo "--- Job Array ID: $SLURM_ARRAY_TASK_ID ---"
echo "Processing ${#SUBJECT_SUBSET[@]} subjects starting at index $START_INDEX"
echo "Subjects: ${SUBJECT_SUBSET[*]}"

# --- 4. Run structural network mapping ---
apptainer run \
    --bind "$BIDS_DIR":/bids:ro \
    --bind "$OUTPUT_DIR":/output \
    --bind "$LACUNA_CACHE_DIR":/lacuna_cache \
    --env LACUNA_CACHE_DIR="/lacuna_cache" \
    --env LACUNA_TMP_DIR="/lacuna_cache" \
    "$SIF_IMAGE" run snm \
    /bids /output \
    --participant-label ${SUBJECT_SUBSET[*]} \
    --mask-space MNI152NLin6Asym \
    --connectome-path /lacuna_cache/dTOR_full_tractogram.tck \
    --compute-roi-disconnection \
    --compute-disconnectivity-matrix \
    --parcel-atlases schaefer2018parcels100networks7 \
    --output-resolution 2 \
    --batch-size 10 \
    --nprocs $SLURM_CPUS_PER_TASK \
    --verbose

rm -rf "$LACUNA_CACHE_DIR"

Submit scripts¶

Each submit script counts the subjects, calculates how many array tasks are needed, and submits the corresponding batch script. You can pass specific subject names as arguments to process a subset; if none are given, all subjects in BIDS_DIR are submitted:

submit_fnm_jobs.sh

#!/bin/bash

# --- Configuration ---
BATCH_SIZE=200
BIDS_DIR="/path/to/bids"

# --- Determine subjects ---
# Pass subject names as arguments to process a subset, e.g.:
#   bash submit_fnm_jobs.sh sub-001 sub-002 sub-003
# If no arguments are given, all subjects in BIDS_DIR are used.
if [ $# -gt 0 ]; then
    num_subjects=$#
    SUBJECT_LIST="$*"
else
    num_subjects=$(find "$BIDS_DIR" -maxdepth 1 -name "sub-*" -type d | wc -l)
    SUBJECT_LIST=""
fi

if [ "$num_subjects" -eq 0 ]; then
    echo "Error: No subjects found in $BIDS_DIR"
    exit 1
fi

# --- Calculate array limit ---
num_batches=$(( (num_subjects + BATCH_SIZE - 1) / BATCH_SIZE ))
array_limit=$(( num_batches - 1 ))

# --- Submit ---
echo "Found $num_subjects subjects."
echo "Batch size: $BATCH_SIZE"
echo "Submitting $num_batches jobs (array indices 0-$array_limit)."

sbatch --array=0-$array_limit --export=BATCH_SIZE=$BATCH_SIZE,SUBJECT_LIST="$SUBJECT_LIST",ALL lacuna_fnm.batch

Usage:

# Edit BIDS_DIR in the submit script, then:

# Submit all subjects
bash submit_fnm_jobs.sh

# Submit specific subjects only
bash submit_fnm_jobs.sh sub-001 sub-002 sub-003

The SNM submit script (submit_snm_jobs.sh) follows the same pattern.

Adapting the scripts¶

Before running, update the placeholder paths in both the batch and submit scripts:

Variable	Description	Example
`BIDS_DIR`	BIDS dataset on shared storage	`/data/projects/my_study/bids`
`OUTPUT_DIR`	Output directory (read-write)	`/scratch/$USER/lacuna_output`
`CONNECTOMES_DIR`	Connectome files	`/data/connectomes`
`SIF_IMAGE`	Path to the `.sif` container	`/containers/lacuna_latest.sif`
`CACHE_ROOT`	Per-job cache directory	`/scratch/$USER/.cache_slurm`

Resource requirements¶

Analysis	CPU	Memory	Time (per batch)	Batch size
Functional network mapping	16	64 GB	~4 h	200
Structural network mapping	16	64 GB	~4 h	20

Adjust --cpus-per-task, --mem, and --time to match your cluster's constraints and dataset size.

Subject slicing¶

The scripts use SLURM job arrays to distribute subjects across nodes. Each array task processes a slice of BATCH_SIZE subjects:

Array task 0 processes subjects 0–199
Array task 1 processes subjects 200–399
etc.

The submit script calculates the required number of array tasks automatically.

Filtering masks¶

Use --pattern to select specific masks within each subject directory:

--pattern "*label-acuteinfarct*"

This is useful when subjects have multiple lesion masks (e.g., acute vs. chronic infarct).

Caching¶

Each array task gets its own cache directory to prevent write conflicts between parallel jobs. The cache is cleaned up after each job completes.

For SNM, the tractogram is copied to node-local storage ($TMPDIR) before processing. This avoids repeated reads from the shared filesystem and significantly improves I/O performance.

Monitoring jobs¶

# Check job status
squeue -u $USER

# View logs for a specific array task
cat logs/lacuna_<job_id>_<array_id>.out

# Cancel all jobs in an array
scancel <job_id>

Collecting results¶

After all jobs complete, aggregate results into group-level tables:

apptainer run \
    --bind /path/to/output:/output \
    lacuna_latest.sif \
    collect /output --output-dir /output/group