Cluster and batch workflows

This chapter is for everyone who installs StochasticGene.jl (Pkg.add("StochasticGene")) and needs to:

run many fit jobs on a cluster (including NIH Biowulf) using swarm files and makeswarm;
follow the recommended coupled-model workflow: fit individual units first, then merge those fitted rates into one initial rate file for the coupled model.

The implementations live in biowulf.jl (swarms, run-spec presets) and io.jl (merging rate tables). Function signatures and defaults are in the docstrings; this page is the narrative guide published with the GitHub-hosted documentation.

Coupled models: recommended workflow (single units → merge → coupled fit)

For coupled transcriptional units (e.g. enhancer + gene, with or without a hidden unit), the suggested workflow is:

Fit each unit as its own single-unit model Run separate fits (e.g. enhancer-only and gene-only traces or histograms) so each produces a standard MCMC rates_*.txt (rows = posterior samples, columns = rate headers for that unit). Use normal fit calls or batch them with makeswarm_models / makeswarmfiles in single-unit mode.
Merge the fitted rates into one wide table Stack the columns from the two (or more) unit files and append coupling placeholder columns using create_combined_file (two units) or create_combined_file_mult (more than two). You choose Nenh / Ngene (or per-unit column counts) to match how each set of rates is laid out in your files (see docstrings). For many keys (e.g. from a CSV of model names), use create_combined_files or create_combined_files_driver, which call create_combined_file once per key and name outputs with combined_rates_key.
Run the coupled fit using the combined file as the starting rates Point infolder / inlabel (or your run spec) at that merged file so the coupled MCMC warm-starts from the stacked single-unit posteriors. The coupled fit uses tuple G, R, coupling, joint datatype (e.g. tracejoint), etc. Coupling strengths are then estimated in the coupled run (the placeholder columns from step 2 get updated).
Optional: batch everything on the cluster Use makeswarm or makeswarmfiles so each job runs fit(; key=..., ...) from prewritten info_<key> specs (see Run specification (info TOML)).

This order—individual fits → merge → coupled fit—is the standard way to get a sensible initial combined rate file for coupled models without fitting all parameters cold.

NIH Biowulf: using `makeswarm`

makeswarm does not submit jobs to the scheduler by itself. It writes files you submit with Biowulf’s swarm (or your own sbatch wrappers):

<swarmfile>.swarm — one command line per run key (each line runs julia with your project and one fit script).
fitscript_<key>.jl per key — typically calls fit(; key="<key>", ...) with shared options (resultfolder, maxtime, samplesteps, etc.).

Typical use on Biowulf

Install StochasticGene in your Julia environment (see Installation, including Biowulf Installation).
From Julia (interactive session or batch script), run something like:

using StochasticGene

makeswarm(
    ["runA", "runB"];           # keys; must match info_<key> / rates_<key> naming you use
    filedir      = "my_swarm",  # directory where .swarm and .jl files are written
    resultfolder = "my_results",
    root         = ".",
    project      = "/path/to/your/StochasticGene.jl",  # or "" if using the default environment
    nchains      = 4,
    nthreads     = 1,
    maxtime      = 72000.0,
    samplesteps  = 1_000_000,
)

Submit the swarm from the shell (example):

cd my_swarm
swarm -g 4 -t 16 -b 1 --time 24:00:00 --module julialang -f fit.swarm

Adjust -g, -t, time, and module to match your allocation and Julia module name on Biowulf.

Generating keys and info_<key> in bulk

write_run_spec_preset — write info_<key>.jld2 + marker TOML for one key.
makeswarm_models — sweep single-unit G,R,S,insertstep, write presets, then call makeswarm.
makeswarmfiles — unified entry: coupled key lists (CSV, explicit base_keys, or H3 grids) or single-unit sweeps; writes presets and runs makeswarm. See its docstring for the mutually exclusive modes.
makeswarmfiles_h3_latent — convenience for H3 latent key grids.

Swarm `julia -p`, `nchains`, merged `info_<key>`, and `root`

Parallel workers: The swarm command should use -p N (or equivalent) consistent with how many chains run in parallel. For makeswarmfiles / makeswarm_models, if you do not pass an explicit swarm-only nchains= in kwargs, the generated -p is taken from each run spec’s nchains (e.g. coupled defaults often use 16), so it stays aligned with fit(; …, nchains=…). See the makeswarmfiles docstring.
Merged presets: With merge_existing_info=true (default), older info_<key>.jld2 files are merged into new specs. Legacy trace_specs sometimes used a huge t_end (historical “open end” sentinel). When saving, write_run_spec_preset runs normalize_trace_specs_legacy_t_end! so those values are rewritten to t_end = -1.0, matching current default_trace_specs_for_coupled and avoiding invalid frame indices in read_tracefiles.
root in generated fit scripts: Scripts list root exactly as in the run spec (no forced abspath). Use root="." if the job’s working directory is the project root (set cd in the swarm or submit from the right folder). Paths resolved in an interactive Biowulf session can differ from batch jobs; "." avoids baking in an interactive-only absolute path.

Key-based naming

Many batch helpers assume a string key per run:

results/<resultfolder>/info_<key>.toml and info_<key>.jld2
rates_<key>.txt

See Run specification (info TOML). Presets for cluster reruns are written with write_run_spec_preset.

Topic	Canonical place
User workflows, Biowulf, coupled merge order	This page (hosted docs)
`info_<key>` file format	runspectoml.md
README on GitHub	Short pointer + link to stable docs
Exact function signatures	Docstrings in `biowulf.jl` / `io.jl`

Cluster and batch workflows

Coupled models: recommended workflow (single units → merge → coupled fit)

NIH Biowulf: using `makeswarm`

Swarm `julia -p`, `nchains`, merged `info_<key>`, and `root`

Key-based naming

Combined rate files (`io.jl`) — reference

After the coupled fit

See also

Maintainer note: where to document what