simulator Function
Simulate stochastic gene expression models using the Gillespie algorithm.
Syntax
simulator(rin, transitions, G, R, S, insertstep; kwargs...)Arguments
Required Arguments
rin::Vector{Float64}: Initial transition ratestransitions::Tuple: Tuple of vectors specifying state transitionsG::Int: Number of gene statesR::Int: Number of pre-RNA stepsS::Int: Number of splice sites (must be ≤ R - insertstep + 1)insertstep::Int: Step where reporter becomes visible
Optional Keyword Arguments
Simulation Parameters
warmupsteps::Int = 0: Number of warmup steps before recordingnalleles::Int = 1: Number of allelesnhist::Int = 20: Number of histogram binsbins::Vector{Float64} = Float64[]: Custom histogram binstraceinterval::Float64 = 0.0: Time interval for trace recording (0 = no traces)
Model Configuration
coupling::Tuple = tuple(): Coupling parameters for multi-unit modelsonstates::Vector{Int} = Int[]: States where transcription is activesplicetype::String = "": Splicing configuration ("", "offeject", "offdecay")
Observation Model
probfn::Function = prob_Gaussian: Probability function for observationsnoise::Vector{Float64} = Float64[]: Noise parametersreportersteps::Vector{Int} = Int[]: Steps where reporter is visible
Output Control
tspan::Tuple{Float64, Float64} = (0., 1000.): Time span for simulationntrials::Int = 1: Number of simulation trialsresultfolder::String = "": Output folder for results
Returns
histogram::Vector{Float64}: Steady-state RNA count distributiontraces::Vector{Vector{Float64}}: Intensity traces (iftraceinterval > 0)dtimes::Vector{Float64}: Dwell times (if applicable)
Examples
Basic Two-State Model
using StochasticGene
# Simple two-state telegraph model
rates = [0.1, 0.2] # G1->G2, G2->G1
transitions = ([1,2], [2,1])
G, R, S = 2, 0, 0
insertstep = 1
# Simulate RNA histogram
histogram = simulator(
rates, transitions, G, R, S, insertstep,
nhist = 50,
ntrials = 1000
)GRS Model with Traces
# Gene-Reporter-Splice model
rates = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
transitions = ([1,2], [2,1])
G, R, S = 2, 3, 2
insertstep = 1
# Simulate with intensity traces
histogram, traces = simulator(
rates, transitions, G, R, S, insertstep,
traceinterval = 1.0, # 1 minute intervals
tspan = (0., 500.), # 500 minute simulation
noise = [10.0, 5.0], # Gaussian noise parameters
probfn = prob_Gaussian
)Coupled Model
# Two coupled transcriptional units
rates1 = [0.1, 0.2, 0.3]
rates2 = [0.15, 0.25, 0.35]
transitions1 = ([1,2], [2,1])
transitions2 = ([1,2], [2,1])
# Coupling: units share some transition rates
coupling = (2, 2, [1,2], [3,4], 2) # Coupling specification
histogram = simulator(
[rates1; rates2],
(transitions1, transitions2),
(2, 2), # G states for each unit
(1, 1), # R steps for each unit
(0, 0), # S sites for each unit
(1, 1), # Insert steps
coupling = coupling,
nalleles = 2
)Hierarchical Model
# Simulate data for hierarchical fitting
rates = [0.1, 0.2, 0.3]
transitions = ([1,2], [2,1])
G, R, S = 2, 1, 0
insertstep = 1
# Generate multiple traces for hierarchical analysis
ntraces = 10
all_traces = Vector{Vector{Float64}}()
for i in 1:ntraces
# Add noise to rates for each trace
noisy_rates = rates .* (1 .+ 0.1 * randn(length(rates)))
histogram, traces = simulator(
noisy_rates, transitions, G, R, S, insertstep,
traceinterval = 0.5,
tspan = (0., 200.),
noise = [20.0, 10.0]
)
append!(all_traces, traces)
endCustom Observation Model
# Define custom observation function
function prob_Poisson(y, μ, σ)
return pdf(Poisson(μ), round(Int, y))
end
# Simulate with Poisson observation noise
histogram = simulator(
rates, transitions, G, R, S, insertstep,
probfn = prob_Poisson,
noise = [15.0], # Poisson rate parameter
nhist = 30
)Rate Ordering
The rate vector rin must follow this specific ordering:
- G transitions: Rates between gene states
- R transitions: Rates between pre-RNA steps
- S transitions: Splicing rates
- Decay rates: mRNA decay rates
- Noise parameters: Observation noise parameters
Example Rate Ordering
For a model with G=2, R=3, S=2:
rates = [
# G transitions
0.1, # G1 -> G2
0.2, # G2 -> G1
# R transitions
0.3, # R1 -> R2
0.4, # R2 -> R3
0.5, # R3 -> eject
# S transitions
0.6, # Splice site 1
0.7, # Splice site 2
# Decay
0.05 # mRNA decay
]Performance Notes
- Memory Usage: Large
nhistvalues require more memory - Simulation Time: Longer
tspanand smaller rates increase runtime - Trace Recording:
traceinterval > 0significantly increases memory usage - Parallel Processing: Use multiple calls for embarrassingly parallel simulations
Common Use Cases
Parameter Estimation Validation
# Generate synthetic data for testing parameter estimation
true_rates = [0.1, 0.2, 0.3]
synthetic_data = simulator(
true_rates, transitions, G, R, S, insertstep,
nhist = 100,
ntrials = 5000
)
# Use synthetic_data to validate fitting algorithmsModel Comparison
# Compare different model structures
models = [
(2, 0, 0), # Simple telegraph
(2, 1, 0), # With pre-RNA
(3, 1, 0), # Three gene states
]
for (G, R, S) in models
histogram = simulator(rates, transitions, G, R, S, insertstep)
# Analyze and compare histograms
endBurst Analysis
# Simulate for burst size analysis
histogram, traces = simulator(
rates, transitions, G, R, S, insertstep,
traceinterval = 0.1, # High temporal resolution
tspan = (0., 1000.),
onstates = [2] # State 2 is transcriptionally active
)
# Analyze burst properties from tracesError Handling
The function includes validation for:
- Rate vector length consistency
- Valid state transitions
- Proper model dimensions (G, R, S relationships)
- Positive rate values
- Valid time spans
See Also
simulate_trace: Generate traces onlysimulate_trials: Multiple simulation runsfit: Fit models to dataprob_Gaussian: Gaussian observation model