Hierarchical Trace Analysis
This example demonstrates how to analyze trace data using hierarchical models to account for cell-to-cell variability.
Setup
First, let's set up our project directory and load the package:
using StochasticGene
# Create project directory
mkdir("hierarchical_example")
cd("hierarchical_example")
# Generate example data using test_fit_trace_hierarchical
fitted_rates, target_rates = test_fit_trace_hierarchical(
traceinfo=Dict("cell" => "example", "gene" => "test"), # Trace metadata
G=2, # 2 gene states
R=2, # 2 RNA states
S=2, # 2 splicing states
transitions=([1, 2], [2, 1]), # Simple two-state model
rtarget=[0.33, 0.19, 2.5, 1.0], # Target rates for simulation
totaltime=1000, # Total simulation time
ntrials=10, # Number of simulation trials
fittedparam=[1, 2, 3], # Parameters to fit
hierarchical=true, # Use hierarchical model
nchains=1 # Single MCMC chain for example
)
# Print results
println("Fitted rates: ", fitted_rates)
println("Target rates: ", target_rates)
Data Preparation
Place your trace data in the data/
directory. The data should be organized as follows:
data/
├── gene_name/
│ ├── condition1/
│ │ ├── cell1.csv
│ │ ├── cell2.csv
│ │ └── metadata.csv
│ └── condition2/
│ ├── cell1.csv
│ ├── cell2.csv
│ └── metadata.csv
Each cell CSV file should contain:
- Time points
- Fluorescence intensity values
- Optional background values
Model Definition
We'll fit a hierarchical model with:
- 2 gene states (G=2) - ON and OFF
- 1 pre-RNA step (R=1)
- Simple transitions between states
- Cell-specific parameters
# Define model parameters
G = 2 # Number of gene states (ON and OFF)
R = 1 # Number of pre-RNA steps
# Define state transitions
# Format: (from_states, to_states)
transitions = (
[1, 2], # From states
[2, 1] # To states
)
# Define transcription rates for each state
transcription_rates = [0.0, 1.0] # No transcription in OFF state
# Define hierarchical structure
hierarchical = true
Fitting the Model
Now we can fit the hierarchical model to our trace data:
# Fit the model
fits, stats, measures, data, model, options = fit(
G = G,
R = R,
transitions = transitions,
transcription_rates = transcription_rates,
datatype = "trace",
datapath = "data/",
gene = "MYC",
datacond = "CONTROL",
hierarchical = true
)
Analyzing Results
Basic Analysis
# Print basic statistics
println(stats)
# Plot the results
using Plots
plot(fits)
# Save results
save_results(fits, "results/")
Cell-Specific Analysis
# Analyze cell-specific parameters
cell_params = analyze_cell_parameters(fits)
plot_cell_parameters(cell_params, "results/cell_parameters/")
# Plot individual cell traces
plot_cell_traces(fits, "results/cell_traces/")
Population Analysis
# Analyze population statistics
pop_stats = analyze_population(fits)
plot_population_stats(pop_stats, "results/population/")
# Calculate population averages
pop_avg = calculate_population_average(fits)
plot_population_average(pop_avg, "results/population_average/")
Advanced Analysis
Variability Analysis
# Analyze cell-to-cell variability
variability = analyze_variability(fits)
plot_variability(variability, "results/variability/")
# Calculate correlation structure
correlations = calculate_correlations(fits)
plot_correlations(correlations, "results/correlations/")
Model Comparison
# Compare hierarchical and non-hierarchical models
models = [
(hierarchical=true, name="Hierarchical"),
(hierarchical=false, name="Non-hierarchical")
]
model_fits = []
for (hier, name) in models
fits, stats, measures, data, model, options = fit(
G = G,
R = R,
transitions = transitions,
transcription_rates = transcription_rates,
datatype = "trace",
datapath = "data/",
gene = "MYC",
datacond = "CONTROL",
hierarchical = hier
)
push!(model_fits, (fits, stats, name))
end
# Compare models
compare_models(model_fits, "results/model_comparison/")
Best Practices
Data Quality
- Check for photobleaching
- Verify signal-to-noise ratio
- Consider background correction
Model Selection
- Start with simple models
- Use model selection criteria
- Validate assumptions
Analysis Pipeline
- Document preprocessing steps
- Save intermediate results
- Version control your analysis
Common Issues and Solutions
Parameter Identifiability
# Check parameter identifiability
identifiability = check_identifiability(fits)
plot_identifiability(identifiability, "results/identifiability/")
Convergence
# Check model convergence
convergence = check_convergence(fits)
plot_convergence(convergence, "results/convergence/")
# Analyze parameter correlations
correlations = analyze_parameter_correlations(fits)
plot_parameter_correlations(correlations, "results/parameter_correlations/")
Next Steps
- Try different hierarchical structures
- Experiment with parameter priors
- Compare results across different genes
For more advanced examples, see: