Skip to main content

The System of Record

The Data Layer is a versioned, relational database that serves as the immutable system of record for all modeling activities. Every input, execution, and output is permanently archived with full audit trails.
Core Principle: Nothing is ever overwritten. All data is append-only, creating perfect reproducibility and temporal analysis capabilities.

Database Architecture

Core Tables

Input Tables

Immutable records of all versioned scenario configurations.
CREATE TABLE scenarios (
    id VARCHAR PRIMARY KEY,
    created_at TIMESTAMP,
    created_by VARCHAR,
    name VARCHAR,
    description TEXT,
    plan_rules_id VARCHAR,
    operating_assumptions_id VARCHAR,
    version INTEGER,
    parent_scenario_id VARCHAR  -- For scenario variants
);
Key Features:
  • Every scenario is versioned
  • Scenarios can fork from parent scenarios
  • Immutable once created

Processing Tables

Execution metadata connecting inputs to outputs.
CREATE TABLE simulation_runs (
    id VARCHAR PRIMARY KEY,
    scenario_id VARCHAR,
    census_id VARCHAR,
    system_config JSONB,
    
    started_at TIMESTAMP,
    completed_at TIMESTAMP,
    status VARCHAR,  -- 'running', 'completed', 'failed'
    
    projection_start_year INTEGER,
    projection_end_year INTEGER,
    
    error_message TEXT,
    execution_metrics JSONB  -- performance stats
);
The Logbook: This table is the permanent record connecting specific inputs to specific outputs.

Output Tables

Year-by-year company financial snapshots.
CREATE TABLE annual_company_states (
    simulation_run_id VARCHAR,
    year INTEGER,
    source_type VARCHAR,  -- 'actual' or 'simulated'
    
    revenue DECIMAL,
    ebitda DECIMAL,
    total_payroll DECIMAL,
    esop_contribution DECIMAL,
    
    financial_metrics JSONB,
    
    PRIMARY KEY (simulation_run_id, year)
);
Source Type: Distinguishes historical actuals from future projections.

Versioning Strategy

Scenario Versioning

Scenarios can be versioned to track changes over time:
# Original scenario (June 2024)
scenario_v1 = Scenario(
    id="acme_base",
    version=1,
    contribution_amount=500_000
)

# Updated scenario (September 2024)
scenario_v2 = Scenario(
    id="acme_base",
    version=2,
    parent_scenario_id="acme_base_v1",
    contribution_amount=450_000,  # Reduced
    change_log="Reduced contribution due to market conditions"
)

Source Type Classification

All data is tagged with source_type:
  • 'actual': Historical, verified data
  • 'simulated': Forecasted data from model runs
This enables powerful queries:
-- Compare actual vs. projected for year 2024
SELECT 
    actual.repurchase_amount as actual_repurchases,
    simulated.repurchase_amount as projected_repurchases,
    (actual.repurchase_amount - simulated.repurchase_amount) as variance
FROM annual_trust_states actual
JOIN annual_trust_states simulated
    ON actual.year = simulated.year
WHERE actual.source_type = 'actual'
    AND simulated.source_type = 'simulated'
    AND actual.year = 2024;

Temporal Analysis Capabilities

The immutable architecture enables sophisticated temporal queries:
Track how your forecasts changed over time:
-- How did our 2028 repurchase forecast evolve?
SELECT 
    sr.completed_at as forecast_date,
    ats.repurchase_amount as projected_2028_repurchases
FROM simulation_runs sr
JOIN annual_trust_states ats 
    ON sr.id = ats.simulation_run_id
WHERE ats.year = 2028
    AND ats.source_type = 'simulated'
ORDER BY sr.completed_at;
Identify which assumption changes drove result differences:
-- What changed between runs?
SELECT 
    jsonb_diff(
        old_run.operating_assumptions,
        new_run.operating_assumptions
    ) as assumption_changes
FROM scenarios old_run, scenarios new_run
WHERE old_run.id = 'acme_v1'
    AND new_run.id = 'acme_v2';
Compare forecasts to actuals to measure model accuracy:
-- How accurate was our 2024 forecast made in 2023?
SELECT 
    projected.repurchase_amount as forecast,
    actual.repurchase_amount as actual,
    ABS(projected.repurchase_amount - actual.repurchase_amount) / 
        actual.repurchase_amount as error_percentage
FROM annual_trust_states projected
JOIN annual_trust_states actual
    ON projected.year = actual.year
WHERE projected.source_type = 'simulated'
    AND projected.simulation_run_id = 'run_2023_forecast'
    AND actual.source_type = 'actual'
    AND actual.year = 2024;

Data Integrity

Constraints & Validation

-- Ensure share conservation
ALTER TABLE annual_trust_states
    ADD CONSTRAINT shares_balance_check
    CHECK (
        total_shares_outstanding = 
        allocated_shares + suspense_shares + unallocated_shares
    );

-- Prevent negative cash
ALTER TABLE annual_trust_states
    ADD CONSTRAINT cash_positive_check
    CHECK (
        (cash_ledger->>'total_cash')::DECIMAL >= 0
    );

Referential Integrity

-- All simulation runs must reference valid scenarios
ALTER TABLE simulation_runs
    ADD FOREIGN KEY (scenario_id) 
    REFERENCES scenarios(id);

-- All participant snapshots must reference valid runs
ALTER TABLE annual_participant_snapshots
    ADD FOREIGN KEY (simulation_run_id)
    REFERENCES simulation_runs(id)
    ON DELETE CASCADE;

Query Patterns

Common Queries

SELECT * FROM simulation_runs
WHERE scenario_id = 'acme_base'
ORDER BY completed_at DESC
LIMIT 1;

Backup & Recovery

Point-in-Time Recovery

Restore database to any point in history

Snapshot Exports

Export specific runs for offline archival

Incremental Backups

Daily backups with 90-day retention

Disaster Recovery

Multi-region replication for business continuity

Performance Considerations

-- Fast simulation run lookups
CREATE INDEX idx_runs_scenario ON simulation_runs(scenario_id, completed_at);

-- Fast year-based queries
CREATE INDEX idx_trust_year ON annual_trust_states(simulation_run_id, year);

-- Fast participant lookups
CREATE INDEX idx_participant ON annual_participant_snapshots(participant_id, year);
Large tables partitioned by simulation_run_id for faster queries and easier archival.
Pre-computed aggregations for common dashboard queries (e.g., total repurchase obligations by year).

Next Steps