Config Schema

Purpose

This page documents the authored YAML contract used by:

  • scripts/dsambayes.R
  • DSAMbayes::run_from_yaml()
  • runme.R

The authored schema is schema_version: 2 only. Older formula-driven YAML files are intentionally rejected.

Processing order

The runner processes configs in this order:

  1. Parse YAML.
  2. Coerce YAML infinity tokens (.Inf, -.Inf).
  3. Apply v2 defaults.
  4. Resolve relative paths against the config file directory.
  5. Validate the authored v2 contract.
  6. Compile the authored config into the internal runner config.
  7. Apply managed holiday terms, then build the model and run.

Root sections

Key Required Purpose
schema_version yes Must be 2.
data yes Input data path, format, and date handling.
target yes Outcome column, KPI type, and response transform.
media yes Modeled media terms.
controls yes Non-media predictors, including manual trend/seasonality terms.
effects no Managed effects. In M1 this is holidays only.
model yes Model class and scaling options.
hierarchy conditional Required for model.type: re and model.type: cre.
pooling conditional Required for model.type: pooled.
priors no Default priors plus grouped or explicit overrides.
boundaries no Grouped or explicit parameter boundaries.
fit no MCMC or optimise settings.
diagnostics no Diagnostics, model selection, and time-series selection settings.
allocation no Budget optimisation settings.
outputs no Output paths and artifact toggles.
forecast no Reserved forecast toggle.

Unknown keys fail validation.

Minimal valid config

schema_version: 2

data:
  path: ../data/timeseries/demo_data_synthetic.csv
  format: csv
  date_var: date

target:
  column: revenue
  type: revenue
  transform: identity

media:
  - channel0_signal
  - channel1_signal

controls:
  - t_scaled
  - sin52_1
  - cos52_1

model:
  type: blm

Key differences from the retired schema

  • model.formula is no longer authored directly.
  • schema_version: 1 configs are rejected.
  • Trend and seasonality stay user-authored as ordinary columns under controls.
  • Managed time effects are limited to holidays under effects.holidays.
  • re and cre models use hierarchy, not cre.enabled flags.
  • pooled models use pooling, not pooling.enabled.

Section reference

schema_version

Key Type Rules
schema_version integer Must be 2.

data

Key Type Rules
data.path string Required. File must exist. Relative paths resolve from the config directory.
data.format string csv, rds, or long.
data.date_var string Required in M1.
data.date_format string or null Optional parser format for date columns.
data.na_action string omit or error.
data.long_id_col string or null Required when data.format: long.
data.long_variable_col string or null Required when data.format: long.
data.long_value_col string or null Required when data.format: long.
data.dictionary_path string or null Optional metadata CSV.
data.dictionary mapping Optional inline metadata keyed by term name.

target

Key Type Rules
target.column string Required response column.
target.type string revenue or subscriptions.
target.transform string identity or log.
target.offset_column string or null Supported only for model.type: blm in M1.

media and controls

  • media is a required list of modeled media terms.
  • controls is a required list, but it may be empty ([]).
  • A term may not appear in both lists.
  • Manual trend and seasonality terms belong in controls.

Compiled formula order is:

  1. generated holiday terms
  2. controls
  3. media
  4. generated CRE mean terms
  5. optional offset
  6. hierarchical random-effects term

effects.holidays

Managed holidays are optional and are the only managed effect in M1.

effects:
  holidays:
    enabled: true
    path: ../data/holidays.csv
    label_col: holiday
    country: gb
    country_col: country
    week_start: monday
    prefix: holiday_
Key Type Rules
effects.holidays.enabled boolean Enables holiday feature generation.
effects.holidays.path string Required when enabled. CSV or RDS.
effects.holidays.date_col string or null Optional calendar date column override.
effects.holidays.label_col string Holiday label column.
effects.holidays.country string or null Optional single-country filter.
effects.holidays.country_col string Calendar column used with country.
effects.holidays.date_format string or null Optional parser format for non-ISO dates.
effects.holidays.week_start string monday through sunday.
effects.holidays.timezone string Timezone used in parsing/alignment.
effects.holidays.prefix string Prefix for generated holiday columns.
effects.holidays.window_before integer Non-negative.
effects.holidays.window_after integer Non-negative.
effects.holidays.aggregation_rule string count or any.
effects.holidays.overlap_policy string count_all or dedupe_label_date.
effects.holidays.overwrite_existing boolean Replaces existing columns only when true.

Notes:

  • The data date column must be aligned to the configured weekly anchor.
  • Country filtering materializes a filtered calendar artifact before the compiled config is written.

model

Key Type Rules
model.name string Defaults to the config filename stem.
model.type string blm, re, cre, or pooled.
model.scale boolean Controls internal scaling before fit.
model.force_recompile boolean Forces Stan recompilation when true.

hierarchy

Required for model.type: re and model.type: cre.

Key Type Rules
hierarchy.group string Grouping column for panel models.
hierarchy.random_intercept boolean Include `(1
hierarchy.random_slopes list of strings Optional subset of authored media and controls.
hierarchy.cre_variables list of strings Required and non-empty for model.type: cre.
hierarchy.cre_prefix string Prefix for generated CRE mean terms. Default cre_mean_.

pooling

Required for model.type: pooled.

Key Type Rules
pooling.grouping_vars list of strings Required and non-empty.
pooling.map_path string Required. CSV or RDS.
pooling.map_format string csv or rds.
pooling.min_waves integer or null Optional positive integer.

priors

Key Type Rules
priors.use_defaults boolean Must remain true in M1.
priors.likelihood mapping Optional Abacus-style alias for noise_sd.
priors.overrides list Explicit parameter-level overrides.

Grouped families are available when applicable:

  • intercept
  • media_beta
  • control_beta
  • holiday_beta
  • cre_beta
  • pooling_beta
  • random_effect_sd
  • noise_sd

Each grouped family accepts either the legacy DSAMbayes style:

family: normal    # or lognormal_ms where supported
mean: 0
sd: 0.5

or the more explicit Abacus-style alias:

distribution: Normal   # or HalfNormal / LogNormalMS where supported
mu: 0
sigma: 0.5

HalfNormal compiles to a zero-centered Normal prior plus an implied lower bound of 0 for unconstrained targeted parameter(s). Parameters that are already positive by construction, such as noise_sd and hierarchical sd_*[...], do not receive an extra boundary row.

The residual-noise prior also accepts this alias:

priors:
  likelihood:
    sigma:
      distribution: HalfNormal
      sigma: 2

boundaries

Boundary families mirror the grouped prior families and may also use explicit boundaries.overrides.

Each grouped or explicit boundary row uses:

lower: -Inf
upper: Inf

fit

Key Type Rules
fit.method string mcmc or optimise. Pooled runs require mcmc.
fit.seed numeric or null Optional scalar seed.
fit.optimise.* mapping Optimisation controls.
fit.mcmc.* mapping Stan sampling controls.
fit.mcmc.parameterization.positive_priors string centered or noncentered.

diagnostics

Retains the current runner surface for:

  • model_selection
  • time_series_selection
  • identifiability
  • publish-gate controls

Important M1 rule:

  • diagnostics.time_series_selection.enabled: true is not supported for pooled runs.

allocation

Retains the current runner surface for budget optimisation, with channel targeting based on authored media terms.

outputs

outputs.root_dir and outputs.run_dir behave as before, but the metadata contract now includes:

  • config.original.yaml
  • config.resolved.yaml
  • config.compiled.yaml

forecast

Reserved toggle. No authored schema changes in M1.

Examples in this repository

  • config/blm_timeseries.yaml — weekly time-series BLM example
  • config/cre_geo_panel.yaml — weekly geo-panel CRE example