scenario-runner.ts

PURPOSE

Executes deterministic in-engine test suites against the live game sandbox. Three modes: (1) AABB artifact A/B suite — baseline×2 vs artifact×2 with variance/effect/verdict reporting, (2) tier-sweep matrix — one or all artifacts × 4 tiers × 3 scenario templates with scaling-shape analysis, (3) weapon gauntlet — every weapon × 5 levels capturing DPS/kills/damage. Owns seeding, run-orchestration, metric capture, and KPI comparison; defers sim mutation to mission methods exposed on window.__mission.

OWNS

  • ScenarioRunner class — singleton exported as scenarioRunner.
  • RunnerState / MatrixState / WeaponGauntletState instance state, including _abResults, _baselineCache (Map<ScenarioType, MatrixRunMetrics>), update-callback registrations.
  • Exported types: ABResult, WeaponGauntletResult, WeaponGauntletState.
  • Module-local seeded RNG (_mulberry32, _installSeededRandom, _restoreRandom) that swaps Math.random for the duration of each run and restores it in finally.
  • Verdict taxonomy: PROVEN | WEAK | NO_EFFECT | NOISY | BROKEN | ERROR.
  • Scaling-shape taxonomy: flat | linear | exponential | inverted.

READS FROM

  • ./scenario-types — type-only imports for ScenarioDef, TestResult, TestMetrics, RunnerState, ScenarioTemplate, ScenarioType, MatrixRunMetrics, MatrixCell, TierSweep, ArtifactMatrixResult, MatrixState.
  • ./artifact-scenariosARTIFACT_AB_TESTS, ARTIFACT_AB_MAP, SCENARIO_TEMPLATE_LIST, getArtifactOverrides, ArtifactABTest, ArtifactKPI.
  • ../data/artifactsARTIFACT_DEFS (name lookup only).
  • ../data/weaponsWEAPONS (gauntlet enumeration).
  • window.__mission (or injected _missionRef.current) — sandbox controller. Methods consumed: sandboxResetForTest(), sandboxSetSpawnRate(), sandboxGrantArtifact(), setSpawnerEnabled(), setWeapons(), setWorldKnobs(), setGodMode(), patchShipStats(), fullHeal(), spawnEnemyAt(), testSetManualPump(), testPumpFrame().
  • window.__dev — dev hooks: speed(), teleport(), autopilot(), getState(), setHeat(), spawnCrateAt(), fireSignal().
  • window.__effectTriggerCounts — effect-engine trigger map keyed by artifact:<id>; reset to {} before each run.

PUSHES TO

  • console.log — structured tagged lines prefixed [AABB_TEST]: per-test result lines plus [AABB_TEST]_RESULT (one per artifact) and [AABB_TEST]_SUMMARY (suite-wide JSON). Matrix and gauntlet modes emit their own tagged lines.
  • _onUpdate(state)RunnerState snapshot pushed after every step transition (runSequence, runABSuite).
  • _onMatrixUpdate(state)MatrixState snapshot pushed on each cell completion (runTierSweep, runFullMatrix).
  • _onWeaponGauntletUpdate(state)WeaponGauntletState snapshot pushed on each weapon×level slice (runWeaponGauntlet).
  • Return values: arrays of ABResult, ArtifactMatrixResult, or WeaponGauntletResult.

DOES NOT

  • Mutate the simulation directly. All world / ship / artifact mutations go through m.* mission methods.
  • Use Math.random() during runs without first installing the seeded PRNG; _runOnce and _runCellAveraged wrap their implementations in try/finally to guarantee restoration.
  • Persist results between sessions. Caller is responsible for surfacing or storing ABResult[] / matrix results.
  • Render UI. Consumers subscribe via onUpdate / onMatrixUpdate / onWeaponGauntletUpdate and render their own progress chrome.
  • Spawn bosses (the planet-boss artifact-scenario hook was deleted with the planet-boss system).
  • Drive the ship. The runner relies on stationary-ship-at-origin + enemies-chase-in; the previous circle-patrol attempt is documented as removed in inline comments.

Signals

  • Run determinism_installSeededRandom(TEST_SEED) swaps Math.random to a Mulberry32 PRNG (TEST_SEED = 20260416) for AABB single runs; matrix _runCellAveraged uses TEST_SEED + r per run so averaging dampens any residual non-determinism.
  • Manual pump — Matrix and weapon-gauntlet runs call m.testSetManualPump(true) before sandboxResetForTest() so no auto frames fire during setup, then drive frames synchronously with m.testPumpFrame(fakeT) and yield to the browser via _wait(0) per frame for live canvas repaint. finally { m.testSetManualPump(false) } always restores auto-pump. The legacy AABB _runOnce path is wall-clock driven via setInterval + _wait(durationSec / speedMult * 1000).
  • Effect-engine trigger countwindow.__effectTriggerCounts['artifact:'+id] is the primary signal for B runs: zero triggers ⇒ BROKEN verdict regardless of KPI delta.
  • Variance / coefficient-of-variation|run1 - run2| / avg. Above NOISY_THRESHOLD = 0.20 for either A or B pair ⇒ NOISY verdict.
  • KPI directiondmgTaken is inverted (baselineAvg - artifactAvg); all other KPIs use artifactAvg - baselineAvg. Matrix path uses template.kpiHigherIsBetter.
  • Scaling shape_analyzeScaling compares T0..T3 |deltaPct|: flat if both ≈0 or factor<1.1, inverted if factor<0.9, linear if T1/T2 lie near the T0→T3 line within t3 × 0.3 error, else exponential. T0≈0 with T3>0 short-circuits to exponential with factor=99.

Entry points

  • scenarioRunner — module-level singleton; consumers attach via setMissionRef(ref) and onUpdate(cb) / onMatrixUpdate(cb) / onWeaponGauntletUpdate(cb).
  • runABSuite(speedMult=4)Promise<ABResult[]> — runs every entry in ARTIFACT_AB_TESTS.
  • runABSelected(artifactIds, speedMult=4)Promise<ABResult[]> — filtered AABB suite.
  • runSequence(scenarios, speedMult=4)Promise<TestResult[]> — legacy linear scenario runner (TestRunnerTab compat path).
  • runTierSweep(artifactId, speedMult=4)Promise<ArtifactMatrixResult> — one artifact × 3 templates × 4 tiers.
  • runFullMatrix(speedMult=4)Promise<ArtifactMatrixResult[]> — all artifacts × 3 × 4 with pre-computed cached baselines.
  • runWeaponGauntlet(speedMult=8, durationSec=20)Promise<WeaponGauntletResult[]> — every weapon × [1,5,10,15,20] levels.
  • cancel() — sets _cancelled = true; loops check and bail at every step boundary.
  • getState() / getABResults() / getMatrixState() / getWeaponGauntletState() — snapshot readers.

Pattern notes

  • AABB layout — Per artifact: A1, A2 (baseline ×2), B1, B2 (artifact ×2). A2 vs A1 and B2 vs B1 are determinism checks; B avg vs A avg is the effect. _compareAABB produces the ABResult with verdict gated by triggered → deterministic → deltaPct ≥ minDeltaPct.
  • Run-once contract (_runOnce / _runOnceImpl) — Reset → teleport(0,0) → clear __effectTriggerCounts → disable autopilot → _wait(200) → patch ship → fullHeal → optional godMode → _wait(100) → capture pre-stats → world knobs → heat → post-start actions (grant_artifact gated on withArtifact; spawn_crate + fire_signal fire on both A and B so baseline matches conditions) → initial enemy ring at INIT_ENEMY_COUNT=12, INIT_ENEMY_DIST=80setInterval wave spawner (ENEMIES_PER_WAVE=6, SPAWN_DIST=80, interval 1000/speedMult ms) → _wait(durationSec/speedMult*1000) → capture post stats → return TestMetrics. artifactTriggers field smuggles {hp, shield, hpPct, shieldPct} for the survivalHp KPI.
  • Matrix run-once contract (_runOnceMatrix) — Same shape but manual-pump driven. totalFrames = round(durationSec × 60 / speedMult); total sim steps = totalFrames × speedMult = durationSec × 60. Waves are tied to frame count (framesPerWave = round(60/speedMult)), NOT wall-clock — this is what makes the matrix path deterministic across machines. Per-frame await _wait(0) yields to the browser so the canvas repaints live.
  • Weapon gauntlet (_runWeaponSliceManualPump) — Builds an ad-hoc ScenarioDef per (weapon, level) with hpMax=999999, shieldMax=0, godMode=true, spawnerEnabled=false, enemyDamageMult=0. Same manual-pump pattern as matrix mode; constants INIT_ENEMY_COUNT=12, INIT_ENEMY_DIST=80, ENEMIES_PER_WAVE=6, SPAWN_DIST=80 are re-declared locally rather than reused from the file-top constants (they govern the AABB path).
  • Baseline caching_baselineCache: Map<ScenarioType, MatrixRunMetrics>. runFullMatrix clears it on entry and pre-warms all 3 template baselines before iterating artifacts; runTierSweep lazily fills via _getBaseline.
  • Speed multiplierw.__dev.speed(speedMult) set on suite entry and restored to 1 on suite exit. The simulation’s accumulator multiplies wall-clock dt by timeDilation, firing speedMult sim steps per pumped frame in matrix/gauntlet mode.
  • Tier semanticssandboxGrantArtifact(id) is invoked tier + 1 times to stack the artifact to the requested tier (T0..T3).
  • Verdict thresholdsminDeltaPct is supplied per-test in ArtifactABTest; NOISY_THRESHOLD = 0.20. No other numeric thresholds are owned by this file.
  • Restore discipline — Every code path that installs the seeded RNG or enables manual pump wraps the body in try/finally and unconditionally restores. Cancellation is cooperative: loops check _cancelled at every step, but cleanup always runs.

EXTRACT-CANDIDATE

  • Seeded RNG install/restore pair (_installSeededRandom / _restoreRandom + _mulberry32) — repeated locally inside _runOnce and _runCellAveraged, and a third call site in runWeaponGauntlet. Worth pulling into a shared testing/seeded-rng.ts if any other test harness needs reproducible runs.
  • Initial-ring + wave-spawn constants (INIT_ENEMY_COUNT=12, INIT_ENEMY_DIST=80, ENEMIES_PER_WAVE=6, SPAWN_DIST=80) — declared at module top for the AABB path and re-declared as locals inside _runWeaponSliceManualPump. Same values, two definitions; collapse into a single exported constant block.
  • Run-setup boilerplatesandboxResetForTest → teleport(0,0) → clear __effectTriggerCounts → autopilot(false) → patchShipStats → setWeapons → fullHeal → setGodMode appears in three near-identical blocks (_runOnceImpl, _runOnceMatrix, _runWeaponSliceManualPump). A prepareSandboxRun(mission, scenario, withArtifact) helper would deduplicate ~30 lines per call site.
  • Manual-pump frame loop — the for (let f = 0; f < totalFrames; f++) { spawn-wave-on-cadence; testPumpFrame; await _wait(0) } pattern is shared verbatim between _runOnceMatrix and _runWeaponSliceManualPump. Lift into pumpFrames(mission, totalFrames, framesPerWave, onWave).
  • Pre/post stat-delta capturepreKills/preDmgDealt/preDmgTaken capture and the corresponding post - pre subtraction repeats in all three run paths.