render-diag

PURPOSE

Per-frame render diagnostics: wall-clock timing for every named render pass, layer-health checks on canvas buffers, spike capture on bad frames, GPU-side delivery metrics, and an optional on-screen perf overlay. Designed for zero-overhead-when-disabled overlay drawing while still feeding telemetry continuously, so the collector can reconstruct what was happening at the moment a frame missed budget.

OWNS

  • PASS_NAMES — the ordered list of instrumented render passes (gameLoop, frameCache, weapons, bulletSim, enemyAI, collision, background, parallax slots, gameplay_grid, sticker passes, shadowPass, terrain, worldObjects, bullets, postFx, stickerBlit, shieldHUD, bossVfx, particles, effects, postProcess, gpuSync, hud, frameTotal).
  • RING_SIZE = 120 frame ring buffers (Float32Array) for per-pass timings, plus _avg and _max rollups recomputed each frame.
  • _passOrdinal map for O(1) pass-name → index lookup.
  • _drawCounts (Int32Array) — per-pass entity counts reset every frame.
  • _canvasHealth[] — per-canvas record { name, physW, physH, memMB, ctxValid, lastUpdateFrame } plus _totalCanvasMemMB rolling total.
  • Entity-count scalars (_enemyCount, _visibleEnemyCount, _bulletCount, _particleCount, _stickerEntryCount, _stickerSpriteCount, _stickerPolyCount, _xpOrbCount, _enemyBulletCount, _terrainCount, _junkCount).
  • GPU-side timing state: _renderEndTime, _prevFrameStart, smoothed averages _deliveryMsAvg, _gpuOverheadMsAvg, _rafGapMsAvg, _drawImageCountAvg.
  • _drawImageCount for the current frame; _woSubTimings and _ssSubTimings sub-pass micro-timer bags.
  • Stutter ring (STUTTER_WINDOW = 60, _frameTimeRing) and _stutterScore() rolling std-dev.
  • Pool occupancy scalars _poolOrbCount, _poolEnemyBin, _poolBulletBin.
  • Spike capture state: SpikeSnapshot interface, SPIKE_THRESHOLD_MS = 20, MAX_SPIKES_PER_RUN = 30, _pendingSpikes[], _spikeCount.
  • Sustained-low-FPS tracker: LOW_FPS_THRESHOLD = 24, LOW_FPS_SUSTAINED_FRAMES = 180, _lowFpsFrameCount.
  • Rolling error log _errors[] capped at MAX_ERRORS = 10.
  • The overlay renderer drawDiagOverlay and its _makeBar helper.

READS FROM

  • ./statedebugOverlay, diagExpanded, W, H, dpr.
  • ./configBUILD_VERSION, PERF_VERSION_TAG (the latter is rendered as the first line of the overlay).
  • performance.now() for all timestamps; performance.memory.usedJSHeapSize (Chromium only, null elsewhere) for heapMB.
  • The live canvas/context handed in by callers of diagTrackCanvas — reads canvas.width, canvas.height, and checks whether ctx is non-null.

PUSHES TO

  • ./remote-errorsreportRenderError on canvas-health failures (null context, zero dimension) and reportPerfWarning on oversized buffers and sustained-low-FPS events.
  • console.warn via _pushError for any locally logged issue.
  • Callers (collector, overlay) via the exported snapshot/drain API — no direct mutation of consumer state.

DOES NOT

  • Does not own the render pipeline, kick off any pass, or know how a pass is implemented. It only times the boundaries callers mark with diagBeginPass / diagEndPass.
  • Does not allocate during steady-state telemetry: ring buffers are typed arrays sized at module load, and diagDrainSpikes returns the same empty array reference when no spikes are pending.
  • Does not gate timing on the visual overlay. diagBeginFrame, diagBeginPass, diagEndPass, diagEndFrame, diagSetCounts, diagSetPassDrawCount, diagSetPoolStats, diagSetWoSubTimings, diagSetSsSubTimings, diagAddDrawCalls, and diagTrackCanvas always run — only drawDiagOverlay and diagEvent short-circuit on !debugOverlay.
  • Does not timestamp spikes itself. SpikeSnapshot.t is left as 0; the telemetry collector fills it on drain.
  • Does not persist anything between runs. diagResetSpikes clears the spike cap for a new run; per-frame counts and ring buffers are not explicitly reset.
  • Does not measure GPU work directly. The “GPU overhead” figure is a derived value: deliveryMs - jsRenderMs, where delivery is the frame-to-frame performance.now() gap.

Signals

  • Per-pass timing — ring-buffer average and max in ms per PASS_NAMES entry, plus the per-pass entity count from the most recent frame (_drawCounts).
  • Frame total + FPSframeAvgMs and fps = 1000 / frameAvgMs derived from the frameTotal pass.
  • Stutter score — rolling std-dev of frame times over the last 60 frames, rounded to two decimals. Separates “smooth 60” from “60 with hitches”.
  • Spike snapshots — full per-frame context (per-pass times for THAT frame, draw counts, entity counts, VRAM) captured when frameTotalMs >= 20 and the per-run cap of 30 has not been hit.
  • Sustained low FPSreportPerfWarning('fps', ...) fires once when frame FPS stays below 24 for 180 consecutive frames; resets to zero as soon as a frame meets budget.
  • Canvas health — per-buffer { w, h, mb, ok } plus total VRAM. Failure modes emitted: null context (likely OOM), zero-dimension canvas, oversized buffer (width × height > 8_000_000 and memMB > 30).
  • GPU-side metricsdeliveryMs, gpuOverheadMs, rafGapMs, deliveredFps, smoothed drawImageCalls. Smoothing is an EMA with α = 0.05 (~20-frame window).
  • HeapheapMB from performance.memory where available, null otherwise.
  • Poolsorbs active count plus enemyBin / bulletBin recycled-slot counts, reported by the spawner / weapons / xp-orb modules via diagSetPoolStats.
  • Sub-pass micro-timerswoSub (worldObjects) and ssSub (stickerSetup) bags pushed in by the bridge for finer attribution within heavy passes.
  • Error log — last 10 _pushError lines, formatted [f{frameCount}] {msg}, shown in the overlay’s Errors block.

Entry points

  • diagBeginFrame() — start-of-frame; computes previous-frame GPU-side averages, resets _drawImageCount, stamps _prevFrameStart and _frameStart.
  • diagBeginPass(name) / diagEndPass(name) — wrap a render pass. End writes dur into _timings[ord * RING_SIZE + _ringIdx].
  • diagSetPassDrawCount(name, count) — per-pass entity count for the current frame.
  • diagSetCounts(...) — entity counts (enemies, bullets, particles, sticker, xp orbs, enemy bullets, terrain, junk).
  • diagSetPoolStats(orbs, enemyBin, bulletBin) — pool occupancy.
  • diagSetWoSubTimings(timings) / diagSetSsSubTimings(timings) — sub-pass micro-timer bags.
  • diagAddDrawCalls(count) — accumulates per-frame drawImage calls; reset in diagBeginFrame.
  • diagTrackCanvas(name, canvas, ctx) — register/refresh a buffer’s health entry, push errors/warnings when it fails the checks, recompute _totalCanvasMemMB.
  • diagEndFrame() — finalize frame total, push into the stutter ring, capture spike snapshot if over threshold, reset _drawCounts, advance ring, recompute _avg and _max for every pass; also fires the sustained-low-FPS warning.
  • diagGetPerfSnapshot() — return a RenderPerfSnapshot (averages, maxes, draw counts, entities, GPU metrics, heap, stutter, pools, canvases) for the collector.
  • diagDrainSpikes() — return and clear _pendingSpikes. Returns the same empty-array reference when nothing is pending to avoid allocation on the common path.
  • diagResetSpikes() — clear pending spikes and reset _spikeCount for a new run.
  • diagEvent(msg) — log a one-off diagnostic line; gated on debugOverlay.
  • drawDiagOverlay(ctx) — paint the top-right panel: FPS line is always shown when the overlay is on, and the detailed sections (pass timing with ASCII bars and draw-count tags, entities, canvases, errors) only render when diagExpanded is also on.

Pattern notes

  • Allocate once, write forever. All hot-path storage is a typed array (Float32Array/Int32Array) sized at module load. The only Record<string, number> objects built per call live inside diagGetPerfSnapshot and diagEndFrame’s spike branch — both off the per-pass hot path.
  • Always-on telemetry, gated rendering. Per-pass timing, entity counts, canvas health, GPU metrics, and spike capture run unconditionally so the collector and remote-error reporting can see problems without the player ever turning the overlay on. Only drawDiagOverlay and diagEvent short-circuit on !debugOverlay.
  • Two-tier expansion. The overlay shows one line (version tag + FPS) by default and only spills the detail sections when diagExpanded is also set, keeping the cheap default panel small on portrait screens.
  • Spike capture writes a self-contained record. _pendingSpikes entries copy this-frame pass times directly from _timings[p * RING_SIZE + _ringIdx] rather than averages, so the snapshot reflects the slow frame, not the smoothed neighborhood.
  • Per-run spike cap. MAX_SPIKES_PER_RUN = 30 plus diagResetSpikes keeps a continuously bad device from drowning telemetry; the cap is per run, not per drain.
  • EMA for jittery signals. α = 0.05 smoothing on delivery, GPU overhead, rAF gap, and drawImage counts gives ~20-frame smoothing — slow enough to be readable, fast enough to track a regression.
  • Drain returns the same empty array. diagDrainSpikes returns _pendingSpikes itself (length 0) when nothing is pending, avoiding even an empty [] allocation. The caller must not mutate the returned reference except via the documented drain semantics.
  • Pass-time visualization in the overlay is a simple _makeBar(passMs, totalMs) returning a 10-cell / bar, drawn in the same monospace block as the numbers.
  • Sub-pass bags are replaced wholesale. diagSetWoSubTimings and diagSetSsSubTimings reassign the internal record reference; the snapshot getter spreads them with { ..._woSubTimings } so consumers cannot accidentally hold a live mutable view.
  • Pass order is the draw order. PASS_NAMES is in the literal order draw passes execute, with frameTotal as the sentinel at the end so iteration over the timing arrays can stop at PASS_NAMES.length - 1 when “everything except frame total” is what’s wanted (used in the overlay loop, snapshot loop, and spike-capture loop).