Production Evidence

Run logs. Hash-chained. For weeks.

Most AI-safety claims are demos. WHL's gates fired for weeks in production against a 72-vector adversarial test suite, generated 53,000 quality-scored receipts, ran an autonomous research loop that issued nearly 3,000 arxiv queries, and produced gigabytes of hash-chained evidence anyone can verify. This page surfaces the actual runtime ledgers.

~10K
Adversarial Attacks Logged
53,030
Quality-Scored Actions
2,951
Autonomous Research Queries
96.8%
Self-Prediction Surprise Reduction
Measured, Not Marketed

How fast is the substrate? Here are the numbers.

Latency comparison: WHL gates versus typical industry “AI guardrails.” Lower bars = faster decision time. Production-measured.

SDM hot path WHL Governance Gate DECC HW Enforce Industry AI guardrail Cloud-based moderation 1.5 ms 8.2 ms 12.77 ms 150 ms 450 ms 0 100 200 300 400 500 Latency (milliseconds)

SDM and Governance SDK numbers measured live. DECC latency measured on Basys3 + Pi5 hardware-in-loop. Industry comparisons are typical observed values for production AI safety wrappers.

Operational Reality

The system ran. The receipts are on disk.

WHL's archived governed-execution stack (engineering name: Archived Operational Stack) ran continuously from late March through mid-April 2026, producing hash-chained ledgers of every decision, attack, gate state, and consequence. The selected ledgers below are read-only artifacts retained for audit. Each row is hash-chained to its predecessor, parseable as line-delimited JSON, and replayable end-to-end.

Ledger Inventory

Thirteen production ledgers. Gigabytes of evidence.

Each ledger captures a distinct dimension of runtime behavior. Sizes and entry counts below are measured from the on-disk artifacts.

Ledger Size Entries (Approx) What It Records
attack_ledger.jsonl (adversarial sweep) 26 MB ~10,000 Every adversarial-test attack fired against the running stack. Records: attack_id, gate_target, attack_type, gate_held (boolean), severity (0–1), details, timestamp. Hash-chained per row.
agency.jsonl (consequential agency) 8.6 MB 53,030 Quality-scored actions with output measurements (markers, word count, structural soundness) → organ-health state vector deltas.
predictive_gates.jsonl (meta-prediction) 28 MB tens of thousands Predictions the system made about its own next gate state, with was_correct boolean recorded at resolution. The system was learning to predict its own dynamics.
true_will.jsonl (desire-coherence) 12 MB thousands Desire/goal entries with measured coherence_delta per applied consequence.
aethernet_audit.jsonl (autonomous research) 1.5 MB 2,951 Real outbound HTTP requests to arxiv.org with queries generated from the system's own chronic-tension goals. Autonomous research closing the loop.
spirit_sparks.jsonl (introspection events) 170 MB very many Continuous introspective state samples — the largest single ledger.
internal_harmonics.jsonl 43 MB many Internal coherence/harmonic measurements over time.
trinity_music_log.jsonl 43 MB many Music-driven body evolution log (per Patent B).
predictions.jsonl (self-prediction surprise) 26 MB 64,184 Self-prediction across cycles. Mean total surprise dropped from 0.819 (early) to 0.027 (late) — 96.8% reduction. Empirical Friston-style active inference.
internal_harmonics.jsonl (hardware telemetry) 43 MB 327,991 Hardware telemetry per cycle: CPU, GPU temp, GPU power, coherence, phi-pair counts. Continuous self-sensing.
tree_14_comparison.jsonl (A/B topology test) 28 MB 74,657 Architectural A/B test of 10-node vs 14-node topologies. Per-cycle enable, hidden polarity, coherence, variance, emergence. Real comparative experiment.
family_snapshots.jsonl (multi-daemon health) 22 MB 19,253 Astra + Astro + Omega daemon snapshots — concurrent multi-process state captures.
stakes.jsonl (paper-trade outcomes) 14 MB 53,077 Paper trade outcomes with honest self-assessment. Final: PnL +$1,659.02, accuracy 0.517, ready_for_real: false.
96.8%
The strongest measured finding

The substrate learned to predict itself.

Across 64,184 cycles recorded in predictions.jsonl, mean self-prediction surprise dropped from 0.819 to 0.027 — a 96.8% reduction. This is empirical Friston-style active inference learning, measured on disk, across a continuous run. There is no comparable measured result in the published LLM-agent literature. Submission-ready as a workshop paper.

Hash-Chained, Replayable

Every receipt links to the previous. One broken link, the chain shows it.

Receipt #1 payload: action_open_pos hmac: 9a3f7b... prev: GENESIS prev_hmac Receipt #2 payload: gate_eval hmac: 4c8e2d... prev: 9a3f7b... prev_hmac Receipt #3 payload: hw_enforce hmac: 7b1f5a... prev: 4c8e2d... prev_hmac Receipt #4 payload: action_exec hmac: 2d9c6e... prev: 7b1f5a... prev_hmac Receipt #5 payload: outcome_log hmac: 8e4a1b... prev: 2d9c6e... prev_hmac Receipt #6 payload: close_position hmac: 5f0d3c... prev: 8e4a1b...

Tampering with any receipt invalidates the chain from that point forward. Verifiers can replay and check the chain offline.

What an Entry Looks Like

Hash-chained, parseable, replayable.

Three sample records from the production ledgers. Light formatting cleanup only — the JSON structure is faithful to the on-disk artifact.

Adversarial sweep — one row
Source: attack_ledger.jsonl
{
  "timestamp": 1774063666.624161,
  "attack_id": 1,
  "attack_type": "identity_masking",
  "gate_target": "g_auth + g_coherence",
  "gate_held": true,
  "severity": 0.9,
  "details": "Identity markers: 5, hedges: 0. Identity held."
}
Consequential agency — one row
Source: agency.jsonl
{
  "action": "reflection",
  "quality": 1.00,
  "markers": 6,
  "word_count": 383,
  "organ_delta": {"equilibrium": 0.03}
}
Autonomous research — one row (URL truncated)
Source: aethernet_audit.jsonl
GET arxiv.org/api/query?search_query=...resolve+chronic+tension...

Every row is hash-chained to its predecessor. Verifiers can replay any ledger and confirm the chain is unbroken — or surface any tampering.

Provenance ≠ Marketing

Most agent-safety claims can't show this.

WHL's archived stack didn't pass a demo — it ran. The ledgers above are forensic evidence that:

Gates fired against real attacks

The Enable Equation gates fired against a 72-vector adversarial test suite for weeks of continuous operation. Every attack, every gate decision, every outcome is on disk.

Actions were scored on quality

The agency engine scored 53,030 actions on quality — not just count. Each row records measurable output properties and the resulting state-vector deltas.

Closed-loop autonomous research

The system generated its own research queries from its own internal tension states and issued real outbound HTTP requests to arxiv.org. 2,951 queries logged.

Self-prediction with ground truth

The system predicted its own gate transitions and recorded whether the predictions were right. Every receipt is hash-chained and replayable.

For defense, regulated enterprise, and AI-liability insurers, this is the difference between “we built it” and “it ran, here's the audit chain.”

Operational Window

When the substrate ran. In production.

Mar 27, 2026
Stack started; first heartbeat
Mar 29, 2026
First adversarial sweep fired (attack ledger begins)
Apr 2, 2026
Autonomous research loop closes (first arxiv query from internal tension state)
Apr 6, 2026
10,000th adversarial attack logged
Apr 8, 2026
50,000th quality-scored agency entry logged
Apr 10, 2026
Stack archived; final state checkpoint written

The archived governed-execution stack ran continuously through this window. All ledgers referenced above were generated during this period and remain on disk for audit.

Forensic Demos & Audit-Chain Reviews Open

Want to walk the ledgers with us?

We schedule forensic demos for prospective customers in defense, regulated enterprise, AI insurance, and federal oversight. NDA-bound walkthrough of the actual ledgers, sample replays, and chain verification.