Audit-Grade Evidence for AI Decisions: An Ed25519 + JWKS Walkthrough | Strix Blog

An auditor recently asked us a question that sounded simple: "Can you prove your AI agent was authorized to do what it did at 2:47 PM on March 15th?" Most teams answer that question with a SQL query against an audit log. We answered with a 64-byte Ed25519 signature verifiable against a public key that the auditor downloaded from our /.well-known/ endpoint and checked themselves. That difference — between "we'll show you our logs" and "verify the math yourself" — is the entire reason cryptographically signed evidence exists.

The autopsy problem with audit logs

Application audit logs are useful. They're also not evidence. When an auditor asks for proof that an AI agent's action was authorized, what they get back is a row from a database the vendor controls. The vendor could have edited the row. The vendor could have written the row after the fact. The vendor could be lying. The auditor has no way to know.

This isn't a hypothetical. It's why every regulated industry — finance, healthcare, federal contracting, EU operations under the AI Act — has spent decades layering trust mechanisms on top of basic logging: WORM storage, hash chains, time-stamping authorities, cross-vendor attestation. Each one of those is a workaround for the same root issue: the audit log lives in a database the auditor doesn't trust.

Cryptographic signatures cut through the workarounds. If you sign each evidence record with a private key, and you publish the public key, then any third party can independently verify that (a) the record was produced by the holder of the signing key, and (b) the record hasn't been altered since signing. The verifier doesn't have to trust the database. The verifier doesn't even have to trust the vendor. The math is the trust.

What we actually sign

The temptation when adding signatures is to "sign the row." That's the wrong move — rows have field ordering ambiguity, JSON serializers differ across runtimes, and the moment you add a column the signature breaks. We solved this with a locked canonical schema we call the 13-field signed payload. Every governance evidence record at Strix carries exactly these fields, in exactly this order:

schemaVersion (1)
-> evidenceId
-> evidenceHash
-> proofChainHash
-> capabilityId
-> action
-> actorId
-> actorRole
-> createdAt
-> signingKeyId
-> environment
-> tenantId
-> regulatoryContext

Two of those fields deserve a closer look:

environment and tenantId are captured at signing time and stored on the evidence record. The verifier always reads from the stored fields — never from process.env. That distinction prevents a class of false-failure bug where a record signed in production fails verification because the verifier is running in a development context. The environment is part of the cryptographic record, not part of the runtime.
regulatoryContext binds EU AI Act compliance flags directly into the signed payload. If anything about the regulatory context changes — even a single boolean — the signature no longer verifies. You can't post-hoc claim a record satisfied Article 12 if it didn't.

The serializer is also load-bearing. We use a deterministic canonicalizer (we call it SCJ v1) that produces byte-identical output across Node.js, Bun, Deno, and the browser. There's a test suite that pins 41 golden vectors with locked SHA-256 hashes — if a runtime change alters serialization output, the test fails before the change can ship.

The signing primitive

We use Ed25519, defined in RFC 8032. Three properties matter for evidence:

Deterministic. Same private key + same message = same 64-byte signature. Always. There's no nonce that could leak the key, and there's no implementation freedom that could produce different valid signatures. Two independent implementations of Ed25519 will produce byte-identical output for the same input.
Fast to verify. A modern CPU verifies thousands of Ed25519 signatures per second. The signature itself is 64 bytes. The public key is 32 bytes. There's no scenario where verification cost is a meaningful constraint.
Standardized. RFC 8032. RFC 7517 (JWK). RFC 8037 (JWS for JWKS). Every major language has a tested implementation. There's no "use our SDK" vendor lock-in path; an auditor's verifier and our signer are the same primitive.

Publishing the public key (JWKS)

If signatures are useless without the public key, the public key has to be available. We publish it at the canonical RFC 7517 path:

GET https://www.strixgov.com/.well-known/strix-jwks.json

The response is a standard JWKS document with each key wrapped in metadata: kid (key ID), kty (key type, "OKP" for Ed25519), crv ("Ed25519"), x (the 32-byte public key in base64url), alg ("EdDSA"), and a retentionPolicy block we add for AI Act traceability.

Key rotation is the part that trips most teams up. The naive approach is "rotate the key, update the JWKS, hope nobody verifies an old record." The correct approach is to keep retired public keys in the JWKS for at least the regulatory retention period. We use a key ID format of strix-{env}-{YYYY-MM}, which makes rotation auditable: strix-prod-2025-12 retired, strix-prod-2026-01 took over, both verifiable through the JWKS for at least 2 years. EU AI Act Article 12 requires retention sufficient for traceability — 2 years is our floor, and the verifier supports any number of historical keys.

What verification actually looks like

Here's the entire verification flow, end to end, with no Strix-supplied tooling. An auditor with a laptop can run this:

// 1. Fetch the evidence record (public, no auth)
const record = await fetch(
  'https://www.strixgov.com/api/public/verify?id=evi_abc123'
).then(r => r.json());

// 2. Fetch the public key set
const jwks = await fetch(
  'https://www.strixgov.com/.well-known/strix-jwks.json'
).then(r => r.json());

// 3. Find the key that signed this record
const jwk = jwks.keys.find(k => k.kid === record.signingKeyId);

// 4. Reconstruct the canonical 13-field payload
const payload = {
  schemaVersion: 1,
  evidenceId: record.evidenceId,
  evidenceHash: record.evidenceHash,
  proofChainHash: record.proofChainHash,
  capabilityId: record.capabilityId,
  action: record.action,
  actorId: record.actorId,
  actorRole: record.actorRole,
  createdAt: record.createdAt,
  signingKeyId: record.signingKeyId,
  environment: record.environment,
  tenantId: record.tenantId,
  regulatoryContext: record.regulatoryContext,
};
const canonical = canonicalSerialize(payload); // SCJ v1

// 5. Verify the signature using stdlib crypto
const ok = await crypto.subtle.verify(
  'Ed25519',
  await crypto.subtle.importKey('jwk', jwk, { name: 'Ed25519' }, false, ['verify']),
  decodeBase64(record.signature),
  new TextEncoder().encode(canonical)
);

// 6. ok === true means the math agrees with the claim

Step 4 is where most homegrown signing systems fall over: the verifier and the signer have to agree on serialization down to the byte. SCJ v1 (Strix Canonical JSON, version 1) is the only canonicalizer either side uses. Both sides import from the same reference implementation (solo-builder-core/src/canonical-json.ts); re-implementations are forbidden. The test suite pins 41 golden vectors with locked SHA-256 hashes to catch any silent drift.

Step 5 uses the Web Crypto API. Node, Deno, Bun, and modern browsers all support Ed25519 signature verification natively. There is no Strix dependency in this flow. We provide an open-source verifier (npx @strixgov/verifier) for convenience, but it's just this code packaged as a CLI — the math is the math regardless of which tool you use.

The two-layer trust model

Cryptographic verification is necessary but not sufficient. A record can be cryptographically valid and still be inappropriate for a context — for example, a production record being verified in a development environment, or a record from one tenant being shown in another tenant's dashboard. We split verification into two layers:

Layer 1 — Cryptographic Validity (signatureValid). "Was this record produced by the holder of the Strix signing key?" States: VERIFIED, UNSIGNED, LEGACY_UNSIGNED, COMPLIANCE_VIOLATION. This layer is pure math.
Layer 2 — Deployment Context (environmentMatch, tenantMatch). "Is this record appropriate for this deployment context?" Cross-environment replay detection. This layer is policy.

Both layers always run. A record is "fully verified" only when both layers pass. The Layer 1 result tells you whether the signature is real; the Layer 2 result tells you whether you should be looking at this record in this place. Auditors care about Layer 1; operators care about both.

Compliance flags are derived, not asserted

The biggest mistake we see in homegrown evidence systems is asserting compliance: a column called article12_compliant with a boolean the application sets. That's not compliance — that's a claim about compliance. An auditor reading "article12_compliant: true" has no way to evaluate the claim.

Strix derives compliance flags from verification outcomes:

article12_tamper_resistant = hashValid AND chainValid AND signatureValid
article14_human_oversight = signaturePresent (actor fields are bound into the signed payload)
article28_provider_obligations = signatureValid (the evidence was produced by a known signing key)

The flags are output, not input. They're computed at verification time from the cryptographic state. The application never asserts them; the math derives them. We call this principle truthful representation (CI-5 in our internal terminology), and it's the difference between an evidence system that survives an adversarial audit and one that doesn't.

Proof chain for sequential integrity

Single-record signatures prove each record is authentic. They don't prove no records were silently removed. For that, every evidence record carries a proofChainHash field that hashes the previous record's hash plus the current record's content. The chain has 100% coverage by invariant (SE-5: no orphan records). If anyone deletes a record from the middle of the chain, every subsequent record's chain hash no longer matches — verification fails for the entire tail.

Proof chain verification is in flight as a public verification surface — the data is bound cryptographically today, but the full sequential walk (genesis → tip) isn't yet exposed via the REST/tRPC verification APIs. Coming soon.

What this looks like in practice

One of our auditors fetched a public evidence record yesterday. Walked through these 6 steps. Confirmed VERIFIED. Then asked for a record from a deleted decision and confirmed that the chain hash on the next record didn't match — proving the deletion. Then asked us for a record from a different tenant and confirmed Layer 2 flagged the tenant mismatch. Then signed off.

The audit took 14 minutes. The auditor wrote zero SQL. We didn't have to share screens, walk through dashboards, or explain our internal data model. They verified the math, and the math agreed with our claims. That's what audit-grade evidence is supposed to feel like.

Try it yourself

Public verify API: GET /api/public/verify?id=<evidenceId> — unauthenticated, rate-limited, public.
JWKS endpoint: GET /.well-known/strix-jwks.json — RFC 7517 standard.
External verifier: npx @strixgov/verifier@1.2.0 approval <id> — open source, no Strix dependency.
Capability registry: GET /api/v1/capabilities — every governable action with risk tier.

If your AI deployment can't pass this audit today, we can help. Book a 15-minute walkthrough at /request-access and we'll take you end-to-end through your specific evidence requirements.