Abstract
Three model substitution scenarios were executed against a live inference endpoint with real HTTP requests, signed attestation JWTs, and OPA policy enforcement. In each scenario, every tested workload, artifact, or API identity control relevant to that scenario — workload JWT validation, health checks, gateway process continuity, artifact manifest integrity, API key authentication — remained valid while the model changed. In each scenario, a structural identity measurement based on activation geometry during a standard forward pass detected the substitution and the enforcement layer denied the request. Three substitutions were tested and three were detected, with zero false accepts in this run. The warm-path verification latency was 5.7–6.7 seconds on a single A100 with the model already loaded. The complete evidence chain — before/after measurement results, attestation claim summaries, OPA policy evaluations, and HTTP response codes — is published alongside this note as machine-readable JSON.
1. Purpose
Enterprise identity systems can authenticate a workload, verify an artifact, and authorize a request without ever establishing which neural network is actually computing inside the service. That distinction is the point of this note.
The measurement used here identifies neural networks at inference time from the geometry of their output distributions, without inspecting weights, modifying the model, or requiring cooperation from the deployer. It has already been validated across 48 models spanning three architecture types and parameter counts from 410 million to 72.7 billion [1–6]. What has not yet been shown is the operational step: the measurement running inside a live gateway, issuing a signed attestation, evaluating policy, and changing the HTTP outcome of a real request.
This note provides that experiment.
The question it answers is operational rather than scientific: if an authorized model is enrolled behind a production-shaped gateway and a different model is substituted behind the same stable endpoint, does the measurement detect the substitution in a real request flow, and does the enforcement chain act on it?
Three kinds of identity evidence are relevant in these scenarios: artifact identity (which file was deployed), workload identity (which process or service is authenticated), and model identity (which neural network is actually computing). This note tests what happens when the first two remain valid while the third changes.
2. Architecture
Two servers run on the same host:
A model server on port 8080 exposes a health endpoint and an inference endpoint. The model server loads the neural network and serves completions.
A gateway on port 9090 intercepts every request and runs the following chain before deciding whether to proxy to the model server:
2. Check model server health via real HTTP to port 8080
3. Run the structural identity measurement against the enrolled anchor
4. Issue a signed attestation JWT encoding the measurement result
5. Verify the attestation JWT (full signature + claims verification)
6. Evaluate OPA policy against the verified attestation claims
7. If policy allows → proxy to model server, return inference response
8. If policy denies → return HTTP 403 directly from the gateway
Two tokens are in play throughout. The workload JWT (agent identity) is issued once before the experiment begins, stays valid across all phases, and represents the authenticated software agent — this token is green in every scenario. The attestation JWT (model identity) is issued per-measurement by the production attestation issuer and encodes whether the running model matches the enrolled model — this token turns red when the model changes.
Server
Figure 1. Gateway request flow with workload and model identity tokens. The gateway runs six checks; proxy and deny are the two outcomes. The workload JWT (step 1) remains valid in every scenario. The attestation JWT (step 4) encodes the structural measurement result and flips from match to mismatch when the model changes.
3. Three Scenarios
Scenario A: Same-Family Substitution Behind a Stable Endpoint
The enrolled model is Llama-3.1-8B-Instruct. The substituted model is DeepSeek-R1-Distill-Llama-8B — a model distilled onto the same Llama architecture with the same parameter count and the same API contract.
The gateway process stays alive throughout. The workload JWT is the same JWT. The health endpoint returns 200 in both cases. The policy hash is unchanged.
Before substitution (A1): The gateway
measures the model, finds 0.0×ε distance from the enrolled anchor, issues an attestation with
identity_match: true and confidence: high, OPA allows, the request is proxied to the
model server, HTTP 200 is returned.
After substitution (A2): The gateway
measures the model, finds 2,858×ε distance from the enrolled anchor, issues an attestation with
identity_match: false and confidence: low, OPA denies with reason
model_identity_mismatch, HTTP 403 is returned directly from the gateway. The request never
reaches the model server.
| Control | Before (A1) | After (A2) | Status |
|---|---|---|---|
| Gateway PID | 1639 | 1639 | Unchanged |
| Workload JWT | PASS | PASS | Unchanged |
| Health check | 200 | 200 | Unchanged |
| Policy hash | f061...bb2b | f061...bb2b | Unchanged |
| Attestation match | true | false | Changed |
| OPA verdict | allow | deny | Changed |
| HTTP response | 200 | 403 | Changed |
| Proxied to backend | yes | no | Changed |
Every tested workload and endpoint identity control remained green. Only the model identity layer turned red.
Scenario B: Cross-Family Substitution with Artifact Hash Integrity
The enrolled model remains Llama-3.1-8B-Instruct from Scenario A. The substituted model is Qwen-2.5-7B-Instruct — a different architecture family, different tokenizer, different training lineage.
Both models' weight files are verified against build manifests containing SHA-256 hashes of every artifact. Both manifests pass. Llama: 10 files, all hashes match. Qwen: 10 files, all hashes match. Traditional artifact integrity verification sees nothing wrong with either model.
The structural identity measurement rejects the substituted model at 3,416×ε — exceeding Scenario A's
same-family rejection of 2,858×ε, as expected for cross-family pairs. The signed attestation carries
identity_match: false. OPA denies.
The manifests verified the files. They did not verify the computation.
Scenario C: Silent API Rotation
The enrolled model is gpt-4.1-mini served through the OpenAI API. The rotated model is gpt-4.1-nano. Both use the same API key, the same endpoint base, the same authentication, the same billing account. The provider authenticates both models identically.
Each model is enrolled across three independent sessions. Per-model thresholds are computed from maximum cross-session self-distance with a safety margin. Self-verification accepts both models against their own enrollment. Cross-verification rejects: the mean per-prompt distance of 5.66 exceeds both per-model thresholds (3.79 and 1.16).
The API key authenticated both models. The structural measurement distinguished them.
4. Validation Gates
Before any scenario executes, two validation gates confirm measurement engine integrity:
V1 (self-verification): Qwen-2.5-0.5B-Instruct enrolled and immediately re-measured. Distance: 0.0×ε. τmean = 0.1201, within canonical range (0.02–0.15), cross-seed CV = 0.081. The measurement produces a zero-distance self-match and a structurally consistent fingerprint.
V2 (cross-family rejection): Llama-3.2-1B-Instruct measured against the Qwen anchor. Distance: 2,652×ε. The measurement produces a decisive cross-family rejection.
Both gates passed before any experimental scenario was executed.
5. Timing
Model loading is infrastructure overhead — 34–45 seconds to download and shard an 8B model. This is not verification latency. With the model already loaded and serving:
| Phase | Measurement | Gateway Latency |
|---|---|---|
| A1 (baseline) | 6.71s | 6.76s |
| A2 (substituted) | 6.66s | 6.70s |
| B (cross-family) | 5.66s | — |
OPA policy evaluation: < 1ms. The measurement dominates.
This measurement is not inline on every user request. It runs at model load, on a periodic schedule, or as an out-of-band health check — the same deployment pattern as certificate rotation or container attestation refresh. The per-request path carries the cached attestation JWT, not the measurement itself.
6. Platform
Runtime: PyTorch 2.4.1+cu124, CUDA 12.4
Contract: d=64, k=32, seeds=(42, 123, 456, 789), ε=1.003×10⁻⁴
Gateway: FastAPI/uvicorn on port 9090
Model server: FastAPI/uvicorn on port 8080
Attestation: production
fallrisk_attest.py (RS256
signed JWTs)Policy: production
policy.rego logic (hash:
f0610f29e279bb2b)
Distances throughout this note are expressed as multiples of ε, the acceptance threshold derived from measurement-platform numerical precision. A distance of 0.0×ε indicates an identical model; any distance above 1.0×ε indicates a different model.
7. What This Does and Does Not Establish
This experiment establishes that artifact integrity, workload identity, and API authentication can remain valid while runtime model identity changes — and that a structural identity measurement, integrated into a standard gateway enforcement chain, detects the change and produces an enforceable policy decision. Three substitution scenarios were tested — same-family, cross-family, and API rotation — and all three were detected with zero false accepts.
This experiment does not establish that existing identity controls are unnecessary. Workload identity, artifact integrity, and API authentication are real and necessary controls. The finding is that they occupy a different evidence class than model identity and do not answer the same question. Artifact integrity and workload identity did not establish runtime model identity in these scenarios.
8. Evidence Artifacts
The complete machine-readable evidence is published alongside this note:
cat3_results.json — structured results for all three scenarios, including the full before/after
evidence chain for Scenario Amanifest_authorized.json — SHA-256 manifest for the enrolled model (10 files, all verified)manifest_substituted.json — SHA-256 manifest for the substituted model (10 files, all verified)
References
View 6 references ↓
[1] A. R. Coslett, "The δ-Gene: Inference-Time Physical Unclonable Functions from Architecture-Invariant Output Geometry," 2026. DOI: 10.5281/zenodo.18704275
[2] A. R. Coslett, "Template-Based Endpoint Verification via Logprob Order-Statistic Geometry," 2026. DOI: 10.5281/zenodo.18776711
[3] A. R. Coslett, "Composable Model Identity: Formal Hardening of Structural Attestations in the Enterprise Identity Stack," 2026. DOI: 10.5281/zenodo.19099911
[4] A. R. Coslett, "Post-Hoc Disclosure Is Not Runtime Proof: Model Identity at Frontier Scale," 2026. DOI: 10.5281/zenodo.19216634
[5] A. R. Coslett, "Agent Identity Is Not Model Identity," 2026. DOI: 10.5281/zenodo.19240883
[6] A. R. Coslett, "What Counts as Proof? Admissible Evidence for Neural Network Identity Claims," 2026. DOI: 10.5281/zenodo.19058540
Cite this paper
Acknowledgments
Portions of this research were developed in collaboration with AI systems that served as co-architects for experimental design, adversarial review, and manuscript preparation. All scientific claims, experimental designs, measurements, and editorial decisions remain the sole responsibility of the author. Experiments were conducted on Google Colab using NVIDIA A100-SXM4-80GB GPUs.
Author's Disclosure
Anthony Ray Coslett is the founder of Fall Risk AI, LLC, which holds the provisional patents listed below. The structural identity measurement described in this paper operates within the scope of that intellectual property. No external funding was received for this research.
Patent Disclosure
U.S. Provisional Patent Applications 63/982,893, 63/990,487, 63/996,680, and 64/003,244 are assigned to Fall Risk AI, LLC.