Composable Model Identity: Formal Hardening of Structural Attestations in the Enterprise Identity Stack

Coslett, Anthony Ray

doi:10.5281/zenodo.19099911

Abstract

Enterprise identity systems can authenticate workloads, credentials, and attested platforms, but they do not close the composition layer where runtime model identity enters authorization. A token can verify that a service is running in a trusted environment, that its credentials are valid, and that its actions are authorized — without ever establishing which neural network is actually computing. When model identity evidence is inserted into standard authorization flows, new security properties emerge that are not inherited from the underlying protocols and must be formally established rather than presumed.

This paper presents a live integration architecture for model-identity attestations in JWT and SPIFFE-style token flows, grounded in real measurements from six neural networks executed inside an NVIDIA H100 Confidential Computing enclave. It formally verifies four composition properties — non-separability, temporal binding, issuer authenticity, and reference integrity — across three Coq proof files with zero unfinished proof obligations. Every remaining trust dependency is explicitly named, traced to an integration control, and paired with a concrete falsification witness. The result is a formally hardened composition layer where no security property is left implicit and no assumption is left silent.

1. The Composition Problem

Current enterprise identity standards solve three problems well. OAuth 2.0 manages delegated authorization: an agent can act on behalf of a user without possessing the user's credentials. SPIFFE issues cryptographic workload identities: a service can prove it is running in the expected environment with the expected software. SCIM manages identity lifecycle: accounts are provisioned, updated, and revoked across systems. These protocols compose effectively with each other. An OAuth token can carry a SPIFFE-attested workload identity. A SCIM-provisioned account can hold OAuth credentials. The enterprise identity stack is mature, interoperable, and well understood. What it does not do is verify which model is computing inside the workload. A neural network model can be substituted with a different model, distilled from an unauthorized source, or silently modified through continued training. When this happens, every credential, workload identity, and authorization token remains valid. The OAuth token still authorizes the agent. The SPIFFE identity still attests the workload. The SCIM record still shows the expected deployment. Only the model has changed — and no layer of the existing stack detects it. This is not a hypothetical gap. In February 2026, Anthropic disclosed that multiple organizations had conducted industrial-scale distillation campaigns against their models, using approximately 24,000 fraudulent accounts to generate over 16 million exchanges for the purpose of training unauthorized copies [15]. The resulting systems could still satisfy many standard documentary and credential checks. The identity infrastructure could remain intact while the model layer had changed underneath. The question this paper addresses is not whether model identity can be measured.

Figure 1 · Model Identity Composition Layer

Measurement
Engine

→

Evidence
Bundle

→

Claim
Derivation

→

JWT/SVID
Issuer

→

Relying
Party

→

Access
Decision

◄ This paper: model-identity composition ► Traditional stack →

Green: formally hardened. Gray: standard enterprise infrastructure (OAuth, SPIFFE, SCIM).

Prior work in this series established three results that this paper does not re-prove. First, neural networks carry a measurable structural identity — a compact fingerprint derived from the model's internal computation — that distinguishes individual models rather than merely their files. A weight file documents a container; the structural fingerprint identifies what appears when that container is loaded and begins computing. Second, that identity can be bound to runtime attestation inside hardware-secured environments and is not reducible to ordinary artifact documentation. Third, different kinds of identity evidence — structural (derived from model internals during computation), functional (derived from the model's input-output behavior), and thermodynamic (derived from universal statistical properties of neural network outputs) — answer different questions and cannot be substituted for one another. This paper begins from those results and asks the next question: what must be proved when structural model identity is carried inside enterprise authorization tokens? Once a model-identity claim is placed in a standard token, the composition itself becomes a security object. Properties that the underlying protocols were not designed to provide — non-separability of the claim from its authorization context, temporal binding between evidence freshness and token lifetime, issuer authenticity for the attestation service, and integrity of the off-token evidence bundle — must be formally established rather than inherited.

2. The Live Integration Architecture

The integration architecture presented here was exercised on March 17, 2026, inside a GCP Confidential VM running an NVIDIA H100 GPU in Confidential Computing mode with Intel TDX CPU-level attestation. This paper focuses on structural model identity — the strongest evidence class in the taxonomy established by prior work — carried through standard authorization infrastructure where privileged runtime measurement is available. Six neural network models spanning four architecture families and parameter counts from 0.5 billion to 7 billion were measured inside the enclave. The measurement engine — running within the hardware-attested trust boundary — processed standardized challenge inputs through each model and observed the model's internal activation geometry to produce a 64-dimensional structural fingerprint. The fingerprint is not a hash of the model's weight file. It is a statistical summary of how the model's internal representations behave during actual computation — specifically, the geometric relationships between activation patterns as the model processes diverse inputs. Each dimension of the fingerprint captures a geometric statistic computed from how the model's internal representations relate to each other across a diverse set of input contexts. Two models built from the same architecture template, trained on different data or for different durations, produce different activation geometries and therefore different fingerprints. Conversely, the same model measured repeatedly produces the same fingerprint within measurement noise. The fingerprint requires running the model; it cannot be extracted from static artifacts alone. Prior work in this series [1–4] established these properties across 23 models spanning 16 architecture families with zero false acceptances across over 1,000 pairwise comparisons.

Model	Family	Parameters	Structural Identity (mean)
Qwen-2.5-0.5B	Qwen	0.5B	0.134
Llama-3.2-1B	Llama	1B	0.1160
Qwen-2.5-1.5B	Qwen	1.5B	0.133
Gemma-2-2B	Gemma	2B	0.359
Llama-3.2-3B	Llama	3B	0.1163
Mistral-7B	Mistral	7B	0.105

The values shown are the mean of each model's 64-dimensional fingerprint vector, computed from the activation geometry observed during measurement. While the scalar mean provides a compact summary, the full fingerprint is a 64-dimensional vector — and it is the full vector, not the mean, that distinguishes models. Two models may share similar means (as Llama-1B and Llama-3B do at 0.1160 and 0.1163) while differing substantially across the other 63 dimensions. Across the six-model dataset (1,536 total measurements), the measurement process produced zero execution failures and no fingerprint collisions. Each model's full 64-dimensional fingerprint digest was unique. Models from the same architecture family — Qwen-0.5B and Qwen-1.5B, Llama-1B and Llama-3B — produced entirely distinct fingerprints despite sharing tokenizer, architecture, and training lineage. Structural identity is individual, not familial. The measurement produces an evidence bundle containing the fingerprint, a cryptographic hash of the model's weight files, hardware attestation tokens from both the CPU enclave and the GPU, and a binding value that ties the fingerprint to the specific attestation session. This bundle is the raw evidence. It is too large and too sensitive to carry inside an authorization token. Instead, a compact model-identity claim is derived from the bundle and carried as a namespaced custom claim inside a standard JWT. The claim contains digests of the fingerprint and the full bundle — not the raw values — along with a reference to the stored bundle for relying parties that require deeper audit, a timestamp indicating when the measurement was performed, and a separate evidence-freshness deadline that tracks when the measurement itself becomes stale, independent of the token's own expiry. The same claim structure fits inside a SPIFFE JWT-SVID for workload-to-workload identity at Layer 7. The only change is the subject and audience fields, which use SPIFFE trust-domain URIs instead of application identifiers. The model-identity payload is identical. Standard OAuth and SPIFFE infrastructure passes the claim through unchanged — no protocol modifications are required.

3. The Security Perimeter

When a model-identity claim enters an authorization token, ten security properties must hold for the composition to be trustworthy. These properties are not all novel — some are inherited from standard JWT verification, some are traced to existing cryptographic infrastructure, and some are new to the model-identity layer. The contribution of this paper is to make every property explicit, classify each one, and close the ones that are novel.

Four proved. Three traced to existing standards or prior formal work. One implemented by standard JWT mechanisms. Two constrained by design. No property is left unnamed. No assumption is left silent.

Property	Statement	Status	Source
Non-separability	Claim cannot migrate across authorization contexts without re-attestation	PROVED	ComposableIdentity.v A3
Temporal validity	Freshness checking is necessary, not optional; expiry-only verification inherits stale-evidence vulnerability	PROVED	ComposableIdentity.v B4
Issuer authenticity	Accepted token traces to authorized issuer via the specific signing key	PROVED	IssuerAuthenticity.v T1
Reference integrity	Bundle modification between issuance and audit is detectable via digest mismatch	PROVED	ReferenceIntegrity.v T1, T3
Unforgeability	Valid claim requires genuine measurement in genuine enclave	TRACED	Prior impossibility proofs + TEE chain
Non-replayability	Claim from one session cannot be replayed for another	TRACED	bind_root freshness + JWT jti
Evidence-class correctness	Claim declares its evidence class; relying party checks class against required level	TRACED	Prior admissibility condition
Audience binding	Token scoped to specific relying party	IMPLEMENTED	JWT aud (RFC 7519)
Claim minimality	Token carries digests and references, not raw values	DESIGN	Token schema + redaction policy
Verification soundness	Relying party executes declared verification sequence without omitting required checks	DESIGN	Relying-party implementation

4. The Formal Model

The proof suite consists of three Coq files, each targeting a specific trust boundary of the composition layer. Coq is a proof assistant that mechanically checks every logical step in a proof and rejects any argument with unfinished obligations — a theorem either compiles completely or it does not compile at all. The files share a common proof culture: every axiom is a named, falsifiable assumption about the integration layer; every theorem closes with a complete proof; and every axiom is paired with a concrete falsification witness describing what kind of system would violate it and what security consequence would follow. The first file, ComposableIdentity, models the composition of model-identity claims with authorization contexts. It introduces abstract types for contexts, claims, tokens, and time, along with relations for binding, verification, token derivation, and independent attestation. It proves eight theorems from six named axioms, using classical logic in exactly one proof where a contradiction argument requires excluded middle on an attestation predicate. The second file, IssuerAuthenticity, models the PKI layer that connects signature verification to issuer attribution. It proves two theorems from three named axioms, chaining signature soundness through key exclusivity to issuer binding. The third file, ReferenceIntegrity, models the temporal relationship between the evidence bundle committed at token issuance and the bundle retrieved at audit time. It uses a time-indexed storage retrieval function — the key modeling choice that distinguishes this formalization from an atemporal digest tautology. It proves three theorems from two named axioms. Together, the three files close the four composition properties that are novel to this layer — migration resistance, freshness binding, issuer authenticity, and reference integrity — under a fully named assumption set where every axiom is paired with a falsification witness and an integration control.

5. Claims Must Not Migrate or Outlive Their Evidence

The first proof family addresses two properties that existing token standards were not designed to provide: non-separability and temporal binding.

Non-separability. A model-identity claim is bound to a specific authorization context — a combination of audience, presenter, and session binding. The migration impossibility theorem proves that if an attacker derives a new token from an existing one and the new token is accepted in a different context, then the new token must carry independent attestation for that context. The claim did not migrate — it was independently re-attested. The proof proceeds by contradiction: if the derived token is accepted but not independently attested, then the derivation-does-not-create-binding axiom implies the token is not genuinely bound, contradicting the verifier-soundness axiom that acceptance implies binding. This theorem depends on four named assumptions: that the verifier checks context binding, that acceptance implies genuine binding, that derivation alone does not create binding, and that a token binds to at most one context. Each assumption is falsifiable. A verifier that ignores the audience field violates the first. A verifier that accepts unsigned tokens violates the second. A system that treats model-identity claims as bearer credentials violates the third. A system that issues multi-audience attestation tokens violates the fourth. Each violation has a stated security consequence.

Temporal binding necessity. Token expiry and evidence freshness answer different questions. Token expiry asks whether the credential is still valid. Evidence freshness asks whether the measurement the credential refers to is still trustworthy. A model can be silently modified between the measurement time and the token's expiry. The temporal binding necessity theorem proves that any verification procedure at least as permissive as standard expiry-only checking inherits a specific vulnerability: it will accept at least one token whose evidence is stale. Being strictly more restrictive — checking both token expiry and evidence freshness — is not a best practice. It is a formal necessity. The proof depends on one named assumption: that there exist deployments where token lifetime exceeds evidence freshness. This is grounded in the real integration: a model measured once at deployment with a one-hour freshness window, carried in a token refreshed on a 24-hour cycle. At hour two, the token is valid but the evidence is stale.

6. Accepted Claims Must Come From the Right Issuer and the Right Bundle

The second proof family addresses two trust boundaries that would otherwise leave the composition feeling decorative rather than secure.

Issuer authenticity. The issuer authenticity theorem proves that if a token passes signature verification, there exists an authorized issuer who holds the specific key that signed the token, and the token was issued by that issuer. The signing key, the issuer, and the attribution are all linked — there is no gap where "a valid key exists somewhere" but cannot be connected to the specific token. The proof chains through three axioms: signature verification implies a signing key exists, the key is either authorized or verification fails, and an authorized key identifies its issuer.

Reference integrity. The integration architecture carries compact claims inside tokens, not full evidence bundles. The evidence_ref field points to the stored bundle; the bundle_digest field commits to its content at issuance time. The substitution detection theorem proves that if the bundle retrieved at audit time differs from the bundle that was stored at issuance time, the digest check against the token's committed value fails. The proof uses a time-indexed storage model — the bundle at reference r at time t — which distinguishes issuance-time content from audit-time content. An atemporal model would collapse this distinction and make the theorem trivially true. The time-indexed model makes it operationally meaningful: the theorem captures the real scenario where an attacker modifies the stored evidence after the token was issued. The converse is also proved: if the digest check passes at audit time, the bundle at audit time is the bundle at issuance time. The store is intact. This is the direction that matters most to the relying party.

7. Assumption Closure

Every theorem in this paper rests on named axioms. The following table connects each axiom to the integration layer, states what would violate it, and identifies the security consequence of the violation.

Axiom	Integration Control	Falsification Witness	Consequence If Violated
Binding enforcement (NA1)	JWT audience validation per RFC 7519	Verifier skips `aud` check	Tokens accepted in wrong context; claim migration succeeds
Verifier soundness (NA2)	JWT signature verification per RFC 7515	Verifier accepts unsigned tokens	Attacker forges model-identity claims
Derivation ≠ binding (NA3)	bind_root ties attestation to session	Model-identity claim treated as bearer credential	Attacker extracts claim, pastes into different token
Binding exclusivity (NA4)	Single-audience model-identity tokens	Multi-audience attestation tokens issued	Same token valid in multiple contexts; migration trivial
Attestation → binding (NA5)	Fresh measurement + bind_root + context-consistent aud/cnf	Attestation decoupled from context binding	Fresh attestation fails to establish verifiable context binding
Freshness < expiry (NB1)	Token TTL can exceed evidence freshness	Measurement done once, tokens refreshed on rolling schedule	Stale evidence hidden behind valid token
Signature soundness (IA1)	JWT signature checking per RFC 7515	Verifier returns true without checking	Accepts unsigned tokens
Key-to-issuer binding (IA2)	JWKS serves issuer-specific key sets	Keys shared across issuers	Cannot attribute token to specific issuer
Key exclusivity (IA3)	Trust store rotation + revocation	Compromised key remains in trust store	Attacker's tokens pass verification
Collision resistance (RI1)	SHA-256 for bundle digests	SHA-256 collision (~2^128 operations)	Substituted bundle passes digest check
Issuance-time binding (RI2)	Hash-then-sign issuance discipline	Digest computed from wrong bundle	Token's committed digest mismatches actual evidence

No assumption in this paper is hidden. No trust boundary is left implicit. The reader who disagrees with an axiom can name the specific axiom, identify the specific integration control, and state the specific consequence of its violation.

8. What This Proves, What It Does Not, and Why

This paper formally verifies the composition layer where model-identity evidence enters the enterprise authorization stack. It does not formally verify the entire stack. The JWT signature algorithm is not re-proved here — it is traced to RFC 7515. The TEE attestation chain is not re-proved — it is traced to Intel TDX and NVIDIA Confidential Computing validation. The structural fingerprint's unforgeability is not re-proved — it is traced to prior impossibility results in the research series. The audience-binding mechanism is not re-proved — it is implemented by standard JWT processing. What this paper does prove is that the new properties introduced by the composition — the ones that OAuth, SPIFFE, and SCIM were not designed to provide — are formally sound under explicitly stated assumptions. Non-separability. Temporal binding. Issuer authenticity for the attestation layer. Reference integrity across time. The remaining properties are either traced (connected to an existing standard or prior proof), implemented (using a standard mechanism), or design-constrained (enforced by the token schema). Every one is named. In the present suite, the property table has zero OPEN rows. This boundary is not an apology. It is a design choice. Reproving SHA-256 collision resistance or JWT signature verification would not strengthen the paper — those are established results with their own verification histories. What strengthens the paper is proving the properties that did not exist before this composition was built.

9. Engineering Consequence

The formal results in this paper are not abstract guarantees. They impose concrete design obligations on any system that carries model-identity claims inside enterprise authorization tokens. If you omit context binding — if the model-identity claim is a detachable bearer credential rather than a context-bound attestation — then claims can migrate across authorization boundaries without re-measurement. The non-separability theorem proves this is not a theoretical risk but a structural property of the composition: derivation without re-attestation cannot create genuine binding. If you omit evidence freshness checking — if your verification procedure checks only token expiry — then stale measurement evidence can pass verification indefinitely. The temporal binding necessity theorem proves this is not a hygiene concern but a formal vulnerability class: any verifier at least as permissive as expiry-only checking inherits it. If you leave issuer attribution implicit — if your system accepts signed tokens without connecting the signing key to an authorized issuer — then the audit trail breaks. The issuer authenticity theorem proves that the chain from signature to key to issuer must be complete, not just present. If you leave bundle integrity unverified — if your system trusts the evidence reference without checking the digest — then the off-token evidence can be modified after issuance without detection. The substitution detection theorem proves that the time-indexed digest check is the mechanism that closes this gap. These are not recommendations. They are consequences of the formal model. A system that violates any of them inherits the specific vulnerability that the corresponding theorem characterizes.

Acknowledgments

Portions of this research were developed in collaboration with AI systems that served as assistants for formal verification sketching, adversarial review, and manuscript preparation. All scientific claims, experimental designs, measurements, formal proofs, and editorial decisions remain the sole responsibility of the author.

Patent Disclosure

The composition architecture described in this work operates within the scope of U.S. Provisional Patent Applications 63/982,893 (weights-based identity verification, filed February 13, 2026), 63/990,487 (API-based endpoint verification, filed February 25, 2026), 63/996,680 (privacy-preserving model identity verification, filed March 4, 2026), and 64/003,244 (identity-conditioned inference verification, filed March 12, 2026). All four provisional patents are assigned to Fall Risk AI, LLC.

Supplementary Material

This paper is accompanied by three Coq proof files — ComposableIdentity.v, IssuerAuthenticity.v, and ReferenceIntegrity.v — that formally verify the four composition properties described in §§4–6: non-separability, temporal binding necessity, issuer authenticity, and reference integrity. Together the files prove thirteen theorems from eleven named axioms, each paired with a concrete falsification witness and an integration control. No file contains unresolved obligations (Admitted), and all three compile cleanly under the Rocq Prover 9.1.1 (the current release of the Coq proof assistant, compiled with OCaml 5.4.0). They are available as supplementary files alongside this paper on Zenodo.

References

View 15 references ↓

1. Coslett, A. R. (2026). The δ-Gene: Inference-Time Physical Unclonable Functions from Architecture-Invariant Output Geometry. Zenodo. DOI: 10.5281/zenodo.18704275

2. Coslett, A. R. (2026). Template-Based Endpoint Verification via Logprob Order-Statistic Geometry. Zenodo. DOI: 10.5281/zenodo.18776711

3. Coslett, A. R. (2026). The Geometry of Model Theft. Zenodo. DOI: 10.5281/zenodo.18818608

4. Coslett, A. R. (2026). Provenance Generalization and Verification Scaling for Neural Network Forensics. Zenodo. DOI: 10.5281/zenodo.18872071

5. Coslett, A. R. (2026). Beneath the Character: The Structural Identity of Neural Networks. Zenodo. DOI: 10.5281/zenodo.18907292

6. Coslett, A. R. (2026). Which Model Is Running? Zenodo. DOI: 10.5281/zenodo.19008116

7. Coslett, A. R. (2026). The Deformation Laws of Neural Identity. Zenodo. DOI: 10.5281/zenodo.19055966

8. Coslett, A. R. (2026). What Counts as Proof? Admissible Evidence for Neural Network Identity Claims. Zenodo. DOI: 10.5281/zenodo.19058540

9. NIST NCCoE. (2026). Accelerating the Adoption of Software and AI Agent Identity and Authorization. Concept Paper, February 2026. 10. RFC 9068. JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens. 11. RFC 7515. JSON Web Signature (JWS). 12. RFC 7800. Proof-of-Possession Key Semantics for JSON Web Tokens. 13. SPIFFE JWT-SVID Specification. https://spiffe.io/docs/latest/spiffe-specs/jwt-svid/ 14. RFC 7519. JSON Web Tokens (JWT).

1

5. Anthropic. (2026). Detecting and Preventing Distillation Attacks. Blog post, February 23, 2026. https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks

Cite this paper

A. R. Coslett, "Composable Model Identity: Formal Hardening of Structural Attestations in the Enterprise Identity Stack," Paper IX, Fall Risk AI, LLC, March 2026. DOI: 10.5281/zenodo.19099911

Click to select · Copy to clipboard