Which Model Is Running? Structural Identity as a Prerequisite for Trustworthy Zero-Knowledge Machine Learning

Coslett, Anthony Ray

doi:10.5281/zenodo.19008116

Abstract

Neural networks deployed behind APIs or in cloud infrastructure are often verifiable only as black boxes. zkML systems have made substantial progress on computational integrity: proving that a committed model produced a claimed output honestly. But those proofs begin from a weight commitment, and a weight commitment is not a model identity. A prover can commit to arbitrary weights, execute them honestly, and still prove the computation correctly. We present an identity-first verification framework for the missing layer beneath computational integrity. The framework composes four levels. Two are inherited: structurally attestable model fingerprints via the IT-PUF protocol [1], formally verified in Coq and validated across 23 models with zero false acceptances, and hardware-attested binding from fingerprinted identity to model weights through a trusted execution environment [4].

Two are new: a hybrid verifier-checkable computation path through a complete Transformer decoder layer, combining zero-knowledge circuit proofs with deterministic verifier-side checks under incrementally verifiable computation, and output binding from the verified computation to an observable token logit. On a tested micro-model, a one-step recurrence experiment found costs consistent with linear layer scaling: the dominant sub-computation of a second decoder layer matched the first in constraint count and proof size, and layer-boundary normalization acted as a measured scale reset. An accidental rescaling error then compressed the fingerprint observable to roughly 1.5 bits of dynamic range, yet the structural fingerprint retained 0.98 rank correlation with its reference. This suggests that the identity observable may depend more on relational geometry than on activation magnitude. Existing zkML systems address the computation question. This work advances the missing identity layer beneath it. Throughout the paper, formally proved results, empirical validation, and single measured observations are distinguished.

§1. Introduction

§1.1. The Verification Gap

When an enterprise deploys an AI model through a cloud provider, or when a regulator evaluates a model's outputs for compliance, or when a consumer interacts with a chatbot that claims to be a specific system — in each case, the same question arises: which model is actually running? The question is not academic. In February 2026, Anthropic publicly disclosed that three AI laboratories had conducted industrial-scale knowledge distillation campaigns against Claude, generating over 16 million exchanges through approximately 24,000 fraudulent accounts [6]. Opaque or automatic backend model changes in API-served AI systems are well documented: major providers use mutable model identifiers that silently update to point to new model snapshots [21], and independent measurement has shown that the behavior of the "same" LLM service can change substantially over short periods without disclosure [22]. Covert model substitution — where a provider claims to serve a specific target model but may substitute a smaller, quantized, or entirely different model — has been explicitly formalized as a security and audit threat [23]. Governance frameworks are moving toward requiring the traceability infrastructure that model identity verification would support. The EU AI Act requires versioned technical documentation, automatic logging, and identification records for high-risk AI systems (Articles 11–12, Annex IV) and registration with "unambiguous reference allowing the identification and traceability of the AI system" (Article 49, Annex VIII) [24].

The NIST Generative AI Profile recommends contracts specifying content provenance expectations, monitoring of third-party adherence to provenance standards, and detection of unauthorized changes [25]. These frameworks do not yet require cryptographic verification of the exact model serving each request — but they establish the regulatory direction toward model traceability, and the gap between documentary provenance and cryptographic verification is the gap this paper addresses. Zero-knowledge machine learning systems have closed part of this gap. Lagrange's DeepProve [7], EZKL [8], Giza [9], and Modulus [10] have demonstrated that neural network inference can be verified in zero knowledge — enabling a prover to convince a verifier that a committed model produced a specific output without revealing the model's weights. This addresses computational integrity: the guarantee that the computation was performed honestly on the committed weights. But computational integrity begins from a weight commitment. And a weight commitment is not a model identity. A weight commitment proves: "computation used weights \(W\)." It does not prove: "weights \(W\) belong to the model trained by organization \(X\) on dataset \(Y\) with provenance \(Z\)." The gap is exploitable. An adversary can commit to arbitrary weights, prove honest computation on those weights, and produce a valid zkML proof — valid in the cryptographic sense, but meaningless in the trust sense, because the model behind the commitment is not the model the verifier believes it to be. This paper addresses the layer beneath computational integrity: structural identity.

§1.2. The Identity-First Thesis

We propose that trustworthy zero-knowledge machine learning requires two complementary layers, not one: 1. Structural identity. A proof that the model is the specific model it claims to be — not merely that some set of weights was used, but that those weights constitute the model with a specific, measurable, unforgeable fingerprint. 2. Computational integrity. A proof that the identified model computed honestly on a given input. Neither layer alone is sufficient. Computational integrity without identity is verification of an unknown. Identity without computational integrity is attestation without accountability. Together, they answer: "which specific model computed this, and did it compute honestly?" Mapping structural identity to organizational ownership (whose model is it?) requires an external registry or certificate chain — a deployment concern, not a cryptographic one. To our knowledge, the structural identity layer does not yet exist in current zkML systems (see Table 1 in §2.1 for a property comparison of surveyed systems). This paper advances it.

§1.3. How This Paper Extends Papers 1–5

This work is the sixth in a series that developed the scientific and engineering foundations for neural network structural identity. Papers 1–3 [1, 2, 3] established the core science: a structural fingerprint observable (the δ-gene), a challenge-response authentication protocol (the IT-PUF), API endpoint verification, distillation forensics, and adversarial erasure resilience. The mathematical foundation is formally verified in Coq — 352 theorems across 17 proof files, zero Admitted [PROVEN]. The empirical foundation spans 23 models with zero false acceptances across 1,012 pairwise comparisons [VALIDATED]. (Throughout this paper, [PROVEN] denotes machine-checked theorems in Coq; [VALIDATED] denotes empirical laws confirmed across multiple experiments; [MEASURED] denotes single observations.) Paper 4 [4] defined the three-tier zero-knowledge attestation architecture and the Trust Paradox. The first two tiers are validated; the third was proposed as an open problem. Paper 5 [5] sharpened the philosophical implications, establishing the Two-Layer Identity (structural vs. functional) and its connection to the philosophical discourse on AI selfhood. Paper 6 carries structural identity into verifier-checkable computation. Where Papers 1–5 answered "can we measure and attest model identity?", this paper answers "can we verify computation under an identity-attested model through a decoder-layer slice, and connect the result to an observable output?"

§1.4. Contributions

This paper makes three architectural contributions and one methodological finding: 1. We demonstrate a hybrid verifier-checkable computation path through one full Transformer decoder layer of a micro-model: attention, normalization, feed-forward network, and residual connections. The architecture combines zero-knowledge circuit proofs for arithmetic operations with deterministic verifier-side checks for nonlinear operations, composed through cryptographic commitment bindings [VALIDATED: 124 negative tests, 0 failures]. 2. We report a one-step recurrence experiment showing that the dominant sub-computation of a second decoder layer can be verified using the same architecture at the same cost, with layer-boundary normalization acting as a measured precision reset [MEASURED].

3. We extend the verified computation path to observable output through a token logit projection that connects the identity-attested, computation-verified hidden state to a specific claimed token — with explicit disclosure that this proves a claimed-token logit, not a full-vocabulary argmax [MEASURED]. 4. Methodological contribution. During fixed-point implementation, an accidental rescaling error compressed the structural fingerprint to approximately 1.5 bits of dynamic range, yet the fingerprint maintained 0.98 rank correlation with the reference — consistent with the hypothesis that structural identity depends on relational geometry rather than activation magnitude. The error was localized through a trace-equivalence harness that caught what parametric sweeps could not [MEASURED]. This paper does not claim full-model zero-knowledge inference at frontier scale. It claims an identity-first verification architecture demonstrated on a micro-model decoder layer, with measured recurrence behavior supporting linear scaling.

§1.5. Roadmap

Section 2 reviews the zkML landscape and model identity approaches. Section 3 formalizes the identity gap and defines the threat model. Section 4 presents the four-level framework. Section 5 describes the verification architecture. Section 6 reports experimental results. Section 7 discusses composition with existing zkML systems and why structural identity is a distinct primitive. Section 8 states limitations. Section 9 concludes.

§2.1. Zero-Knowledge Machine Learning

The past three years have seen substantial progress in proving neural network inference in zero knowledge. The core pattern is: commit to model weights, execute the forward pass inside an arithmetic circuit or virtual machine, and produce a succinct proof that the output was honestly computed from the committed weights. Lagrange's DeepProve [7] has demonstrated full-model inference proofs at production scale. EZKL [8] compiles model definitions directly to proving circuits. Giza [9] and Modulus [10] provide complementary approaches to proof generation and model serving under zero-knowledge guarantees. These systems address computational integrity: the assurance that the computation was performed honestly. This is necessary, hard-won, and valuable. Our argument is not that these systems are insufficient — it is that they solve a different layer of the trust problem. What these systems have in common — and what this paper addresses — is that they begin from a weight commitment.

The prover commits to a set of weights, then proves honest computation on those weights. The verifier checks the proof and accepts the output. At no point does the protocol establish which model the weights belong to. The commitment proves the bytes were used. It does not prove the bytes are the bytes the verifier intended to verify. Table 1. Property comparison of zkML and model-identity systems. "Artifact binding" means the proof is tied to a fixed compiled model/circuit. "Model identity (protocol-level)" means the protocol authenticates the committed weights as a specific registered model with measured false-acceptance rates, independent of the prover. "TEE binding" means the proof composes with hardware attestation. "Identity-bound output" means the verified output traces back through an authenticated model identity, not just a committed artifact.

System	Artifact binding	Model identity (protocol-level)	TEE binding	Identity-bound output	Setup
DeepProve [7]	✓	—	—	—	Not confirmed from public docs
EZKL [8]	✓	—	—	Verified output, not identity-bound	KZG (trusted setup) or IPA (transparent)
Giza/Orion [9]	✓	—	—	—	STARKs (transparent)
Giza/LuminAIR [9]	✓	—	—	—	STARKs (transparent)
Modulus [10]	✓	Application-layer only	—	Application-layer only	On-chain verification
zkLLM [19]	✓	—	—	Verified output, not identity-bound	No trusted setup (Pedersen/Hyrax)
ZKML [20]	✓	—	Composition noted, not native	Verified output, not identity-bound	KZG or IPA (configurable)
This work	✓	✓ (IT-PUF, [PROVEN])	✓ (TEE-attested)	✓ (token logit, [MEASURED])	IVC (Nova, IPA-based) + TEE

All surveyed systems prove correct execution of a supplied model artifact. To our knowledge, none authenticates the committed weights as a specific identified model through a protocol with measured false-acceptance rates and formal spoofing impossibility, binds that identity to TEE attestation, and extends the result to an identity-bound output claim. This paper addresses that combination. The comparison reflects publicly documented properties as of early 2026; undisclosed internal capabilities may exist.

Table 1 · Property Comparison: zkML vs Model-Identity Systems

System	Artifact binding	Model identity	TEE binding	Id-bound output
DeepProve	✓	—	—	—
EZKL	✓	—	—	verified, not id-bound
Giza / Orion	✓	—	—	—
Giza / LuminAIR	✓	—	—	—
Modulus	✓	app-layer	—	app-layer
zkLLM	✓	—	—	verified, not id-bound
ZKML	✓	—	noted	verified, not id-bound
This work	✓	✓ PROVEN	✓ TEE	✓ MEASURED

§2.2. Model Identity in Practice

The question "which model is running?" has been addressed through several mechanisms, each with known limitations. A preliminary distinction is useful: provenance is a historical and legal concept (where did this model come from?), while structural identity is a measurable and cryptographic one (is this model the specific mathematical object it claims to be?). Provenance may require documentary evidence, chain-of-custody records, or legal discovery. Structural identity requires only the model itself. Model registries and hashing. A model registry records a cryptographic hash of the model weights at deployment time. This is necessary infrastructure, but a hash is an integrity check, not an identity proof. It answers "are these the same bytes?" — not "whose bytes are these?" A stolen copy produces the same hash as the original. Watermarking. Model watermarking embeds a detectable signal during training [11, 12]. The signal can be extracted from the model's outputs to verify provenance.

Watermarking is intentional: the signal must be designed, embedded, and maintained through subsequent fine-tuning. It is fragile to certain training modifications and requires cooperation from the original trainer. Behavioral fingerprinting. Prior work has explored using a model's output statistics or response to crafted inputs to distinguish models. Cao et al. [13] fingerprint classifiers by extracting data points near the decision boundary via adversarial perturbation. Lukas et al. [14] generate "conferrable" adversarial examples that reliably transfer misclassification behavior between models, enabling black-box copy detection. Chen et al. [15] provide a testing framework (DeepJudge) that probes output behavior through carefully crafted queries. Guan et al. [16] fingerprint models by measuring sample-wise correlation between output logits on selected inputs. To our knowledge, these approaches have not yet been composed with formal authentication protocols, measured false-acceptance regimes, spoofing impossibility theorems, and zero-knowledge verification stacks of the kind studied here.

§2.3. The IT-PUF Framework

The research described in this paper builds on the Inference-Time Physical Unclonable Function (IT-PUF) framework developed in Papers 1–3 [1, 2, 3]. The analogy to a hardware PUF [17] is deliberate: in hardware, manufacturing variation produces a unique challenge-response behavior; here, training produces a weight configuration whose output-layer geometry yields a unique structural response. The core observable is the δ-gene: an order-statistic fingerprint derived from pre-softmax logit gaps. It is not a watermark inserted during training. It arises from the model's learned weight geometry and the softmax bottleneck. Prior work established four properties of this observable that matter for the present paper. Unforgeability. No model can match another model's structural fingerprint without exceeding a computable divergence budget — a claim machine-checked in Coq and supported empirically by the three-phase spoofing pattern observed in Paper 1 [PROVEN + VALIDATED]. Universality. The normalized third logit gap converges to a Gumbel-class prediction across architectures, parameter counts, and training recipes [VALIDATED: 23 models, 16 families]. Resilience. The fingerprint survives quantization, partially transfers through distillation, and resists adversarial erasure. In Paper 3 [3], a controlled study over 54 checkpoints and 13 attack variants found 1.9% coefficient of variation in the normalized third logit gap, with white-box erasure no more effective than passive continued training [VALIDATED]. Authentication. Across 1,012 pairwise comparisons spanning 23 models, the IT-PUF protocol produced zero false acceptances [VALIDATED]. These inherited results are the scientific substrate for the present paper: they are what make it meaningful to bind a computation proof not just to weights, but to an identified model.

§2.4. Trusted Execution Environments

Trusted execution environments (TEEs) provide hardware-enforced isolation for sensitive computation. Intel TDX, NVIDIA Confidential Computing, and AMD SEV create enclaves where code and data are protected from the host operating system and hypervisor. The attestation semantics are standardized in the IETF RATS architecture [26]: an attester produces evidence about its measured state, a verifier appraises that evidence against policy, and a relying party trusts the resulting attestation results according to its own policy. The Entity Attestation Token (EAT) standard [27] defines token-level claims for freshness binding, entity identification, and nested sub-attestations. Intel TDX provides caller-controlled fields in its attestation report that can carry nonces, public keys, or hashes of larger data structures [28], enabling application-specific bindings between the TEE attestation and higher-level protocol state. The present work uses standard EAT and TDX mechanisms; specific binding field assignments are not disclosed. A critical limitation applies to all attestation: it binds signed claims about measured state and freshness at attestation time. It does not guarantee future behavior, and state can change immediately after evidence is generated [26, 29].

In our framework, the TEE attestation proves what was measured when the measurement ran, not that the same model will be serving afterward. Continuous assurance requires re-attestation — a protocol concern, not a cryptographic one. In the context of model identity, a TEE serves as a trust anchor: the structural fingerprint measurement runs inside the enclave, and the TEE attestation binds the measurement to the specific hardware, software, and weight configuration. Paper 4 [4] validated this approach on NVIDIA H100 GPUs with Intel TDX, demonstrating that the confidential computing enclave is transparent to the IT-PUF measurement — the fingerprint measured inside the enclave was identical to the fingerprint measured outside across 1,536 measurements spanning 6 models and 4 architecture families (\(L^2\) distance = 0.0) [VALIDATED]. TEEs provide hardware trust, not mathematical trust. The attestation depends on the integrity of the hardware and firmware, not on a mathematical proof. This is a deliberate architectural choice — it enables practical deployment while the pure-mathematical alternative (full zero-knowledge extraction of the structural fingerprint) remains under development at substantially higher computational cost.

§3. Problem Statement

§3.1. The Identity Gap

We define the identity gap in zkML as the distance between two statements: Statement A (what zkML proves today): "The computation on input \(x\) using committed weights \(W\) produced output \(y\), and this computation was performed honestly." Statement B (what trust requires): "The computation on input \(x\) used the specific model with structural fingerprint \(\tau\) — a model whose identity can be cryptographically verified — and the computation was performed honestly." Statement A is a proof about bytes. Statement B is a proof about identity. The gap between them is the identity gap. (Mapping structural identity \(\tau\) to organizational ownership — "this model belongs to organization \(X\)" — requires an external identity registry. This paper addresses the cryptographic identity layer, not the registry.) The gap is exploitable. This is not a pathological cryptographic edge case; it is the operational trust gap that arises whenever computation is verified without first binding model identity. Consider an adversary who: 1. Obtains or trains a model \(M'\) with different properties than the contracted model \(M\). 2. Commits to \(M'\)'s weights. 3. Proves honest computation of \(M'\) on the verifier's input. 4. Presents the proof as evidence that \(M\) produced the output. Every step in this attack produces a valid cryptographic artifact. The weight commitment is valid for \(M'\). The computation proof is valid for \(M'\). The zero-knowledge property ensures the verifier cannot inspect the weights to detect the substitution. The attack succeeds because the zkML protocol verifies computation, not identity.

§3.2. The Trust Paradox

The identity gap interacts with a second problem formalized in Paper 4 [4]: the Trust Paradox. A victim of model theft cannot prove that the adversary's model was derived from the victim's weights — because producing such proof requires disclosing the very weights at issue, which may themselves be proprietary. Conversely, an accused party cannot prove innocence without disclosing their training data and methodology, which are also proprietary. Zero-knowledge identity proof offers a path toward resolving the paradox: a model owner could prove that their model's structural fingerprint matches a published anchor without revealing the fingerprint vector itself. Paper 4 defined the architecture for this; the present paper advances its computational foundation.

§3.3. Threat Model

The framework addresses three categories of threat: Model substitution. An operator silently replaces the contracted model with a different model — cheaper, less capable, or differently aligned — while continuing to claim the original model is serving. Distillation without attribution. An adversary trains a student model on the outputs of a teacher model, acquiring capabilities without licensing the weights. The structural and functional provenance traces established in Papers 1–3 detect this when the victim has access to either the weights or the API of the suspected student [VALIDATED]. Weight theft and repackaging. An adversary obtains the weights of a model through theft or unauthorized access and deploys them under a different identity. The structural fingerprint detects identity equivalence — that the deployed weights are the same mathematical object as the registered model — regardless of how they are repackaged [PROVEN: unforgeability theorem, Coq]. Establishing that possession is unauthorized requires external evidence (licensing records, access logs) beyond the scope of this system. The framework does not address threats outside the scope of the IT-PUF threat model assumptions (ZTA1–ZTA6, defined in [4, §5.3]). In particular, the forwarding attack (routing queries to the target model in real time) and the float-spoofing attack (fabricating measurement values) are excluded by the threat model, not prevented by the protocol. These exclusions are explicit, not hidden.

§3.4. What This Paper Does and Does Not Address

This paper addresses: - The construction of a verifier-checkable computation path from identity-attested weights to observable output. - The scaling behavior of this construction across layers. - The composition of structural identity with computational integrity. This paper does not address: - Full-model inference verification at production scale. All results are on a micro-model with 64-dimensional embeddings and approximately 147,000 parameters. Production-scale verification is an engineering extrapolation supported by measured linear scaling, not a demonstrated capability. - Model safety, alignment, or quality. Structural identity proves which model is running. Whether the model is good is a different question. - Privacy-preserving inference. The framework verifies computation under known weights; it does not hide the input or output from the verifier.

§4. The Identity-First Framework

The framework composes four levels. Each level produces a cryptographic artifact that the next level consumes.

§4.1. Level 1: Structural Identity

The structural fingerprint is the foundation. It answers: is this model the model it claims to be? The IT-PUF protocol [1] extracts a fingerprint from stable geometric statistics of the model's output competition structure across a set of challenge prompts. The fingerprint is not embedded during training and is not a watermark. It arises from the weight geometry and the softmax bottleneck: the pattern of probability-mass competition among ranked vocabulary entries encodes the model's architectural identity in a way that is stable across inputs, temperatures, and deployment configurations. The formal foundation is inherited from Papers 1–3: the unforgeability theorem, verified in Coq, establishes that no model can match another model's fingerprint within a given tolerance without exceeding a computable divergence budget — a result that holds at all tested model scales [PROVEN]. The empirical foundation spans 23 models, 16 vendor families, and 3 architecture types (standard Transformer, parallel Transformer, and state-space models), with zero false acceptances across 1,012 pairwise comparisons [VALIDATED].

§4.2. Level 2: Weight Binding

The structural fingerprint is a measurement. To trust the measurement, you must trust the measurer. Level 2 removes this trust requirement by placing the measurement inside a trusted execution environment. The IT-PUF measurement engine runs inside an NVIDIA H100 Confidential Computing enclave with Intel TDX attestation. The enclave provides hardware-enforced isolation: the measurement code and model weights are protected from the host operating system and hypervisor. The TEE attestation binds the measurement result — the structural fingerprint — to the specific weights and software configuration that produced it. A binding digest links the attested measurement, the weight identity, and freshness material across the attestation chain. Any modification to the weights, the measurement code, or the attestation chain invalidates the binding. As noted in §2.4, this binding is valid at attestation time; continuous assurance requires re-attestation [26]. Critically, the confidential computing enclave is transparent to the fingerprint observable: in the experiments reported in [4], the fingerprint measured inside the enclave was identical to the fingerprint measured outside across all 64 dimensions and 6 tested models, with \(L^2\) distance of exactly 0.0 and maximum absolute element-wise difference of 0.0 [VALIDATED: 1,536 measurements across 4 architecture families]. This is hardware trust, not mathematical trust. The attestation depends on the integrity of the hardware and firmware. The pure-mathematical alternative (Level 3 of Paper 4's architecture) would eliminate this dependency at substantially higher computational cost; the present paper advances its computational foundation.

§4.3. Level 3: Computation Verification

Levels 1 and 2 establish which model is running. Level 3 establishes that the model computed honestly. The verified computation path covers one complete Transformer decoder layer: the full attention mechanism (input projections, score computation, softmax normalization, value aggregation, output projection), residual connections, layer normalization, and the full feed-forward network (gated projections, nonlinear activation, down projection). The layer is decomposed into sub-computations, each verified independently and composed through cryptographic commitment bindings. Two verification mechanisms are used: Proof-backed subcomputations. Operations involving linear transformations — matrix multiplications followed by fixed-point rescaling — are verified through incrementally verifiable computation (IVC). The IVC scheme produces a single constant-size proof covering the full token sequence, regardless of sequence length.

Deterministic verifier-side checks. Operations involving nonlinearities — functions requiring transcendental computation (exponentials, square roots) that would be expensive to encode in arithmetic circuits — are verified through native re-derivation. The verifier receives the committed inputs, re-executes the operation deterministically, and verifies that the output matches the prover's claim. These checks add zero proving overhead. They are trusted computation: the verifier performs the same computation the prover claims. Under the stated trust assumption — that the verifier's execution environment is not compromised — this provides an equivalent functional check. If the verifier environment cannot be trusted, verifier-side checks provide weaker assurance than zero-knowledge proofs; this is a documented and intentional trade-off. The two mechanisms are composed through commitment bindings: each sub-computation commits to its output via a cryptographic hash chain, and the subsequent sub-computation verifies the commitment before consuming the input. The full-layer composition check verifies all inter-subcomputation bindings in a single pass. On the tested micro-model (§6), the verification covers approximately 296,000 total constraints across all proof-backed subcomputations and proves in approximately 76 seconds on consumer hardware. All 124 negative tests were rejected [VALIDATED].

§4.4. Level 4: Output Binding

Level 3 produces a verified hidden state at the output of the decoder layer. Level 4 connects this hidden state to an observable claim. A token logit projection takes the verified hidden state at the final token position, computes the dot product with one row of the language model's output head (corresponding to a claimed token), and verifies the resulting logit. A comparison gate checks whether the claimed token's logit exceeds the runner-up's logit. This is a deterministic verifier-side check, not a zero-knowledge proof. Its scope is explicitly limited: it proves the logit for a claimed token under the verified computation. It does not prove that the claimed token is the vocabulary argmax — a full-vocabulary argmax proof would require evaluating all vocabulary rows, which is feasible but not implemented. The comparison gate against a single runner-up is a practical proxy, not a mathematical guarantee of argmax.

Table 5 · Token Logit Projection (Level 4)

MechanismVerifier-side dot product

Computation time<10 ms

Negative tests6 / 0 failures

Figure 2 · Four-Level Identity-First Trust Stack

Level 4 · Output Binding

h → logit ℓ for claimed token t

MEASURED · new

Level 3 · Computation Verification

W on x → hidden state h (hybrid IVC)

VALIDATED · new

Level 2 · Weight Binding

τ bound to W via TEE attestation

VALIDATED · inherited (Paper 4)

Level 1 · Structural Identity

Model has fingerprint τ, unforgeable

PROVEN · inherited (Papers 1–3)

Remove any level and the full chain-of-trust claim breaks.

§4.5. Composition

The four levels compose into a single trust chain: 1. Identity (Level 1): The model has structural fingerprint \(\tau\), unforgeable [PROVEN — inherited from Papers 1–3]. 2. Binding (Level 2): Fingerprint \(\tau\) is bound to weights \(W\) via TEE attestation and a binding digest linking measurement, weight identity, and freshness [VALIDATED — inherited from Paper 4]. 3. Computation (Level 3): Weights \(W\) on input \(x\) produced hidden state \(h\) through a decoder-layer slice, verified under the hybrid architecture [VALIDATED — new in this paper]. 4. Output (Level 4): Hidden state \(h\) yields logit \(\ell\) for claimed token \(t\) [MEASURED — new in this paper]. Removing any level breaks the full chain-of-trust claim established by this paper. Without Level 1, you do not know which model is running. Without Level 2, you cannot trust the fingerprint measurement. Without Level 3, the output could be fabricated. Without Level 4, the verified computation has no observable consequence. (Note: removing Level 4 still leaves a valid identity-bound computation proof; what breaks is the connection to observable output.) The composition is the contribution. The zkML systems surveyed in §2.1 (Table 1) do not include a structural identity prerequisite layer beneath computational integrity in their publicly documented architectures. The framework presented here advances that layer and composes with existing computational integrity systems at the weight commitment boundary: an identity proof binds \(\tau\) to \(W\); a computation proof (ours or any zkML system's) verifies inference under \(W\).

§4.6. Inherited Results and Trust Assumptions

This paper builds on results established in the prior series. The following table makes the dependency chain explicit, now that all four levels have been introduced:

Table 2 · Dependency Chain of Results

Result	Source	Status
Fingerprint uniqueness	P1–P3, Coq	PROVEN inh.
Spoofing impossibility	P1 §6, P2 Coq	PROVEN inh.
Adversarial resilience (54 ckpts)	P3 §6	VALIDATED inh.
Zero false acceptances (1,012)	P1	VALIDATED inh.
TEE-transparent measurement	P4	VALIDATED inh.
Committed distance proof (Tier 1)	P4	VALIDATED inh.
Hybrid decoder-layer verification	This paper	VALIDATED new
One-step recurrence	This paper	MEASURED new
Token logit projection	This paper	MEASURED new
Precision-collapse resilience	This paper	MEASURED new

Result	Source	Status	How it enters
Structural fingerprint uniqueness and stability	Papers 1–3, Coq (NoSpoofing.v: 51 theorems)	[PROVEN] — inherited	The identity observable in Level 1
Spoofing impossibility (three-phase structure)	Paper 1 §6, NoSpoofing.v; Paper 2, APINoSpoofing.v (41 theorems)	[PROVEN] — inherited	The unforgeability claim
Adversarial resilience of δ_norm (54 checkpoints, CV 1.9%)	Paper 3 §6	[VALIDATED] — inherited	Resilience in §2.3; 1.5-bit interpretation in §5.5, §6.5
Zero false acceptances (1,012 comparisons, 23 models)	Paper 1	[VALIDATED] — inherited	Authentication performance
TEE-transparent measurement (\(L^2 = 0.0\), 1,536 measurements)	Paper 4	[VALIDATED] — inherited	Level 2 weight binding
Three-Layer Security Hierarchy	Paper 3 §7.1	[VALIDATED] — inherited	Compositional architecture
Committed distance proof (Tier 1 ZK)	Paper 4	[VALIDATED] — inherited	Architectural predecessor to Level 3
Hybrid decoder-layer verification	This paper	[VALIDATED] — new	Level 3
One-step recurrence	This paper	[MEASURED] — new	Scaling observation
Token logit projection	This paper	[MEASURED] — new	Level 4
Precision-collapse resilience	This paper	[MEASURED] — new (accidental)	Extends Paper 3's evidence into extreme quantization

Trust assumptions: - TEE integrity. The TEE-backed path trusts Intel TDX and NVIDIA CC hardware/firmware integrity. The pure-ZK path eliminates this assumption at higher cost. - Attestation is time-bound. TEE attestation binds state at attestation time only [26, 29]. Continuous assurance requires re-attestation. - Verifier-side check integrity. Deterministic verifier-side checks trust the verifier's execution environment. - External identity registry. Structural identity proves which mathematical object is running. Mapping to organizational ownership requires an external registry not specified here.

§5. Architecture

This section describes the verification architecture in technical detail. The design principles are stated; the specific per-operation decomposition — which Transformer operations map to which verification mechanism — is not disclosed, as it constitutes operational architecture.

§5.1. Composition Mechanics

Section 4.3 described the two verification mechanisms (proof-backed subcomputations and deterministic verifier-side checks). This section addresses how they compose. Sub-computations are linked through a binding map: each sub-computation commits to its output, and the subsequent sub-computation verifies the commitment before consuming the input. The binding map specifies, for each adjacent pair, which output commitment must match which input commitment. The final composition check verifies all bindings in one pass. This creates a chain of verified handoffs from the layer input to the layer output without requiring a monolithic proof.

§5.2. The Bridge-vs-Circuit Decision

The choice between circuit proof and verifier-side check follows a principle, not a heuristic: Use a circuit proof when the prover supplies witness values (intermediate computation results) that the verifier cannot re-derive from public inputs alone. This includes linear transformations, where the circuit enforces the arithmetic relationship between inputs, weights, and outputs under fixed-point precision constraints. Use a verifier-side check when the computation can be deterministically re-derived from committed inputs. If the verifier can take the input commitment, re-execute the function, and confirm the output matches — then, under the trust model adopted here (verifier environment is not compromised), a circuit proof would add proving cost without improving the target guarantee. Outside this trust model, circuit proofs do add security benefit by removing the dependency on verifier-side integrity. The scope-honest rule: verifier-side checks are always disclosed as trusted computation. They are not zero-knowledge proofs. If an adversary controls the verifier's execution environment, verifier-side checks can be corrupted. This is a documented trust assumption, not a hidden weakness.

§5.3. Fixed-Point Arithmetic

The entire computation path operates in fixed-point integer arithmetic with a model-specific scale parameter. Arithmetic soundness is enforced through fixed-point quantization with integer division and range-checked remainders. The constraint structure ensures that each rescaling operation is uniquely invertible and bounded, including for signed quantities. These arithmetic constraints are the dominant cost of the proof-backed subcomputations. The commitment hashing (binding intermediate values into the IVC state chain) is the secondary cost. The nonlinear operations handled by verifier-side checks contribute zero constraint overhead.

§5.4. IVC Design Principles

Implementing incrementally verifiable computation for neural network inference surfaced design constraints that we state here as principles for practitioners: 1. Uniform step shape. Every step of the folded computation must synthesize an identical constraint system. Any step-dependent variation — in constraint count or coefficient structure — causes verification failure, because the folding scheme assumes uniform structure across all steps. 2. Decomposition uniqueness. The fixed-point encoding must admit exactly one valid decomposition per output element. Multiple valid representations for the same value undermine arithmetic soundness. 3. Terminal-state binding. The verifier must check the IVC's final state against independently computed expected values. Without this check, the prover can claim correct execution of any computation and the proof will verify — the IVC guarantees consistent folding, not correct initialization.

§5.5. The Trace-Equivalence Discipline

During fixed-point implementation, the fingerprint extraction registered an error that substantially exceeded the cryptographic separation threshold. The error was completely invariant to bit-width: sweeping internal precision across a range produced the identical error floor. This initially appeared to be a fundamental arithmetic limit. A trace-equivalence harness was constructed to compare every intermediate value between the floating-point reference and the fixed-point implementation, operation by operation, step by step. The harness localized the error to a single rescaling mismatch — a structural bug, not an arithmetic limitation. Once corrected, the extraction passed its separation gate across all tested models [MEASURED]. Methodological lesson. Apparent precision failure in fixed-point zero-knowledge implementations can be confounded by implementation error. The trace-equivalence harness caught what parametric sweeps could not, because the error was structural (wrong formula) rather than parametric (insufficient bits). Without this discipline, the conclusion would have been "fixed-point extraction is fundamentally insufficient" — a false scientific conclusion from a coding bug. The accidental stress condition created by the bug — and what it revealed about the fingerprint observable's resilience — is reported in §6.5.

§6. Experimental Results

All results in this section were obtained on a micro-model: a standard Transformer decoder with 64-dimensional embeddings, 2 layers, 4 attention heads, a feed-forward intermediate dimension of 256, and a vocabulary of 256 tokens — approximately 147,000 total parameters. The model was initialized with a fixed random seed and is not pre-trained. The purpose is architectural validation, not language modeling performance.

§6.1. Layer 0 Verification

One complete decoder layer was verified end-to-end under the TEE-backed architecture (weights baked as circuit constants, provenance attested by TEE).

Table 3 · Layer 0 End-to-End Verification Results

Total constraints~296,000

Proving time (sequential)~76 s

Verification time<1 s

Proof size (aggregate)~85 KB

Cost splitAttn ~40% · FFN ~60%

Negative tests124 / 0 failures

Metric	Value
Total constraints (all proof-backed subcomputations)	~296,000
Total proving time (sequential, consumer hardware)	~76 s
Total verification time	< 1 s
Total proof size (aggregate compressed)	~85 KB
Constraint cost split	Attention path ~40%, feed-forward path ~60%
Verifier-side checks	Zero proving overhead
Composition bindings verified	All
Negative tests (total / failures)	124 / 0

The proof-backed subcomputations use incrementally verifiable computation (Nova [18]) with IPA-based proof compression. Each sub-computation produces a single compressed proof covering the full token sequence. The verifier-side checks contribute no constraints and execute in sub-second aggregate time. The feed-forward path dominates constraint cost (~60%) due to its wider intermediate dimension (4× the embedding dimension). The attention path accounts for the remaining ~40%.

§6.2. Recurrence

The critical question for scaling is whether the layer construction recurs — whether a second decoder layer can be verified using the same architecture, at the same cost, without a new arithmetic obstacle. An inter-layer normalization bridge — a verifier-side check that re-derives the normalization operation connecting the output of Layer 0 to the input of Layer 1 — was implemented and verified with 3 negative tests [MEASURED]. The bridge confirmed that the normalization operation resets the activation scale to the model's quantization scale at every layer boundary, preventing systematic scale growth across layers. A second decoder layer was then verified using the same proof-backed subcomputation architecture with independently initialized weights. The recurrence measurement compares the largest projection sub-computation between Layer 0 and Layer 1; full-layer totals are projected from the single-layer baseline rather than independently measured for Layer 1.

Table 4 · Recurrence: Layer 0 vs Layer 1

Constraints (projection)≈1.0× (within 1%)

Proof sizeIdentical (1.0×)

Precision budgetUnchanged

Scale reset at layer boundary prevents arithmetic drift.

Metric	Layer 0	Layer 1	Ratio
Constraints (projection sub-computation)	Baseline	Within 1% of baseline	~1.0×
Proof size	Baseline	Identical	1.0×
Precision budget	Baseline	Unchanged	Identical

The constraint difference (within 1%) is implementation variance from different weight matrices, not a structural change. The fold-time ratio showed minor variation attributable to system noise, not a systematic cost difference. The proof size was byte-identical. Scale reset. The normalization operation at each layer boundary returns the activation root-mean-square to the quantization scale. This prevents the arithmetic drift that would otherwise accumulate across layers in fixed-point computation. The precision budget required for the fixed-point arithmetic remained unchanged across the tested layer boundary. Scaling projection. Under the current orchestrator and micro-model configuration, the sequential wall-clock cost is approximately \(L \times 76\) seconds for \(L\) layers on consumer hardware. This is a current-system estimate based on measured single-layer recurrence, not a platform law. Parallelism across independent sub-computations within each layer could reduce wall-clock time further; this is not yet implemented.

§6.3. Token Logit Projection

The Layer 0 hidden state at the final token position was projected through one row of the language model's output head, corresponding to a claimed token. The result is a verified logit for the claimed token, with a comparison against one runner-up.

Metric	Value
Projection mechanism	Verifier-side dot product
Computation time	< 10 ms
Negative tests (total / failures)	6 / 0

The comparison gate confirmed that the claimed token's logit exceeded the runner-up's logit. This is a practical proxy, not a full-vocabulary argmax proof [MEASURED]. We are not aware of a prior system that connects a verifier-checkable token logit to a structural-identity-attested model. This result is intended as a legible boundary artifact — connecting verified computation to an observable claim — not as a full-vocabulary selection proof.

§6.4. Architecture Comparison

Two architecture paths were evaluated:

Table 6 · TEE-Backed vs Pure-ZK Architecture Paths

TEE-Backed (v1-floor)

Weights: circuit constants
Cost: baseline
Trust: TEE attestation

Pure-ZK (v1-primary)

Weights: IVC state
Cost: ~43× baseline
Trust: cryptographic only

Same proof statement, different trust assumptions and cost profiles.

Path	Weight handling	Proving cost	Trust assumption
TEE-backed	Weights baked as circuit constants	Baseline	Weight provenance from TEE attestation
pure-ZK	Weights carried in IVC state	~43× baseline	None beyond cryptographic soundness

Both paths aim to establish the same functional claim: "these weights, on this input, produced this output." They differ in how weight authenticity is established. The TEE-backed path relies on the TEE to attest weight provenance and achieves sub-second fold time per sub-computation step. The pure-ZK path authenticates weights cryptographically within the IVC state at ~43× higher cost. The two measured points — TEE-backed and pure-ZK — differ by approximately 43× in proving cost. Intermediate configurations (partially state-carried weights) were not tested [MEASURED].

§6.5. Precision-Collapse Event

Not every important result in this paper was planned. During implementation, an error in the fixed-point rescaling path created an accidental stress condition that at first looked like a genuine arithmetic limit. The extraction failed its separation gate, and the failure did not improve under ordinary precision sweeps. For a time, the natural conclusion was that the fingerprint observable itself might be too fragile for fixed-point verification. That conclusion would have been wrong. A trace-equivalence harness comparing the floating-point and fixed-point paths step by step localized the problem to a single rescaling mismatch. Once corrected, the extraction path passed cleanly across all tested micro-models.

Table 7 · Precision-Collapse Event

Stress (~1.5-bit)

τ error: exceeds threshold
Rank corr: 0.98
Gate: FAIL

Corrected

τ error: <5×10⁻⁴
Rank corr: >0.999
Gate: PASS

Rank correlation survived at 0.98 under ~1.5-bit dynamic range.

Condition	τ error	Rank correlation (per-model)	Separation gate
Accidental stress regime (~1.5-bit dynamic range)	Exceeds threshold	0.98 Spearman	FAIL
Corrected extraction path	< 5 × 10⁻⁴ (worst case)	> 0.999	PASS

After correction, all independently initialized micro-models passed the cryptographic separation gate. The per-model rank correlation between fixed-point and floating-point fingerprints exceeded 0.999 in every case. Across models, pairwise distance ordering was perfectly preserved: both Spearman and Kendall rank correlations equal to 1.000 across all pairwise comparisons [MEASURED]. The more suggestive observation is the pre-correction one. Even when the accidental rescaling collapse compressed the downstream signal to roughly 1.5 bits of dynamic range, the structural fingerprint retained 0.98 rank correlation with its reference. We do not present this as a validated robustness law; it is a single accidental event. But it is consistent with a stronger interpretation of the observable. The collapse degraded magnitude severely enough to break the gate, yet much of the relational structure survived. That is exactly what one would expect if the fingerprint depends more on order and gap geometry than on absolute activation scale. The event carries two lessons. Methodologically, it shows that trace-equivalence discipline is necessary in fixed-point ZK implementations, because implementation mismatch can masquerade as a fundamental arithmetic barrier. Scientifically, it provides an extreme stress observation that extends the perturbation evidence from Paper 3 [3] — where δ_norm maintained 1.9% CV across 54 adversarial checkpoints — into a quantization regime not studied there. The result does not prove the relational-geometry hypothesis. It does, however, make that hypothesis substantially more concrete: the observable remained structurally legible even after the surrounding computation had been driven into a regime severe enough to destroy its cryptographic separability.

§7. Discussion

§7.1. What an Accepted Run Establishes

An accepting run of this framework supports a chain of claims, but not every link in that chain rests on the same kind of assurance. The strongest parts are the mathematical ones. The structural-identity layer relies on the inherited IT-PUF result that a model's fingerprint is unforgeable within the stated divergence budget, and the proof-backed parts of the computation layer rely on the soundness of the underlying IVC proof system. For those components, acceptance means the relevant claim follows from cryptographic proof rather than from trust in the prover. Other links in the chain are operational rather than purely mathematical. Binding a measured fingerprint to a particular set of weights depends on TEE attestation, so that part of the claim is only as strong as the integrity of the confidential-computing hardware and firmware. Likewise, the nonlinear operations handled by verifier-side re-derivation are correct only if the verifier's own execution environment is behaving honestly. The same conditional trust applies to the token-logit projection in the current implementation. The framework therefore provides a mixed trust chain: some claims are backed by proof, some by hardware attestation, and some by verifier-side recomputation.

It is also important to separate structural identity from organizational ownership. The framework can support the statement "this is the model registered under fingerprint \(\tau\)," but it does not by itself establish "this model belongs to company \(X\)" or "this deployment is authorized." Those stronger claims require an external registry or certificate chain that maps \(\tau\) to a legal or organizational identity. The protocol identifies a mathematical object; ownership and authorization sit one layer above it. The hybrid design is therefore deliberate, not incidental. Operations belong in proof when the verifier cannot reconstruct them from committed inputs without trusting the prover. Operations can remain as verifier-side checks when deterministic recomputation provides the intended assurance under the stated trust model at much lower cost. A deployment that cannot trust the verifier environment should move more of those checks into proof, accepting the corresponding cost increase. The present system is one explicit point in that trust–cost design space: identity-bound computation with clear boundaries on where proof ends and operational trust begins.

§7.2. Composition with Existing zkML Systems

The identity-first framework composes with existing zkML systems at a precise boundary: the weight commitment. A computational-integrity proof establishes that a committed artifact executed honestly on a given input. An identity proof establishes that the committed weights correspond to a specific model with structural fingerprint \(\tau\). The two statements are complementary but not interchangeable. One is a statement about correct execution; the other is a statement about which model is being executed. This distinction matters because a weight commitment authenticates bytes only relative to itself. A zkML proof can show that weights \(W\) were used honestly, but it does not establish that \(W\) is the model the verifier intended to verify. That is the identity gap formalized in §3.1: computation may be valid while the underlying model is substituted, repackaged, or otherwise not the one the verifier believes it to be. The identity layer closes that gap by binding the committed weights to a model-level structural anchor rather than treating the commitment as self-authenticating. Composed together, the two layers support a stronger statement than either provides alone: not merely "some committed model computed honestly," but "the model bound to structural fingerprint \(\tau\) computed honestly on this input." The present work is neither a replacement for zkML inference systems nor a competing proof backend. It is a prerequisite layer that gives computational-integrity proofs a model-level referent.

§7.3. Why Identity-First Is Not a zkML Optimization

The contribution of this paper is not a cheaper proof of the same statement. It is an architecture for proving a different statement. A zkML optimization improves the cost, latency, or scalability of proving that a committed model executed correctly. Structural identity does not operate on that axis. It does not reduce arithmetic cost, compress witness size, or simplify circuit construction. It answers a question that computational-integrity systems leave open: what model do the committed weights belong to? That question cannot be recovered from prover efficiency alone. A faster circuit is still a proof about honest execution of a supplied artifact. Even a zero-cost zkML prover would not establish that the artifact is the particular trained model whose identity matters for audit, attribution, or substitution resistance. The missing primitive is not computational speed — it is model authentication. This is why structural identity cannot be obtained by reconfiguring an existing zkML stack. Adding it requires a protocol for measuring model-specific structure, a security claim that the measurement is unforgeable within a stated tolerance, and a binding mechanism that attaches the resulting identity claim to the weights used in computation. Those are not optimizations of computational integrity. They are additional protocol commitments with different assumptions, different failure modes, and different semantics. Computational integrity tells you the math was done honestly. Structural identity tells you whose mathematical object the computation operated on. The first is about faithful execution. The second is about authenticated model referent. Deployment-sensitive settings may require both, but they are not variants of the same primitive.

§7.4. Layer-Boundary Normalization as Scale Reset

Fixed-point arithmetic drifts across layers. Each matrix multiplication and rescaling step introduces rounding; over dozens of layers, small biases compound until they exhaust the bit-width budget. This is the arithmetic obstacle to deep-network verification. The recurrence experiment (§6.2) suggests that modern Transformer architectures may already contain the mechanism that makes this tractable. The normalization operation at each layer boundary resets the activation root-mean-square to the quantization scale. Regardless of how much drift accumulated during the layer's computation, the normalized output begins the next layer at a known scale. In our experiments, the bit-width required to represent matrix multiplication outputs was identical between Layer 0 and Layer 1. Normalization layers are not merely a training technique — they may be the computational stability mechanism that enables fixed-point verification at depth.

§7.5. The Cost of Identity

The current implementation is not optimized for real-time inference. On the tested micro-model, the framework proves one decoder layer in approximately 76 seconds on consumer hardware. For a model with \(L\) layers, the sequential cost is approximately \(L \times 76\) seconds under the current micro-model configuration. In high-assurance contexts — where knowing which model computed the output is a prerequisite for trusting the output — this is the cost of that knowledge. The cost has natural reduction paths: independent sub-computations within each layer can be parallelized; the pure-ZK path can be optimized through alternative proof systems; and the micro-model used here (64-dimensional embeddings) is far smaller than production models, where the constraint-per-parameter ratio may differ. These are engineering directions, not architectural barriers — in the tested regime.

§8. Limitations

1. Micro-model only. All experimental results are on a model with 64-dimensional embeddings, 2 decoder layers, and approximately 147,000 parameters. This is sufficient for architectural validation but does not demonstrate production-scale feasibility. The measured cost consistency across layers supports extrapolation, but extrapolation is not demonstration. Production-scale verification (embedding dimensions of 4,096 or larger, 32 or more layers) is not demonstrated here; extending the architecture beyond the tested regime remains future engineering work. 2. Single-layer proof, measured recurrence. One complete decoder layer is verified end-to-end. A second layer is verified through the same architecture with measured cost consistency. Full multi-layer composition — chaining verified layers into a complete model proof — is not demonstrated. 3. Trust-cost tradeoff. The framework offers two architecture paths: the TEE-backed path provides hardware-attested verification at baseline cost; the pure-ZK path eliminates hardware trust assumptions at approximately 43× the computational cost in the tested configuration.

This is a deployment choice, not a technical limitation — but operators must select the trust model appropriate to their assurance requirements. 4. Verifier-side checks are trusted computation. Certain nonlinear operations are verified through deterministic re-derivation rather than zero-knowledge proofs. Under the stated trust assumption (verifier execution environment is not compromised), this provides an equivalent functional check; when that assumption does not hold, these checks provide weaker assurance than zero-knowledge proofs. 5. Token projection does not prove argmax. The output binding proves the logit for a single claimed token and compares it against one runner-up. It does not prove that the claimed token maximizes the logit across the full vocabulary.

§9. Conclusion

This paper has advanced the missing layer beneath computational integrity: structural identity. We presented an identity-first verification framework that composes four levels: structural fingerprinting (inherited from Papers 1–3, [PROVEN] + [VALIDATED]), hardware-attested weight binding (inherited from Paper 4, [VALIDATED]), hybrid verifier-checkable computation through a decoder-layer slice (new, [VALIDATED]), and output binding to a claimed token logit (new, [MEASURED]). Taken together, these levels answer three questions that matter in sequence: which model is this, did it compute honestly, and what output did it produce? Existing zkML systems address the computation question. This work advances the missing identity layer beneath it. That distinction is the central point of the paper. Computational integrity and structural identity are complementary, not interchangeable. A proof that committed weights executed correctly does not by itself establish what model those weights are supposed to represent. The contribution of this framework is to bind computation claims to an identified model-level referent, so that "this computation was correct" is attached to "this was the model that computed." The hybrid structure follows directly from that goal. Arithmetic relations are placed in proof where proof is the right instrument; nonlinear operations remain verifier-checkable where deterministic re-derivation provides the intended assurance at lower cost; and TEE attestation is used where binding identity to weights is the relevant requirement.

This is not an incomplete version of end-to-end zkML, but a deliberate allocation of proof, attestation, and verifier trust across distinct roles in the verification chain. The experiments show that this architecture is not merely conceptual. A one-step recurrence test indicates that the design extends beyond a single layer: on the tested micro-model, constraint counts, proof sizes, and precision budgets remained stable across the boundary, with normalization acting as a measured scale reset. The present implementation therefore establishes a concrete verifier-checkable path from structural identity to a claimed token logit, while also identifying the boundary at which further circuitization remains future work. Perhaps the most suggestive result is the accidental one. A rescaling error compressed the structural fingerprint to roughly 1.5 bits of dynamic range, yet the fingerprint persisted at 0.98 rank correlation. The collapse broke cryptographic separability, but it did not erase the observable's underlying order structure. This suggests that structural identity may reside not in activation magnitude, but in the relational geometry of the weight topology itself. If that observation holds more generally, then structural identity is not merely useful for verification — it is a property of the model that survives distortions which degrade much of the computation around it. This paper establishes the first verification framework in which that possibility can be tested directly.

References

View references ↓

[1] A. R. Coslett, "The δ-Gene: Neural Network Identity via Inference-Time Physical Unclonable Functions," 2026. DOI: 10.5281/zenodo.18704275

[2] A. R. Coslett, "Template-Based Endpoint Verification via Logprob Order-Statistic Geometry," 2026. DOI: 10.5281/zenodo.18776711

[3] A. R. Coslett, "The Geometry of Model Theft: Distillation Forensics and the Three-Layer Security Hierarchy," 2026. DOI: 10.5281/zenodo.18818608

[4] A. R. Coslett, "Provenance Generalization and Verification Scaling for Neural Network Forensics," 2026. DOI: 10.5281/zenodo.18872071

[5] A. R. Coslett, "Beneath the Character: The Structural Identity of Neural Networks," 2026. DOI: 10.5281/zenodo.18907293

[6] Anthropic, "Detecting and Preventing Distillation Attacks," Blog post, February 23, 2026. https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks

[7] Lagrange Labs, "DeepProve," Open-source zkML proof system, 2025. https://github.com/Lagrange-Labs/deep-prove

[8] Zkonduit, "ezkl: Easy Zero-Knowledge Inference," Open-source framework, 2023. https://github.com/zkonduit/ezkl

[9] Giza, "Verifiable ML Inference Platform," 2024. https://www.gizatech.xyz

[10] Modulus Labs, "The Cost of Intelligence: Proving Machine Learning Inference with Zero-Knowledge," Whitepaper, 2023.

[11] J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, T. Goldstein, "A Watermark for Large Language Models," ICML 2023. arXiv:2301.10226

[12] X. Zhao, Y.-X. Wang, L. Li, "Protecting Language Generation Models via Invisible Watermarking," ICML 2023. arXiv:2302.03162

[13] X. Cao, J. Jia, N. Z. Gong, "IPGuard: Protecting Intellectual Property of Deep Neural Networks via Fingerprinting the Classification Boundary," AsiaCCS 2021.

[14] N. Lukas, Y. Zhang, F. Kerschbaum, "Deep Neural Network Fingerprinting by Conferrable Adversarial Examples," arXiv:1912.00888, 2019.

[15] J. Chen, J. Wang et al., "Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models," IEEE Symposium on Security and Privacy (S&P), pp. 824–841, 2022.

[16] J. Guan, J. Liang, R. He, "Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks," NeurIPS 2022.

[17] R. Pappu, B. Recht, J. Taylor, N. Gershenfeld, "Physical One-Way Functions," Science 297(5589), 2002.

[18] A. Kothapalli, S. Setty, I. Tzialla, "Nova: Recursive Zero-Knowledge Arguments from Folding Schemes," CRYPTO 2022.

[19] Z. Sun, J. Yu, M. Mirkin, H. Zhang, "zkLLM: Zero Knowledge Proofs for Large Language Models," ACM CCS 2024.

[20] D. Kang, T. Hashimoto, I. Stoica, Y. Sun, "Scaling up Trustless DNN Inference with Zero-Knowledge Proofs," EuroSys 2024.

[21] OpenAI, "API Changelog," 2025. https://developers.openai.com/api/docs/changelog (Documents mutable model slug updates for gpt-5.2-chat-latest and others.)

[22] L. Chen, M. Zaharia, J. Zou, "How is ChatGPT's behavior changing over time?" arXiv:2307.09009, 2023.

[23] W. Cai, T. Shi, X. Zhao, and D. Song, "Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs," arXiv:2504.04715, 2025.

[24] European Parliament and Council, "Regulation (EU) 2024/1689 (AI Act)," Official Journal of the European Union, 2024. Articles 11, 12, 49; Annexes IV, VIII.

[25] NIST, "Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile," NIST AI 600-1, 2024.

[26] H. Birkholz, D. Thaler, M. Richardson, N. Smith, W. Pan, "Remote ATtestation procedureS (RATS) Architecture," RFC 9334, IETF, 2023.

[27] L. Lundblade, G. Mandyam, J. O'Donoghue, C. Wallace, "The Entity Attestation Token (EAT)," RFC 9711, IETF, 2024.

[28] Intel, "Intel Trust Domain Extensions (Intel TDX) Module Base Architecture Specification," 2024. (REPORTDATA, MRTD, RTMR semantics.)

[29] A. Francillon, Q. Nguyen, K. B. Rasmussen, G. Tsudik, "A Minimalist Approach to Remote Attestation," DATE 2014.

Acknowledgments

Portions of this research were developed in collaboration with AI systems that served as assistants for formal verification sketching, adversarial review, and manuscript preparation. All scientific claims, formal proofs, and editorial decisions remain the sole responsibility of the author.

Patent Disclosure

The structural fingerprint measurement methodology described in this work is the subject of U.S. Provisional Patent Application 63/982,893 (weights-based identity verification, filed February 13, 2026). The API-based endpoint verification methodology is the subject of U.S. Provisional Patent Application 63/990,487 (filed February 25, 2026). The zero-knowledge attestation architecture is the subject of U.S. Provisional Patent Application 63/996,680 (privacy-preserving model identity verification, filed March 4, 2026). The identity-conditioned inference verification architecture, hybrid proof-and-bridge decomposition, selective output verification, and evidence bundle system described in this paper are the subject of U.S. Provisional Patent Application 64/003,244 (filed March 12, 2026).

Supplementary Material

All Coq proof files referenced in this paper are available in the supplementary archive on Zenodo (DOI: 10.5281/zenodo.19008116).

Cite this paper

A. R. Coslett, "Which Model Is Running? Structural Identity as a Prerequisite for Trustworthy Zero-Knowledge Machine Learning," Fall Risk AI, LLC, March 2026. DOI: 10.5281/zenodo.19008116

Click to select · Copy to clipboard