Runtime model identity, artifact identity, and signed verification infrastructure.
Each paper opened a question the previous one couldn't answer. Together they trace the same
line: a neural network's structural identity is mathematically distinct from its outputs, its weights' bytes,
and its agent credentials — and that distinction is measurable, formally verifiable, and operationally useful.
13 research papers4 technical notes4 patents · 0 retractedAll open-access on Zenodo
Reading paths
Four entry points through the corpus, by audience and purpose. Each path is three
works long.
Structural Identity. These papers
define and test the measurement primitive behind Fall Risk: a structural fingerprint observed during ordinary
computation. Start here for the scientific basis of runtime model identity.
Endpoint / API. When the model is
behind an API and weights are out of reach, the same identity can still be measured through public logprob
endpoints. These papers cover the API-side observable, its formal security properties, and its limits.
Distillation & Provenance.
Distilled models inherit some of their teacher's behavior but not its structural identity. These papers measure,
falsify, and bound what survives passive and adversarial copying — across families, scales, and training
recipes.
Governance & Compliance. How model
identity becomes admissible evidence: regulatory mappings (EU AI Act, NIST), enterprise IAM composition, and the
threat-model gap that current agent-identity standards leave open.
Artifact Identity. Verifying what is on
disk, before runtime. The boundary between artifact identity (what Trustfall Lite verifies) and runtime identity
(what Trustfall Deep verifies) is part of the claim hygiene.
Technical Notes. Operational notes
published alongside the research series: the agent-vs-model identity distinction, a gap-invariance proof for API
measurement, and a measured-substitution scenario against a live agent.
Core research
13 papers · publication order
Each paper extends a previous question. The natural reading path is in publication order —
the program's questions unfolded that way for a reason.
Neural networks have a structural fingerprint — the third pre-softmax logit gap —
that is invariant to temperature, architecture-stable across six families, and unforgeable under any
adversarial KL budget.
When the model is behind an API and the weights are out of reach, the same
identity can be measured through public logprob endpoints using PPP-residualized order-statistic geometry.
Distilled models inherit a measurable trace of their teacher; passive fine-tuning
erases the trace faster than adversarial erasure does, and same-family spoofing is geometrically
anti-aligned.
Provenance detection generalizes across teachers, students, and training protocols
— but the cosine alignment diagnostic is mandatory; scalar distance alone produces wrong answers.
An AI system's structural identity is mathematically distinct from its behavioral
character. Two models can produce identical outputs while having different identities, and the same model
produces wildly different characters under different prompts.
Inference verification proofs are only as trustworthy as the binding between the
proof and the model that actually ran. Hybrid proof-and-bridge attestation closes the gap.
Identity evidence comes in distinct classes — artifact, structural, provenance,
behavior — and substituting one for another is formally insufficient. A theorem makes the constraint legally
citable.
Structural model identity composes cleanly with JWT, SPIFFE, and existing
enterprise identity primitives. Four formal composition properties make it stack-safe.
Two models trained on identical data and architecture produce different structural
identities. Endpoint statistics cannot recover the formative trajectory.
Structural identity verification scales to 70B+ parameters. When a frontier
vendor's model lineage is disputed, only runtime measurement settles it — disclosure after the fact does
not.
Reasoning distillation produces family-dependent structural and functional
responses. Mistral, Llama, and Qwen react differently — and the structural layer can decouple entirely from
the functional layer.
Public toolchains strip a model's safety constraints while preserving observable
behavior. The structural fingerprint changes anyway — and we can detect it.
Authorizing an agent is not the same as verifying which neural network produced
its response. The 2026 identity-management products solve the first problem and assume the second.
Order-statistic gaps are invariant to log-softmax, temperature, and constant
shifts. The endpoint-verification protocol's robustness is provable, not just empirical.
Three substitution scenarios run against a live gateway with valid agent
credentials. Three detected. Zero false accepts. Warm-path latency under seven seconds.
Trustfall Lite verifies whether a local artifact's bytes match a signed enrollment
record. It does not — and cannot — verify what runs at inference time. The boundary is the product.