Add v2.3 styled trace paper with layered stack diagram
This commit is contained in:
parent
535392653c
commit
eb5697e53a
2 changed files with 282 additions and 0 deletions
281
IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.3_STYLED.md
Normal file
281
IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.3_STYLED.md
Normal file
|
|
@ -0,0 +1,281 @@
|
|||
# IF.EMOTION TRACE PROTOCOL v2.3: AUDITABLE DEBUGGING (WITHOUT WISHFUL THINKING)
|
||||
|
||||
**Subject:** End-to-End Traceability, Completeness Witnessing, and PQ-Anchored Evidence Binding
|
||||
**Protocol:** IF.TTT (Traceable, Transparent, Trustworthy)
|
||||
**Version:** 2.3 (Diagrams + Layered Stack)
|
||||
**Date:** 2025-12-21
|
||||
**Status:** AUDIT REQUIRED
|
||||
**Citation:** `if://whitepaper/emotion/trace-protocol/v2.3`
|
||||
|
||||
---
|
||||
|
||||
## 0) Layered Stack (Where Guarantees Live)
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
U[User Browser] -->|HTTPS| E[Edge: TLS + Routing]
|
||||
E -->|/oauth2/*| O[oauth2-proxy]
|
||||
E -->|/api/*| N[nginx]
|
||||
N --> B[Backend Witness Boundary]
|
||||
|
||||
B --> R[Retrieval / RAG]
|
||||
B --> P[Prompt Build]
|
||||
B --> M[Model Inference]
|
||||
B --> X[Postprocess: citations + trace footer]
|
||||
|
||||
B --> T1[REQ_SEEN ledger\n(hourly JSONL)]
|
||||
B --> T2[Trace events\n(hash chain JSONL)]
|
||||
B --> T3[Signed summary\n(output hash + head attestation)]
|
||||
|
||||
T1 --> H[Signed Merkle head\n(per hour)]
|
||||
T2 --> S[Trace head\n(event_hash)]
|
||||
|
||||
H --> BUNDLE[Evidence bundle\n(tar.gz + manifest)]
|
||||
S --> BUNDLE
|
||||
T3 --> BUNDLE
|
||||
|
||||
BUNDLE --> REG[IF.TTT registry\n(PQ-hybrid anchor)]
|
||||
BUNDLE --> MIRROR[Public static mirror\n/ static / hosted]
|
||||
```
|
||||
|
||||
**Interpretation:** integrity starts at the backend witness boundary. Completeness (REQ_SEEN) is only meaningful at and after that boundary until edge witnessing is implemented.
|
||||
|
||||
---
|
||||
|
||||
## 1) What This Protocol Actually Guarantees
|
||||
|
||||
This system does not try to “prove the model is true.” That is not a meaningful claim for probabilistic generation.
|
||||
|
||||
This system proves something narrower and more valuable:
|
||||
|
||||
1) what the system received (as a commitment),
|
||||
2) what the system did (trace event chain),
|
||||
3) what the system returned (output hash),
|
||||
4) what evidence it claims to have used (retrieval IDs + citation handles),
|
||||
5) that the resulting artifacts are tamper-evident and portable for external review.
|
||||
|
||||
If a claim cannot be bound to an artifact, it does not exist.
|
||||
|
||||
---
|
||||
|
||||
## 2) The Trace ID Contract (Non-Negotiable)
|
||||
|
||||
Every request to `/api/chat/completions` receives a Trace ID. This includes denials.
|
||||
|
||||
**Surfaces:**
|
||||
|
||||
- **Header:** `X-IF-Emotion-Trace: <uuid>`
|
||||
- **Header:** `X-IF-Emotion-Trace-Sig: <sig>` (app-level integrity)
|
||||
- **User output:** final line `Trace: <uuid>`
|
||||
|
||||
The Trace ID is the support ticket, the incident handle, and the audit join key.
|
||||
|
||||
---
|
||||
|
||||
## 3) Completeness: REQ_SEEN Witness Ledger (And Its Real Boundary)
|
||||
|
||||
Integrity alone is easy. Completeness is where systems lie.
|
||||
|
||||
### What REQ_SEEN does (v2.x)
|
||||
|
||||
REQ_SEEN records every request attempt that reaches the backend witness boundary as a privacy-preserving commitment:
|
||||
|
||||
- `user_text_sha256`, `user_len`, decision/reason, and `leaf_hash`
|
||||
|
||||
It writes:
|
||||
|
||||
- Hour ledger: `/opt/if-emotion/data/req_seen/<YYYYMMDDTHH>.jsonl`
|
||||
- Signed Merkle head: `/opt/if-emotion/data/req_seen/heads/<YYYYMMDDTHH>.json`
|
||||
|
||||
### The boundary (explicit)
|
||||
|
||||
REQ_SEEN completeness is only valid for requests that reach the backend process. Requests blocked before the backend are out of scope until the witness is moved to the edge proxy.
|
||||
|
||||
This is not a weakness in wording. It is a hard boundary condition.
|
||||
|
||||
---
|
||||
|
||||
## 4) Merkle Proofs: Roots Are Not Enough
|
||||
|
||||
A signed Merkle root helps, but roots alone do not give efficient proofs.
|
||||
|
||||
v2.x adds inclusion proofs for REQ_SEEN:
|
||||
|
||||
- A specific trace can be proven to exist in an hourly ledger with an O(log n) Merkle path.
|
||||
- The proof is generated from the ledger and verified against the signed head.
|
||||
|
||||
Verification tooling is provided via `iftrace.py` (see Section 10).
|
||||
|
||||
---
|
||||
|
||||
## 5) Trace Events: Hash Chain + Immediate Head Attestation
|
||||
|
||||
Trace events are stored as a hash chain in:
|
||||
|
||||
- `/opt/if-emotion/data/trace_events.jsonl`
|
||||
|
||||
Each event includes:
|
||||
|
||||
- `prev_hash` pointer
|
||||
- `event_hash` computed as `sha256(prev_hash || canonical_json(event_without_event_hash))`
|
||||
|
||||
This detects deletion or modification of interior events. It does not prevent a malicious deployment from not emitting events. That is addressed as a limitation and a roadmap item (Section 11).
|
||||
|
||||
The trace head is also attested in the signed completion record with an app-level Ed25519 signature so integrity can be verified immediately.
|
||||
|
||||
---
|
||||
|
||||
## 6) Canonicalization: What We Hash Must Be Stable
|
||||
|
||||
Cryptographic systems die by “almost the same bytes.”
|
||||
|
||||
v2.x mandates canonical JSON bytes for hashing/signing:
|
||||
|
||||
- Primary: `canonicaljson.encode_canonical_json(obj)`
|
||||
- Fallback: stable JSON serialization (`sort_keys`, fixed separators, UTF-8)
|
||||
|
||||
If two environments hash different bytes for “the same object,” you have no protocol.
|
||||
|
||||
---
|
||||
|
||||
## 7) Key Management (POC-Grade Today, Audit-Grade Tomorrow)
|
||||
|
||||
### Current state
|
||||
|
||||
- App Ed25519 signing key:
|
||||
- Private: `/opt/if-emotion/data/trace_ed25519.key` (0600)
|
||||
- Public: `/opt/if-emotion/data/trace_ed25519.pub` (shipped in bundles)
|
||||
- Key ID: `ed25519-app-v1`
|
||||
|
||||
The key is generated by libsodium-backed primitives and stored on disk with file permissions. This is acceptable for a POC, not for external certification.
|
||||
|
||||
### Rotation and compromise (required discipline)
|
||||
|
||||
- If the key is compromised: rotate immediately, bump `key_id`, and mark all traces from that time window as `trust_tier=degraded` unless independently anchored.
|
||||
- Old signatures remain verifiable with the historic public keys; key history must be preserved.
|
||||
|
||||
### Certification path
|
||||
|
||||
- Move keys to HSM/TPM or threshold signing.
|
||||
- Bind deploy attestations (image digest + config hash) to IF.TTT.
|
||||
|
||||
---
|
||||
|
||||
## 8) Post-Quantum: What Is PQ Today (And What Isn’t)
|
||||
|
||||
### What is PQ-anchored today
|
||||
|
||||
Evidence bundles are PQ-hybrid signed when registered into IF.TTT. The IF.TTT registry record includes:
|
||||
|
||||
- `pq_status: hybrid-fips204`
|
||||
- `pq_algo: ML-DSA-87`
|
||||
|
||||
This is the PQ anchoring layer.
|
||||
|
||||
### What is not PQ today
|
||||
|
||||
Hot-path app signatures (Ed25519) are not post-quantum. That is a conscious trade:
|
||||
|
||||
- Ed25519 provides immediate integrity at low latency.
|
||||
- PQ signing occurs at registration time in IF.TTT.
|
||||
|
||||
The correct claim is “PQ-anchored at registry time,” not “PQ everywhere.”
|
||||
|
||||
---
|
||||
|
||||
## 9) IF.story: Readability Without Evidence Drift
|
||||
|
||||
IF.story is a deterministic narrative projection of `trace_events.jsonl`.
|
||||
|
||||
It is not evidence. It is an index.
|
||||
|
||||
Each IF.story line includes the `event_hash` anchor, and auditors should verify those anchors against the raw JSONL.
|
||||
|
||||
---
|
||||
|
||||
## 10) Verifier Tooling (Independent Checks, Not Operator Vibes)
|
||||
|
||||
The bundle is designed to be verified with a single command and then deep-audited selectively.
|
||||
|
||||
Verifier:
|
||||
|
||||
- `iftrace.py verify <tar.gz> --expected-sha256 <sha>`
|
||||
|
||||
Merkle inclusion proof (REQ_SEEN):
|
||||
|
||||
- `iftrace.py prove-inclusion --ledger <req_seen_hour.jsonl> --head <req_seen_head.json> --trace-id <uuid>`
|
||||
- `iftrace.py verify-inclusion <proof.json>`
|
||||
|
||||
Checksum rules (important):
|
||||
|
||||
- `sha256s.txt` intentionally excludes itself and `manifest.json` to avoid self-referential checksum traps.
|
||||
|
||||
---
|
||||
|
||||
## 11) Threat Model and Limitations (Explicit)
|
||||
|
||||
### A) Truncation and external anchoring
|
||||
|
||||
Hash chains detect edits. They do not prevent truncation unless head hashes are anchored externally or independently cached.
|
||||
|
||||
Current mitigation:
|
||||
|
||||
- IF.TTT registration anchors the tarball hash into a separate chain.
|
||||
|
||||
Remaining requirement for certification:
|
||||
|
||||
- scheduled external anchoring of IF.TTT head hashes to a public append-only log.
|
||||
|
||||
### B) Clock integrity
|
||||
|
||||
Timestamps are derived from system clocks and are not trusted for cryptographic time.
|
||||
|
||||
Ordering is guaranteed by hash chain indices and hash pointers, not by wall-clock truth.
|
||||
|
||||
Certification path:
|
||||
|
||||
- introduce time witnesses or external timestamping for head hashes.
|
||||
|
||||
### C) Code integrity
|
||||
|
||||
Hash chains detect post-hoc tampering. They do not prevent a modified binary from choosing not to record.
|
||||
|
||||
Certification path:
|
||||
|
||||
- signed deploy attestations (image digest + config hash) bound into IF.TTT
|
||||
- optional remote attestation
|
||||
|
||||
---
|
||||
|
||||
## 12) Reference Proof Run (v2.1) — Public Transport Verified
|
||||
|
||||
Trace ID:
|
||||
|
||||
- `016cca78-6f9d-4ffe-aec0-99792d383ca1`
|
||||
|
||||
Preferred download URL (static mirror for external reviewers):
|
||||
|
||||
- `https://infrafabric.io/static/hosted/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz`
|
||||
|
||||
Alternate download URL (Forgejo raw; may intermittently return `415 Unsupported Media Type` depending on client/proxy behavior):
|
||||
|
||||
- `https://git.infrafabric.io/danny/hosted/raw/branch/main/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz`
|
||||
|
||||
Tarball SHA256:
|
||||
|
||||
- `7101ff9c38fc759a66157f6a6ab9c0936af547d0ec77a51b5d05db07069966c8`
|
||||
|
||||
Transport-level verification (what is now publicly provable):
|
||||
|
||||
- HTTP status: `200`
|
||||
- `Content-Type`: `application/gzip`
|
||||
- SHA256 matches the claim above (download integrity)
|
||||
|
||||
One-line verifier (shell):
|
||||
|
||||
- `curl -fsSL https://infrafabric.io/static/hosted/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz | sha256sum`
|
||||
|
||||
IF.TTT citation handle for the tarball (PQ hybrid signed):
|
||||
|
||||
- `if://citation/c24fe95e-226c-4efc-ba22-5ddcc37ff7d2/v1`
|
||||
|
||||
|
|
@ -0,0 +1 @@
|
|||
cbd1423bc728576ef639a6c823a57878c7af56faa60112f226e9f652c910c3c5 /root/tmp/hosted_repo_update/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.3_STYLED.md
|
||||
Loading…
Add table
Reference in a new issue