diff --git a/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md b/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md new file mode 100644 index 0000000..a2c29d7 --- /dev/null +++ b/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md @@ -0,0 +1,291 @@ +# IF.EMOTION TRACE PROTOCOL v2.4: AUDITABLE DEBUGGING (WITHOUT WISHFUL THINKING) + +**Subject:** End-to-End Traceability, Completeness Witnessing, and PQ-Anchored Evidence Binding +**Protocol:** IF.TTT (Traceable, Transparent, Trustworthy) +**Version:** 2.4 (Diagrams + Transport Hardening) +**Date:** 2025-12-21 +**Status:** AUDIT REQUIRED +**Citation:** `if://whitepaper/emotion/trace-protocol/v2.4` + +--- + +## 0) Layered Stack (Where Guarantees Live) + +```mermaid +flowchart TB + U[User Browser] -->|HTTPS| E[Edge: TLS + Routing] + E -->|/oauth2/*| O[oauth2-proxy] + E -->|/api/*| N[nginx] + N --> B[Backend Witness Boundary] + + B --> R[Retrieval / RAG] + B --> P[Prompt Build] + B --> M[Model Inference] + B --> X[Postprocess: citations + trace footer] + + B --> T1[REQ_SEEN ledger\n(hourly JSONL)] + B --> T2[Trace events\n(hash chain JSONL)] + B --> T3[Signed summary\n(output hash + head attestation)] + + T1 --> H[Signed Merkle head\n(per hour)] + T2 --> S[Trace head\n(event_hash)] + + H --> BUNDLE[Evidence bundle\n(tar.gz + manifest)] + S --> BUNDLE + T3 --> BUNDLE + + BUNDLE --> REG[IF.TTT registry\n(PQ-hybrid anchor)] + BUNDLE --> MIRROR[Public static mirror\n/ static / hosted] +``` + +**Interpretation:** integrity starts at the backend witness boundary. Completeness (REQ_SEEN) is only meaningful at and after that boundary until edge witnessing is implemented. + +--- + +## 1) What This Protocol Actually Guarantees + +This system does not try to “prove the model is true.” That is not a meaningful claim for probabilistic generation. + +This system proves something narrower and more valuable: + +1) what the system received (as a commitment), +2) what the system did (trace event chain), +3) what the system returned (output hash), +4) what evidence it claims to have used (retrieval IDs + citation handles), +5) that the resulting artifacts are tamper-evident and portable for external review. + +If a claim cannot be bound to an artifact, it does not exist. + +--- + +## 2) The Trace ID Contract (Non-Negotiable) + +Every request to `/api/chat/completions` receives a Trace ID. This includes denials. + +**Surfaces:** + +- **Header:** `X-IF-Emotion-Trace: ` +- **Header:** `X-IF-Emotion-Trace-Sig: ` (app-level integrity) +- **User output:** final line `Trace: ` + +The Trace ID is the support ticket, the incident handle, and the audit join key. + +--- + +## 3) Completeness: REQ_SEEN Witness Ledger (And Its Real Boundary) + +Integrity alone is easy. Completeness is where systems lie. + +### What REQ_SEEN does (v2.x) + +REQ_SEEN records every request attempt that reaches the backend witness boundary as a privacy-preserving commitment: + +- `user_text_sha256`, `user_len`, decision/reason, and `leaf_hash` + +It writes: + +- Hour ledger: `/opt/if-emotion/data/req_seen/.jsonl` +- Signed Merkle head: `/opt/if-emotion/data/req_seen/heads/.json` + +### The boundary (explicit) + +REQ_SEEN completeness is only valid for requests that reach the backend process. Requests blocked before the backend are out of scope until the witness is moved to the edge proxy. + +This is not a weakness in wording. It is a hard boundary condition. + +--- + +## 4) Merkle Proofs: Roots Are Not Enough + +A signed Merkle root helps, but roots alone do not give efficient proofs. + +v2.x adds inclusion proofs for REQ_SEEN: + +- A specific trace can be proven to exist in an hourly ledger with an O(log n) Merkle path. +- The proof is generated from the ledger and verified against the signed head. + +Verification tooling is provided via `iftrace.py` (see Section 10). + +--- + +## 5) Trace Events: Hash Chain + Immediate Head Attestation + +Trace events are stored as a hash chain in: + +- `/opt/if-emotion/data/trace_events.jsonl` + +Each event includes: + +- `prev_hash` pointer +- `event_hash` computed as `sha256(prev_hash || canonical_json(event_without_event_hash))` + +This detects deletion or modification of interior events. It does not prevent a malicious deployment from not emitting events. That is addressed as a limitation and a roadmap item (Section 11). + +The trace head is also attested in the signed completion record with an app-level Ed25519 signature so integrity can be verified immediately. + +--- + +## 6) Canonicalization: What We Hash Must Be Stable + +Cryptographic systems die by “almost the same bytes.” + +v2.x mandates canonical JSON bytes for hashing/signing: + +- Primary: `canonicaljson.encode_canonical_json(obj)` +- Fallback: stable JSON serialization (`sort_keys`, fixed separators, UTF-8) + +If two environments hash different bytes for “the same object,” you have no protocol. + +--- + +## 7) Key Management (POC-Grade Today, Audit-Grade Tomorrow) + +### Current state + +- App Ed25519 signing key: + - Private: `/opt/if-emotion/data/trace_ed25519.key` (0600) + - Public: `/opt/if-emotion/data/trace_ed25519.pub` (shipped in bundles) + - Key ID: `ed25519-app-v1` + +The key is generated by libsodium-backed primitives and stored on disk with file permissions. This is acceptable for a POC, not for external certification. + +### Rotation and compromise (required discipline) + +- If the key is compromised: rotate immediately, bump `key_id`, and mark all traces from that time window as `trust_tier=degraded` unless independently anchored. +- Old signatures remain verifiable with the historic public keys; key history must be preserved. + +### Certification path + +- Move keys to HSM/TPM or threshold signing. +- Bind deploy attestations (image digest + config hash) to IF.TTT. + +--- + +## 8) Post-Quantum: What Is PQ Today (And What Isn’t) + +### What is PQ-anchored today + +Evidence bundles are PQ-hybrid signed when registered into IF.TTT. The IF.TTT registry record includes: + +- `pq_status: hybrid-fips204` +- `pq_algo: ML-DSA-87` + +This is the PQ anchoring layer. + +### What is not PQ today + +Hot-path app signatures (Ed25519) are not post-quantum. That is a conscious trade: + +- Ed25519 provides immediate integrity at low latency. +- PQ signing occurs at registration time in IF.TTT. + +The correct claim is “PQ-anchored at registry time,” not “PQ everywhere.” + +--- + +## 9) IF.story: Readability Without Evidence Drift + +IF.story is a deterministic narrative projection of `trace_events.jsonl`. + +It is not evidence. It is an index. + +Each IF.story line includes the `event_hash` anchor, and auditors should verify those anchors against the raw JSONL. + +--- + +## 10) Verifier Tooling (Independent Checks, Not Operator Vibes) + +The bundle is designed to be verified with a single command and then deep-audited selectively. + +Verifier: + +- `iftrace.py verify --expected-sha256 ` + +Merkle inclusion proof (REQ_SEEN): + +- `iftrace.py prove-inclusion --ledger --head --trace-id ` +- `iftrace.py verify-inclusion ` + +Checksum rules (important): + +- `sha256s.txt` intentionally excludes itself and `manifest.json` to avoid self-referential checksum traps. + +--- + +## 11) Threat Model and Limitations (Explicit) + +### A) Truncation and external anchoring + +Hash chains detect edits. They do not prevent truncation unless head hashes are anchored externally or independently cached. + +Current mitigation: + +- IF.TTT registration anchors the tarball hash into a separate chain. + +Remaining requirement for certification: + +- scheduled external anchoring of IF.TTT head hashes to a public append-only log. + +### B) Clock integrity + +Timestamps are derived from system clocks and are not trusted for cryptographic time. + +Ordering is guaranteed by hash chain indices and hash pointers, not by wall-clock truth. + +Certification path: + +- introduce time witnesses or external timestamping for head hashes. + +### C) Code integrity + +Hash chains detect post-hoc tampering. They do not prevent a modified binary from choosing not to record. + +Certification path: + +- signed deploy attestations (image digest + config hash) bound into IF.TTT +- optional remote attestation + +--- + +## 12) Reference Proof Run (v2.1) — Public Transport Verified + +Trace ID: + +- `016cca78-6f9d-4ffe-aec0-99792d383ca1` + +Preferred download URL (static mirror for external reviewers): + +- `https://infrafabric.io/static/hosted/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz` + +Static mirror directory (for discovery if a link is mistyped): + +- `https://infrafabric.io/static/hosted/` + +Alternate download URL (Forgejo raw; may intermittently return `415 Unsupported Media Type` depending on client/proxy behavior): + +- `https://git.infrafabric.io/danny/hosted/raw/branch/main/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz` + +Tarball SHA256: + +- `7101ff9c38fc759a66157f6a6ab9c0936af547d0ec77a51b5d05db07069966c8` + +Transport-level verification (what is now publicly provable): + +- HTTP status: `200` +- `Content-Type`: `application/gzip` +- SHA256 matches the claim above (download integrity) + +One-line verifier (shell): + +- `curl -fsSL https://infrafabric.io/static/hosted/emo_trace_payload_016cca78-6f9d-4ffe-aec0-99792d383ca1.tar.gz | sha256sum` + +Expected output: + +- `7101ff9c38fc759a66157f6a6ab9c0936af547d0ec77a51b5d05db07069966c8 -` + +If you see `415 Unsupported Media Type` while using the static mirror URL, treat it as a client/proxy copy-paste error and retry with a literal paste from this document. The static mirror should return `200` or `404`, not `415`. + +IF.TTT citation handle for the tarball (PQ hybrid signed): + +- `if://citation/c24fe95e-226c-4efc-ba22-5ddcc37ff7d2/v1` + diff --git a/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md.sha256 b/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md.sha256 new file mode 100644 index 0000000..bb3af33 --- /dev/null +++ b/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md.sha256 @@ -0,0 +1 @@ +d87fbef381ee2e71dca4451d0eeb0c569728cf33f3e9089740589da9983a8cd3 /root/tmp/hosted_repo_update/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.4_STYLED.md