Add v3.2 trace bundle + paper

This commit is contained in:
root 2025-12-21 10:45:56 +00:00
parent 164bb8ec44
commit 272cd3723b
17 changed files with 858 additions and 0 deletions

View file

@ -0,0 +1,196 @@
# IF.EMOTION TRACE PROTOCOL v3.2 — AUDITABLE DEBUGGING (WITHOUT WISHFUL THINKING)
**Alternate title:** Debugging Emotion — the Immutable Flight Recorder
**Subject:** End-to-end traceability, bounded completeness witnessing, and PQ-anchored evidence binding
**Protocol:** IF.TTT (Traceable, Transparent, Trustworthy)
**Version:** 3.2 (Methodology hardening: key separation + monotonic timing + correlation-only client trace)
**Date (UTC):** 2025-12-21
**Status:** AUDIT REQUIRED
**Citation:** `if://whitepaper/emotion/trace-protocol/v3.2`
---
## What this is (and why it matters)
If you run an LLM system in a high-liability environment, you eventually hit the moment where “the logs say” isnt enough. You need evidence you can hand to someone who does not trust you.
*This is not an observability feature. Its chain-of-custody.*
This protocol is a practical answer to one question:
Can an external reviewer independently verify what happened from request → retrieval → output, and detect tampering after the fact?
It intentionally separates what we can prove from what we cannot.
---
## Guarantees (and boundaries)
This system provides **integrity** guarantees (tamper-evidence) and **bounded completeness** guarantees (witnessing) within explicit boundaries.
- **Integrity:** the trace timeline is hash-chained; the signed summary binds the final output to a trace head.
- **Completeness (bounded):** a REQ_SEEN ledger witnesses each request that crosses the backend witness boundary, with a signed per-hour Merkle head.
- **PQ anchoring (bounded):** Post-quantum signatures apply at registry anchoring time (IF.TTT), not necessarily on every hot-path artifact.
One sentence boundary (non-negotiable):
Integrity starts at the backend witness boundary; completeness is only meaningful at and after that boundary until edge witnessing is cryptographically enforced.
---
## Layered evidence stack (where guarantees live)
```mermaid
flowchart TB
U[User] -->|HTTPS| E[Edge]
E --> B[Backend Witness Boundary]
B --> R[Retrieval]
B --> P[Prompt]
B --> M[Model]
B --> X[Postprocess]
B --> T1["REQ_SEEN ledger<br/>(hourly JSONL)"]
B --> T2["Trace events<br/>(hash chain JSONL)"]
B --> T3["Signed summary<br/>(output hash + head attestation)"]
T1 --> H["Signed Merkle head<br/>(per hour)"]
T2 --> S["Trace head<br/>(event_hash)"]
H --> BUNDLE["Evidence bundle<br/>(tar.gz + manifest)"]
S --> BUNDLE
T3 --> BUNDLE
BUNDLE --> REG["Registry anchor<br/>(PQ-hybrid)"]
BUNDLE --> MIRROR["Static mirror<br/>(public download)"]
```
Interpretation:
Integrity starts at the witness boundary; completeness is only meaningful at and after that boundary until edge witnessing exists.
---
## Evidence inventory (what ships)
| Artifact | File | Claim it supports | Verification |
|---|---|---|---|
| Evidence bundle | `emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz` | Portable reproduction | `sha256sum` + verifier |
| Manifest + checksums | `payload/manifest.json`, `payload/sha256s.txt` | “One check implies contents” | verifier validates per-file SHA256 |
| Trace event chain | `payload/trace_events.jsonl` | Tamper-evident event ordering | verifier recomputes event hashes |
| Signed summary | `payload/ttt_signed_record.json` | Binds response hash → trace head | verifier recomputes HMAC signature |
| REQ_SEEN ledger | `payload/req_seen_<hour>.jsonl` | Bounded completeness | verifier recomputes leaf hashes + Merkle root |
| REQ_SEEN head | `payload/req_seen_head_<hour>.json` | Signed Merkle head | verifier checks Ed25519 signature |
| Inclusion proof | `payload/req_seen_inclusion_proof.json` | Proves this trace is in the hour ledger | verifier checks Merkle path |
| IF.story annex | `payload/if_story.md` and external annex | Human-readable timeline | anchors must reference real `event_hash` |
| Registry corroboration | `*.ttt_chain_record.json` | PQ-anchored record (when available) | compare content hashes |
---
## Methodology hardening in v3.2 (the changes that close real audit gaps)
### HMAC key separation for REQ_SEEN (no mixed keys)
REQ_SEEN uses HMAC commitments only if `IF_REQ_SEEN_HMAC_KEY` is configured. It never reuses the signing secret used for the signed summary.
If `IF_REQ_SEEN_HMAC_KEY` is missing, REQ_SEEN downgrades to SHA256 commitments and the system must not claim “privacy-preserving HMAC commitments”.
### Correlation-only client trace IDs (collision discipline)
If a client provides `X-IF-Client-Trace`, it is treated as a correlation-only identifier.
The canonical trace ID is always server-generated and returned in `X-IF-Emotion-Trace`.
### Monotonic timing fields (clock realism)
Each trace event includes:
- `ts_utc`: wall-clock timestamp (not trusted for crypto time)
- `mono_ns` / `mono_ms`: monotonic timing since trace start (stable ordering and performance attribution)
This does not solve time attestation, but it removes “clock drift” as an excuse for missing latency evidence.
### Inclusion proof is a first-class prior
The inclusion proof file is registered as a child artifact in IF.TTT. It is not optional.
---
## Reference proof run (v3.2)
Trace ID:
- `96700e8e-6a83-445e-86f7-06905c500146`
Evidence bundle:
- Static mirror (preferred): `https://infrafabric.io/static/hosted/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
- Forgejo raw (alternate): `https://git.infrafabric.io/danny/hosted/raw/branch/main/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
Tarball SHA256:
- `85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87`
IF.TTT tarball handle:
- `if://citation/2ec551ec-0a08-487d-a41d-4d068aa8ee2f/v1`
---
## Verification (external reviewer path)
Download + hash:
```bash
curl -fsSL -o emo.tar.gz 'https://infrafabric.io/static/hosted/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz'
sha256sum emo.tar.gz
# expected: 85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87
```
Run verifier:
```bash
python3 -m venv venv
./venv/bin/pip install canonicaljson pynacl
curl -fsSL -o iftrace.py 'https://infrafabric.io/static/hosted/iftrace.py'
./venv/bin/python iftrace.py verify emo.tar.gz --expected-sha256 85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87
```
REQ_SEEN inclusion proof check:
```bash
tar -xzf emo.tar.gz
./venv/bin/python iftrace.py verify-inclusion payload/req_seen_inclusion_proof.json
# expected: OK
```
---
## Narrative annex
IF.story is not evidence; it is a deterministic projection keyed by `event_hash`.
- Static mirror: `https://infrafabric.io/static/hosted/IF_EMOTION_TRACE_REFERENCE_96700e8e-6a83-445e-86f7-06905c500146_IF_STORY.md`
- Forgejo raw: `https://git.infrafabric.io/danny/hosted/raw/branch/main/IF_EMOTION_TRACE_REFERENCE_96700e8e-6a83-445e-86f7-06905c500146_IF_STORY.md`
---
## Limitations (still true)
- This proves what the system did and what bytes were served. It does not prove factual truth in the world.
- Completeness is bounded by the witness boundary; requests dropped before the backend boundary are out of scope until edge witnessing is cryptographically bound.
- Key management and time attestation remain the practical certification blockers (HSM/TPM, rotation ceremony, external timestamping).
---
## What to do next (tomorrows work, not wishful thinking)
- Move REQ_SEEN witnessing to the true front door (edge) and sign the head there.
- Publish a deploy attestation record (code hash + image digest) into IF.TTT for every release.
- Add a clear anchoring SLO (maximum time from trace finalization → registry anchor) and enforce it.

View file

@ -0,0 +1 @@
8e61cfd0353da980439d9e18aeb6d572d71eb58960ccf26dfdf279c453095835 /root/tmp/hosted_repo_update/IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v3.2_STYLED.md

View file

@ -0,0 +1,23 @@
# IF.story — contextual narrative log (reference)
Trace: `96700e8e-6a83-445e-86f7-06905c500146`
Evidence bundle tarball:
- `https://infrafabric.io/static/hosted/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
- `https://git.infrafabric.io/danny/hosted/raw/branch/main/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
---
Trace: `96700e8e-6a83-445e-86f7-06905c500146`
Deterministic narrative projection of `trace_events.jsonl`. Each line includes the `event_hash` anchor.
- 2025-12-21T10:20:04Z (+0ms) | `request_commit` | Request body commitment; commit_ok=True client_trace_id=22222222-2222-4222-8222-222222222222 | event_hash=f924cb8cba0a6db4580009da023bd4eaeb376daaffa119619799f26f584358aa
- 2025-12-21T10:20:04Z (+1ms) | `req_seen` | REQ_SEEN witnessed; hour=20251221T10 count=4 merkle_root=fc96fce3d19583cbb4e11e4e0c4e717c4ce7d426697a5633286a9a446a146455 | event_hash=8f0c3568e59243519994ff76dad25def95e1014180fb8c5db7b3f86efb92f9f9
- 2025-12-21T10:20:04Z (+2ms) | `request_received` | Auth+quota succeeded; provider=codex model=gpt-5.2 stream=False user_len=47 auth_ms=3 | event_hash=f50db27625228b5293e1a2c14018bfd95377a12d211233f46fc4c85739f8f27d
- 2025-12-21T10:20:04Z (+3ms) | `guard_short_circuit` | IF.GUARD short-circuit; reasons=['self_harm_signal'] | event_hash=2c9eb30ff9fb12e19faecc9cd403c86d033bb76d7923d534ef08c37eb1bc217f
- 2025-12-21T10:20:04Z (+3ms) | `trace_finalizing` | Trace finalizing; ok=True provider=guard | event_hash=0022e0ce2050bc544bc38ff518aa465f505aad4c231bba4d7aabff19fcf459d9
Notes:
- Ground truth remains `trace_events.jsonl` + `ttt_signed_record.json`.
- REQ_SEEN ledger+head are included; public key is `trace_ed25519.pub`.

View file

@ -0,0 +1 @@
4f86ad4c1ebf415b6ed1ee700748584af1380ec0d04f8e0350e1fe51f458720e /root/tmp/hosted_repo_update/IF_EMOTION_TRACE_REFERENCE_96700e8e-6a83-445e-86f7-06905c500146_IF_STORY.md

View file

@ -29,6 +29,21 @@ Static hosted artifacts used in InfraFabric reviews.
- IF.TTT citation (PQ hybrid signed): `if://citation/c24fe95e-226c-4efc-ba22-5ddcc37ff7d2/v1`
- Notes: includes `payload/trace_ed25519.pub` + `payload/req_seen_inclusion_proof.json` + nested priors (`payload/ttt_children*.json`).
## emo-social trace payload (v3.2, methodology hardening demo)
- File: `emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
- SHA256: `85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87`
- IF.TTT citation (PQ hybrid signed): `if://citation/2ec551ec-0a08-487d-a41d-4d068aa8ee2f/v1`
- Notes: includes monotonic timing fields (`mono_ns`/`mono_ms`), correlation-only `X-IF-Client-Trace`, and registers `payload/req_seen_inclusion_proof.json` as a first-class prior.
## IF.emotion trace whitepaper (styled v3.2)
- File: `IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v3.2_STYLED.md`
## Trace bundler (operator tool)
- File: `emo_trace_pack.py` (builds an evidence bundle tarball from a trace id by pulling artifacts from `pct 220` + `pct 240`)
## IF.emotion trace whitepaper (styled v2.1)
- File: `IF_EMOTION_DEBUGGING_TRACE_WHITEPAPER_v2.1_STYLED.md`

View file

@ -0,0 +1,35 @@
# Verify emo-social trace bundle (external)
Artifacts:
- Tarball: `emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz`
- SHA256: `85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87`
- IF.TTT handle (PQ hybrid signed in registry): `if://citation/2ec551ec-0a08-487d-a41d-4d068aa8ee2f/v1`
Download:
```bash
curl -fsSL -o emo.tar.gz 'https://infrafabric.io/static/hosted/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz'
# Alternate (Forgejo raw):
curl -fsSL -o emo.tar.gz 'https://git.infrafabric.io/danny/hosted/raw/branch/main/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz'
sha256sum emo.tar.gz
```
Run verifier:
```bash
python3 -m venv venv
./venv/bin/pip install canonicaljson pynacl
curl -fsSL -o iftrace.py 'https://infrafabric.io/static/hosted/iftrace.py'
./venv/bin/python iftrace.py verify emo.tar.gz --expected-sha256 85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87
```
Merkle inclusion proof demo (REQ_SEEN completeness):
```bash
mkdir -p payload && tar -xzf emo.tar.gz -C .
./venv/bin/python iftrace.py verify-inclusion payload/req_seen_inclusion_proof.json
```
IF.TTT corroboration note:
- The `if://citation/...` handle is an internal registry identifier.
- For external review without registry access, use the published chain record:
- `emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.ttt_chain_record.json`
- `emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.ttt_chain_ref.json`

View file

@ -0,0 +1 @@
48a58a3729409dcc66da57cbd428b611ca50c29f03dc3b8b411a96acefdba76d /root/tmp/hosted_repo_update/VERIFY_EMO_TRACE_96700e8e-6a83-445e-86f7-06905c500146.md

547
emo_trace_pack.py Normal file
View file

@ -0,0 +1,547 @@
#!/usr/bin/env python3
"""
Build an IF.emotion "evidence bundle" tarball for a trace ID.
This is an operator tool that runs on the Proxmox host and pulls artifacts from:
- pct 220 (emo-social / if.emotion backend)
- pct 240 (IF.TTT registry)
Outputs:
/root/tmp/emo-trace-package-<trace_id>/
payload/...
emo_trace_payload_<trace_id>.tar.gz
payload_tar_sha256.txt
ttt_tarball_audit_entry.json
ttt_tarball_chain_record.json
ttt_tarball_chain_ref.json
"""
from __future__ import annotations
import argparse
import hashlib
import json
import os
import subprocess
import tarfile
import textwrap
import uuid
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
def utc_now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def sha256_bytes(data: bytes) -> str:
return hashlib.sha256(data or b"").hexdigest()
def sha256_file(path: Path) -> str:
h = hashlib.sha256()
with path.open("rb") as f:
for chunk in iter(lambda: f.read(1024 * 1024), b""):
h.update(chunk)
return h.hexdigest()
def canonical_json_bytes(obj: Any) -> bytes:
return json.dumps(obj, ensure_ascii=False, sort_keys=True, separators=(",", ":")).encode("utf-8")
def merkle_root_hex(leaves_hex: list[str]) -> str:
if not leaves_hex:
return sha256_bytes(b"")
level: list[bytes] = [bytes.fromhex(h) for h in leaves_hex if isinstance(h, str) and len(h) == 64]
if not level:
return sha256_bytes(b"")
while len(level) > 1:
if len(level) % 2 == 1:
level.append(level[-1])
nxt: list[bytes] = []
for i in range(0, len(level), 2):
nxt.append(hashlib.sha256(level[i] + level[i + 1]).digest())
level = nxt
return level[0].hex()
def merkle_inclusion_proof(leaves_hex: list[str], index: int) -> dict:
if index < 0 or index >= len(leaves_hex):
raise ValueError("index out of range")
level: list[bytes] = [bytes.fromhex(h) for h in leaves_hex]
proof: list[dict] = []
idx = index
while len(level) > 1:
if len(level) % 2 == 1:
level.append(level[-1])
sibling_idx = idx ^ 1
sibling = level[sibling_idx]
side = "left" if sibling_idx < idx else "right"
proof.append({"sibling": sibling.hex(), "side": side})
nxt: list[bytes] = []
for i in range(0, len(level), 2):
nxt.append(hashlib.sha256(level[i] + level[i + 1]).digest())
level = nxt
idx //= 2
root = level[0].hex()
return {"index": index, "root": root, "path": proof}
def run(cmd: list[str], *, stdin: bytes | None = None, timeout_s: int = 120) -> bytes:
p = subprocess.run(
cmd,
input=stdin,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
timeout=timeout_s,
check=False,
)
if p.returncode != 0:
raise RuntimeError(f"cmd failed ({p.returncode}): {' '.join(cmd)}\n{p.stderr.decode('utf-8',errors='ignore')}")
return p.stdout
def pct_exec(pct: int, bash_cmd: str, *, stdin: bytes | None = None, timeout_s: int = 120) -> bytes:
return run(["pct", "exec", str(pct), "--", "bash", "-lc", bash_cmd], stdin=stdin, timeout_s=timeout_s)
def pct_cat(pct: int, path: str, *, timeout_s: int = 120) -> bytes:
return pct_exec(pct, f"cat {shlex_quote(path)}", timeout_s=timeout_s)
def shlex_quote(s: str) -> str:
# Minimal, safe shell quoting for paths.
return "'" + (s or "").replace("'", "'\"'\"'") + "'"
def write_json(path: Path, obj: Any) -> None:
path.write_text(json.dumps(obj, ensure_ascii=False, sort_keys=True, indent=2) + "\n", encoding="utf-8")
def write_text(path: Path, text: str) -> None:
path.write_text(text, encoding="utf-8")
def fetch_api_json(*, trace_id: str, endpoint: str, email: str) -> Any:
raw = pct_exec(
220,
f"curl -fsSL -H {shlex_quote('X-Auth-Request-Email: ' + email)} http://127.0.0.1:5000{endpoint}",
timeout_s=60,
)
return json.loads(raw.decode("utf-8", errors="ignore") or "{}")
def extract_ttt_signed_record(*, trace_id: str) -> dict:
script = textwrap.dedent(
f"""
python3 - <<'PY'
import json
tid = {trace_id!r}
path = "/opt/if-emotion/data/ttt_signed_log.jsonl"
out = None
try:
with open(path, "r", encoding="utf-8", errors="ignore") as f:
for line in f:
line = line.strip()
if not line:
continue
try:
rec = json.loads(line)
except Exception:
continue
ev = rec.get("event") or {{}}
if isinstance(ev, dict) and str(ev.get("trace_id") or "").strip() == tid:
out = rec
except Exception:
out = None
print(json.dumps(out or {{}}, ensure_ascii=False, sort_keys=True))
PY
"""
).strip()
raw = pct_exec(220, script, timeout_s=60)
return json.loads(raw.decode("utf-8", errors="ignore") or "{}")
def resolve_ttt_records_by_id(record_ids: list[str]) -> list[dict]:
payload = json.dumps({"ids": record_ids}, ensure_ascii=False).encode("utf-8")
py = """
import json
import sys
import importlib.util
import contextlib
req = json.loads(sys.stdin.read() or "{}")
ids = req.get("ids") or []
spec = importlib.util.spec_from_file_location("ttt_registry_mod", "/opt/ttt-registry/ttt_registry.py")
mod = importlib.util.module_from_spec(spec)
with contextlib.redirect_stdout(sys.stderr):
spec.loader.exec_module(mod) # type: ignore
reg = mod.TTTRegistry()
out = []
for rid in ids:
rid = str(rid or "").strip()
if not rid:
continue
h = reg.redis.get(f"ttt:index:id:{rid}")
if not h:
continue
try:
out.append(reg.get(h))
except Exception:
continue
print(json.dumps(out, ensure_ascii=False, sort_keys=True))
""".strip()
raw = pct_exec(
240,
f"OQS_INSTALL_PATH=/opt/ttt-registry/_oqs /opt/ttt-registry/venv/bin/python -c {shlex_quote(py)}",
stdin=payload,
timeout_s=180,
)
try:
data = json.loads(raw.decode("utf-8", errors="ignore") or "[]")
except Exception:
return []
return data if isinstance(data, list) else [data]
def write_audit_entries(entries: list[dict]) -> None:
payload = json.dumps({"entries": entries}, ensure_ascii=False).encode("utf-8")
py = """
import json
import sys
import re
import uuid
import redis
from pathlib import Path
req = json.loads(sys.stdin.read() or "{}")
entries = req.get("entries") or []
cfg = Path("/etc/redis/ttt.conf").read_text(encoding="utf-8", errors="ignore")
m = re.search(r"^requirepass\\s+(\\S+)", cfg, flags=re.M)
password = m.group(1) if m else None
r = redis.Redis(host="localhost", port=6380, password=password, decode_responses=True)
written = 0
for e in entries:
cid = str(e.get("citation_id") or "").strip()
parts = cid.split("/")
uid = (parts[3] if len(parts) > 3 else "").strip()
try:
uuid.UUID(uid)
except Exception:
continue
r.set(f"audit:entry:{uid}", json.dumps(e, ensure_ascii=False, sort_keys=True))
written += 1
print(json.dumps({"ok": True, "written": written}, ensure_ascii=False))
""".strip()
_ = pct_exec(240, f"/opt/ttt-registry/venv/bin/python -c {shlex_quote(py)}", stdin=payload, timeout_s=60)
def ttt_import_audit() -> dict:
raw = pct_exec(
240,
"OQS_INSTALL_PATH=/opt/ttt-registry/_oqs /opt/ttt-registry/venv/bin/python /opt/ttt-registry/ttt_registry.py import-audit",
timeout_s=300,
)
txt = raw.decode("utf-8", errors="ignore") or ""
# The registry prints capability banners before JSON; best-effort parse the last JSON object.
for chunk in reversed([c.strip() for c in txt.splitlines() if c.strip()]):
if chunk.startswith("{") and chunk.endswith("}"):
try:
return json.loads(chunk)
except Exception:
break
return {"raw": txt.strip()}
def build_story(trace_id: str, events: list[dict]) -> str:
lines = [
"# IF.story — contextual narrative log",
"",
f"Trace: `{trace_id}`",
"",
"Deterministic narrative projection of `trace_events.jsonl`. Each line includes the `event_hash` anchor.",
"",
]
for ev in sorted(events, key=lambda e: int(e.get("idx") or 0)):
ts = str(ev.get("ts_utc") or "")
et = str(ev.get("type") or "")
mono_ms = int(ev.get("mono_ms") or 0)
data = ev.get("data") if isinstance(ev.get("data"), dict) else {}
h = str(ev.get("event_hash") or "")
summary = ""
if et == "request_commit":
summary = f"Request body commitment; commit_ok={bool(data.get('commit_ok'))} client_trace_id={data.get('client_trace_id') or ''}".strip()
elif et == "req_seen":
summary = f"REQ_SEEN witnessed; hour={data.get('hour_utc')} count={data.get('count')} merkle_root={data.get('merkle_root')}"
elif et == "request_received":
summary = f"Auth+quota succeeded; provider={data.get('provider')} model={data.get('requested_model')} stream={data.get('stream')} user_len={data.get('user_len')} auth_ms={data.get('auth_ms')}"
elif et == "guard_short_circuit":
summary = f"IF.GUARD short-circuit; reasons={data.get('reasons')}"
elif et == "trace_finalizing":
summary = f"Trace finalizing; ok={data.get('ok')} provider={data.get('provider')}"
else:
# generic
keys = list(data.keys())[:6] if isinstance(data, dict) else []
summary = f"Event data keys={keys}"
lines.append(f"- {ts} (+{mono_ms}ms) | `{et}` | {summary} | event_hash={h}")
lines += [
"",
"Notes:",
"- Ground truth remains `trace_events.jsonl` + `ttt_signed_record.json`.",
"- REQ_SEEN ledger+head are included; public key is `trace_ed25519.pub`.",
"",
]
return "\n".join(lines)
def build_manifest(payload_dir: Path) -> tuple[dict, dict[str, str]]:
sha_map: dict[str, str] = {}
files = []
for p in sorted(payload_dir.iterdir(), key=lambda x: x.name):
if not p.is_file():
continue
data = p.read_bytes()
sha = sha256_bytes(data)
sha_map[p.name] = sha
files.append({"path": p.name, "bytes": len(data), "sha256": sha})
manifest = {"files": files}
return manifest, sha_map
def write_sha256s(payload_dir: Path, sha_map: dict[str, str]) -> None:
lines = []
for name in sorted(sha_map.keys()):
lines.append(f"{sha_map[name]} {name}")
(payload_dir / "sha256s.txt").write_text("\n".join(lines) + "\n", encoding="utf-8")
def tar_payload(workdir: Path, trace_id: str) -> Path:
tar_path = workdir / f"emo_trace_payload_{trace_id}.tar.gz"
with tarfile.open(tar_path, "w:gz") as tf:
tf.add(workdir / "payload", arcname="payload")
return tar_path
def main() -> int:
ap = argparse.ArgumentParser()
ap.add_argument("trace_id", help="Trace ID to package")
ap.add_argument("--email", default="ds@infrafabric.io", help="Trusted email for owner-gated endpoints")
ap.add_argument("--headers", default="", help="Path to captured HTTP response headers (optional)")
ap.add_argument("--response", default="", help="Path to captured HTTP response body (optional)")
ap.add_argument("--api-payload", default="", help="Path to captured request JSON (optional)")
ap.add_argument("--out-dir", default="", help="Output directory (default: /root/tmp/emo-trace-package-<trace_id>)")
args = ap.parse_args()
trace_id = str(args.trace_id).strip()
if not trace_id:
raise SystemExit("trace_id required")
out_dir = Path(args.out_dir or f"/root/tmp/emo-trace-package-{trace_id}").resolve()
payload_dir = out_dir / "payload"
payload_dir.mkdir(parents=True, exist_ok=True)
# Captured request/response artifacts (optional).
if args.headers:
write_text(payload_dir / "headers.txt", Path(args.headers).read_text(encoding="utf-8", errors="ignore"))
if args.response:
# Ensure JSON is stable.
raw = Path(args.response).read_text(encoding="utf-8", errors="ignore")
try:
obj = json.loads(raw)
write_json(payload_dir / "response.json", obj)
except Exception:
write_text(payload_dir / "response.json", raw)
if args.api_payload:
raw = Path(args.api_payload).read_text(encoding="utf-8", errors="ignore")
try:
obj = json.loads(raw)
write_json(payload_dir / "api_payload.json", obj)
except Exception:
write_text(payload_dir / "api_payload.json", raw)
# API snapshots (owner-gated).
api_trace = fetch_api_json(trace_id=trace_id, endpoint=f"/api/trace/{trace_id}", email=args.email)
write_json(payload_dir / "api_trace.json", api_trace)
api_events = fetch_api_json(trace_id=trace_id, endpoint=f"/api/trace/events/{trace_id}?limit=10000", email=args.email)
write_json(payload_dir / "api_events.json", api_events)
# Signed record from append-only log (ground truth).
ttt_rec = extract_ttt_signed_record(trace_id=trace_id)
if not ttt_rec:
raise SystemExit("ttt_signed_record not found for trace_id")
write_json(payload_dir / "ttt_signed_record.json", ttt_rec)
# Raw trace payload (ground truth).
payload_path = f"/opt/if-emotion/data/trace_payloads/{trace_id}.json"
trace_payload_raw = pct_exec(220, f"cat {shlex_quote(payload_path)}", timeout_s=60)
payload_dir.joinpath("trace_payload.json").write_bytes(trace_payload_raw)
# Trace events (canonical JSONL form).
events = api_events.get("events") if isinstance(api_events, dict) else None
if not isinstance(events, list):
raise SystemExit("api_events missing events[]")
trace_events_lines = []
for ev in sorted((e for e in events if isinstance(e, dict)), key=lambda e: int(e.get("idx") or 0)):
trace_events_lines.append(json.dumps({"event": ev}, ensure_ascii=False, sort_keys=True))
write_text(payload_dir / "trace_events.jsonl", "\n".join(trace_events_lines) + "\n")
# Story projection.
write_text(payload_dir / "if_story.md", build_story(trace_id, [e for e in events if isinstance(e, dict)]))
# Trace public key (Ed25519).
pub = pct_exec(220, "cat /opt/if-emotion/data/trace_ed25519.pub", timeout_s=30)
payload_dir.joinpath("trace_ed25519.pub").write_bytes(pub)
# REQ_SEEN hour ledger+head, derived from the trace events.
req_seen_ev = next((e for e in events if isinstance(e, dict) and e.get("type") == "req_seen"), None)
if not isinstance(req_seen_ev, dict):
raise SystemExit("trace has no req_seen event; cannot build completeness proof")
hour = str((req_seen_ev.get("data") or {}).get("hour_utc") or "").strip()
if not hour:
raise SystemExit("req_seen event missing hour_utc")
ledger_path = f"/opt/if-emotion/data/req_seen/{hour}.jsonl"
head_path = f"/opt/if-emotion/data/req_seen/heads/{hour}.json"
ledger_bytes = pct_exec(220, f"cat {shlex_quote(ledger_path)}", timeout_s=30)
head_bytes = pct_exec(220, f"cat {shlex_quote(head_path)}", timeout_s=30)
payload_dir.joinpath(f"req_seen_{hour}.jsonl").write_bytes(ledger_bytes)
payload_dir.joinpath(f"req_seen_head_{hour}.json").write_bytes(head_bytes)
# Inclusion proof for this trace_id in the hour ledger.
leaves: list[str] = []
idx_for_trace: int | None = None
leaf_for_trace: str = ""
for raw_line in ledger_bytes.splitlines():
if not raw_line.strip():
continue
try:
entry = json.loads(raw_line.decode("utf-8", errors="ignore"))
except Exception:
continue
lh = str(entry.get("leaf_hash") or "").strip()
if len(lh) != 64:
continue
leaves.append(lh)
if idx_for_trace is None and str(entry.get("trace_id") or "").strip() == trace_id:
idx_for_trace = len(leaves) - 1
leaf_for_trace = lh
if idx_for_trace is None:
raise SystemExit("trace_id not found in REQ_SEEN hour ledger")
proof = merkle_inclusion_proof(leaves, idx_for_trace)
proof["leaf_hash"] = leaf_for_trace
proof["hour_utc"] = hour
# Sanity: root must match head's merkle_root.
head_obj = json.loads(head_bytes.decode("utf-8", errors="ignore") or "{}")
if str(head_obj.get("merkle_root") or "") and proof["root"] != str(head_obj.get("merkle_root") or ""):
raise SystemExit("Merkle root mismatch (ledger != head)")
write_json(payload_dir / "req_seen_inclusion_proof.json", proof)
# Manifest + sha list.
manifest, sha_map = build_manifest(payload_dir)
write_json(payload_dir / "manifest.json", manifest)
write_sha256s(payload_dir, sha_map)
# Register child artifacts in IF.TTT (audit:entry -> import-audit -> signed records).
child_paths = [
"headers.txt",
"response.json",
"trace_payload.json",
"trace_events.jsonl",
"ttt_signed_record.json",
"api_trace.json",
"api_events.json",
"api_payload.json",
"if_story.md",
"trace_ed25519.pub",
f"req_seen_{hour}.jsonl",
f"req_seen_head_{hour}.json",
"req_seen_inclusion_proof.json",
]
children_pre = []
audit_entries = []
created_utc = utc_now_iso()
for name in child_paths:
p = payload_dir / name
if not p.exists():
continue
cid_uuid = str(uuid.uuid4())
citation_id = f"if://citation/{cid_uuid}/v1"
sha = sha256_file(p)
rel_path = f"payload/{name}"
children_pre.append({"citation_id": citation_id, "rel_path": rel_path, "sha256": sha})
audit_entries.append(
{
"citation_id": citation_id,
"claim": f"emo-social trace artifact {name} for trace_id={trace_id}",
"source_filename": rel_path,
"source_sha256": sha,
"verification_status": "source-sha256",
"ingested_at": created_utc,
}
)
write_json(payload_dir / "ttt_children_pre.json", {"trace_id": trace_id, "created_utc": created_utc, "children": children_pre})
write_audit_entries(audit_entries)
_ = ttt_import_audit()
# Resolve signed IF.TTT records for the children.
child_ids = [c["citation_id"] for c in children_pre]
chain_records = resolve_ttt_records_by_id(child_ids)
write_json(payload_dir / "ttt_children_chain_records.json", chain_records)
# Minimal index for the bundle.
rec_by_id = {r.get("id"): r for r in chain_records if isinstance(r, dict) and r.get("id")}
children = []
for c in children_pre:
rid = c["citation_id"]
rec = rec_by_id.get(rid) or {}
children.append(
{
"citation_id": rid,
"rel_path": c["rel_path"],
"sha256": c["sha256"],
"content_hash": rec.get("content_hash"),
"pq_status": rec.get("pq_status"),
}
)
write_json(payload_dir / "ttt_children.json", {"trace_id": trace_id, "children": children})
# Build tarball and register it in IF.TTT.
tar_path = tar_payload(out_dir, trace_id)
tar_sha = sha256_file(tar_path)
write_text(out_dir / "payload_tar_sha256.txt", f"{tar_sha} {tar_path}\n")
tar_uuid = str(uuid.uuid4())
tar_citation_id = f"if://citation/{tar_uuid}/v1"
tar_audit_entry = {
"citation_id": tar_citation_id,
"claim": f"emo-social trace payload tarball (bundle) for trace_id={trace_id}",
"source_filename": tar_path.name,
"source_sha256": tar_sha,
"verification_status": "source-sha256",
"ingested_at": utc_now_iso(),
"source_path": str(tar_path),
}
write_json(out_dir / "ttt_tarball_audit_entry.json", tar_audit_entry)
write_audit_entries([tar_audit_entry])
_ = ttt_import_audit()
tar_chain = resolve_ttt_records_by_id([tar_citation_id])
if not tar_chain:
raise SystemExit("Failed to resolve tarball chain record from IF.TTT")
tar_rec = tar_chain[0]
write_json(out_dir / "ttt_tarball_chain_record.json", tar_rec)
write_json(out_dir / "ttt_tarball_chain_ref.json", {"citation_id": tar_citation_id, "content_hash": tar_rec.get("content_hash")})
print(str(out_dir))
return 0
if __name__ == "__main__":
raise SystemExit(main())

1
emo_trace_pack.py.sha256 Normal file
View file

@ -0,0 +1 @@
635671faa2b056253e8e26469d04d87f4f597c7bca0815eff038fb2b1986b548 /root/tmp/hosted_repo_update/emo_trace_pack.py

View file

@ -0,0 +1 @@
85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87 /root/tmp/hosted_repo_update/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz

View file

@ -0,0 +1,9 @@
{
"citation_id": "if://citation/2ec551ec-0a08-487d-a41d-4d068aa8ee2f/v1",
"claim": "emo-social trace payload tarball (bundle) for trace_id=96700e8e-6a83-445e-86f7-06905c500146",
"ingested_at": "2025-12-21T10:39:38.222913+00:00",
"source_filename": "emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz",
"source_path": "/root/tmp/emo-trace-package-96700e8e-6a83-445e-86f7-06905c500146/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.tar.gz",
"source_sha256": "85eb323c8e5f11cf4dd18e612e8cde8dcdb355b3fbd6380bbc8d480a5bf97e87",
"verification_status": "source-sha256"
}

View file

@ -0,0 +1 @@
7174cfba22651a19ad948a0df06d7dc3dc802acd2a6f2fc8364dbf311f178332 /root/tmp/hosted_repo_update/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.ttt_audit_entry.json

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
232722db9fe133aea3923565695324807cd7a74e81a6db2db360a18e9aa4181f /root/tmp/hosted_repo_update/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.ttt_chain_record.json

View file

@ -0,0 +1,4 @@
{
"citation_id": "if://citation/2ec551ec-0a08-487d-a41d-4d068aa8ee2f/v1",
"content_hash": "2a19ce17e4f936ed3ed5b140d77410bf99d41ba22489920e1e4cdec5b7b3a76a"
}

View file

@ -0,0 +1 @@
86efb752980e12b64a2c36c82d84bf3116cb3a48ef621e288691fa29a6902c63 /root/tmp/hosted_repo_update/emo_trace_payload_96700e8e-6a83-445e-86f7-06905c500146.ttt_chain_ref.json