Add explicit gaps next to Dave-proof link
This commit is contained in:
parent
5fccd0ed09
commit
fc5d6bbc4d
3 changed files with 24 additions and 0 deletions
|
|
@ -73,6 +73,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r
|
|||
- Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py
|
||||
- Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md
|
||||
|
||||
**Not proven yet (explicit gaps this portfolio does not hide):**
|
||||
- Clinical efficacy / outcomes (requires clinician-led study design + evaluation)
|
||||
- Completeness beyond the backend witness boundary (edge-level witnessing)
|
||||
- Key custody + rotation + compromise response (audit-grade key management)
|
||||
- Public append-only transparency log / external anchoring SLOs
|
||||
|
||||
The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars.
|
||||
|
||||
**Example traces (public bundles):**
|
||||
|
|
@ -148,6 +154,8 @@ When either filter changes the final user-visible output, the trace records `bef
|
|||
- It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed.
|
||||
- It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here.
|
||||
|
||||
**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run.
|
||||
|
||||
**Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here.
|
||||
|
||||
---
|
||||
|
|
|
|||
|
|
@ -83,6 +83,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r
|
|||
- Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py
|
||||
- Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md
|
||||
|
||||
**Not proven yet (explicit gaps this portfolio does not hide):**
|
||||
- Clinical efficacy / outcomes (requires clinician-led study design + evaluation)
|
||||
- Completeness beyond the backend witness boundary (edge-level witnessing)
|
||||
- Key custody + rotation + compromise response (audit-grade key management)
|
||||
- Public append-only transparency log / external anchoring SLOs
|
||||
|
||||
The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars.
|
||||
|
||||
**Example traces (public bundles):**
|
||||
|
|
@ -158,6 +164,8 @@ When either filter changes the final user-visible output, the trace records `bef
|
|||
- It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed.
|
||||
- It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here.
|
||||
|
||||
**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run.
|
||||
|
||||
**Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here.
|
||||
|
||||
---
|
||||
|
|
|
|||
|
|
@ -83,6 +83,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r
|
|||
- Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py
|
||||
- Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md
|
||||
|
||||
**Not proven yet (explicit gaps this portfolio does not hide):**
|
||||
- Clinical efficacy / outcomes (requires clinician-led study design + evaluation)
|
||||
- Completeness beyond the backend witness boundary (edge-level witnessing)
|
||||
- Key custody + rotation + compromise response (audit-grade key management)
|
||||
- Public append-only transparency log / external anchoring SLOs
|
||||
|
||||
The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars.
|
||||
|
||||
**Example traces (public bundles):**
|
||||
|
|
@ -158,6 +164,8 @@ When either filter changes the final user-visible output, the trace records `bef
|
|||
- It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed.
|
||||
- It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here.
|
||||
|
||||
**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run.
|
||||
|
||||
**Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here.
|
||||
|
||||
---
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue