diff --git a/DANNY_STOCKER_INFRAFABRIC_DOSSIER.md b/DANNY_STOCKER_INFRAFABRIC_DOSSIER.md index 8c0af0c..7d5b475 100644 --- a/DANNY_STOCKER_INFRAFABRIC_DOSSIER.md +++ b/DANNY_STOCKER_INFRAFABRIC_DOSSIER.md @@ -73,6 +73,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r - Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py - Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md +**Not proven yet (explicit gaps this portfolio does not hide):** +- Clinical efficacy / outcomes (requires clinician-led study design + evaluation) +- Completeness beyond the backend witness boundary (edge-level witnessing) +- Key custody + rotation + compromise response (audit-grade key management) +- Public append-only transparency log / external anchoring SLOs + The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars. **Example traces (public bundles):** @@ -148,6 +154,8 @@ When either filter changes the final user-visible output, the trace records `bef - It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed. - It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here. +**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run. + **Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here. --- diff --git a/DANNY_STOCKER_INFRAFABRIC_DOSSIER_DATA_DRIVEN_EDITION_FULL.md b/DANNY_STOCKER_INFRAFABRIC_DOSSIER_DATA_DRIVEN_EDITION_FULL.md index 2c2f13f..142ff84 100644 --- a/DANNY_STOCKER_INFRAFABRIC_DOSSIER_DATA_DRIVEN_EDITION_FULL.md +++ b/DANNY_STOCKER_INFRAFABRIC_DOSSIER_DATA_DRIVEN_EDITION_FULL.md @@ -83,6 +83,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r - Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py - Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md +**Not proven yet (explicit gaps this portfolio does not hide):** +- Clinical efficacy / outcomes (requires clinician-led study design + evaluation) +- Completeness beyond the backend witness boundary (edge-level witnessing) +- Key custody + rotation + compromise response (audit-grade key management) +- Public append-only transparency log / external anchoring SLOs + The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars. **Example traces (public bundles):** @@ -158,6 +164,8 @@ When either filter changes the final user-visible output, the trace records `bef - It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed. - It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here. +**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run. + **Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here. --- diff --git a/DANNY_STOCKER_INFRAFABRIC_DOSSIER_SUBMISSION_EDITION_FULL.md b/DANNY_STOCKER_INFRAFABRIC_DOSSIER_SUBMISSION_EDITION_FULL.md index 979cdaf..caff93c 100644 --- a/DANNY_STOCKER_INFRAFABRIC_DOSSIER_SUBMISSION_EDITION_FULL.md +++ b/DANNY_STOCKER_INFRAFABRIC_DOSSIER_SUBMISSION_EDITION_FULL.md @@ -83,6 +83,12 @@ Live user diagnostics pages remain OAuth‑protected for privacy. For external r - Verifier (single file): https://infrafabric.io/static/hosted/iftrace.py - Model-tier invariants (Dave-proof run, 15 traces): https://infrafabric.io/static/hosted/EMO_DAVE_PROOF_MODEL_COMPARE_20251222T164352Z.md +**Not proven yet (explicit gaps this portfolio does not hide):** +- Clinical efficacy / outcomes (requires clinician-led study design + evaluation) +- Completeness beyond the backend witness boundary (edge-level witnessing) +- Key custody + rotation + compromise response (audit-grade key management) +- Public append-only transparency log / external anchoring SLOs + The evidence index links per‑trace pages and the underlying downloadable bundles + SHA256 sidecars. **Example traces (public bundles):** @@ -158,6 +164,8 @@ When either filter changes the final user-visible output, the trace records `bef - It proves the *stack* can enforce specific invariants (language + formatting) across these model tiers for these prompts, with auditable corrections when needed. - It does not prove the models are equivalent on clinical judgment, crisis handling, or long‑horizon reasoning. Those require separate validation and are intentionally not claimed here. +**Early-stage signal (bounded):** for these enforced invariants, we observed no tier-dependent failures across the three tested model tiers in this run. + **Economic implication (bounded claim):** once these invariants are enforceable by the stack, model choice becomes a routing problem (default smaller, escalate when TRIAGE demands). Any claimed cost multipliers depend on provider pricing and are not asserted here. ---