re-voice/docs/FEEDBACK_WEEK_2025-12-27.md

3.7 KiB
Raw Export PDF Blame History

Week Feedback Summary (LLM Panel) — 2025-12-27

Source: internal CSV export (@ShadowRT-LLM-Feedback)

This is a synthesis of cross-model feedback (Grok, Gemini 1.5 Pro/Flash, GPT-5.2) over the MonSun TV-week stress test packs. It is intended to drive patches to the generator + bible without widening scope.

Themes (cross-day)

  • P0: Ensure every dossier has usable “body” sections (some HTML→MD sources collapsed into “cover + inferred mermaids only”, losing mirror integrity and Action Pack utility).
  • P0: Control Card / header hygiene: extracted headings sometimes become paragraph-length; this breaks scanability and Jira/backlog export.
  • P0: Edition isolation: Action Pack logic can “bleed” across domains (e.g., SaaS controls reused for hardware tokens) unless gates/owners/evidence are domain-aware.
  • P1: Mirror payload completeness: tables/licensing tiers and high-signal numeric claims should be preserved and turned into enforceable questions/gates, not summarized away.
  • P1: Operational concreteness: “telemetry” and “machine-checkable prerequisites” land well, but reviewers want minimum schemas (event type, freshness window, owner) to reduce hand-waving.
  • P2: Prioritization: add lightweight severity ranking so “all Dave Factors” dont read equally critical.

Day-specific P0s (from structured reviewer notes)

  • MON (Enterprise / Microsoft Defender page mirror): missing Action Pack and missing Dave blocks; licensing tier/table not mirrored; turn “3 minute” claims into enforceable gates.
  • TUE (Cloud / Aqua SaaS): paragraph blobs leaked into Control Card titles; add hard character limits and summarization.
  • WED (Endpoint / SentinelOne): headings conflated with descriptions; enforce short headings; critique “AI analyst” as black box evidence.
  • THU (COMSEC-ish / YubiKey FIPS brief): control logic looked SaaS-shaped; require hardware lifecycle / chain-of-custody controls.
  • FRI (Startup / Torq page mirror): Action Pack dropout; require stronger scrutiny when sources claim autonomy/agentic behavior.
  • SAT (Recap): ensure recap output includes a “what to steal” meta action pack (policy templates).
  • SUN (Deep dive / NIST SP 800-207 mirror): reduce abstractness by translating prose into “policy-as-code” style gates.

Implemented fixes (generator + lint)

Implemented in re-voice/src/revoice/generate.py and re-voice/src/revoice/lint.py:

  • Robust section extraction fallback for HTML→MD / weakly structured sources:
    • Markdown heading parsing fallback.
    • Last-resort “cover + body” shape, so sections[1:] is never empty.
  • Action Pack title hygiene:
    • New _compact_title() used for Control Card headings and backlog items to avoid paragraph-length titles.
  • Hardware-aware gating:
    • New Action Pack gate: Hardware / identity with owner/stop condition/evidence artifacts when the source contains FIPS/PIV/FIDO + token/hardware cues.
  • Lint exemption for Action Pack boilerplate:
    • Ignore repeated - Acceptance: lines so Action Pack backlog doesnt fail _lint_repeated_lines.

Remaining backlog (proposed next patches)

  • Add recap_mode to generate a meta “What to steal” action pack from MonFri without requiring the source to include it.
  • Add government_standard_mode translation table (standard prose → gates/owners/evidence), with explicit tagging as operationalization (not new source claims).
  • Add high-signal table retention rule to the extractor for common PDF table layouts (licensing tiers, side-by-side comparisons).
  • Add lightweight severity ranking (P0/P1/P2 per section) without changing mirror order.