re-voice/docs/FEEDBACK_WEEK_2025-12-27.md

45 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Week Feedback Summary (LLM Panel) — 2025-12-27
Source: internal CSV export (`@ShadowRT-LLM-Feedback`)
This is a synthesis of cross-model feedback (Grok, Gemini 1.5 Pro/Flash, GPT-5.2) over the **MonSun TV-week stress test** packs. It is intended to drive patches to the generator + bible without widening scope.
## Themes (cross-day)
- **P0: Ensure every dossier has usable “body” sections** (some HTML→MD sources collapsed into “cover + inferred mermaids only”, losing mirror integrity and Action Pack utility).
- **P0: Control Card / header hygiene**: extracted headings sometimes become paragraph-length; this breaks scanability and Jira/backlog export.
- **P0: Edition isolation**: Action Pack logic can “bleed” across domains (e.g., SaaS controls reused for hardware tokens) unless gates/owners/evidence are domain-aware.
- **P1: Mirror payload completeness**: tables/licensing tiers and high-signal numeric claims should be preserved and turned into enforceable questions/gates, not summarized away.
- **P1: Operational concreteness**: “telemetry” and “machine-checkable prerequisites” land well, but reviewers want minimum schemas (event type, freshness window, owner) to reduce hand-waving.
- **P2: Prioritization**: add lightweight severity ranking so “all Dave Factors” dont read equally critical.
## Day-specific P0s (from structured reviewer notes)
- **MON (Enterprise / Microsoft Defender page mirror)**: missing Action Pack and missing Dave blocks; licensing tier/table not mirrored; turn “3 minute” claims into enforceable gates.
- **TUE (Cloud / Aqua SaaS)**: paragraph blobs leaked into Control Card titles; add hard character limits and summarization.
- **WED (Endpoint / SentinelOne)**: headings conflated with descriptions; enforce short headings; critique “AI analyst” as black box evidence.
- **THU (COMSEC-ish / YubiKey FIPS brief)**: control logic looked SaaS-shaped; require hardware lifecycle / chain-of-custody controls.
- **FRI (Startup / Torq page mirror)**: Action Pack dropout; require stronger scrutiny when sources claim autonomy/agentic behavior.
- **SAT (Recap)**: ensure recap output includes a “what to steal” meta action pack (policy templates).
- **SUN (Deep dive / NIST SP 800-207 mirror)**: reduce abstractness by translating prose into “policy-as-code” style gates.
## Implemented fixes (generator + lint)
Implemented in `re-voice/src/revoice/generate.py` and `re-voice/src/revoice/lint.py`:
- **Robust section extraction fallback** for HTML→MD / weakly structured sources:
- Markdown heading parsing fallback.
- Last-resort “cover + body” shape, so `sections[1:]` is never empty.
- **Action Pack title hygiene**:
- New `_compact_title()` used for Control Card headings and backlog items to avoid paragraph-length titles.
- **Hardware-aware gating**:
- New Action Pack gate: `Hardware / identity` with owner/stop condition/evidence artifacts when the source contains FIPS/PIV/FIDO + token/hardware cues.
- **Lint exemption for Action Pack boilerplate**:
- Ignore repeated `- Acceptance:` lines so Action Pack backlog doesnt fail `_lint_repeated_lines`.
## Remaining backlog (proposed next patches)
- Add **recap_mode** to generate a meta “What to steal” action pack from MonFri without requiring the source to include it.
- Add **government_standard_mode** translation table (standard prose → gates/owners/evidence), with explicit tagging as operationalization (not new source claims).
- Add **high-signal table retention** rule to the extractor for common PDF table layouts (licensing tiers, side-by-side comparisons).
- Add **lightweight severity ranking** (P0/P1/P2 per section) without changing mirror order.