45 lines
3.7 KiB
Markdown
45 lines
3.7 KiB
Markdown
# Week Feedback Summary (LLM Panel) — 2025-12-27
|
||
|
||
Source: internal CSV export (`@ShadowRT-LLM-Feedback`)
|
||
|
||
This is a synthesis of cross-model feedback (Grok, Gemini 1.5 Pro/Flash, GPT-5.2) over the **Mon–Sun TV-week stress test** packs. It is intended to drive patches to the generator + bible without widening scope.
|
||
|
||
## Themes (cross-day)
|
||
|
||
- **P0: Ensure every dossier has usable “body” sections** (some HTML→MD sources collapsed into “cover + inferred mermaids only”, losing mirror integrity and Action Pack utility).
|
||
- **P0: Control Card / header hygiene**: extracted headings sometimes become paragraph-length; this breaks scanability and Jira/backlog export.
|
||
- **P0: Edition isolation**: Action Pack logic can “bleed” across domains (e.g., SaaS controls reused for hardware tokens) unless gates/owners/evidence are domain-aware.
|
||
- **P1: Mirror payload completeness**: tables/licensing tiers and high-signal numeric claims should be preserved and turned into enforceable questions/gates, not summarized away.
|
||
- **P1: Operational concreteness**: “telemetry” and “machine-checkable prerequisites” land well, but reviewers want minimum schemas (event type, freshness window, owner) to reduce hand-waving.
|
||
- **P2: Prioritization**: add lightweight severity ranking so “all Dave Factors” don’t read equally critical.
|
||
|
||
## Day-specific P0s (from structured reviewer notes)
|
||
|
||
- **MON (Enterprise / Microsoft Defender page mirror)**: missing Action Pack and missing Dave blocks; licensing tier/table not mirrored; turn “3 minute” claims into enforceable gates.
|
||
- **TUE (Cloud / Aqua SaaS)**: paragraph blobs leaked into Control Card titles; add hard character limits and summarization.
|
||
- **WED (Endpoint / SentinelOne)**: headings conflated with descriptions; enforce short headings; critique “AI analyst” as black box evidence.
|
||
- **THU (COMSEC-ish / YubiKey FIPS brief)**: control logic looked SaaS-shaped; require hardware lifecycle / chain-of-custody controls.
|
||
- **FRI (Startup / Torq page mirror)**: Action Pack dropout; require stronger scrutiny when sources claim autonomy/agentic behavior.
|
||
- **SAT (Recap)**: ensure recap output includes a “what to steal” meta action pack (policy templates).
|
||
- **SUN (Deep dive / NIST SP 800-207 mirror)**: reduce abstractness by translating prose into “policy-as-code” style gates.
|
||
|
||
## Implemented fixes (generator + lint)
|
||
|
||
Implemented in `re-voice/src/revoice/generate.py` and `re-voice/src/revoice/lint.py`:
|
||
|
||
- **Robust section extraction fallback** for HTML→MD / weakly structured sources:
|
||
- Markdown heading parsing fallback.
|
||
- Last-resort “cover + body” shape, so `sections[1:]` is never empty.
|
||
- **Action Pack title hygiene**:
|
||
- New `_compact_title()` used for Control Card headings and backlog items to avoid paragraph-length titles.
|
||
- **Hardware-aware gating**:
|
||
- New Action Pack gate: `Hardware / identity` with owner/stop condition/evidence artifacts when the source contains FIPS/PIV/FIDO + token/hardware cues.
|
||
- **Lint exemption for Action Pack boilerplate**:
|
||
- Ignore repeated `- Acceptance:` lines so Action Pack backlog doesn’t fail `_lint_repeated_lines`.
|
||
|
||
## Remaining backlog (proposed next patches)
|
||
|
||
- Add **recap_mode** to generate a meta “What to steal” action pack from Mon–Fri without requiring the source to include it.
|
||
- Add **government_standard_mode** translation table (standard prose → gates/owners/evidence), with explicit tagging as operationalization (not new source claims).
|
||
- Add **high-signal table retention** rule to the extractor for common PDF table layouts (licensing tiers, side-by-side comparisons).
|
||
- Add **lightweight severity ranking** (P0/P1/P2 per section) without changing mirror order.
|