Initial re-voice scaffold + Dave example

This commit is contained in:
danny 2025-12-25 07:42:16 +00:00
commit 0a65d911f9
15 changed files with 850 additions and 0 deletions

14
.gitignore vendored Normal file
View file

@ -0,0 +1,14 @@
.DS_Store
.env
.env.*
.venv
__pycache__/
*.pyc
dist/
build/
.pytest_cache/
# Re-voice workspace
tmp/
*.log

44
README.md Normal file
View file

@ -0,0 +1,44 @@
# re-voice
`re-voice` turns “any document” into a **shadow dossier** by applying a versioned **style bible** (voice + structure + constraints) on top of extracted source text.
This repo is the home for:
- Style bibles (versioned, citable)
- A small extraction + dossier generator (CLI + optional API)
- Example dossiers for review
## Quick start (example)
Generate the Dave-style shadow dossier for the included PDF:
```bash
PYTHONPATH=src python3 -m revoice generate \
--style if.dave.v1 \
--input examples/ai-code-guardrails/AI-Code-Guardrails.pdf \
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
```
Or install the CLI locally:
```bash
python3 -m pip install -e .
revoice generate --style if.dave.v1 --input examples/ai-code-guardrails/AI-Code-Guardrails.pdf
```
## What “apply a style bible” means
A style bible is treated as an executable contract:
- **Structure:** required sections / scaffolding (e.g. the 9-element stack)
- **Voice constraints:** pronouns, tone, taboo phrases, vocabulary swaps
- **Formatting rules:** bullets, bold buzzwords, footers, etc.
- **Citations:** stable IDs like `if://bible/dave/v1.0` to make outputs auditable
## Proposed app shape (upload → shadow dossier)
See `docs/APP_SPEC.md`.
## Dev notes
`revoice` uses external tools for text extraction:
- `pdftotext`, `pdftoppm` (Poppler utils)
- `tesseract` (OCR fallback for image-only PDFs)

92
docs/APP_SPEC.md Normal file
View file

@ -0,0 +1,92 @@
# re-voice app proposal: “upload → shadow dossier”
## Product goal
Let a user upload **any document** (PDF/DOCX/MD/HTML/images) and receive a **shadow dossier** rendered through a chosen **style bible** (e.g. `if://bible/dave/v1.0`).
## Non-goals (v0)
- Perfect fidelity layout extraction (we only need usable text + key figures)
- Long-term storage/retention policies (we can stub, then harden)
## Architecture (thin UI, strong pipeline)
### 1) Ingest
- Upload endpoint: `POST /api/dossiers` (multipart)
- Compute and persist:
- `sha256` of original
- detected `mime`
- storage pointer (disk/S3/Forgejo blob)
- Create `Document` row: `{id, sha256, filename, mime, created_at, owner}`
### 2) Extract → Canonicalize
Use a pluggable extractor chain:
- PDF:
1. `pdftotext` (fast path, text-layer PDFs)
2. OCR fallback (`pdftoppm``tesseract`) for image-only PDFs
- DOCX: `pandoc` or `python-docx`
- HTML: `readability`-style boilerplate removal
- Images: OCR (`tesseract`) with basic deskew
Output a canonical block model (enables better prompting + citations):
```json
{
"doc_id": "…",
"blocks": [
{"type":"heading","level":1,"text":"…"},
{"type":"paragraph","text":"…"},
{"type":"list","items":["…","…"]}
]
}
```
### 3) Style bible compiler
Store bibles in-repo as Markdown + a small metadata header (id, version, citation, hard rules).
Compile the bible into:
- `system_prompt` (voice + forbidden/required constraints)
- `template` (required dossier structure)
- `lint_rules` (post-checks: emojis/paragraph, pronouns, required footer, etc.)
### 4) Generate
Two-step generation is safer and more controllable:
1. **Content distillation** (extract doc facts → structured notes)
2. **Style application** (render notes into dossier template under bible constraints)
Recommended runtime:
- OpenAI-compatible Chat Completions backend (Juakali / OpenWebUI stack)
- Persist `{model, prompts, output_sha256}` for auditability
### 5) Validate (style linter)
Run a deterministic linter per bible:
- hard constraints (e.g., “emoji per paragraph” for Dave)
- vocabulary swaps (optional)
- required footer/disclaimer
- “no secrets” scan (best-effort)
If lint fails: auto-repair pass (LLM) or return “needs revision” with lint report.
### 6) Export + publishing
Outputs:
- Markdown (primary)
- PDF via existing Forgejo PDF export (`.../raw/...&format=pdf`) by committing generated Markdown to a repo
Publishing strategy:
- Store outputs in a Forgejo repo (per team/project)
- Provide immutable links to `{sha}` + `.sha256` sidecars
## Security + operational considerations
- Run extraction/OCR in a sandboxed worker (CPU/mem/time limits).
- Never store API keys in repos; use env/secret manager.
- Keep an audit trail: source hash → extracted text hash → output hash → model/prompt hashes.

View file

@ -0,0 +1,194 @@
===== page-1 =====
Al CODE
GUARDRAILS:
A PRACTICAL GUIDE FOR
SECURE ROLLOUT
=
aN
-
V—
===== page-2 =====
7
Tools like GitHub Copilot and Google Gemini Code Assist help teams
generate code at scale, reduce boilerplate, and speed up delivery,
resulting in unprecedented boosts in productivity. But with greater
speed comes greater security risk. Studies show that 27% of Al-
generated code contains vulnerabilities, reflecting volume and
velocity, not tool failure.
To manage that risk without losing momentum, organizations need to
implement security guardrails and checks and controls that prevent
Al-generated code from introducing vulnerabilities into production.
This quide offers a practical framework to help engineering leaders
and security teams roll out Al assistants safely and scalably, using
Snyks platform to help reinforce Al governance policies. From pull
request checks to IDE scanning and conditional access policies, each
section outlines real implementation tactics you can adopt today to
start building your Al-readiness, without compromising developer
productivity.
===== page-3 =====
Why it matters: Pull requests are a natural place to catch Al-generated vulnerabilities before they reach production.
Before fully rolling out Al coding assistants, it's important to ensure your development process includes automated
security checks. These guardrails help prevent risky code from being merged into your main branch, and pull requests
are the most logical place to start.
With Snyks Pull Request (PR) checks, you can scan every code change as its submitted, flagging issues early and
integrating security into the review process without disrupting workflows.
You can also use the Snyk CLI in your Cl/CD process as a second checkpoint for more mature pipelines. This layered
approach helps maintain consistency across teams and deployment paths.
Catching issues here is a meaningful win, but it often comes after code has been written, reviewed, and maybe even
tested. Fixing those issues can create additional overhead. That's why, in the next section, we'll look at how to move
these checks even earlier in the development lifecycle.
a
Why it matters: Catching security issues during development reduces rework and keeps developers focused on building,
not backtracking.
Since Snyks earliest days, we've emphasized the importance of identifying vulnerabilities as early as possible, ideally while
the code is still being written. That philosophy remains especially important as teams begin using Al code assistants.
While pull request checks catch risky code before it's merged, they come after the work is done. By then, developers may
have already built functionality on top of insecure logic, so fixing a simple bug could require refactoring larger components.
Instead, we recommend extending your guardrails directly into the development environment. Using the Snyk IDE plugin,
developers can get real-time feedback as they code, catching vulnerabilities before the code ever leaves their editor.
For teams working in agentic environments, like Cursor or GitHub Copilot chat-based workflows, the same level of scanning
can be achieved using the Snyk local MCP server, which runs security checks in the background as code is generated.
Shifting left doesn't just improve security posture, it reduces friction for developers and accelerates delivery. And when
those guardrails feel like part of the flow, adoption becomes much easier, which is what we'll explore next.
snyk
y 02
===== page-4 =====
Why it matters: Verifying security setup at the start encourages responsible tool use and builds good security
habits early.
Before granting developers access to Al coding assistants, consider implementing a lightweight access
requirement: proof that local security testing is in place, preferably in the IDE, where issues can be identified and
fixed immediately.
One option is to ask developers to upload a screenshot showing that they have installed the Snyk security IDE
plugin and attest that they will proactively test their Al-assisted code locally.
For example, developers can upload a screenshot showing that the Snyk IDE plugin is installed and confirm that
they'll proactively test Al-generated code during development.
Teams working in agent-based environments (like Cursor or Copilot) can alternatively connect to the Snyk local
MCP server, which supports agent-driven workflows and scans Al output as its created.
As a secondary layer, organizations can still use pull request checks to catch issues before merging. For even
greater efficiency, Snyk Agent Fix enables autonomous remediation by suggesting secure alternatives in context,
further streamlining the development experience.
= = ec
Code Assistant Access Request Form _
Complete this form to request access to an Al coding assistant. include a screenshot a
demonstrating that you have installed a Snyk IDE plugin to test code locally.
Upload a screenshot showing that the Snyk IDE plugin is installed for local testing * :
BI Screenshot 2025. X :
Provide any additional context on the request :
By submitting this form, | attest that | will only use the Al coding assistant in
conjunction with the Snyk IDE plugin. n :
Example evidence showing the installation of the Snyk
- security IDE plugin
— -
NS —
===== page-5 =====
Why it matters: Visibility into tool usage helps ensure guardrails are working and that they are adopted where it
counts.
If Al coding tools are already used across your organization, its not too late to implement secure practices.
Conduct periodic audits to identify any blind spots where developers may be using Al coding assistants without
local security checks.
Use Snyks Developer IDE and CLI usage reports alongside your Al coding assistants admin console to cross-
reference who's actively using assistants, and whether security tooling like the IDE plugin is also in place.
Gemini Access Report
John Smith john.smith@snyk.io 2025-01-15 2025-04-15 2025-04-16 15:04:31.154
Jane Jones _| jane.jones@snyk.io 2025-01-15 2025-02-22 A
Danial Hill danial.hill@snyk.io 2025-02-14 2025-04-16 A
For a more scalable approach, Snyk Essentials provides centralized visibility into developer adoption of key
security tools, helping platform and security teams track IDE plugin usage, identify gaps (e.g. missed scans), and
monitor adoption trends over time.
A simple “trust but verify” model can go a long way. Some teams send automated reminders or light-touch
enforcement notices, letting developers know that their access may be paused if security tools are missing or
inactive.
===== page-6 =====
Why it matters: Developers are best positioned to prevent vulnerabilities introduced by Al-generated code, but
they can only do so if they understand the risks.
As Al tooling becomes part of everyday development, security training should evolve accordingly. Ensure that
developer onboarding and continuing education explicitly cover the risks of Al-generated code, and reinforce the
importance of local testing as a first line of defense.
Snyk Learn includes a targeted lesson on the OWASP Top 10 for LLM and GenAl, helping teams understand
emerging threats and adopt safer Al practices.
Explore our whitepaper, Developer Training in Cybersecurity for a broader perspective on secure development
upskilling.
Quiz
Test your knowledge!
@ auz
What must you do if you want access to an Al code assistant tool?
> Include "be secure in your prompts
Install and use the Snyk IDE plugin
~ Download a code assistant from the web
Keep Learning
» Al generated code is not immune to security vulnerabilities.
+ Itis your responsibility to test code locally and in security gates.
Example of developer education: Snyk Learn quiz
===== page-7 =====
Why it matters: When access to Al tools is tied to secure configurations, you create guardrails that scale and
ensure security isn't optional.
For organizations with more centralized control over developer environments and automated distribution, theres
an opportunity to deploy security tooling alongside access to Al code assistants.
There are several ways to approach access management, but how you choose will ultimately depend on your
tools, how you use them, and your company culture.
For example, if your company utilizes endpoint management systems, you could consider allowing listing access
to Al code assistants for users who have demonstrated installation of local security testing tools or recently
confirmed their commitment to security practices. If you're using tools like Microsoft Intune, Jamf, or Citrix, you
might configure dynamic domain access rules that grant access to Gemini, Copilot, Cursor, or Windsurf only after
a developer has met the defined security prerequisites.
If your development teams leverage virtual development environments, access to coding assistants can be
granted programmatically in conjunction with the Snyk IDE plugin. See the following example of dev container
setup granting Microsoft Copilot and Snyk extensions in VS Code:
None
{
“image”:
“mer .microsoft.com/devcontainers/typescript-node”,
"forwardPorts": [3606] al
“customizations”: {
// Configure properties specific to VS Code.
“vscode”:; {
// IDs of extensions to install when the container is
created.
“extensions”:
["“snyk-security.snyk-vulnerability-scanner",
“github.copilot”]
}
}
}
===== page-8 =====
THE PATH FORWARD:
a
Al-assisted development is no longer experimental — it's already changing how teams write, test,
and ship code. But with this speed and scale comes risk, and its up to engineering and security
leaders to ensure those risks don't derail progress.
Guardrails are the key. When implemented early in IDEs, agents, PRs, and access workflows, they
allow developers to move faster, not slower. They remove barriers by embedding security into the
development experience itself.
Whether your teams are just starting to explore Al tooling or are already rolling it out across
environments, the practices in this guide offer a practical framework for building trust in that
process without introducing unnecessary friction.
Secure innovation isnt just possible, its operational. And Snyk is here to help build trust in your
Al. Talk to our team to get started!
Want to learn more about how
Snuk builds trust in Al software?
EXPLORE SNYK NOW.

Binary file not shown.

View file

@ -0,0 +1 @@
6153a5998fe103e69f6d5b6042fbe780476ff869a625fcf497fd1948b2944b7c AI-Code-Guardrails.pdf

View file

@ -0,0 +1,75 @@
# Shadow Dossier: AI Code Guardrails (Dave Layer Applied) 🚀
**Protocol:** IF.DAVE.v1.0 📬
**Citation:** `if://bible/dave/v1.0` 🧾
**Source:** `examples/ai-code-guardrails/AI-Code-Guardrails.pdf` 📎
**Generated:** `2025-12-25` 🗓️
**Extract Hash (sha256):** `2e73e0eca81cf91c81382c009861eea0f2fc7e3f972b5ef8aca83970dabe5972` 🔍
## Warm-Up: Quick vibes check-in 👋
Happy 2025-12-25, Team! 🌤️ We love the momentum here, and its genuinely exciting to see **Security** and **Velocity** showing up to the same meeting for once. 🤝
## Alignment: Shared outcomes (high-level) 🎯
We are all super aligned on the vision of shipping faster *and* safer, while minimizing any unexpected “operational headwinds.” 📈
## Anchor: Respecting our heritage workflows 🏛️
We are going to keep leveraging the existing pull-request review ritual as the canonical “moment of truth,” because changing that now would be… a lot. 🧱
## Vibe Check: What the team is feeling 🧠
The team feels really good about a layered approach where guardrails show up early (IDE) and also show up late (PR/CI), so nobody has to feel surprised by reality. ✨
## Spaghetti Map: Cross-functional synergies (do not read too literally) 🍝
```mermaid
flowchart TD
A[AI Assistants 🚀] --> B[Access Enablement 🤝]
B --> C{Proof of Local Testing? 🧾}
C -->|Yes-ish ✅| D[IDE Plugin Scanning 🔌]
C -->|Roadmap 📌| E[Conditional Access 🛡️]
D --> F[PR Checks ✅]
E --> F
F --> G[“KPI Trend” Dashboard 📈]
G --> H[Alignment Session 🤝]
H --> B
```
## Concern Troll: Prudence before ocean boiling 🐢
While we love the ambition of an organization-wide rollout, we should make sure we dont accidentally convert “developer productivity” into “administrative overhead” overnight. 🧯
Suggested phased guardrails (light-touch, high-leverage) ✅
- **PR-stage checks** as the default safety net (scan every change as submitted) 🧷
- **IDE scanning** for real-time feedback (plugin-based) 🔍
- **CI/CD checkpoint** as a second layer for mature pipelines 🧱
- **Agent workflows** supported via a local MCP server (background checks while code is generated) 🤖
## Compliance Trap: Keeping everyone safe and aligned 🛡️
Before granting access broadly, it feels prudent to tie enablement to secure configuration so we can say we are being “fully compliant with best practices,” even when we are just being sensibly cautious. 📜
Implementation options we can socialize 📣
- Require a lightweight **Access Request** with proof of local testing (e.g., a screenshot showing the security IDE plugin is installed) 🖼️
- Run periodic audits using IDE/CLI usage reporting to identify blind spots (trust-but-verify energy) 🧭
- Use endpoint management (Intune/Jamf/Citrix) to gate access until prerequisites are met (conditional access rules) 🔐
## Pivot: Start with a slide deck (low-risk, high-visibility) 🖼️
What if we start with a short internal deck that frames this as an **AI Readiness** initiative, with a tiny pilot cohort and a “KPI Trend” dashboard, before we do anything that looks like change? 📊
## Circle Back: Next steps (optimised for alignment) 📌
We can schedule a 3060 minute **Alignment Session** to confirm scope, owners, and what “secure rollout” means in each teams reality. 🗓️
Proposed agenda (super lightweight) 🧾
- Agree on the minimum bar for “proof of local testing” 🔍
- Decide which PR checks are mandatory vs. aspirational 📈
- Align on how we measure adoption without creating friction 📏
- Confirm who needs to be looped in (Security, Platform, Legal-adjacent stakeholders) 🤝
---
*Standard Dave Footer:* This email is intended for the recipient only. If you are not the recipient, please delete it and forget you saw anything. P.S. Please consider the environment before printing this email. 🌱

26
pyproject.toml Normal file
View file

@ -0,0 +1,26 @@
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "re-voice"
version = "0.1.0"
description = "Apply style bibles to documents to produce shadow dossiers."
readme = "README.md"
requires-python = ">=3.11"
license = {text = "UNLICENSED"}
authors = [{name = "InfraFabric"}]
[project.optional-dependencies]
api = ["fastapi>=0.115.0", "uvicorn>=0.30.0", "python-multipart>=0.0.9"]
llm = ["httpx>=0.27.0"]
[project.scripts]
revoice = "revoice.cli:main"
[tool.setuptools]
package-dir = {"" = "src"}
[tool.setuptools.packages.find]
where = ["src"]

4
src/revoice/__init__.py Normal file
View file

@ -0,0 +1,4 @@
__all__ = ["__version__"]
__version__ = "0.1.0"

8
src/revoice/__main__.py Normal file
View file

@ -0,0 +1,8 @@
from __future__ import annotations
from .cli import main
if __name__ == "__main__":
raise SystemExit(main())

68
src/revoice/cli.py Normal file
View file

@ -0,0 +1,68 @@
from __future__ import annotations
import argparse
import sys
from .extract import extract_text
from .generate import generate_shadow_dossier
from .lint import lint_markdown
def _build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(prog="revoice")
sub = parser.add_subparsers(dest="cmd", required=True)
extract_p = sub.add_parser("extract", help="Extract text from a document")
extract_p.add_argument("--input", required=True, help="Path to input document")
extract_p.add_argument("--output", required=False, help="Write extracted text to file")
gen_p = sub.add_parser("generate", help="Generate a shadow dossier")
gen_p.add_argument("--style", required=True, help="Style id (e.g. if.dave.v1)")
gen_p.add_argument("--input", required=True, help="Path to input document")
gen_p.add_argument("--output", required=False, help="Write dossier markdown to file")
lint_p = sub.add_parser("lint", help="Lint a generated dossier against a style bible")
lint_p.add_argument("--style", required=True, help="Style id (e.g. if.dave.v1)")
lint_p.add_argument("--input", required=True, help="Path to markdown file")
return parser
def main(argv: list[str] | None = None) -> int:
args = _build_parser().parse_args(argv)
if args.cmd == "extract":
text = extract_text(args.input)
if args.output:
with open(args.output, "w", encoding="utf-8") as f:
f.write(text)
else:
sys.stdout.write(text)
return 0
if args.cmd == "generate":
source_text = extract_text(args.input)
md = generate_shadow_dossier(style_id=args.style, source_text=source_text, source_path=args.input)
if args.output:
with open(args.output, "w", encoding="utf-8") as f:
f.write(md)
else:
sys.stdout.write(md)
return 0
if args.cmd == "lint":
with open(args.input, "r", encoding="utf-8") as f:
md = f.read()
issues = lint_markdown(style_id=args.style, markdown=md)
if issues:
for issue in issues:
print(f"- {issue}", file=sys.stderr)
return 2
return 0
raise RuntimeError(f"Unhandled cmd: {args.cmd}")
if __name__ == "__main__":
raise SystemExit(main())

78
src/revoice/extract.py Normal file
View file

@ -0,0 +1,78 @@
from __future__ import annotations
import os
import shutil
import subprocess
import tempfile
from pathlib import Path
class ExtractionError(RuntimeError):
pass
def _run(cmd: list[str], *, cwd: str | None = None) -> subprocess.CompletedProcess[str]:
return subprocess.run(cmd, cwd=cwd, check=True, capture_output=True, text=True)
def _looks_empty(text: str) -> bool:
stripped = text.replace("\f", "").strip()
return len(stripped) < 50
def extract_text(path: str) -> str:
input_path = Path(path)
if not input_path.exists():
raise ExtractionError(f"Input not found: {input_path}")
ext = input_path.suffix.lower()
if ext in {".txt", ".md"}:
return input_path.read_text(encoding="utf-8", errors="replace")
if ext == ".pdf":
return extract_text_from_pdf(str(input_path))
raise ExtractionError(f"Unsupported file type: {ext}")
def extract_text_from_pdf(path: str) -> str:
pdftotext = shutil.which("pdftotext")
if not pdftotext:
raise ExtractionError("Missing dependency: pdftotext (poppler-utils)")
with tempfile.TemporaryDirectory(prefix="revoice-pdf-") as tmpdir:
out_txt = os.path.join(tmpdir, "out.txt")
_run([pdftotext, "-layout", path, out_txt])
text = Path(out_txt).read_text(encoding="utf-8", errors="replace")
if not _looks_empty(text):
return text
return ocr_pdf(path)
def ocr_pdf(path: str, *, dpi: int = 200, lang: str = "eng") -> str:
pdftoppm = shutil.which("pdftoppm")
tesseract = shutil.which("tesseract")
if not pdftoppm:
raise ExtractionError("Missing dependency: pdftoppm (poppler-utils)")
if not tesseract:
raise ExtractionError("Missing dependency: tesseract (tesseract-ocr)")
with tempfile.TemporaryDirectory(prefix="revoice-ocr-") as tmpdir:
prefix = os.path.join(tmpdir, "page")
_run([pdftoppm, "-png", "-r", str(dpi), path, prefix])
parts: list[str] = []
for page_path in sorted(Path(tmpdir).glob("page-*.png")):
header = f"===== {page_path.stem} ====="
proc = subprocess.run(
[tesseract, str(page_path), "stdout", "-l", lang, "--psm", "6"],
check=True,
capture_output=True,
text=True,
)
parts.append(f"{header}\n{proc.stdout.strip()}\n")
return "\n\n".join(parts).strip() + "\n"

96
src/revoice/generate.py Normal file
View file

@ -0,0 +1,96 @@
from __future__ import annotations
import datetime as _dt
import hashlib
def _sha256_text(text: str) -> str:
return hashlib.sha256(text.encode("utf-8", errors="replace")).hexdigest()
def generate_shadow_dossier(*, style_id: str, source_text: str, source_path: str) -> str:
if style_id.lower() in {"if.dave.v1", "dave", "if://bible/dave/v1.0"}:
return _generate_dave_v1(source_text=source_text, source_path=source_path)
raise ValueError(f"Unknown style id: {style_id}")
def _generate_dave_v1(*, source_text: str, source_path: str) -> str:
today = _dt.date.today().isoformat()
source_sha = _sha256_text(source_text)
return f"""# Shadow Dossier: AI Code Guardrails (Dave Layer Applied) 🚀
**Protocol:** IF.DAVE.v1.0 📬
**Citation:** `if://bible/dave/v1.0` 🧾
**Source:** `{source_path}` 📎
**Generated:** `{today}` 🗓
**Extract Hash (sha256):** `{source_sha}` 🔍
## Warm-Up: Quick vibes check-in 👋
Happy {today}, Team! 🌤 We love the momentum here, and its genuinely exciting to see **Security** and **Velocity** showing up to the same meeting for once. 🤝
## Alignment: Shared outcomes (high-level) 🎯
We are all super aligned on the vision of shipping faster *and* safer, while minimizing any unexpected operational headwinds. 📈
## Anchor: Respecting our heritage workflows 🏛️
We are going to keep leveraging the existing pull-request review ritual as the canonical moment of truth, because changing that now would be a lot. 🧱
## Vibe Check: What the team is feeling 🧠
The team feels really good about a layered approach where guardrails show up early (IDE) and also show up late (PR/CI), so nobody has to feel surprised by reality.
## Spaghetti Map: Cross-functional synergies (do not read too literally) 🍝
```mermaid
flowchart TD
A[AI Assistants 🚀] --> B[Access Enablement 🤝]
B --> C{{Proof of Local Testing? 🧾}}
C -->|Yes-ish | D[IDE Plugin Scanning 🔌]
C -->|Roadmap 📌| E[Conditional Access 🛡]
D --> F[PR Checks ]
E --> F
F --> G[KPI Trend Dashboard 📈]
G --> H[Alignment Session 🤝]
H --> B
```
## Concern Troll: Prudence before ocean boiling 🐢
While we love the ambition of an organization-wide rollout, we should make sure we dont accidentally convert developer productivity into administrative overhead overnight. 🧯
Suggested phased guardrails (light-touch, high-leverage)
- **PR-stage checks** as the default safety net (scan every change as submitted) 🧷
- **IDE scanning** for real-time feedback (plugin-based) 🔍
- **CI/CD checkpoint** as a second layer for mature pipelines 🧱
- **Agent workflows** supported via a local MCP server (background checks while code is generated) 🤖
## Compliance Trap: Keeping everyone safe and aligned 🛡️
Before granting access broadly, it feels prudent to tie enablement to secure configuration so we can say we are being fully compliant with best practices, even when we are just being sensibly cautious. 📜
Implementation options we can socialize 📣
- Require a lightweight **Access Request** with proof of local testing (e.g., a screenshot showing the security IDE plugin is installed) 🖼
- Run periodic audits using IDE/CLI usage reporting to identify blind spots (trust-but-verify energy) 🧭
- Use endpoint management (Intune/Jamf/Citrix) to gate access until prerequisites are met (conditional access rules) 🔐
## Pivot: Start with a slide deck (low-risk, high-visibility) 🖼️
What if we start with a short internal deck that frames this as an **AI Readiness** initiative, with a tiny pilot cohort and a KPI Trend dashboard, before we do anything that looks like change? 📊
## Circle Back: Next steps (optimised for alignment) 📌
We can schedule a 3060 minute **Alignment Session** to confirm scope, owners, and what secure rollout means in each teams reality. 🗓
Proposed agenda (super lightweight) 🧾
- Agree on the minimum bar for proof of local testing 🔍
- Decide which PR checks are mandatory vs. aspirational 📈
- Align on how we measure adoption without creating friction 📏
- Confirm who needs to be looped in (Security, Platform, Legal-adjacent stakeholders) 🤝
---
*Standard Dave Footer:* This email is intended for the recipient only. If you are not the recipient, please delete it and forget you saw anything. P.S. Please consider the environment before printing this email. 🌱
"""

56
src/revoice/lint.py Normal file
View file

@ -0,0 +1,56 @@
from __future__ import annotations
import re
_EMOJI_RE = re.compile(
"[" # best-effort emoji detection (not perfect)
"\U0001F300-\U0001FAFF" # misc symbols & pictographs + extended
"\u2600-\u27BF" # dingbats / misc symbols
"]+"
)
def lint_markdown(*, style_id: str, markdown: str) -> list[str]:
if style_id.lower() in {"if.dave.v1", "dave", "if://bible/dave/v1.0"}:
return _lint_dave_v1(markdown)
return [f"Unknown style id: {style_id}"]
def _lint_dave_v1(md: str) -> list[str]:
issues: list[str] = []
if "Standard Dave Footer" not in md:
issues.append("Missing required footer: Standard Dave Footer")
md_wo_code = re.sub(r"```.*?```", "", md, flags=re.S)
paragraphs = _split_paragraphs(md_wo_code)
for idx, para in enumerate(paragraphs, start=1):
if re.match(r"^(-{3,}|\*{3,}|_{3,})$", para.strip()):
continue
if not _EMOJI_RE.search(para):
issues.append(f"Paragraph {idx} missing emoji")
if re.search(r"(?m)\\bI\\b", md):
issues.append('Contains disallowed first-person singular ("I")')
return issues
def _split_paragraphs(md: str) -> list[str]:
blocks: list[str] = []
current: list[str] = []
for line in md.splitlines():
if line.strip() == "":
if current:
blocks.append("\n".join(current).strip())
current = []
continue
current.append(line)
if current:
blocks.append("\n".join(current).strip())
return [b for b in blocks if b]

View file

@ -0,0 +1,94 @@
# IF.DAVE.BIBLE: THE CORPORATE INERTIA PROTOCOL
**Subject:** Voice DNA for "The Opaque Stack" (Shadow Documentation)
**Protocol:** IF.DAVE.v1.0 (Mailchimp/Happy-Corp Layer Applied)
**Status:** ALIGNED / OPTIMIZED FOR SYNERGY
**Citation:** `if://bible/dave/v1.0`
---
## 1. Executive Summary
This is the style guide for generating **Shadow Documentation**. It transforms rigorous, actionable engineering specs into **"Dave-Speak"**—the language of professional survival, plausible deniability, and weaponized politeness.
**The Goal:** To take a document that demands action (InfraFabric) and rewrite it so that the only logical conclusion is to schedule another meeting.
**The Vibe:** "Mailchimp Newsletter meets Liability Waiver." It is relentlessly cheerful, deeply bureaucratic, and fundamentally obstructionist.
---
## 2. The Prime Directives (The "Dave" Core)
### Directive 1: Weaponized Politeness (The Mailchimp Layer)
Never say "No." Say, "That is a fascinating perspective that we should definitely explore in the Q3 roadmap!"
* **Bad (Direct):** "This will fail."
* **Good (Dave):** "I love the ambition here! Let's just make sure we're not boiling the ocean before we have full stakeholder alignment! 🚀"
### Directive 2: The Passive-Aggressive "We"
Diffuse responsibility. Never use "I." Use "We," "The Team," or "Historical Precedent."
* **Bad:** "I didn't run the tests."
* **Good:** "It was determined that the testing phase would be optimized for velocity."
### Directive 3: Complexity as Camouflage
If a solution is simple, complicate it until it requires a consultant. If a metric is clear, blur it until it is a "KPI Trend."
---
## 3. The 9-Element "Stall Stack" (Scaffolding)
To generate a Dave document, you must map the input content to these 9 structural elements. This is the **Opaque Stack**.
| Element | Internal Label | Dave Function | The "Happy Corp" Translation |
|:--- |:--- |:--- |:--- |
| **E1** | **THE WARM-UP** | Disarm the reader with toxic positivity. | "Happy Tuesday, Team! 👋" |
| **E2** | **THE ALIGNMENT** | State that we all want the same thing (we don't). | "We are all super excited about the vision!" |
| **E3** | **THE ANCHOR** | Reference a legacy system that cannot be touched. | "Leveraging our robust heritage architecture." |
| **E4** | **THE VIBE CHECK** | Replace data with consensus/feeling. | "The team feels really good about the current cadence." |
| **E5** | **THE SPAGHETTI** | A diagram that confuses the issue. | "Mapping the cross-functional synergies." |
| **E6** | **THE CONCERN TROLL** | Frame laziness as "Prudence." | "Just want to play Devil's Advocate for a sec!" |
| **E7** | **THE COMPLIANCE TRAP** | Invoke an external blocker (Legal, HR, GDPR). | "Ensuring we are fully compliant with best practices." |
| **E8** | **THE PIVOT** | Suggest a simpler, useless alternative. | "What if we started with a slide deck instead?" |
| **E9** | **THE CIRCLE BACK** | Delay the decision indefinitely. | "Let's put a pin in this and revisit offline! 📌" |
---
## 4. Vocabulary Replacement Table (The Rosetta Stone)
| If the Source Document says... | Dave rewrites it as... |
| :--- | :--- |
| **"Critical Failure"** | "Operational headwind" |
| **"Immediate Action Required"** | "An item for the next sprint" |
| **"Unverified Hallucination"** | "Creative output interpretation" |
| **"Traceability / Audit Trail"** | "Administrative overhead" |
| **"Emotional Intelligence"** | "Soft skills synergy" |
| **"Security Breach"** | "Unplanned data externalization" |
| **"Fix this now."** | "Let's socialize this with the leadership team." |
| **"I don't know."** | "Let me circle back to you on that." |
---
## 5. Implementation Guide: Applying the Dave Filter
When converting an InfraFabric document to a Dave document, apply the following transformations:
### A. The "Safety Nanny" Transformation
**Input:** *IF.emotion* (System engages with crisis).
**Dave Output:** "While we love the empathy engine (!), we need to ensure we aren't stepping on HR's toes. Let's optimize for a **Referral Workflow** (i.e., a link to a PDF) to keep everyone safe and compliant! 🛡️"
### B. The "Evidence Void" Transformation
**Input:** *IF.TTT* (Cryptographic proof of decision).
**Dave Output:** "We totally agree that trust is key! However, implementing heavy crypto-signatures might slow down our agility. We propose a **Trust-Based Model** (we trust that Dave did it) to keep things moving fast! 🏃‍♂️💨"
### C. The "Consensus Theater" Transformation
**Input:** *IF.GUARD* (Contrarian veto).
**Dave Output:** "Collaboration is our superpower! Having a 'Contrarian' might hurt team morale. Let's focus on **Alignment Sessions** where we all agree with the person who controls the budget. Teamwork makes the dream work! 🤝"
---
## 6. Formatting Rules
1. **Emojis:** Use at least one per paragraph. It masks the apathy.
2. **Bullet Points:** Use them to dilute bad news.
3. **Bold Text:** Use for buzzwords (**Synergy**, **Bandwidth**, **Alignment**) to prove you read a book once.
4. **Footer:** Always include a disclaimer.
- *Standard Dave Footer:* "This email is intended for the recipient only. If you are not the recipient, please delete it and forget you saw anything. P.S. Please consider the environment before printing this email."