3.2 KiB
re-voice app proposal: “upload → shadow dossier”
Product goal
Let a user upload any document (PDF/DOCX/MD/HTML/images) and receive a shadow dossier rendered through a chosen style bible (e.g. if://bible/dave/v1.0).
Non-goals (v0)
- Perfect fidelity layout extraction (we only need usable text + key figures)
- Long-term storage/retention policies (we can stub, then harden)
Architecture (thin UI, strong pipeline)
1) Ingest
- Upload endpoint:
POST /api/dossiers(multipart) - Compute and persist:
sha256of original- detected
mime - storage pointer (disk/S3/Forgejo blob)
- Create
Documentrow:{id, sha256, filename, mime, created_at, owner}
2) Extract → Canonicalize
Use a pluggable extractor chain:
- PDF:
pdftotext(fast path, text-layer PDFs)- OCR fallback (
pdftoppm→tesseract) for image-only PDFs
- DOCX:
pandocorpython-docx - HTML:
readability-style boilerplate removal - Images: OCR (
tesseract) with basic deskew
Output a canonical block model (enables better prompting + citations):
{
"doc_id": "…",
"blocks": [
{"type":"heading","level":1,"text":"…"},
{"type":"paragraph","text":"…"},
{"type":"list","items":["…","…"]}
]
}
3) Style bible compiler
Store bibles in-repo as Markdown + a small metadata header (id, version, citation, hard rules).
Compile the bible into:
system_prompt(voice + forbidden/required constraints)template(required dossier structure)lint_rules(post-checks: emojis/paragraph, pronouns, required footer, etc.)
4) Generate
Two-step generation is safer and more controllable:
- Content distillation (extract doc facts → structured notes)
- Style application (render notes into dossier template under bible constraints)
Recommended runtime:
- OpenAI-compatible Chat Completions backend (Juakali / OpenWebUI stack)
- Persist
{model, prompts, output_sha256}for auditability
5) Validate (style linter)
Run a deterministic linter per bible:
- hard constraints (e.g., “emoji per paragraph” for Dave)
- vocabulary swaps (optional)
- required footer/disclaimer
- “no secrets” scan (best-effort)
If lint fails: auto-repair pass (LLM) or return “needs revision” with lint report.
5b) Mermaid preflight (PDF export reliability)
If the output includes Mermaid diagrams, run a preflight pass before PDF export:
- auto-heal Mermaid blocks (quote labels, normalize headers, balance
subgraph/end) - validate Mermaid rendering in the same runtime used by the PDF exporter
In re-voice, this is exposed as:
revoice preflight --style <style> --input <output.md> --source <source-doc>
6) Export + publishing
Outputs:
- Markdown (primary)
- PDF via existing Forgejo PDF export (
.../raw/...&format=pdf) by committing generated Markdown to a repo
Publishing strategy:
- Store outputs in a Forgejo repo (per team/project)
- Provide immutable links to
{sha}+.sha256sidecars
Security + operational considerations
- Run extraction/OCR in a sandboxed worker (CPU/mem/time limits).
- Never store API keys in repos; use env/secret manager.
- Keep an audit trail: source hash → extracted text hash → output hash → model/prompt hashes.