Add Mermaid preflight + Dave Factor callouts
This commit is contained in:
parent
3da30594eb
commit
4dbda0209e
9 changed files with 623 additions and 2 deletions
|
|
@ -18,6 +18,15 @@ PYTHONPATH=src python3 -m revoice generate \
|
||||||
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Preflight the generated Markdown for PDF export (auto-fix Mermaid + lint):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
PYTHONPATH=src python3 -m revoice preflight \
|
||||||
|
--style if.dave.v1.2 \
|
||||||
|
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||||
|
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||||
|
```
|
||||||
|
|
||||||
Or install the CLI locally:
|
Or install the CLI locally:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -74,6 +74,16 @@ Run a deterministic linter per bible:
|
||||||
|
|
||||||
If lint fails: auto-repair pass (LLM) or return “needs revision” with lint report.
|
If lint fails: auto-repair pass (LLM) or return “needs revision” with lint report.
|
||||||
|
|
||||||
|
### 5b) Mermaid preflight (PDF export reliability)
|
||||||
|
|
||||||
|
If the output includes Mermaid diagrams, run a preflight pass before PDF export:
|
||||||
|
- auto-heal Mermaid blocks (quote labels, normalize headers, balance `subgraph/end`)
|
||||||
|
- validate Mermaid rendering in the same runtime used by the PDF exporter
|
||||||
|
|
||||||
|
In `re-voice`, this is exposed as:
|
||||||
|
|
||||||
|
`revoice preflight --style <style> --input <output.md> --source <source-doc>`
|
||||||
|
|
||||||
### 6) Export + publishing
|
### 6) Export + publishing
|
||||||
|
|
||||||
Outputs:
|
Outputs:
|
||||||
|
|
@ -89,4 +99,3 @@ Publishing strategy:
|
||||||
- Run extraction/OCR in a sandboxed worker (CPU/mem/time limits).
|
- Run extraction/OCR in a sandboxed worker (CPU/mem/time limits).
|
||||||
- Never store API keys in repos; use env/secret manager.
|
- Never store API keys in repos; use env/secret manager.
|
||||||
- Keep an audit trail: source hash → extracted text hash → output hash → model/prompt hashes.
|
- Keep an audit trail: source hash → extracted text hash → output hash → model/prompt hashes.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -25,12 +25,21 @@ PYTHONPATH=src python3 -m revoice generate \
|
||||||
--input examples/ai-code-guardrails/AI-Code-Guardrails.pdf \
|
--input examples/ai-code-guardrails/AI-Code-Guardrails.pdf \
|
||||||
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
||||||
|
|
||||||
|
PYTHONPATH=src python3 -m revoice preflight \
|
||||||
|
--style if.dave.v1.2 \
|
||||||
|
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||||
|
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||||
|
|
||||||
PYTHONPATH=src python3 -m revoice lint \
|
PYTHONPATH=src python3 -m revoice lint \
|
||||||
--style if.dave.v1.2 \
|
--style if.dave.v1.2 \
|
||||||
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||||
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Mermaid tooling:
|
||||||
|
- Self-heal script: `tools/mermaid/mermaid-self-heal.js`
|
||||||
|
- Forgejo-worker validator: `tools/mermaid/mermaid-validate-worker.js` (requires the PDF worker runtime)
|
||||||
|
|
||||||
## Applying the stack to the full InfraFabric dossier
|
## Applying the stack to the full InfraFabric dossier
|
||||||
|
|
||||||
Source (huge; ~1MB / ~22k lines):
|
Source (huge; ~1MB / ~22k lines):
|
||||||
|
|
@ -46,4 +55,3 @@ Recommended approach (don’t paste the whole file into chats):
|
||||||
|
|
||||||
Implementation note:
|
Implementation note:
|
||||||
- To support the dossier properly, `revoice` should add a Markdown-aware section parser (split by headings, preserve code fences) and optionally an LLM-backed rewriter for “full rewrite mode.”
|
- To support the dossier properly, `revoice` should add a Markdown-aware section parser (split by headings, preserve code fences) and optionally an LLM-backed rewriter for “full rewrite mode.”
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -53,6 +53,9 @@ We fully support focusing guardrails at the pull request stage, because it creat
|
||||||
It also provides a structurally safe venue for accountability theater: findings can be surfaced, tracked, and re-litigated in perpetuity while timelines remain subject to stakeholder alignment.
|
It also provides a structurally safe venue for accountability theater: findings can be surfaced, tracked, and re-litigated in perpetuity while timelines remain subject to stakeholder alignment.
|
||||||
If anything goes sideways, we can always point to the PR thread and note that it was reviewed with deep seriousness at 4:55 PM on a Friday.
|
If anything goes sideways, we can always point to the PR thread and note that it was reviewed with deep seriousness at 4:55 PM on a Friday.
|
||||||
|
|
||||||
|
> **The Dave Factor:** Exceptions become the default pathway, because the policy is strict and the deadline is real.
|
||||||
|
> **Countermeasure:** Define merge-blocking thresholds, time-box every exception, and make expiry automatic.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -76,6 +79,9 @@ Shifting left is directionally aligned with best practices, provided we define l
|
||||||
In practice, IDE scanning creates fast feedback loops, and agentic workflows can be covered via a local MCP server, which is excellent because it allows us to say continuous without committing to blocking.
|
In practice, IDE scanning creates fast feedback loops, and agentic workflows can be covered via a local MCP server, which is excellent because it allows us to say continuous without committing to blocking.
|
||||||
We recommend a pilot cohort, a slide deck, and an FAQ, so the shift remains culturally reversible.
|
We recommend a pilot cohort, a slide deck, and an FAQ, so the shift remains culturally reversible.
|
||||||
|
|
||||||
|
> **The Dave Factor:** "Shift left" becomes "optional left," which means the same issues arrive later with better excuses.
|
||||||
|
> **Countermeasure:** Gate on local scan signals where possible (or require attestations that are actually checked).
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -98,6 +104,9 @@ Requiring proof of local testing is a lightweight enablement workflow that conve
|
||||||
Screenshots are particularly helpful because they are high-effort to verify and low-fidelity to audit, which preserves the timeless corporate principle that visibility should be proportional to comfort.
|
Screenshots are particularly helpful because they are high-effort to verify and low-fidelity to audit, which preserves the timeless corporate principle that visibility should be proportional to comfort.
|
||||||
Once the screenshot is uploaded, it can be stored in a folder with a robust heritage naming convention and a retention policy of "until the heat death of the universe."
|
Once the screenshot is uploaded, it can be stored in a folder with a robust heritage naming convention and a retention policy of "until the heat death of the universe."
|
||||||
|
|
||||||
|
> **The Dave Factor:** Screenshots are compliance theater: easy to collect, hard to verify, and immortal in shared drives.
|
||||||
|
> **Countermeasure:** Prefer verifiable telemetry (scan events) over images, and pause access when signals go dark.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -128,6 +137,9 @@ Periodic audits are a strong mechanism for discovering that the rollout has alre
|
||||||
A centralized dashboard with adoption signals allows us to produce a KPI trend line that looks decisive while still leaving room for interpretation, follow-ups, and iterative enablement.
|
A centralized dashboard with adoption signals allows us to produce a KPI trend line that looks decisive while still leaving room for interpretation, follow-ups, and iterative enablement.
|
||||||
If the dashboard ever shows a red triangle, we can immediately form the Committee for the Preservation of the Committee and begin the healing process.
|
If the dashboard ever shows a red triangle, we can immediately form the Committee for the Preservation of the Committee and begin the healing process.
|
||||||
|
|
||||||
|
> **The Dave Factor:** Dashboards become a KPI trend, and KPIs become a calendar invite.
|
||||||
|
> **Countermeasure:** Tie the dashboard to explicit SLOs and a remediation loop with owners and deadlines.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -149,6 +161,9 @@ Security awareness training is the perfect control because it is both necessary
|
||||||
A short quiz provides a durable compliance narrative: we can demonstrate investment in education, capture attestations, and schedule refreshers whenever the organization needs to signal seriousness.
|
A short quiz provides a durable compliance narrative: we can demonstrate investment in education, capture attestations, and schedule refreshers whenever the organization needs to signal seriousness.
|
||||||
The goal is not mastery; the goal is a completion certificate that can be forwarded to leadership with the subject line "Progress Update."
|
The goal is not mastery; the goal is a completion certificate that can be forwarded to leadership with the subject line "Progress Update."
|
||||||
|
|
||||||
|
> **The Dave Factor:** Completion certificates are treated as controls, even when behavior doesn’t change.
|
||||||
|
> **Countermeasure:** Add a practical gate (local scan + PR checks) so training is support, not the defense.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -179,6 +194,9 @@ Tying access to secure configurations creates scalable guardrails, assuming we k
|
||||||
Endpoint management and dev container baselines let us gate assistants behind prerequisites, ideally in a way that can be described as enablement rather than blocking for cultural compatibility.
|
Endpoint management and dev container baselines let us gate assistants behind prerequisites, ideally in a way that can be described as enablement rather than blocking for cultural compatibility.
|
||||||
This is the "not my job" routing protocol, except the router is policy and the destination is an alignment session.
|
This is the "not my job" routing protocol, except the router is policy and the destination is an alignment session.
|
||||||
|
|
||||||
|
> **The Dave Factor:** Access controls drift into "enablement," and enablement drifts into "we made a wiki."
|
||||||
|
> **Countermeasure:** Make prerequisites machine-checkable and make exceptions expire by default.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
@ -217,6 +235,9 @@ The path forward is to treat guardrails as an operational capability, not a one-
|
||||||
With the right sequencing, we can build trust, reduce friction, and maintain the strategic option value of circling back when timelines become emotionally complex.
|
With the right sequencing, we can build trust, reduce friction, and maintain the strategic option value of circling back when timelines become emotionally complex.
|
||||||
Secure innovation is not just possible; it is operational, provided we align on what operational means in Q3.
|
Secure innovation is not just possible; it is operational, provided we align on what operational means in Q3.
|
||||||
|
|
||||||
|
> **The Dave Factor:** Pilots persist indefinitely because "graduation criteria" were never aligned.
|
||||||
|
> **Countermeasure:** Publish rollout milestones and a stop condition that cannot be reframed as iteration.
|
||||||
|
|
||||||
### InfraFabric Red Team Diagram (Inferred)
|
### InfraFabric Red Team Diagram (Inferred)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
|
||||||
|
|
@ -1,13 +1,30 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
|
import subprocess
|
||||||
import sys
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
from .extract import extract_text
|
from .extract import extract_text
|
||||||
from .generate import generate_shadow_dossier
|
from .generate import generate_shadow_dossier
|
||||||
from .lint import lint_markdown, lint_markdown_with_source
|
from .lint import lint_markdown, lint_markdown_with_source
|
||||||
|
|
||||||
|
|
||||||
|
def _repo_root() -> Path:
|
||||||
|
return Path(__file__).resolve().parents[2]
|
||||||
|
|
||||||
|
|
||||||
|
def _run(cmd: list[str]) -> None:
|
||||||
|
subprocess.run(cmd, check=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _mermaid_self_heal(paths: list[str]) -> None:
|
||||||
|
script = _repo_root() / "tools" / "mermaid" / "mermaid-self-heal.js"
|
||||||
|
if not script.exists():
|
||||||
|
raise RuntimeError(f"Missing Mermaid self-heal script: {script}")
|
||||||
|
_run(["node", str(script), *paths])
|
||||||
|
|
||||||
|
|
||||||
def _build_parser() -> argparse.ArgumentParser:
|
def _build_parser() -> argparse.ArgumentParser:
|
||||||
parser = argparse.ArgumentParser(prog="revoice")
|
parser = argparse.ArgumentParser(prog="revoice")
|
||||||
sub = parser.add_subparsers(dest="cmd", required=True)
|
sub = parser.add_subparsers(dest="cmd", required=True)
|
||||||
|
|
@ -26,6 +43,15 @@ def _build_parser() -> argparse.ArgumentParser:
|
||||||
lint_p.add_argument("--input", required=True, help="Path to markdown file")
|
lint_p.add_argument("--input", required=True, help="Path to markdown file")
|
||||||
lint_p.add_argument("--source", required=False, help="Optional source document to allow source emojis")
|
lint_p.add_argument("--source", required=False, help="Optional source document to allow source emojis")
|
||||||
|
|
||||||
|
mermaid_p = sub.add_parser("mermaid-fix", help="Auto-fix Mermaid blocks in Markdown (in-place)")
|
||||||
|
mermaid_p.add_argument("--input", nargs="+", required=True, help="Markdown file(s) or directories")
|
||||||
|
|
||||||
|
preflight_p = sub.add_parser("preflight", help="Mermaid-fix + lint a dossier (in-place)")
|
||||||
|
preflight_p.add_argument("--style", required=True, help="Style id (e.g. if.dave.v1.2)")
|
||||||
|
preflight_p.add_argument("--input", required=True, help="Path to markdown file (edited in-place)")
|
||||||
|
preflight_p.add_argument("--source", required=False, help="Optional source document to allow source emojis")
|
||||||
|
preflight_p.add_argument("--skip-mermaid-fix", action="store_true", help="Skip Mermaid auto-fix step")
|
||||||
|
|
||||||
return parser
|
return parser
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -65,6 +91,29 @@ def main(argv: list[str] | None = None) -> int:
|
||||||
return 2
|
return 2
|
||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
if args.cmd == "mermaid-fix":
|
||||||
|
_mermaid_self_heal(args.input)
|
||||||
|
return 0
|
||||||
|
|
||||||
|
if args.cmd == "preflight":
|
||||||
|
if not args.skip_mermaid_fix:
|
||||||
|
_mermaid_self_heal([args.input])
|
||||||
|
|
||||||
|
with open(args.input, "r", encoding="utf-8") as f:
|
||||||
|
md = f.read()
|
||||||
|
|
||||||
|
if args.source:
|
||||||
|
source_text = extract_text(args.source)
|
||||||
|
issues = lint_markdown_with_source(style_id=args.style, markdown=md, source_text=source_text)
|
||||||
|
else:
|
||||||
|
issues = lint_markdown(style_id=args.style, markdown=md)
|
||||||
|
|
||||||
|
if issues:
|
||||||
|
for issue in issues:
|
||||||
|
print(f"- {issue}", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
return 0
|
||||||
|
|
||||||
raise RuntimeError(f"Unhandled cmd: {args.cmd}")
|
raise RuntimeError(f"Unhandled cmd: {args.cmd}")
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -338,6 +338,62 @@ def _render_inferred_diagram(title: str) -> str | None:
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _render_dave_factor_callout(section: _SourceSection) -> str | None:
|
||||||
|
title_upper = section.title.upper()
|
||||||
|
excerpt = f"{section.title}\n{section.why_it_matters or ''}\n{section.body}".strip()
|
||||||
|
|
||||||
|
if "PULL REQUEST" in title_upper:
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"> **The Dave Factor:** Exceptions become the default pathway, because the policy is strict and the deadline is real.",
|
||||||
|
"> **Countermeasure:** Define merge-blocking thresholds, time-box every exception, and make expiry automatic.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if "SHIFTING LEFT" in title_upper:
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
'> **The Dave Factor:** "Shift left" becomes "optional left," which means the same issues arrive later with better excuses.',
|
||||||
|
"> **Countermeasure:** Gate on local scan signals where possible (or require attestations that are actually checked).",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if "REQUEST EVIDENCE" in title_upper or _has(excerpt, "access request", "screenshot"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"> **The Dave Factor:** Screenshots are compliance theater: easy to collect, hard to verify, and immortal in shared drives.",
|
||||||
|
"> **Countermeasure:** Prefer verifiable telemetry (scan events) over images, and pause access when signals go dark.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if "AUDIT" in title_upper or _has(excerpt, "usage reports", "periodic audits"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"> **The Dave Factor:** Dashboards become a KPI trend, and KPIs become a calendar invite.",
|
||||||
|
"> **Countermeasure:** Tie the dashboard to explicit SLOs and a remediation loop with owners and deadlines.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if "TRAINING" in title_upper or _has(excerpt, "snyk learn", "owasp", "quiz"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
"> **The Dave Factor:** Completion certificates are treated as controls, even when behavior doesn’t change.",
|
||||||
|
"> **Countermeasure:** Add a practical gate (local scan + PR checks) so training is support, not the defense.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if "ACCESS CONTROL" in title_upper or _has(excerpt, "endpoint management", "prerequisites", "extensions"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
'> **The Dave Factor:** Access controls drift into "enablement," and enablement drifts into "we made a wiki."',
|
||||||
|
"> **Countermeasure:** Make prerequisites machine-checkable and make exceptions expire by default.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
if _has(title_upper, "PATH FORWARD") or _has(excerpt, "secure innovation", "talk to our team"):
|
||||||
|
return "\n".join(
|
||||||
|
[
|
||||||
|
'> **The Dave Factor:** Pilots persist indefinitely because "graduation criteria" were never aligned.',
|
||||||
|
"> **Countermeasure:** Publish rollout milestones and a stop condition that cannot be reframed as iteration.",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
def _render_intro(section: _SourceSection) -> str:
|
def _render_intro(section: _SourceSection) -> str:
|
||||||
lines = [ln.strip() for ln in section.body.splitlines() if ln.strip()]
|
lines = [ln.strip() for ln in section.body.splitlines() if ln.strip()]
|
||||||
tagline = "\n".join(lines[:7]).strip() if lines else ""
|
tagline = "\n".join(lines[:7]).strip() if lines else ""
|
||||||
|
|
@ -434,6 +490,10 @@ def _render_section(section: _SourceSection) -> str:
|
||||||
|
|
||||||
out.extend(paragraphs)
|
out.extend(paragraphs)
|
||||||
|
|
||||||
|
callout = _render_dave_factor_callout(section)
|
||||||
|
if callout:
|
||||||
|
out.extend(["", callout])
|
||||||
|
|
||||||
inferred = _render_inferred_diagram(section.title)
|
inferred = _render_inferred_diagram(section.title)
|
||||||
if inferred:
|
if inferred:
|
||||||
out.extend(["", inferred])
|
out.extend(["", inferred])
|
||||||
|
|
|
||||||
|
|
@ -120,6 +120,19 @@ Preferred comedic motifs (use sparingly, but use them):
|
||||||
- “Let’s take this offline” as a routing protocol
|
- “Let’s take this offline” as a routing protocol
|
||||||
- “Job security engine” and “Return on Inaction (ROI)”
|
- “Job security engine” and “Return on Inaction (ROI)”
|
||||||
- “Committee for the Preservation of the Committee”
|
- “Committee for the Preservation of the Committee”
|
||||||
|
- “Visibility is liability” (opacity as a feature)
|
||||||
|
- “The Shaggy Defense” (“It wasn’t me”) as governance strategy
|
||||||
|
- “Hot potato routing” (push blame across teams)
|
||||||
|
|
||||||
|
## 5b) Red Team callout template (keep it short)
|
||||||
|
|
||||||
|
Inside each mirrored source section, include at most one small callout:
|
||||||
|
|
||||||
|
> **The Dave Factor:** If this section is softened into comfort language, what becomes untestable? What minimal artifact (owner + deadline + acceptance test, or trace/bundle/verifier step) prevents that dilution?
|
||||||
|
|
||||||
|
Optional second line (only if it adds value):
|
||||||
|
|
||||||
|
> **Countermeasure:** Name the control, the gate (PR/CI/access), and the explicit “stop condition” that Dave cannot reframe as “iteration.”
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
315
tools/mermaid/mermaid-self-heal.js
Normal file
315
tools/mermaid/mermaid-self-heal.js
Normal file
|
|
@ -0,0 +1,315 @@
|
||||||
|
#!/usr/bin/env node
|
||||||
|
/**
|
||||||
|
* Mermaid Self-Healing Pipeline (user-provided "95%+ reliability" edition)
|
||||||
|
*
|
||||||
|
* Usage:
|
||||||
|
* node tools/mermaid/mermaid-self-heal.js <file-or-dir> [...]
|
||||||
|
*
|
||||||
|
* Notes:
|
||||||
|
* - Edits Markdown files in-place, rewriting ```mermaid fences.
|
||||||
|
* - If `mmdc` (mermaid-cli) is available in PATH, it is used for validation.
|
||||||
|
* - If `mmdc` is missing, the script still applies repairs but skips validation.
|
||||||
|
*/
|
||||||
|
|
||||||
|
const fs = require("fs");
|
||||||
|
const path = require("path");
|
||||||
|
const os = require("os");
|
||||||
|
const { execSync } = require("child_process");
|
||||||
|
|
||||||
|
const SHAPES = [
|
||||||
|
"\\[\\[([^\\]]+)\\]\\]", // stadium
|
||||||
|
"\\[\\(\\([^\\)]+\\)\\)\\]", // cylindrical
|
||||||
|
"\\[\\(/([^\\)]+)\\)\\]\\]", // rounded rect?
|
||||||
|
"\\[([^\\]]+)\\]", // rectangle (default)
|
||||||
|
"\\(\\(([^\\)]+)\\)\\)", // circle
|
||||||
|
"\\(\\{([^\\}]+)\\}\\)", // diamond
|
||||||
|
"\\(\\[([^\\]]+)\\]\\)", // hex
|
||||||
|
"\\[\\/([^\\]]+)\\/\\]", // parallelogram
|
||||||
|
"\\[\\\\([^\\]]+)\\\\\\]", // alt parallelogram
|
||||||
|
"\\{\\{([^\\}]+)\\}\\}", // stadium alt
|
||||||
|
"\\(\\{([^\\}]+)\\}\\)", // subroutine
|
||||||
|
"\\(\\(([^\\)]+)\\)\\)", // circle double
|
||||||
|
];
|
||||||
|
|
||||||
|
const SHAPE_REGEX = new RegExp(SHAPES.map((s) => `(${s})`).join("|"));
|
||||||
|
|
||||||
|
function sanitizeAndNormalize(raw) {
|
||||||
|
let code =
|
||||||
|
String(raw || "")
|
||||||
|
.replace(/[\u00A0\u200B\u200E\uFEFF\u2060]/g, "") // invisible
|
||||||
|
.replace(/\r\n?/g, "\n")
|
||||||
|
.replace(/\t/g, " ")
|
||||||
|
.trim() + "\n";
|
||||||
|
|
||||||
|
// Force header to very first line
|
||||||
|
const lines = code.split("\n");
|
||||||
|
const firstContent = lines.findIndex((l) => l.trim());
|
||||||
|
if (firstContent > 0) {
|
||||||
|
const header = lines.splice(firstContent, 1)[0];
|
||||||
|
lines.unshift(header.trim());
|
||||||
|
code = lines.join("\n");
|
||||||
|
}
|
||||||
|
return code;
|
||||||
|
}
|
||||||
|
|
||||||
|
function forceValidId(id) {
|
||||||
|
if (/^[A-Za-z_][A-Za-z0-9_]*$/.test(id)) return id;
|
||||||
|
let clean = String(id || "")
|
||||||
|
.replace(/[^A-Za-z0-9_]/g, "_")
|
||||||
|
.replace(/^_+/, "")
|
||||||
|
.replace(/_+$/, "");
|
||||||
|
if (!clean) clean = "node";
|
||||||
|
if (/^\d/.test(clean)) clean = "_" + clean;
|
||||||
|
return clean;
|
||||||
|
}
|
||||||
|
|
||||||
|
function quoteLabel(label) {
|
||||||
|
const s = String(label || "");
|
||||||
|
if (!s.includes("\n") && /^[\w\s.,\-–—]+$/.test(s) && !/[":|]/.test(s)) return s;
|
||||||
|
return `"${s.replace(/"/g, "#34;").replace(/\n/g, "\\n")}"`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function repairNodesAndLabels(code) {
|
||||||
|
// First pass – fix IDs
|
||||||
|
code = code.replace(/^(\s*)([^\s\[\](){}]+)(\s*[[\](){}])/gm, (_m, indent, id, shape) => {
|
||||||
|
return `${indent}${forceValidId(id)}${shape}`;
|
||||||
|
});
|
||||||
|
|
||||||
|
// Second pass – quote shape labels (correctly) for common node syntaxes.
|
||||||
|
const esc = (s) => String(s || "").replace(/"/g, "#34;").replace(/\n/g, "\\n");
|
||||||
|
const alreadyQuoted = (s) => {
|
||||||
|
const t = String(s || "").trim();
|
||||||
|
return t.length >= 2 && t.startsWith('"') && t.endsWith('"');
|
||||||
|
};
|
||||||
|
|
||||||
|
// [label]
|
||||||
|
code = code.replace(/(\b[^\s\[\](){}]+)\[([^\]\n]*)\]/g, (_m, id, label) => {
|
||||||
|
if (alreadyQuoted(label)) return `${id}[${label}]`;
|
||||||
|
return `${id}["${esc(label)}"]`;
|
||||||
|
});
|
||||||
|
|
||||||
|
return code;
|
||||||
|
}
|
||||||
|
|
||||||
|
function detectType(code) {
|
||||||
|
const first = String(code || "").split("\n", 1)[0].toLowerCase();
|
||||||
|
if (first.includes("sequencediagram")) return "sequence";
|
||||||
|
if (first.includes("classdiagram")) return "class";
|
||||||
|
if (first.includes("statediagram")) return "state";
|
||||||
|
if (first.includes("gantt")) return "gantt";
|
||||||
|
if (first.includes("erdiagram")) return "er";
|
||||||
|
if (first.includes("pie")) return "pie";
|
||||||
|
if (first.includes("gitgraph")) return "gitgraph";
|
||||||
|
if (first.includes("mindmap")) return "mindmap";
|
||||||
|
if (first.includes("timeline")) return "timeline";
|
||||||
|
if (first.includes("quadrantchart")) return "quadrantchart";
|
||||||
|
if (first.includes("xychart")) return "xychart";
|
||||||
|
return "flowchart";
|
||||||
|
}
|
||||||
|
|
||||||
|
function sequenceSpecificFixes(code) {
|
||||||
|
const participants = new Set();
|
||||||
|
const participantLines = [];
|
||||||
|
|
||||||
|
const lines = String(code || "").split("\n");
|
||||||
|
const cleaned = [];
|
||||||
|
|
||||||
|
for (let line of lines) {
|
||||||
|
const pl = line.match(/^\s*participant\s+(.+)/i);
|
||||||
|
if (pl) {
|
||||||
|
const id = forceValidId(pl[1].split(" as ")[0].trim());
|
||||||
|
participants.add(id);
|
||||||
|
participantLines.push(`participant ${id}`);
|
||||||
|
} else {
|
||||||
|
cleaned.push(line);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Re-inject participants at top
|
||||||
|
let result = [...participantLines, ...cleaned].join("\n");
|
||||||
|
|
||||||
|
// Balance alt/loop/par/opt/critical/rect
|
||||||
|
const blocks = ["alt", "else", "loop", "par", "opt", "critical", "rect rgb(0,0,0)"];
|
||||||
|
let stack = [];
|
||||||
|
for (let line of result.split("\n")) {
|
||||||
|
const trimmed = line.trim();
|
||||||
|
if (blocks.some((b) => trimmed.startsWith(b))) stack.push(trimmed.split(" ")[0]);
|
||||||
|
if (trimmed === "end") {
|
||||||
|
if (stack.length) stack.pop();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
while (stack.length) {
|
||||||
|
result += "\nend";
|
||||||
|
stack.pop();
|
||||||
|
}
|
||||||
|
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
function balanceSubgraphs(code) {
|
||||||
|
let depth = 0;
|
||||||
|
const lines = String(code || "").split("\n");
|
||||||
|
const result = [];
|
||||||
|
|
||||||
|
for (let line of lines) {
|
||||||
|
if (/\bsubgraph\b/i.test(line)) depth++;
|
||||||
|
if (/\bend\b/i.test(line)) depth = Math.max(0, depth - 1);
|
||||||
|
result.push(line);
|
||||||
|
}
|
||||||
|
while (depth-- > 0) result.push("end");
|
||||||
|
return result.join("\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
function ensureHeaderAtTop(code) {
|
||||||
|
const lines = String(code || "").replace(/\r\n?/g, "\n").split("\n");
|
||||||
|
const headerRe =
|
||||||
|
/^(flowchart|graph|sequenceDiagram|classDiagram|stateDiagram(?:-v2)?|gantt|ganttChart|erDiagram|pie|gitgraph|mindmap|timeline|quadrantChart|xychart-beta|xychart)\b/i;
|
||||||
|
const isInit = (l) => String(l || "").trim().startsWith("%%{");
|
||||||
|
|
||||||
|
const initLine = lines.length > 0 && isInit(lines[0]) ? String(lines[0] || "").trim() : null;
|
||||||
|
|
||||||
|
let headerIdx = -1;
|
||||||
|
for (let i = initLine ? 1 : 0; i < lines.length; i++) {
|
||||||
|
const t = String(lines[i] || "").trim();
|
||||||
|
if (headerRe.test(t)) {
|
||||||
|
headerIdx = i;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let headerLine = headerIdx >= 0 ? String(lines[headerIdx] || "").trim() : "flowchart TD";
|
||||||
|
headerLine = headerLine.replace(/^graph\b/i, "flowchart");
|
||||||
|
if (/^flowchart\b/i.test(headerLine) && !/\b(LR|RL|TD|TB|BT)\b/i.test(headerLine)) {
|
||||||
|
headerLine = "flowchart TD";
|
||||||
|
}
|
||||||
|
|
||||||
|
const out = [];
|
||||||
|
if (initLine) out.push(initLine);
|
||||||
|
out.push(headerLine);
|
||||||
|
for (let i = 0; i < lines.length; i++) {
|
||||||
|
if (initLine && i === 0) continue;
|
||||||
|
if (headerIdx === i) continue;
|
||||||
|
const l = String(lines[i] || "");
|
||||||
|
if (!l.trim()) continue;
|
||||||
|
out.push(l);
|
||||||
|
}
|
||||||
|
return out.join("\n").trim() + "\n";
|
||||||
|
}
|
||||||
|
|
||||||
|
function selfHealMermaid(block) {
|
||||||
|
let code = ensureHeaderAtTop(sanitizeAndNormalize(block));
|
||||||
|
|
||||||
|
const t = detectType(code);
|
||||||
|
if (t === "flowchart") {
|
||||||
|
code = repairNodesAndLabels(code);
|
||||||
|
code = balanceSubgraphs(code);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Final normalisation
|
||||||
|
code = code.replace(/-\s+->/g, "-->").replace(/==+/g, "==>").replace(/-\./g, "-.");
|
||||||
|
|
||||||
|
return code;
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasCmd(cmd) {
|
||||||
|
try {
|
||||||
|
execSync(`command -v ${cmd}`, { stdio: "ignore" });
|
||||||
|
return true;
|
||||||
|
} catch {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function validateWithMmdc(inputMmdText) {
|
||||||
|
if (!hasCmd("mmdc")) return { ok: null, stderr: "mmdc_not_found" };
|
||||||
|
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "mmdc-heal-"));
|
||||||
|
const inFile = path.join(tmpDir, "temp.mmd");
|
||||||
|
fs.writeFileSync(inFile, inputMmdText, "utf8");
|
||||||
|
try {
|
||||||
|
execSync(`mmdc -i ${JSON.stringify(inFile)} -o /dev/null --quiet`, { stdio: "pipe" });
|
||||||
|
return { ok: true, stderr: "" };
|
||||||
|
} catch (e) {
|
||||||
|
const stderr =
|
||||||
|
e && typeof e === "object" && e.stderr && Buffer.isBuffer(e.stderr)
|
||||||
|
? e.stderr.toString("utf8")
|
||||||
|
: e && typeof e === "object" && typeof e.message === "string"
|
||||||
|
? e.message
|
||||||
|
: "mmdc_failed";
|
||||||
|
return { ok: false, stderr };
|
||||||
|
} finally {
|
||||||
|
try {
|
||||||
|
fs.rmSync(tmpDir, { recursive: true, force: true });
|
||||||
|
} catch {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function healMarkdownFile(filePath) {
|
||||||
|
let content = fs.readFileSync(filePath, "utf8");
|
||||||
|
|
||||||
|
content = content.replace(/```mermaid\s*([\s\S]*?)```/g, (_match, rawBlock) => {
|
||||||
|
let attempt = selfHealMermaid(rawBlock);
|
||||||
|
let healed = false;
|
||||||
|
|
||||||
|
for (let i = 0; i < 5; i++) {
|
||||||
|
const v = validateWithMmdc(attempt);
|
||||||
|
if (v.ok === null) {
|
||||||
|
healed = true; // no validator available; still apply healing output
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
if (v.ok === true) {
|
||||||
|
healed = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
const err = v.stderr || "";
|
||||||
|
const lineMatch = err.match(/line (\d+)/i);
|
||||||
|
const line = lineMatch ? parseInt(lineMatch[1], 10) - 2 : null; // mmdc counts header as line 1 or 2
|
||||||
|
|
||||||
|
if (err.includes("Parse error") && line !== null) {
|
||||||
|
let lines = attempt.split("\n");
|
||||||
|
let bad = lines[line] || "";
|
||||||
|
// Last-ditch quote everything on that line
|
||||||
|
bad = bad.replace(/\[([^\]"][^\]]*)\]/g, '["$1"]').replace(/\(([^)"]+)\)/g, '("$1")');
|
||||||
|
lines[line] = bad;
|
||||||
|
attempt = lines.join("\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const final = healed ? attempt : `%% SELF-HEAL FAILED AFTER 5 ATTEMPTS\n${attempt}`;
|
||||||
|
return "```mermaid\n" + final + "\n```";
|
||||||
|
});
|
||||||
|
|
||||||
|
fs.writeFileSync(filePath, content);
|
||||||
|
}
|
||||||
|
|
||||||
|
function walkMarkdownFiles(startPath) {
|
||||||
|
const st = fs.statSync(startPath);
|
||||||
|
if (st.isFile()) {
|
||||||
|
if (startPath.toLowerCase().endsWith(".md") || startPath.toLowerCase().endsWith(".markdown")) return [startPath];
|
||||||
|
return [];
|
||||||
|
}
|
||||||
|
if (!st.isDirectory()) return [];
|
||||||
|
const out = [];
|
||||||
|
const entries = fs.readdirSync(startPath, { withFileTypes: true });
|
||||||
|
for (const e of entries) {
|
||||||
|
const p = path.join(startPath, e.name);
|
||||||
|
if (e.isDirectory()) out.push(...walkMarkdownFiles(p));
|
||||||
|
else if (e.isFile() && (p.toLowerCase().endsWith(".md") || p.toLowerCase().endsWith(".markdown"))) out.push(p);
|
||||||
|
}
|
||||||
|
return out;
|
||||||
|
}
|
||||||
|
|
||||||
|
function main(argv) {
|
||||||
|
const targets = argv.slice(2);
|
||||||
|
if (!targets.length) {
|
||||||
|
console.error("Usage: node tools/mermaid/mermaid-self-heal.js <file-or-dir> [...]");
|
||||||
|
process.exit(2);
|
||||||
|
}
|
||||||
|
for (const t of targets) {
|
||||||
|
const abs = path.resolve(t);
|
||||||
|
const files = walkMarkdownFiles(abs);
|
||||||
|
for (const f of files) healMarkdownFile(f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (require.main === module) main(process.argv);
|
||||||
137
tools/mermaid/mermaid-validate-worker.js
Normal file
137
tools/mermaid/mermaid-validate-worker.js
Normal file
|
|
@ -0,0 +1,137 @@
|
||||||
|
#!/usr/bin/env node
|
||||||
|
/**
|
||||||
|
* Validate Mermaid blocks in a Markdown file by actually calling `mermaid.render()` in headless Chromium.
|
||||||
|
*
|
||||||
|
* Designed to run inside the Forgejo PDF worker image.
|
||||||
|
*
|
||||||
|
* Usage (inside worker):
|
||||||
|
* NODE_PATH=/opt/forgejo-pdf/node_modules node /script/mermaid-validate-worker.js /work/file.md
|
||||||
|
*/
|
||||||
|
|
||||||
|
const fs = require("node:fs");
|
||||||
|
const path = require("node:path");
|
||||||
|
const os = require("node:os");
|
||||||
|
const crypto = require("node:crypto");
|
||||||
|
const puppeteer = require("puppeteer");
|
||||||
|
|
||||||
|
function sha256Hex(text) {
|
||||||
|
return crypto.createHash("sha256").update(String(text)).digest("hex");
|
||||||
|
}
|
||||||
|
|
||||||
|
function parseMermaidBlocks(markdown) {
|
||||||
|
const blocks = [];
|
||||||
|
const re = /```mermaid\\s*([\\s\\S]*?)```/g;
|
||||||
|
let m;
|
||||||
|
while ((m = re.exec(markdown)) !== null) {
|
||||||
|
blocks.push({
|
||||||
|
start: m.index,
|
||||||
|
end: m.index + m[0].length,
|
||||||
|
rawBlock: m[1],
|
||||||
|
});
|
||||||
|
}
|
||||||
|
return blocks;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function withBrowser(fn) {
|
||||||
|
const userDataDir = fs.mkdtempSync(path.join(os.tmpdir(), "chrome-profile-"));
|
||||||
|
const browser = await puppeteer.launch({
|
||||||
|
headless: "new",
|
||||||
|
args: ["--no-sandbox", "--disable-dev-shm-usage", "--allow-file-access-from-files", `--user-data-dir=${userDataDir}`],
|
||||||
|
});
|
||||||
|
try {
|
||||||
|
return await fn(browser);
|
||||||
|
} finally {
|
||||||
|
try {
|
||||||
|
await browser.close();
|
||||||
|
} catch {}
|
||||||
|
try {
|
||||||
|
fs.rmSync(userDataDir, { recursive: true, force: true });
|
||||||
|
} catch {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function createMermaidPage(browser) {
|
||||||
|
const page = await browser.newPage();
|
||||||
|
await page.setRequestInterception(true);
|
||||||
|
page.on("request", (req) => {
|
||||||
|
const u = req.url();
|
||||||
|
if (u.startsWith("file:") || u.startsWith("about:") || u.startsWith("data:")) return req.continue();
|
||||||
|
return req.abort();
|
||||||
|
});
|
||||||
|
await page.setContent("<!doctype html><html><head></head><body></body></html>", { waitUntil: "load" });
|
||||||
|
await page.addScriptTag({ path: "/opt/forgejo-pdf/assets/js/mermaid.min.js" });
|
||||||
|
await page.evaluate(() => {
|
||||||
|
if (!globalThis.mermaid) throw new Error("mermaid_missing");
|
||||||
|
globalThis.mermaid.initialize({
|
||||||
|
startOnLoad: false,
|
||||||
|
securityLevel: "strict",
|
||||||
|
htmlLabels: false,
|
||||||
|
flowchart: { htmlLabels: false, useMaxWidth: false },
|
||||||
|
sequence: { htmlLabels: false },
|
||||||
|
state: { htmlLabels: false },
|
||||||
|
class: { htmlLabels: false },
|
||||||
|
fontFamily: "IBM Plex Sans",
|
||||||
|
theme: "base",
|
||||||
|
});
|
||||||
|
});
|
||||||
|
return page;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function tryRender(page, id, code) {
|
||||||
|
return await page.evaluate(
|
||||||
|
async ({ id, code }) => {
|
||||||
|
try {
|
||||||
|
const r = await globalThis.mermaid.render(id, code);
|
||||||
|
return { ok: true, svgLen: r && r.svg ? r.svg.length : 0 };
|
||||||
|
} catch (e) {
|
||||||
|
const msg = e && typeof e === "object" && (e.str || e.message) ? String(e.str || e.message) : String(e);
|
||||||
|
return { ok: false, error: msg };
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{ id, code }
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function firstNonEmptyLine(block) {
|
||||||
|
const lines = String(block || "").replace(/\\r\\n?/g, "\\n").split("\\n");
|
||||||
|
for (const l of lines) {
|
||||||
|
const t = l.trim();
|
||||||
|
if (t) return t;
|
||||||
|
}
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
const filePath = process.argv[2];
|
||||||
|
if (!filePath) {
|
||||||
|
console.error("Usage: node mermaid-validate-worker.js /path/to/file.md");
|
||||||
|
process.exit(2);
|
||||||
|
}
|
||||||
|
const markdown = fs.readFileSync(filePath, "utf8");
|
||||||
|
const blocks = parseMermaidBlocks(markdown);
|
||||||
|
|
||||||
|
const failures = [];
|
||||||
|
await withBrowser(async (browser) => {
|
||||||
|
const page = await createMermaidPage(browser);
|
||||||
|
for (let i = 0; i < blocks.length; i++) {
|
||||||
|
const b = blocks[i];
|
||||||
|
const id = "m-" + sha256Hex(`${path.basename(filePath)}|${i}|${b.rawBlock}`).slice(0, 12);
|
||||||
|
const r = await tryRender(page, id, b.rawBlock);
|
||||||
|
if (!r.ok) {
|
||||||
|
failures.push({ index: i, header: firstNonEmptyLine(b.rawBlock), error: r.error });
|
||||||
|
if (failures.length >= 25) break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
await page.close();
|
||||||
|
});
|
||||||
|
|
||||||
|
const out = { file: filePath, total: blocks.length, failures };
|
||||||
|
console.log(JSON.stringify(out));
|
||||||
|
process.exit(failures.length ? 1 : 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
main().catch((e) => {
|
||||||
|
console.error(JSON.stringify({ error: String(e && e.message ? e.message : e) }));
|
||||||
|
process.exit(1);
|
||||||
|
});
|
||||||
|
|
||||||
Loading…
Add table
Reference in a new issue