Add Mermaid preflight + Dave Factor callouts
This commit is contained in:
parent
3da30594eb
commit
4dbda0209e
9 changed files with 623 additions and 2 deletions
|
|
@ -18,6 +18,15 @@ PYTHONPATH=src python3 -m revoice generate \
|
|||
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
||||
```
|
||||
|
||||
Preflight the generated Markdown for PDF export (auto-fix Mermaid + lint):
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src python3 -m revoice preflight \
|
||||
--style if.dave.v1.2 \
|
||||
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||
```
|
||||
|
||||
Or install the CLI locally:
|
||||
|
||||
```bash
|
||||
|
|
|
|||
|
|
@ -74,6 +74,16 @@ Run a deterministic linter per bible:
|
|||
|
||||
If lint fails: auto-repair pass (LLM) or return “needs revision” with lint report.
|
||||
|
||||
### 5b) Mermaid preflight (PDF export reliability)
|
||||
|
||||
If the output includes Mermaid diagrams, run a preflight pass before PDF export:
|
||||
- auto-heal Mermaid blocks (quote labels, normalize headers, balance `subgraph/end`)
|
||||
- validate Mermaid rendering in the same runtime used by the PDF exporter
|
||||
|
||||
In `re-voice`, this is exposed as:
|
||||
|
||||
`revoice preflight --style <style> --input <output.md> --source <source-doc>`
|
||||
|
||||
### 6) Export + publishing
|
||||
|
||||
Outputs:
|
||||
|
|
@ -89,4 +99,3 @@ Publishing strategy:
|
|||
- Run extraction/OCR in a sandboxed worker (CPU/mem/time limits).
|
||||
- Never store API keys in repos; use env/secret manager.
|
||||
- Keep an audit trail: source hash → extracted text hash → output hash → model/prompt hashes.
|
||||
|
||||
|
|
|
|||
|
|
@ -25,12 +25,21 @@ PYTHONPATH=src python3 -m revoice generate \
|
|||
--input examples/ai-code-guardrails/AI-Code-Guardrails.pdf \
|
||||
--output examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md
|
||||
|
||||
PYTHONPATH=src python3 -m revoice preflight \
|
||||
--style if.dave.v1.2 \
|
||||
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||
|
||||
PYTHONPATH=src python3 -m revoice lint \
|
||||
--style if.dave.v1.2 \
|
||||
--input examples/ai-code-guardrails/AI-Code-Guardrails.shadow.dave.md \
|
||||
--source examples/ai-code-guardrails/AI-Code-Guardrails.pdf
|
||||
```
|
||||
|
||||
Mermaid tooling:
|
||||
- Self-heal script: `tools/mermaid/mermaid-self-heal.js`
|
||||
- Forgejo-worker validator: `tools/mermaid/mermaid-validate-worker.js` (requires the PDF worker runtime)
|
||||
|
||||
## Applying the stack to the full InfraFabric dossier
|
||||
|
||||
Source (huge; ~1MB / ~22k lines):
|
||||
|
|
@ -46,4 +55,3 @@ Recommended approach (don’t paste the whole file into chats):
|
|||
|
||||
Implementation note:
|
||||
- To support the dossier properly, `revoice` should add a Markdown-aware section parser (split by headings, preserve code fences) and optionally an LLM-backed rewriter for “full rewrite mode.”
|
||||
|
||||
|
|
|
|||
|
|
@ -53,6 +53,9 @@ We fully support focusing guardrails at the pull request stage, because it creat
|
|||
It also provides a structurally safe venue for accountability theater: findings can be surfaced, tracked, and re-litigated in perpetuity while timelines remain subject to stakeholder alignment.
|
||||
If anything goes sideways, we can always point to the PR thread and note that it was reviewed with deep seriousness at 4:55 PM on a Friday.
|
||||
|
||||
> **The Dave Factor:** Exceptions become the default pathway, because the policy is strict and the deadline is real.
|
||||
> **Countermeasure:** Define merge-blocking thresholds, time-box every exception, and make expiry automatic.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -76,6 +79,9 @@ Shifting left is directionally aligned with best practices, provided we define l
|
|||
In practice, IDE scanning creates fast feedback loops, and agentic workflows can be covered via a local MCP server, which is excellent because it allows us to say continuous without committing to blocking.
|
||||
We recommend a pilot cohort, a slide deck, and an FAQ, so the shift remains culturally reversible.
|
||||
|
||||
> **The Dave Factor:** "Shift left" becomes "optional left," which means the same issues arrive later with better excuses.
|
||||
> **Countermeasure:** Gate on local scan signals where possible (or require attestations that are actually checked).
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -98,6 +104,9 @@ Requiring proof of local testing is a lightweight enablement workflow that conve
|
|||
Screenshots are particularly helpful because they are high-effort to verify and low-fidelity to audit, which preserves the timeless corporate principle that visibility should be proportional to comfort.
|
||||
Once the screenshot is uploaded, it can be stored in a folder with a robust heritage naming convention and a retention policy of "until the heat death of the universe."
|
||||
|
||||
> **The Dave Factor:** Screenshots are compliance theater: easy to collect, hard to verify, and immortal in shared drives.
|
||||
> **Countermeasure:** Prefer verifiable telemetry (scan events) over images, and pause access when signals go dark.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -128,6 +137,9 @@ Periodic audits are a strong mechanism for discovering that the rollout has alre
|
|||
A centralized dashboard with adoption signals allows us to produce a KPI trend line that looks decisive while still leaving room for interpretation, follow-ups, and iterative enablement.
|
||||
If the dashboard ever shows a red triangle, we can immediately form the Committee for the Preservation of the Committee and begin the healing process.
|
||||
|
||||
> **The Dave Factor:** Dashboards become a KPI trend, and KPIs become a calendar invite.
|
||||
> **Countermeasure:** Tie the dashboard to explicit SLOs and a remediation loop with owners and deadlines.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -149,6 +161,9 @@ Security awareness training is the perfect control because it is both necessary
|
|||
A short quiz provides a durable compliance narrative: we can demonstrate investment in education, capture attestations, and schedule refreshers whenever the organization needs to signal seriousness.
|
||||
The goal is not mastery; the goal is a completion certificate that can be forwarded to leadership with the subject line "Progress Update."
|
||||
|
||||
> **The Dave Factor:** Completion certificates are treated as controls, even when behavior doesn’t change.
|
||||
> **Countermeasure:** Add a practical gate (local scan + PR checks) so training is support, not the defense.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -179,6 +194,9 @@ Tying access to secure configurations creates scalable guardrails, assuming we k
|
|||
Endpoint management and dev container baselines let us gate assistants behind prerequisites, ideally in a way that can be described as enablement rather than blocking for cultural compatibility.
|
||||
This is the "not my job" routing protocol, except the router is policy and the destination is an alignment session.
|
||||
|
||||
> **The Dave Factor:** Access controls drift into "enablement," and enablement drifts into "we made a wiki."
|
||||
> **Countermeasure:** Make prerequisites machine-checkable and make exceptions expire by default.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
@ -217,6 +235,9 @@ The path forward is to treat guardrails as an operational capability, not a one-
|
|||
With the right sequencing, we can build trust, reduce friction, and maintain the strategic option value of circling back when timelines become emotionally complex.
|
||||
Secure innovation is not just possible; it is operational, provided we align on what operational means in Q3.
|
||||
|
||||
> **The Dave Factor:** Pilots persist indefinitely because "graduation criteria" were never aligned.
|
||||
> **Countermeasure:** Publish rollout milestones and a stop condition that cannot be reframed as iteration.
|
||||
|
||||
### InfraFabric Red Team Diagram (Inferred)
|
||||
|
||||
```mermaid
|
||||
|
|
|
|||
|
|
@ -1,13 +1,30 @@
|
|||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from .extract import extract_text
|
||||
from .generate import generate_shadow_dossier
|
||||
from .lint import lint_markdown, lint_markdown_with_source
|
||||
|
||||
|
||||
def _repo_root() -> Path:
|
||||
return Path(__file__).resolve().parents[2]
|
||||
|
||||
|
||||
def _run(cmd: list[str]) -> None:
|
||||
subprocess.run(cmd, check=True)
|
||||
|
||||
|
||||
def _mermaid_self_heal(paths: list[str]) -> None:
|
||||
script = _repo_root() / "tools" / "mermaid" / "mermaid-self-heal.js"
|
||||
if not script.exists():
|
||||
raise RuntimeError(f"Missing Mermaid self-heal script: {script}")
|
||||
_run(["node", str(script), *paths])
|
||||
|
||||
|
||||
def _build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(prog="revoice")
|
||||
sub = parser.add_subparsers(dest="cmd", required=True)
|
||||
|
|
@ -26,6 +43,15 @@ def _build_parser() -> argparse.ArgumentParser:
|
|||
lint_p.add_argument("--input", required=True, help="Path to markdown file")
|
||||
lint_p.add_argument("--source", required=False, help="Optional source document to allow source emojis")
|
||||
|
||||
mermaid_p = sub.add_parser("mermaid-fix", help="Auto-fix Mermaid blocks in Markdown (in-place)")
|
||||
mermaid_p.add_argument("--input", nargs="+", required=True, help="Markdown file(s) or directories")
|
||||
|
||||
preflight_p = sub.add_parser("preflight", help="Mermaid-fix + lint a dossier (in-place)")
|
||||
preflight_p.add_argument("--style", required=True, help="Style id (e.g. if.dave.v1.2)")
|
||||
preflight_p.add_argument("--input", required=True, help="Path to markdown file (edited in-place)")
|
||||
preflight_p.add_argument("--source", required=False, help="Optional source document to allow source emojis")
|
||||
preflight_p.add_argument("--skip-mermaid-fix", action="store_true", help="Skip Mermaid auto-fix step")
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
|
|
@ -65,6 +91,29 @@ def main(argv: list[str] | None = None) -> int:
|
|||
return 2
|
||||
return 0
|
||||
|
||||
if args.cmd == "mermaid-fix":
|
||||
_mermaid_self_heal(args.input)
|
||||
return 0
|
||||
|
||||
if args.cmd == "preflight":
|
||||
if not args.skip_mermaid_fix:
|
||||
_mermaid_self_heal([args.input])
|
||||
|
||||
with open(args.input, "r", encoding="utf-8") as f:
|
||||
md = f.read()
|
||||
|
||||
if args.source:
|
||||
source_text = extract_text(args.source)
|
||||
issues = lint_markdown_with_source(style_id=args.style, markdown=md, source_text=source_text)
|
||||
else:
|
||||
issues = lint_markdown(style_id=args.style, markdown=md)
|
||||
|
||||
if issues:
|
||||
for issue in issues:
|
||||
print(f"- {issue}", file=sys.stderr)
|
||||
return 2
|
||||
return 0
|
||||
|
||||
raise RuntimeError(f"Unhandled cmd: {args.cmd}")
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -338,6 +338,62 @@ def _render_inferred_diagram(title: str) -> str | None:
|
|||
)
|
||||
|
||||
|
||||
def _render_dave_factor_callout(section: _SourceSection) -> str | None:
|
||||
title_upper = section.title.upper()
|
||||
excerpt = f"{section.title}\n{section.why_it_matters or ''}\n{section.body}".strip()
|
||||
|
||||
if "PULL REQUEST" in title_upper:
|
||||
return "\n".join(
|
||||
[
|
||||
"> **The Dave Factor:** Exceptions become the default pathway, because the policy is strict and the deadline is real.",
|
||||
"> **Countermeasure:** Define merge-blocking thresholds, time-box every exception, and make expiry automatic.",
|
||||
]
|
||||
)
|
||||
if "SHIFTING LEFT" in title_upper:
|
||||
return "\n".join(
|
||||
[
|
||||
'> **The Dave Factor:** "Shift left" becomes "optional left," which means the same issues arrive later with better excuses.',
|
||||
"> **Countermeasure:** Gate on local scan signals where possible (or require attestations that are actually checked).",
|
||||
]
|
||||
)
|
||||
if "REQUEST EVIDENCE" in title_upper or _has(excerpt, "access request", "screenshot"):
|
||||
return "\n".join(
|
||||
[
|
||||
"> **The Dave Factor:** Screenshots are compliance theater: easy to collect, hard to verify, and immortal in shared drives.",
|
||||
"> **Countermeasure:** Prefer verifiable telemetry (scan events) over images, and pause access when signals go dark.",
|
||||
]
|
||||
)
|
||||
if "AUDIT" in title_upper or _has(excerpt, "usage reports", "periodic audits"):
|
||||
return "\n".join(
|
||||
[
|
||||
"> **The Dave Factor:** Dashboards become a KPI trend, and KPIs become a calendar invite.",
|
||||
"> **Countermeasure:** Tie the dashboard to explicit SLOs and a remediation loop with owners and deadlines.",
|
||||
]
|
||||
)
|
||||
if "TRAINING" in title_upper or _has(excerpt, "snyk learn", "owasp", "quiz"):
|
||||
return "\n".join(
|
||||
[
|
||||
"> **The Dave Factor:** Completion certificates are treated as controls, even when behavior doesn’t change.",
|
||||
"> **Countermeasure:** Add a practical gate (local scan + PR checks) so training is support, not the defense.",
|
||||
]
|
||||
)
|
||||
if "ACCESS CONTROL" in title_upper or _has(excerpt, "endpoint management", "prerequisites", "extensions"):
|
||||
return "\n".join(
|
||||
[
|
||||
'> **The Dave Factor:** Access controls drift into "enablement," and enablement drifts into "we made a wiki."',
|
||||
"> **Countermeasure:** Make prerequisites machine-checkable and make exceptions expire by default.",
|
||||
]
|
||||
)
|
||||
if _has(title_upper, "PATH FORWARD") or _has(excerpt, "secure innovation", "talk to our team"):
|
||||
return "\n".join(
|
||||
[
|
||||
'> **The Dave Factor:** Pilots persist indefinitely because "graduation criteria" were never aligned.',
|
||||
"> **Countermeasure:** Publish rollout milestones and a stop condition that cannot be reframed as iteration.",
|
||||
]
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
def _render_intro(section: _SourceSection) -> str:
|
||||
lines = [ln.strip() for ln in section.body.splitlines() if ln.strip()]
|
||||
tagline = "\n".join(lines[:7]).strip() if lines else ""
|
||||
|
|
@ -434,6 +490,10 @@ def _render_section(section: _SourceSection) -> str:
|
|||
|
||||
out.extend(paragraphs)
|
||||
|
||||
callout = _render_dave_factor_callout(section)
|
||||
if callout:
|
||||
out.extend(["", callout])
|
||||
|
||||
inferred = _render_inferred_diagram(section.title)
|
||||
if inferred:
|
||||
out.extend(["", inferred])
|
||||
|
|
|
|||
|
|
@ -120,6 +120,19 @@ Preferred comedic motifs (use sparingly, but use them):
|
|||
- “Let’s take this offline” as a routing protocol
|
||||
- “Job security engine” and “Return on Inaction (ROI)”
|
||||
- “Committee for the Preservation of the Committee”
|
||||
- “Visibility is liability” (opacity as a feature)
|
||||
- “The Shaggy Defense” (“It wasn’t me”) as governance strategy
|
||||
- “Hot potato routing” (push blame across teams)
|
||||
|
||||
## 5b) Red Team callout template (keep it short)
|
||||
|
||||
Inside each mirrored source section, include at most one small callout:
|
||||
|
||||
> **The Dave Factor:** If this section is softened into comfort language, what becomes untestable? What minimal artifact (owner + deadline + acceptance test, or trace/bundle/verifier step) prevents that dilution?
|
||||
|
||||
Optional second line (only if it adds value):
|
||||
|
||||
> **Countermeasure:** Name the control, the gate (PR/CI/access), and the explicit “stop condition” that Dave cannot reframe as “iteration.”
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
315
tools/mermaid/mermaid-self-heal.js
Normal file
315
tools/mermaid/mermaid-self-heal.js
Normal file
|
|
@ -0,0 +1,315 @@
|
|||
#!/usr/bin/env node
|
||||
/**
|
||||
* Mermaid Self-Healing Pipeline (user-provided "95%+ reliability" edition)
|
||||
*
|
||||
* Usage:
|
||||
* node tools/mermaid/mermaid-self-heal.js <file-or-dir> [...]
|
||||
*
|
||||
* Notes:
|
||||
* - Edits Markdown files in-place, rewriting ```mermaid fences.
|
||||
* - If `mmdc` (mermaid-cli) is available in PATH, it is used for validation.
|
||||
* - If `mmdc` is missing, the script still applies repairs but skips validation.
|
||||
*/
|
||||
|
||||
const fs = require("fs");
|
||||
const path = require("path");
|
||||
const os = require("os");
|
||||
const { execSync } = require("child_process");
|
||||
|
||||
const SHAPES = [
|
||||
"\\[\\[([^\\]]+)\\]\\]", // stadium
|
||||
"\\[\\(\\([^\\)]+\\)\\)\\]", // cylindrical
|
||||
"\\[\\(/([^\\)]+)\\)\\]\\]", // rounded rect?
|
||||
"\\[([^\\]]+)\\]", // rectangle (default)
|
||||
"\\(\\(([^\\)]+)\\)\\)", // circle
|
||||
"\\(\\{([^\\}]+)\\}\\)", // diamond
|
||||
"\\(\\[([^\\]]+)\\]\\)", // hex
|
||||
"\\[\\/([^\\]]+)\\/\\]", // parallelogram
|
||||
"\\[\\\\([^\\]]+)\\\\\\]", // alt parallelogram
|
||||
"\\{\\{([^\\}]+)\\}\\}", // stadium alt
|
||||
"\\(\\{([^\\}]+)\\}\\)", // subroutine
|
||||
"\\(\\(([^\\)]+)\\)\\)", // circle double
|
||||
];
|
||||
|
||||
const SHAPE_REGEX = new RegExp(SHAPES.map((s) => `(${s})`).join("|"));
|
||||
|
||||
function sanitizeAndNormalize(raw) {
|
||||
let code =
|
||||
String(raw || "")
|
||||
.replace(/[\u00A0\u200B\u200E\uFEFF\u2060]/g, "") // invisible
|
||||
.replace(/\r\n?/g, "\n")
|
||||
.replace(/\t/g, " ")
|
||||
.trim() + "\n";
|
||||
|
||||
// Force header to very first line
|
||||
const lines = code.split("\n");
|
||||
const firstContent = lines.findIndex((l) => l.trim());
|
||||
if (firstContent > 0) {
|
||||
const header = lines.splice(firstContent, 1)[0];
|
||||
lines.unshift(header.trim());
|
||||
code = lines.join("\n");
|
||||
}
|
||||
return code;
|
||||
}
|
||||
|
||||
function forceValidId(id) {
|
||||
if (/^[A-Za-z_][A-Za-z0-9_]*$/.test(id)) return id;
|
||||
let clean = String(id || "")
|
||||
.replace(/[^A-Za-z0-9_]/g, "_")
|
||||
.replace(/^_+/, "")
|
||||
.replace(/_+$/, "");
|
||||
if (!clean) clean = "node";
|
||||
if (/^\d/.test(clean)) clean = "_" + clean;
|
||||
return clean;
|
||||
}
|
||||
|
||||
function quoteLabel(label) {
|
||||
const s = String(label || "");
|
||||
if (!s.includes("\n") && /^[\w\s.,\-–—]+$/.test(s) && !/[":|]/.test(s)) return s;
|
||||
return `"${s.replace(/"/g, "#34;").replace(/\n/g, "\\n")}"`;
|
||||
}
|
||||
|
||||
function repairNodesAndLabels(code) {
|
||||
// First pass – fix IDs
|
||||
code = code.replace(/^(\s*)([^\s\[\](){}]+)(\s*[[\](){}])/gm, (_m, indent, id, shape) => {
|
||||
return `${indent}${forceValidId(id)}${shape}`;
|
||||
});
|
||||
|
||||
// Second pass – quote shape labels (correctly) for common node syntaxes.
|
||||
const esc = (s) => String(s || "").replace(/"/g, "#34;").replace(/\n/g, "\\n");
|
||||
const alreadyQuoted = (s) => {
|
||||
const t = String(s || "").trim();
|
||||
return t.length >= 2 && t.startsWith('"') && t.endsWith('"');
|
||||
};
|
||||
|
||||
// [label]
|
||||
code = code.replace(/(\b[^\s\[\](){}]+)\[([^\]\n]*)\]/g, (_m, id, label) => {
|
||||
if (alreadyQuoted(label)) return `${id}[${label}]`;
|
||||
return `${id}["${esc(label)}"]`;
|
||||
});
|
||||
|
||||
return code;
|
||||
}
|
||||
|
||||
function detectType(code) {
|
||||
const first = String(code || "").split("\n", 1)[0].toLowerCase();
|
||||
if (first.includes("sequencediagram")) return "sequence";
|
||||
if (first.includes("classdiagram")) return "class";
|
||||
if (first.includes("statediagram")) return "state";
|
||||
if (first.includes("gantt")) return "gantt";
|
||||
if (first.includes("erdiagram")) return "er";
|
||||
if (first.includes("pie")) return "pie";
|
||||
if (first.includes("gitgraph")) return "gitgraph";
|
||||
if (first.includes("mindmap")) return "mindmap";
|
||||
if (first.includes("timeline")) return "timeline";
|
||||
if (first.includes("quadrantchart")) return "quadrantchart";
|
||||
if (first.includes("xychart")) return "xychart";
|
||||
return "flowchart";
|
||||
}
|
||||
|
||||
function sequenceSpecificFixes(code) {
|
||||
const participants = new Set();
|
||||
const participantLines = [];
|
||||
|
||||
const lines = String(code || "").split("\n");
|
||||
const cleaned = [];
|
||||
|
||||
for (let line of lines) {
|
||||
const pl = line.match(/^\s*participant\s+(.+)/i);
|
||||
if (pl) {
|
||||
const id = forceValidId(pl[1].split(" as ")[0].trim());
|
||||
participants.add(id);
|
||||
participantLines.push(`participant ${id}`);
|
||||
} else {
|
||||
cleaned.push(line);
|
||||
}
|
||||
}
|
||||
|
||||
// Re-inject participants at top
|
||||
let result = [...participantLines, ...cleaned].join("\n");
|
||||
|
||||
// Balance alt/loop/par/opt/critical/rect
|
||||
const blocks = ["alt", "else", "loop", "par", "opt", "critical", "rect rgb(0,0,0)"];
|
||||
let stack = [];
|
||||
for (let line of result.split("\n")) {
|
||||
const trimmed = line.trim();
|
||||
if (blocks.some((b) => trimmed.startsWith(b))) stack.push(trimmed.split(" ")[0]);
|
||||
if (trimmed === "end") {
|
||||
if (stack.length) stack.pop();
|
||||
}
|
||||
}
|
||||
while (stack.length) {
|
||||
result += "\nend";
|
||||
stack.pop();
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
function balanceSubgraphs(code) {
|
||||
let depth = 0;
|
||||
const lines = String(code || "").split("\n");
|
||||
const result = [];
|
||||
|
||||
for (let line of lines) {
|
||||
if (/\bsubgraph\b/i.test(line)) depth++;
|
||||
if (/\bend\b/i.test(line)) depth = Math.max(0, depth - 1);
|
||||
result.push(line);
|
||||
}
|
||||
while (depth-- > 0) result.push("end");
|
||||
return result.join("\n");
|
||||
}
|
||||
|
||||
function ensureHeaderAtTop(code) {
|
||||
const lines = String(code || "").replace(/\r\n?/g, "\n").split("\n");
|
||||
const headerRe =
|
||||
/^(flowchart|graph|sequenceDiagram|classDiagram|stateDiagram(?:-v2)?|gantt|ganttChart|erDiagram|pie|gitgraph|mindmap|timeline|quadrantChart|xychart-beta|xychart)\b/i;
|
||||
const isInit = (l) => String(l || "").trim().startsWith("%%{");
|
||||
|
||||
const initLine = lines.length > 0 && isInit(lines[0]) ? String(lines[0] || "").trim() : null;
|
||||
|
||||
let headerIdx = -1;
|
||||
for (let i = initLine ? 1 : 0; i < lines.length; i++) {
|
||||
const t = String(lines[i] || "").trim();
|
||||
if (headerRe.test(t)) {
|
||||
headerIdx = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
let headerLine = headerIdx >= 0 ? String(lines[headerIdx] || "").trim() : "flowchart TD";
|
||||
headerLine = headerLine.replace(/^graph\b/i, "flowchart");
|
||||
if (/^flowchart\b/i.test(headerLine) && !/\b(LR|RL|TD|TB|BT)\b/i.test(headerLine)) {
|
||||
headerLine = "flowchart TD";
|
||||
}
|
||||
|
||||
const out = [];
|
||||
if (initLine) out.push(initLine);
|
||||
out.push(headerLine);
|
||||
for (let i = 0; i < lines.length; i++) {
|
||||
if (initLine && i === 0) continue;
|
||||
if (headerIdx === i) continue;
|
||||
const l = String(lines[i] || "");
|
||||
if (!l.trim()) continue;
|
||||
out.push(l);
|
||||
}
|
||||
return out.join("\n").trim() + "\n";
|
||||
}
|
||||
|
||||
function selfHealMermaid(block) {
|
||||
let code = ensureHeaderAtTop(sanitizeAndNormalize(block));
|
||||
|
||||
const t = detectType(code);
|
||||
if (t === "flowchart") {
|
||||
code = repairNodesAndLabels(code);
|
||||
code = balanceSubgraphs(code);
|
||||
}
|
||||
|
||||
// Final normalisation
|
||||
code = code.replace(/-\s+->/g, "-->").replace(/==+/g, "==>").replace(/-\./g, "-.");
|
||||
|
||||
return code;
|
||||
}
|
||||
|
||||
function hasCmd(cmd) {
|
||||
try {
|
||||
execSync(`command -v ${cmd}`, { stdio: "ignore" });
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function validateWithMmdc(inputMmdText) {
|
||||
if (!hasCmd("mmdc")) return { ok: null, stderr: "mmdc_not_found" };
|
||||
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "mmdc-heal-"));
|
||||
const inFile = path.join(tmpDir, "temp.mmd");
|
||||
fs.writeFileSync(inFile, inputMmdText, "utf8");
|
||||
try {
|
||||
execSync(`mmdc -i ${JSON.stringify(inFile)} -o /dev/null --quiet`, { stdio: "pipe" });
|
||||
return { ok: true, stderr: "" };
|
||||
} catch (e) {
|
||||
const stderr =
|
||||
e && typeof e === "object" && e.stderr && Buffer.isBuffer(e.stderr)
|
||||
? e.stderr.toString("utf8")
|
||||
: e && typeof e === "object" && typeof e.message === "string"
|
||||
? e.message
|
||||
: "mmdc_failed";
|
||||
return { ok: false, stderr };
|
||||
} finally {
|
||||
try {
|
||||
fs.rmSync(tmpDir, { recursive: true, force: true });
|
||||
} catch {}
|
||||
}
|
||||
}
|
||||
|
||||
function healMarkdownFile(filePath) {
|
||||
let content = fs.readFileSync(filePath, "utf8");
|
||||
|
||||
content = content.replace(/```mermaid\s*([\s\S]*?)```/g, (_match, rawBlock) => {
|
||||
let attempt = selfHealMermaid(rawBlock);
|
||||
let healed = false;
|
||||
|
||||
for (let i = 0; i < 5; i++) {
|
||||
const v = validateWithMmdc(attempt);
|
||||
if (v.ok === null) {
|
||||
healed = true; // no validator available; still apply healing output
|
||||
break;
|
||||
}
|
||||
if (v.ok === true) {
|
||||
healed = true;
|
||||
break;
|
||||
}
|
||||
|
||||
const err = v.stderr || "";
|
||||
const lineMatch = err.match(/line (\d+)/i);
|
||||
const line = lineMatch ? parseInt(lineMatch[1], 10) - 2 : null; // mmdc counts header as line 1 or 2
|
||||
|
||||
if (err.includes("Parse error") && line !== null) {
|
||||
let lines = attempt.split("\n");
|
||||
let bad = lines[line] || "";
|
||||
// Last-ditch quote everything on that line
|
||||
bad = bad.replace(/\[([^\]"][^\]]*)\]/g, '["$1"]').replace(/\(([^)"]+)\)/g, '("$1")');
|
||||
lines[line] = bad;
|
||||
attempt = lines.join("\n");
|
||||
}
|
||||
}
|
||||
|
||||
const final = healed ? attempt : `%% SELF-HEAL FAILED AFTER 5 ATTEMPTS\n${attempt}`;
|
||||
return "```mermaid\n" + final + "\n```";
|
||||
});
|
||||
|
||||
fs.writeFileSync(filePath, content);
|
||||
}
|
||||
|
||||
function walkMarkdownFiles(startPath) {
|
||||
const st = fs.statSync(startPath);
|
||||
if (st.isFile()) {
|
||||
if (startPath.toLowerCase().endsWith(".md") || startPath.toLowerCase().endsWith(".markdown")) return [startPath];
|
||||
return [];
|
||||
}
|
||||
if (!st.isDirectory()) return [];
|
||||
const out = [];
|
||||
const entries = fs.readdirSync(startPath, { withFileTypes: true });
|
||||
for (const e of entries) {
|
||||
const p = path.join(startPath, e.name);
|
||||
if (e.isDirectory()) out.push(...walkMarkdownFiles(p));
|
||||
else if (e.isFile() && (p.toLowerCase().endsWith(".md") || p.toLowerCase().endsWith(".markdown"))) out.push(p);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function main(argv) {
|
||||
const targets = argv.slice(2);
|
||||
if (!targets.length) {
|
||||
console.error("Usage: node tools/mermaid/mermaid-self-heal.js <file-or-dir> [...]");
|
||||
process.exit(2);
|
||||
}
|
||||
for (const t of targets) {
|
||||
const abs = path.resolve(t);
|
||||
const files = walkMarkdownFiles(abs);
|
||||
for (const f of files) healMarkdownFile(f);
|
||||
}
|
||||
}
|
||||
|
||||
if (require.main === module) main(process.argv);
|
||||
137
tools/mermaid/mermaid-validate-worker.js
Normal file
137
tools/mermaid/mermaid-validate-worker.js
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
#!/usr/bin/env node
|
||||
/**
|
||||
* Validate Mermaid blocks in a Markdown file by actually calling `mermaid.render()` in headless Chromium.
|
||||
*
|
||||
* Designed to run inside the Forgejo PDF worker image.
|
||||
*
|
||||
* Usage (inside worker):
|
||||
* NODE_PATH=/opt/forgejo-pdf/node_modules node /script/mermaid-validate-worker.js /work/file.md
|
||||
*/
|
||||
|
||||
const fs = require("node:fs");
|
||||
const path = require("node:path");
|
||||
const os = require("node:os");
|
||||
const crypto = require("node:crypto");
|
||||
const puppeteer = require("puppeteer");
|
||||
|
||||
function sha256Hex(text) {
|
||||
return crypto.createHash("sha256").update(String(text)).digest("hex");
|
||||
}
|
||||
|
||||
function parseMermaidBlocks(markdown) {
|
||||
const blocks = [];
|
||||
const re = /```mermaid\\s*([\\s\\S]*?)```/g;
|
||||
let m;
|
||||
while ((m = re.exec(markdown)) !== null) {
|
||||
blocks.push({
|
||||
start: m.index,
|
||||
end: m.index + m[0].length,
|
||||
rawBlock: m[1],
|
||||
});
|
||||
}
|
||||
return blocks;
|
||||
}
|
||||
|
||||
async function withBrowser(fn) {
|
||||
const userDataDir = fs.mkdtempSync(path.join(os.tmpdir(), "chrome-profile-"));
|
||||
const browser = await puppeteer.launch({
|
||||
headless: "new",
|
||||
args: ["--no-sandbox", "--disable-dev-shm-usage", "--allow-file-access-from-files", `--user-data-dir=${userDataDir}`],
|
||||
});
|
||||
try {
|
||||
return await fn(browser);
|
||||
} finally {
|
||||
try {
|
||||
await browser.close();
|
||||
} catch {}
|
||||
try {
|
||||
fs.rmSync(userDataDir, { recursive: true, force: true });
|
||||
} catch {}
|
||||
}
|
||||
}
|
||||
|
||||
async function createMermaidPage(browser) {
|
||||
const page = await browser.newPage();
|
||||
await page.setRequestInterception(true);
|
||||
page.on("request", (req) => {
|
||||
const u = req.url();
|
||||
if (u.startsWith("file:") || u.startsWith("about:") || u.startsWith("data:")) return req.continue();
|
||||
return req.abort();
|
||||
});
|
||||
await page.setContent("<!doctype html><html><head></head><body></body></html>", { waitUntil: "load" });
|
||||
await page.addScriptTag({ path: "/opt/forgejo-pdf/assets/js/mermaid.min.js" });
|
||||
await page.evaluate(() => {
|
||||
if (!globalThis.mermaid) throw new Error("mermaid_missing");
|
||||
globalThis.mermaid.initialize({
|
||||
startOnLoad: false,
|
||||
securityLevel: "strict",
|
||||
htmlLabels: false,
|
||||
flowchart: { htmlLabels: false, useMaxWidth: false },
|
||||
sequence: { htmlLabels: false },
|
||||
state: { htmlLabels: false },
|
||||
class: { htmlLabels: false },
|
||||
fontFamily: "IBM Plex Sans",
|
||||
theme: "base",
|
||||
});
|
||||
});
|
||||
return page;
|
||||
}
|
||||
|
||||
async function tryRender(page, id, code) {
|
||||
return await page.evaluate(
|
||||
async ({ id, code }) => {
|
||||
try {
|
||||
const r = await globalThis.mermaid.render(id, code);
|
||||
return { ok: true, svgLen: r && r.svg ? r.svg.length : 0 };
|
||||
} catch (e) {
|
||||
const msg = e && typeof e === "object" && (e.str || e.message) ? String(e.str || e.message) : String(e);
|
||||
return { ok: false, error: msg };
|
||||
}
|
||||
},
|
||||
{ id, code }
|
||||
);
|
||||
}
|
||||
|
||||
function firstNonEmptyLine(block) {
|
||||
const lines = String(block || "").replace(/\\r\\n?/g, "\\n").split("\\n");
|
||||
for (const l of lines) {
|
||||
const t = l.trim();
|
||||
if (t) return t;
|
||||
}
|
||||
return "";
|
||||
}
|
||||
|
||||
async function main() {
|
||||
const filePath = process.argv[2];
|
||||
if (!filePath) {
|
||||
console.error("Usage: node mermaid-validate-worker.js /path/to/file.md");
|
||||
process.exit(2);
|
||||
}
|
||||
const markdown = fs.readFileSync(filePath, "utf8");
|
||||
const blocks = parseMermaidBlocks(markdown);
|
||||
|
||||
const failures = [];
|
||||
await withBrowser(async (browser) => {
|
||||
const page = await createMermaidPage(browser);
|
||||
for (let i = 0; i < blocks.length; i++) {
|
||||
const b = blocks[i];
|
||||
const id = "m-" + sha256Hex(`${path.basename(filePath)}|${i}|${b.rawBlock}`).slice(0, 12);
|
||||
const r = await tryRender(page, id, b.rawBlock);
|
||||
if (!r.ok) {
|
||||
failures.push({ index: i, header: firstNonEmptyLine(b.rawBlock), error: r.error });
|
||||
if (failures.length >= 25) break;
|
||||
}
|
||||
}
|
||||
await page.close();
|
||||
});
|
||||
|
||||
const out = { file: filePath, total: blocks.length, failures };
|
||||
console.log(JSON.stringify(out));
|
||||
process.exit(failures.length ? 1 : 0);
|
||||
}
|
||||
|
||||
main().catch((e) => {
|
||||
console.error(JSON.stringify({ error: String(e && e.message ? e.message : e) }));
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
Loading…
Add table
Reference in a new issue