# IF.TYPESET (iftypeset) — zero-context explainer This document is meant to let a fresh session understand `if.typeset` without any chat history. ## Public surfaces Product landing page: https://infrafabric.io/if/typeset/ Forgejo repo: https://git.infrafabric.io/dannystocker/iftypeset https://git.infrafabric.io/dannystocker/iftypeset.git ## What it is (black/white) Verified: - A spec-driven **Document CI** pipeline: Markdown → deterministic HTML/CSS → PDF render → post-render QA → reports + exit codes. - A machine-readable **rule registry** (rules are paraphrased + pointer refs; no book text). - Default PDF renderer is **Playwright**; Chromium CLI rendering is **banned**. Not promised: - Not a fact checker, citation verifier, or “trust” oracle. - Not a geometry-perfect PDF validator (PDF QA is heuristic). - PDF determinism is “given pinned engine + fonts + locale”, not “byte-identical everywhere”. ## Mental model (spec → pipeline) 1) **Spec** defines rules, profiles, QA thresholds, and coverage metadata. 2) **Profiles** map typographic intent into deterministic render tokens (A4, margins, font stacks, hyphenation, etc.). 3) **Lint** checks Markdown text/structure and emits deterministic diagnostics. 4) **Render HTML** produces deterministic HTML+CSS. 5) **Render PDF** produces a PDF (engine/fonts dependent); renderer metadata is logged. 6) **QA** scans HTML (and PDF when present) for layout risks and enforces numeric gates. 7) **Report** emits coverage summaries + an HTML index for review. ## Where truth lives in the repo - CLI contract: `app/CLI_SPEC.md` - Rendering pipeline overview: `docs/17-rendering-pipeline.md` - Renderer strategy / determinism knobs: `docs/04-renderer-strategy.md` - Pinned runtime (Docker): `docs/15-docker.md` - Rule schema: `spec/schema/rule.schema.json` - CLI implementation: `src/iftypeset/cli.py` - Rendering + fonts contract: `src/iftypeset/rendering.py` - QA thresholds: `spec/quality_gates.yaml` - Profiles: `spec/profiles/*.yaml` - Rules: `spec/rules/**.ndjson` ## Rule registry (what gets enforced) Rules live as NDJSON records under `spec/rules/**.ndjson` and are schema-validated by `spec/schema/rule.schema.json`. Key fields: - `id`: stable identifier (`CMOS.*`, `BRING.*`, or `HOUSE.*`) - `source_refs`: **pointers only** (e.g., `CMOS18 §13.93 p828 (scan p850)`), never quotes - `category`: taxonomy bucket (links, citations, layout, etc.) - `severity`: `must` / `should` / `warn` - `applies_to`: `md` / `html` / `pdf` / `all` - `enforcement`: `lint` / `typeset` / `postrender` / `manual` - `autofix`: `none|rewrite|reflow|suggest` (+ `autofix_notes`) - `tags`, `keywords`, `dependencies`, `exceptions`, `status` Outcome: - Automated rules generate deterministic diagnostics. - Non-automatable rules (tagged `manual_checklist=true`) appear in the manual checklist artifacts. ## Profiles (typographic intent → render tokens) Profiles are YAML under `spec/profiles/*.yaml`. They set: - page size/orientation/margins - font stacks + size/line-height - hyphenation policy - code/table/list overflow policies - running head (header/footer) template - locale defaults and severity overrides Example: - `spec/profiles/audit_report.yaml` is tuned for readable A4 review PDFs (12pt baseline, wide margins) and enables strict fonts (`fonts.require_primary: true`). ## QA gates (numeric thresholds) QA produces incidents (e.g., overflow risk, stranded headings, wrap hazards) and enforces numeric thresholds from `spec/quality_gates.yaml`. - `--strict` selects stricter thresholds for CI release gating. - QA is deterministic for the same inputs, but it is heuristic by design. ## Determinism + render logs (PDF specifics) - HTML+CSS are deterministic for the same input + profile. - PDF output depends on renderer + fonts; `render-log.json` records enough context to audit variance: - engine name + version - warnings/errors - font policy - requested primary fonts and what fontconfig matched (`fc-match`) - embedded PDF fonts (`pdffonts`) when available ## Commands (flow, artifacts, exit codes) The main CI-friendly entrypoint is: ```bash iftypeset run --input --out out --profile ``` ### `validate-spec` Purpose: validate YAML/JSON spec and rule batches. - Output: `out/spec-validation.json` - Exit: `0` ok, `2` config/schema error ### `lint` Purpose: parse Markdown and emit deterministic diagnostics + manual checklist. Outputs (typical): - `out/lint-report.json` - `out/manual-checklist.md` and `out/manual-checklist.json` - `out/degraded-mode-report.json` (only when degraded triggers) Exit: `0` ok, `1` fail threshold, `2` config error ### `render-html` Purpose: deterministic HTML+CSS for a profile. - Outputs: `out/render.html`, `out/render.css`, `out/typeset-report.json` - Exit: `0` ok, `1` degraded without `--degraded-ok`, `2` config error ### `render-pdf` Purpose: render PDF from deterministic HTML (default engine: Playwright). - Outputs: `out/render-log.json` plus `out/render.pdf` on success - Exit: `0` ok, `2` config error, `3` renderer/tool error ### `qa` Purpose: post-render QA + gate evaluation (HTML-first; uses PDF when present). - Outputs: `out/layout-report.json`, `out/qa-report.json` - Exit: `0` pass, `1` fail, `2` config error / missing HTML ### `report` Purpose: coverage report + trust contract + HTML index. - Outputs: `out/coverage-report.json`, `out/coverage-summary.md`, `out/trust-contract.md`, `out/report/index.html` - Exit: `0` ok, `1` coverage floor violated, `2` config error ### `doctor` Purpose: environment + determinism diagnostics (renderer, Poppler tools, fonts, locale). - Outputs: `out/doctor.json`, `out/doctor.md` - Exit: `0` ok, `2` config error ### `bundle` Purpose: portable external review tarball + sha256 manifest. - Outputs: `out/iftypeset-bundle.tar.gz`, `out/bundle-manifest.json` - Exit: `0` ok ## Important flags (what they do) Shared: - `--spec`: spec root (default `spec`) - `--config`: optional `iftypeset.yaml` path - `--out`: output directory (default `out`) - `--profile`: profile id - `--strict`: strict QA/report thresholds - `--degraded-ok`: don’t fail degraded mode (where supported) Rendering: - `--engine `: PDF engine preference (`auto` → `playwright`; Chromium CLI is banned) - `--self-contained`: embed local images as data URIs (useful for review bundles) Fonts (prevents “12pt but looks tiny” drift due to fallback x-heights): - `--font-dir `: extra font directory (repeatable) - `--strict-fonts`: fail if primary fonts aren’t available or the PDF embeds fallback fonts - Profile knob: `fonts.require_primary: true` Lint fixing: - `--fix`, `--fix-mode `, `--lint-fixed` Run control: - `--skip-pdf`: don’t attempt PDF rendering - `--require-pdf`: make PDF rendering failure fail the run ## Corporate font workflow (no fallbacks) 1) Put `.ttf`/`.otf` files in a directory you can mount (e.g., `./fonts/`). 2) Set the profile font family to the real font family name. 3) Render with strict enforcement: ```bash iftypeset run --input --out out --profile audit_report --font-dir ./fonts --strict-fonts ``` If it fails, inspect `out/render-log.json` (font matches + embedded fonts). ## Quickstart ```bash cd /root/ai-workspace/iftypeset python3 -m venv .venv && . .venv/bin/activate python -m pip install -r requirements.txt python -m pip install -e . iftypeset run --input fixtures/sample.md --out out --profile web_pdf --degraded-ok iftypeset report --spec spec --out out ```