docs: add zero-context explainer
Some checks are pending
ci / ci (push) Waiting to run

This commit is contained in:
codex 2026-01-08 20:49:59 +00:00
parent e92f1c3b93
commit f8808cecd1
2 changed files with 226 additions and 0 deletions

View file

@ -50,6 +50,7 @@ If you need to create a new restore point:
- Status and counts: `STATUS.md`
- Project status (narrative): `docs/09-project-status.md`
- Zero-context explainer: `docs/20-iftypeset-zero-context-explainer.md`
- Architecture: `app/ARCHITECTURE.md`
- CLI contract: `app/CLI_SPEC.md`
- External evaluation prompt: `docs/05-external-evaluation-prompt.md`

View file

@ -0,0 +1,225 @@
# IF.TYPESET (iftypeset) — zero-context explainer
This document is meant to let a fresh session understand `if.typeset` without any chat history.
## Public surfaces
Product landing page:
https://infrafabric.io/if/typeset/
Forgejo repo:
https://git.infrafabric.io/dannystocker/iftypeset
https://git.infrafabric.io/dannystocker/iftypeset.git
## What it is (black/white)
Verified:
- A spec-driven **Document CI** pipeline: Markdown → deterministic HTML/CSS → PDF render → post-render QA → reports + exit codes.
- A machine-readable **rule registry** (rules are paraphrased + pointer refs; no book text).
- Default PDF renderer is **Playwright**; Chromium CLI rendering is **banned**.
Not promised:
- Not a fact checker, citation verifier, or “trust” oracle.
- Not a geometry-perfect PDF validator (PDF QA is heuristic).
- PDF determinism is “given pinned engine + fonts + locale”, not “byte-identical everywhere”.
## Mental model (spec → pipeline)
1) **Spec** defines rules, profiles, QA thresholds, and coverage metadata.
2) **Profiles** map typographic intent into deterministic render tokens (A4, margins, font stacks, hyphenation, etc.).
3) **Lint** checks Markdown text/structure and emits deterministic diagnostics.
4) **Render HTML** produces deterministic HTML+CSS.
5) **Render PDF** produces a PDF (engine/fonts dependent); renderer metadata is logged.
6) **QA** scans HTML (and PDF when present) for layout risks and enforces numeric gates.
7) **Report** emits coverage summaries + an HTML index for review.
## Where truth lives in the repo
- CLI contract: `app/CLI_SPEC.md`
- Rendering pipeline overview: `docs/17-rendering-pipeline.md`
- Renderer strategy / determinism knobs: `docs/04-renderer-strategy.md`
- Pinned runtime (Docker): `docs/15-docker.md`
- Rule schema: `spec/schema/rule.schema.json`
- CLI implementation: `src/iftypeset/cli.py`
- Rendering + fonts contract: `src/iftypeset/rendering.py`
- QA thresholds: `spec/quality_gates.yaml`
- Profiles: `spec/profiles/*.yaml`
- Rules: `spec/rules/**.ndjson`
## Rule registry (what gets enforced)
Rules live as NDJSON records under `spec/rules/**.ndjson` and are schema-validated by `spec/schema/rule.schema.json`.
Key fields:
- `id`: stable identifier (`CMOS.*`, `BRING.*`, or `HOUSE.*`)
- `source_refs`: **pointers only** (e.g., `CMOS18 §13.93 p828 (scan p850)`), never quotes
- `category`: taxonomy bucket (links, citations, layout, etc.)
- `severity`: `must` / `should` / `warn`
- `applies_to`: `md` / `html` / `pdf` / `all`
- `enforcement`: `lint` / `typeset` / `postrender` / `manual`
- `autofix`: `none|rewrite|reflow|suggest` (+ `autofix_notes`)
- `tags`, `keywords`, `dependencies`, `exceptions`, `status`
Outcome:
- Automated rules generate deterministic diagnostics.
- Non-automatable rules (tagged `manual_checklist=true`) appear in the manual checklist artifacts.
## Profiles (typographic intent → render tokens)
Profiles are YAML under `spec/profiles/*.yaml`. They set:
- page size/orientation/margins
- font stacks + size/line-height
- hyphenation policy
- code/table/list overflow policies
- running head (header/footer) template
- locale defaults and severity overrides
Example:
- `spec/profiles/audit_report.yaml` is tuned for readable A4 review PDFs (12pt baseline, wide margins) and enables strict fonts (`fonts.require_primary: true`).
## QA gates (numeric thresholds)
QA produces incidents (e.g., overflow risk, stranded headings, wrap hazards) and enforces numeric thresholds from `spec/quality_gates.yaml`.
- `--strict` selects stricter thresholds for CI release gating.
- QA is deterministic for the same inputs, but it is heuristic by design.
## Determinism + render logs (PDF specifics)
- HTML+CSS are deterministic for the same input + profile.
- PDF output depends on renderer + fonts; `render-log.json` records enough context to audit variance:
- engine name + version
- warnings/errors
- font policy
- requested primary fonts and what fontconfig matched (`fc-match`)
- embedded PDF fonts (`pdffonts`) when available
## Commands (flow, artifacts, exit codes)
The main CI-friendly entrypoint is:
```bash
iftypeset run --input <path.md> --out out --profile <profile_id>
```
### `validate-spec`
Purpose: validate YAML/JSON spec and rule batches.
- Output: `out/spec-validation.json`
- Exit: `0` ok, `2` config/schema error
### `lint`
Purpose: parse Markdown and emit deterministic diagnostics + manual checklist.
Outputs (typical):
- `out/lint-report.json`
- `out/manual-checklist.md` and `out/manual-checklist.json`
- `out/degraded-mode-report.json` (only when degraded triggers)
Exit: `0` ok, `1` fail threshold, `2` config error
### `render-html`
Purpose: deterministic HTML+CSS for a profile.
- Outputs: `out/render.html`, `out/render.css`, `out/typeset-report.json`
- Exit: `0` ok, `1` degraded without `--degraded-ok`, `2` config error
### `render-pdf`
Purpose: render PDF from deterministic HTML (default engine: Playwright).
- Outputs: `out/render-log.json` plus `out/render.pdf` on success
- Exit: `0` ok, `2` config error, `3` renderer/tool error
### `qa`
Purpose: post-render QA + gate evaluation (HTML-first; uses PDF when present).
- Outputs: `out/layout-report.json`, `out/qa-report.json`
- Exit: `0` pass, `1` fail, `2` config error / missing HTML
### `report`
Purpose: coverage report + trust contract + HTML index.
- Outputs: `out/coverage-report.json`, `out/coverage-summary.md`, `out/trust-contract.md`, `out/report/index.html`
- Exit: `0` ok, `1` coverage floor violated, `2` config error
### `doctor`
Purpose: environment + determinism diagnostics (renderer, Poppler tools, fonts, locale).
- Outputs: `out/doctor.json`, `out/doctor.md`
- Exit: `0` ok, `2` config error
### `bundle`
Purpose: portable external review tarball + sha256 manifest.
- Outputs: `out/iftypeset-bundle.tar.gz`, `out/bundle-manifest.json`
- Exit: `0` ok
## Important flags (what they do)
Shared:
- `--spec`: spec root (default `spec`)
- `--config`: optional `iftypeset.yaml` path
- `--out`: output directory (default `out`)
- `--profile`: profile id
- `--strict`: strict QA/report thresholds
- `--degraded-ok`: dont fail degraded mode (where supported)
Rendering:
- `--engine <auto|playwright|wkhtmltopdf|weasyprint>`: PDF engine preference (`auto``playwright`; Chromium CLI is banned)
- `--self-contained`: embed local images as data URIs (useful for review bundles)
Fonts (prevents “12pt but looks tiny” drift due to fallback x-heights):
- `--font-dir <dir>`: extra font directory (repeatable)
- `--strict-fonts`: fail if primary fonts arent available or the PDF embeds fallback fonts
- Profile knob: `fonts.require_primary: true`
Lint fixing:
- `--fix`, `--fix-mode <suggest|rewrite>`, `--lint-fixed`
Run control:
- `--skip-pdf`: dont attempt PDF rendering
- `--require-pdf`: make PDF rendering failure fail the run
## Corporate font workflow (no fallbacks)
1) Put `.ttf`/`.otf` files in a directory you can mount (e.g., `./fonts/`).
2) Set the profile font family to the real font family name.
3) Render with strict enforcement:
```bash
iftypeset run --input <doc.md> --out out --profile audit_report --font-dir ./fonts --strict-fonts
```
If it fails, inspect `out/render-log.json` (font matches + embedded fonts).
## Quickstart
```bash
cd /root/ai-workspace/iftypeset
python3 -m venv .venv && . .venv/bin/activate
python -m pip install -r requirements.txt
python -m pip install -e .
iftypeset run --input fixtures/sample.md --out out --profile web_pdf --degraded-ok
iftypeset report --spec spec --out out
```