# IF.TYPESET (iftypeset) — zero-context explainer

This document is meant to let a fresh session understand `if.typeset` without any chat history.

## Public surfaces

Product landing page:
https://infrafabric.io/if/typeset/

Forgejo repo:
https://git.infrafabric.io/dannystocker/iftypeset
https://git.infrafabric.io/dannystocker/iftypeset.git

## What it is (black/white)

Verified:

- A spec-driven **Document CI** pipeline: Markdown → deterministic HTML/CSS → PDF render → post-render QA → reports + exit codes.
- A machine-readable **rule registry** (rules are paraphrased + pointer refs; no book text).
- Default PDF renderer is **Playwright**; Chromium CLI rendering is **banned**.

Not promised:

- Not a fact checker, citation verifier, or “trust” oracle.
- Not a geometry-perfect PDF validator (PDF QA is heuristic).
- PDF determinism is “given pinned engine + fonts + locale”, not “byte-identical everywhere”.

## Mental model (spec → pipeline)

1) **Spec** defines rules, profiles, QA thresholds, and coverage metadata.
2) **Profiles** map typographic intent into deterministic render tokens (A4, margins, font stacks, hyphenation, etc.).
3) **Lint** checks Markdown text/structure and emits deterministic diagnostics.
4) **Render HTML** produces deterministic HTML+CSS.
5) **Render PDF** produces a PDF (engine/fonts dependent); renderer metadata is logged.
6) **QA** scans HTML (and PDF when present) for layout risks and enforces numeric gates.
7) **Report** emits coverage summaries + an HTML index for review.

## Where truth lives in the repo

- CLI contract: `app/CLI_SPEC.md`
- Rendering pipeline overview: `docs/17-rendering-pipeline.md`
- Renderer strategy / determinism knobs: `docs/04-renderer-strategy.md`
- Pinned runtime (Docker): `docs/15-docker.md`
- Rule schema: `spec/schema/rule.schema.json`
- CLI implementation: `src/iftypeset/cli.py`
- Rendering + fonts contract: `src/iftypeset/rendering.py`
- QA thresholds: `spec/quality_gates.yaml`
- Profiles: `spec/profiles/*.yaml`
- Rules: `spec/rules/**.ndjson`

## Rule registry (what gets enforced)

Rules live as NDJSON records under `spec/rules/**.ndjson` and are schema-validated by `spec/schema/rule.schema.json`.

Key fields:

- `id`: stable identifier (`CMOS.*`, `BRING.*`, or `HOUSE.*`)
- `source_refs`: **pointers only** (e.g., `CMOS18 §13.93 p828 (scan p850)`), never quotes
- `category`: taxonomy bucket (links, citations, layout, etc.)
- `severity`: `must` / `should` / `warn`
- `applies_to`: `md` / `html` / `pdf` / `all`
- `enforcement`: `lint` / `typeset` / `postrender` / `manual`
- `autofix`: `none|rewrite|reflow|suggest` (+ `autofix_notes`)
- `tags`, `keywords`, `dependencies`, `exceptions`, `status`

Outcome:

- Automated rules generate deterministic diagnostics.
- Non-automatable rules (tagged `manual_checklist=true`) appear in the manual checklist artifacts.

## Profiles (typographic intent → render tokens)

Profiles are YAML under `spec/profiles/*.yaml`. They set:

- page size/orientation/margins
- font stacks + size/line-height
- hyphenation policy
- code/table/list overflow policies
- running head (header/footer) template
- locale defaults and severity overrides

Example:

- `spec/profiles/audit_report.yaml` is tuned for readable A4 review PDFs (12pt baseline, wide margins) and enables strict fonts (`fonts.require_primary: true`).

## QA gates (numeric thresholds)

QA produces incidents (e.g., overflow risk, stranded headings, wrap hazards) and enforces numeric thresholds from `spec/quality_gates.yaml`.

- `--strict` selects stricter thresholds for CI release gating.
- QA is deterministic for the same inputs, but it is heuristic by design.

## Determinism + render logs (PDF specifics)

- HTML+CSS are deterministic for the same input + profile.
- PDF output depends on renderer + fonts; `render-log.json` records enough context to audit variance:
  - engine name + version
  - warnings/errors
  - font policy
  - requested primary fonts and what fontconfig matched (`fc-match`)
  - embedded PDF fonts (`pdffonts`) when available

## Commands (flow, artifacts, exit codes)

The main CI-friendly entrypoint is:

```bash
iftypeset run --input <path.md> --out out --profile <profile_id>
```

### `validate-spec`

Purpose: validate YAML/JSON spec and rule batches.

- Output: `out/spec-validation.json`
- Exit: `0` ok, `2` config/schema error

### `lint`

Purpose: parse Markdown and emit deterministic diagnostics + manual checklist.

Outputs (typical):

- `out/lint-report.json`
- `out/manual-checklist.md` and `out/manual-checklist.json`
- `out/degraded-mode-report.json` (only when degraded triggers)

Exit: `0` ok, `1` fail threshold, `2` config error

### `render-html`

Purpose: deterministic HTML+CSS for a profile.

- Outputs: `out/render.html`, `out/render.css`, `out/typeset-report.json`
- Exit: `0` ok, `1` degraded without `--degraded-ok`, `2` config error

### `render-pdf`

Purpose: render PDF from deterministic HTML (default engine: Playwright).

- Outputs: `out/render-log.json` plus `out/render.pdf` on success
- Exit: `0` ok, `2` config error, `3` renderer/tool error

### `qa`

Purpose: post-render QA + gate evaluation (HTML-first; uses PDF when present).

- Outputs: `out/layout-report.json`, `out/qa-report.json`
- Exit: `0` pass, `1` fail, `2` config error / missing HTML

### `report`

Purpose: coverage report + trust contract + HTML index.

- Outputs: `out/coverage-report.json`, `out/coverage-summary.md`, `out/trust-contract.md`, `out/report/index.html`
- Exit: `0` ok, `1` coverage floor violated, `2` config error

### `doctor`

Purpose: environment + determinism diagnostics (renderer, Poppler tools, fonts, locale).

- Outputs: `out/doctor.json`, `out/doctor.md`
- Exit: `0` ok, `2` config error

### `bundle`

Purpose: portable external review tarball + sha256 manifest.

- Outputs: `out/iftypeset-bundle.tar.gz`, `out/bundle-manifest.json`
- Exit: `0` ok

## Important flags (what they do)

Shared:

- `--spec`: spec root (default `spec`)
- `--config`: optional `iftypeset.yaml` path
- `--out`: output directory (default `out`)
- `--profile`: profile id
- `--strict`: strict QA/report thresholds
- `--degraded-ok`: don’t fail degraded mode (where supported)

Rendering:

- `--engine <auto|playwright|wkhtmltopdf|weasyprint>`: PDF engine preference (`auto` → `playwright`; Chromium CLI is banned)
- `--self-contained`: embed local images as data URIs (useful for review bundles)

Fonts (prevents “12pt but looks tiny” drift due to fallback x-heights):

- `--font-dir <dir>`: extra font directory (repeatable)
- `--strict-fonts`: fail if primary fonts aren’t available or the PDF embeds fallback fonts
- Profile knob: `fonts.require_primary: true`

Lint fixing:

- `--fix`, `--fix-mode <suggest|rewrite>`, `--lint-fixed`

Run control:

- `--skip-pdf`: don’t attempt PDF rendering
- `--require-pdf`: make PDF rendering failure fail the run

## Corporate font workflow (no fallbacks)

1) Put `.ttf`/`.otf` files in a directory you can mount (e.g., `./fonts/`).
2) Set the profile font family to the real font family name.
3) Render with strict enforcement:

```bash
iftypeset run --input <doc.md> --out out --profile audit_report --font-dir ./fonts --strict-fonts
```

If it fails, inspect `out/render-log.json` (font matches + embedded fonts).

## Quickstart

```bash
cd /root/ai-workspace/iftypeset
python3 -m venv .venv && . .venv/bin/activate
python -m pip install -r requirements.txt
python -m pip install -e .

iftypeset run --input fixtures/sample.md --out out --profile web_pdf --degraded-ok
iftypeset report --spec spec --out out
```