iftypeset/docs/11-project-summary.md
codex e92f1c3b93
Some checks are pending
ci / ci (push) Waiting to run
iftypeset: document CI pipeline + Playwright + font contract
2026-01-08 18:10:41 +00:00

4.7 KiB

iftypeset — Project Summary (downloadable)

What this is

iftypeset is a deterministic document-quality runtime for turning Markdown into:

  • render.html + render.css (stable, shareable)
  • render.pdf (stable given pinned engine + fonts)
  • quality artifacts (lint-report.json, layout-report.json, qa-report.json, coverage-report.json)

The differentiator is measurement + enforceability:

  • A machinereadable rule registry (spec/rules/**.ndjson) derived from Chicago/Bringhurst as paraphrases only, with pointer refs (no book text).
  • Profiles (spec/profiles/*.yaml) that map typographic intent into render tokens.
  • QA gates (spec/quality_gates.yaml) that fail builds when layout degrades.

Positioning (honest): this is CI for document quality, not “we perfectly implement Chicago/Bringhurst.”

What works today (verified)

Verified locally via ./scripts/ci.sh (spec validate + report + tests):

  • CLI pipeline: validate-spec, report, lint, render-html, render-pdf, qa, emit-css
  • Deterministic lint (with safe rewrite mode) + manual checklist emission
  • Deterministic HTML output; PDF rendering works when an engine is available (Playwright is the default)
  • QA analyzer:
    • HTML heuristics: bare URL/DOI/email wrap, overfull tokens, table/code overflow (profile-aware)
    • PDF heuristics: widows/orphans via Poppler text extraction (pdftotext -layout)
  • Session resilience:
    • ./scripts/audit.sh (truth snapshot)
    • ./scripts/state.sh (writes docs/SESSION_STATE.md)
    • ./scripts/checkpoint.sh "note" (portable restore tarball + entry in docs/CHECKPOINTS.md)
  • “Trust contract” artifact emitted by report:
    • out/trust-contract.md and out/trust-contract.json
    • makes enforcement vs manual + QA limitations explicit

Rule registry snapshot (real counts)

From out/coverage-report.json (generated by report):

  • Total rules: 524
  • Enforcement split: manual 379, typeset 62, lint 70, postrender 13
  • Editorial category is no longer empty: 45 house-pointer editorial rules (manual-first)

Interpretation:

  • The registry is intentionally broader than enforcement.
  • Manual rules are not a weakness if the system is explicit and generates checklists deterministically.

How to run (quick start)

From repo root:

./scripts/ci.sh

Minimal manual run:

PYTHONPATH=src python3 -m iftypeset.cli validate-spec --spec spec --build-indexes
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-html --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-pdf  --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli qa --out out --profile web_pdf

“Dont lose work” (session resets)

Chat logs are not durable. The repo is.

  • Resume: ./scripts/resume.sh
  • Snapshot: ./scripts/audit.sh
  • Pasteable state: ./scripts/state.sh (updates docs/SESSION_STATE.md)
  • Restore point: ./scripts/checkpoint.sh "short note"

If a session “looks rolled back”, its usually because work wasnt committed yet. Check:

  • git status --porcelain
  • out/checkpoints/iftypeset_checkpoint_*.tar.gz
  • docs/CHECKPOINTS.md

What remains (next work, prioritized)

1) Deepen PDF-aware QA (beyond widows/orphans)

High leverage detectors:

  • stranded headings / keep-with-next violations (PDF-aware)
  • more reliable overflow/clipping detection (tables/code, page bounds)

Goal: make qa failures correlate strongly with “this PDF looks broken.”

2) Increase implemented enforcement where it buys down real pain

Focus areas:

  • citations normalization + i18n/locale checks where deterministic
  • link/DOI/email wrapping policies (reduce broken PDFs)
  • table + code overflow strategies across profiles

Rule of thumb: implement what repeatedly causes layout incidents in fixtures.

3) Forgejo integration (export worker)

Target: use iftypeset as Forgejos typeset/export worker:

  • apply profiles + CSS to exported docs
  • emit QA artifacts as export attachments
  • treat “PDF export quality” as CI, not a best-effort feature

4) Expand rule batches responsibly

Continue batches where they matter for real docs:

  • figures, frontmatter/backmatter, references, long code blocks, complex tables
  • keep paraphrase + pointer discipline; no book text in repo

Canonical docs (start here)

  • README.md
  • STATUS.md
  • docs/06-project-overview.md
  • docs/07-session-resilience.md
  • docs/08-handoff.md
  • docs/09-project-status.md
  • docs/10-project-brief.md
  • docs/05-external-evaluation-prompt.md
  • app/ARCHITECTURE.md
  • app/CLI_SPEC.md