# `iftypeset` — Project Status (2026-01-03) ## What This Project Is `iftypeset` is a thin, deterministic **typeset runtime** for turning Markdown into: - Stable, shareable HTML (`render-html`) - A PDF (`render-pdf`) - Machine-readable quality reports (`lint`, `qa`, `report`) It is paired with a **machine-readable rule registry** derived from: - Chicago Manual of Style (CMOS 18) - Bringhurst (*Elements of Typographic Style*) Important: rule records are **paraphrases only** with **pointer refs** (e.g., `CMOS18 §6.2 p377 (scan p10)`). This repo must not contain book text. ## What Works Today Verified on `master` via `./scripts/ci.sh` (spec validate + report + unit tests): - End-to-end CLI: `validate-spec`, `report`, `lint`, `render-html`, `render-pdf`, `qa`, `emit-css` - Deterministic lint engine with safe rewrite mode (`lint --fix --fix-mode rewrite`) - Deterministic HTML rendering; PDF rendering works when an engine is available (Playwright is the default) - QA analyzer (HTML + PDF heuristics) with incident details: - long/bare URL/DOI/email wrap incidents - overfull token detection - table/code overflow incidents (profile-aware thresholds) - PDF-aware widow/orphan heuristics via Poppler text extraction (`pdftotext -layout`) - PDF-aware runt final page detection (short last page heuristics) - CI wiring for Forgejo: `.forgejo/workflows/ci.yml` - Session resilience tooling: - `./scripts/audit.sh` prints a compact truth snapshot (git + coverage + checkpoints) - `./scripts/checkpoint.sh "note"` creates a portable restore tarball recorded in `docs/CHECKPOINTS.md` ## Rule Registry Snapshot (Real Counts) From `out/coverage-report.json` (generated by `PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out`): - **Total rules:** 524 - **Enforcement split:** manual 379, typeset 62, lint 70, postrender 13 - **Severity split:** must 37, should 470, warn 17 Category counts: - editorial 45 - citations 61 - numbers 62 - punctuation 55 - layout 46 - headings 32 - tables 23 - links 21 - i18n 27 - abbreviations 27 - code 28 - accessibility 22 - frontmatter 20 - backmatter 18 - figures 22 ## Coverage Map Summary (Sections) Generated by `python3 tools/coverage_summary.py --coverage-dir spec/coverage --out-json out/coverage-summary.json --out-md out/coverage-summary.md`: - BRING: 64 sections, 284 rules (all partial) - CMOS18: 176 sections, 550 rules (all partial) - Total unique rule_ids across coverage maps: 834 - typography 15 Interpretation: - The registry is intentionally larger than the enforcement surface. - Many rules remain `manual_checklist=true` by design until we have deterministic enforcement for them. ## What This Is *Not* Yet - A full “publication-grade PDF QA” system. PDF-aware checks exist, but are heuristic (text extraction based) and limited in scope. - A complete automated implementation of Chicago/Bringhurst. The registry is pointer-backed; enforcement is incremental and explicit. ## How To Run (Quick) ```bash cd /root/ai-workspace/iftypeset ``` Validate + rebuild indexes: ```bash PYTHONPATH=src python3 -m iftypeset.cli validate-spec --spec spec --build-indexes ``` Generate coverage report: ```bash PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out --build-indexes ``` Lint (and optionally autofix): ```bash PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf --fix --fix-mode rewrite ``` Render HTML + PDF: ```bash PYTHONPATH=src python3 -m iftypeset.cli render-html --input fixtures/sample.md --out out --profile web_pdf PYTHONPATH=src python3 -m iftypeset.cli render-pdf --input fixtures/sample.md --out out --profile web_pdf ``` Run QA: ```bash PYTHONPATH=src python3 -m iftypeset.cli qa --out out --profile web_pdf ``` Sanity check: ```bash ./scripts/ci.sh ``` ## Don’t Lose Work (Session Resets) Chat logs are not durable. The repo is. - Snapshot: `./scripts/audit.sh` - Restore point: `./scripts/checkpoint.sh "short note"` - Checkpoint index: `docs/CHECKPOINTS.md` ## What Remains (Prioritized) 1) **Improve post-render QA beyond current heuristics** - PDF-aware stranded headings / keep-with-next violations - more reliable overflow/clipping detection when a renderer is pinned 2) **Increase *implemented* rule coverage where it matters** - citations normalization / author-date variants where feasible - i18n/locale-driven checks (without pretending perfect automation) - link/DOI wrapping policies (reduce broken PDFs) 3) **Forgejo integration** - use `iftypeset` as the Forgejo typeset/export worker - emit QA artifacts as export attachments 4) **Continue adding rule batches** - prioritize what breaks real documents: figures, references, complex tables, long code blocks