4.8 KiB
iftypeset — Project Status (2026-01-03)
What This Project Is
iftypeset is a thin, deterministic typeset runtime for turning Markdown into:
- Stable, shareable HTML (
render-html) - A PDF (
render-pdf) - Machine-readable quality reports (
lint,qa,report)
It is paired with a machine-readable rule registry derived from:
- Chicago Manual of Style (CMOS 18)
- Bringhurst (Elements of Typographic Style)
Important: rule records are paraphrases only with pointer refs (e.g., CMOS18 §6.2 p377 (scan p10)). This repo must not contain book text.
What Works Today
Verified on master via ./scripts/ci.sh (spec validate + report + unit tests):
- End-to-end CLI:
validate-spec,report,lint,render-html,render-pdf,qa,emit-css - Deterministic lint engine with safe rewrite mode (
lint --fix --fix-mode rewrite) - Deterministic HTML rendering; PDF rendering works when an engine is available (Playwright is the default)
- QA analyzer (HTML + PDF heuristics) with incident details:
- long/bare URL/DOI/email wrap incidents
- overfull token detection
- table/code overflow incidents (profile-aware thresholds)
- PDF-aware widow/orphan heuristics via Poppler text extraction (
pdftotext -layout) - PDF-aware runt final page detection (short last page heuristics)
- CI wiring for Forgejo:
.forgejo/workflows/ci.yml - Session resilience tooling:
./scripts/audit.shprints a compact truth snapshot (git + coverage + checkpoints)./scripts/checkpoint.sh "note"creates a portable restore tarball recorded indocs/CHECKPOINTS.md
Rule Registry Snapshot (Real Counts)
From out/coverage-report.json (generated by PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out):
- Total rules: 524
- Enforcement split: manual 379, typeset 62, lint 70, postrender 13
- Severity split: must 37, should 470, warn 17
Category counts:
- editorial 45
- citations 61
- numbers 62
- punctuation 55
- layout 46
- headings 32
- tables 23
- links 21
- i18n 27
- abbreviations 27
- code 28
- accessibility 22
- frontmatter 20
- backmatter 18
- figures 22
Coverage Map Summary (Sections)
Generated by python3 tools/coverage_summary.py --coverage-dir spec/coverage --out-json out/coverage-summary.json --out-md out/coverage-summary.md:
- BRING: 64 sections, 284 rules (all partial)
- CMOS18: 176 sections, 550 rules (all partial)
- Total unique rule_ids across coverage maps: 834
- typography 15
Interpretation:
- The registry is intentionally larger than the enforcement surface.
- Many rules remain
manual_checklist=trueby design until we have deterministic enforcement for them.
What This Is Not Yet
- A full “publication-grade PDF QA” system. PDF-aware checks exist, but are heuristic (text extraction based) and limited in scope.
- A complete automated implementation of Chicago/Bringhurst. The registry is pointer-backed; enforcement is incremental and explicit.
How To Run (Quick)
cd /root/ai-workspace/iftypeset
Validate + rebuild indexes:
PYTHONPATH=src python3 -m iftypeset.cli validate-spec --spec spec --build-indexes
Generate coverage report:
PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out --build-indexes
Lint (and optionally autofix):
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf --fix --fix-mode rewrite
Render HTML + PDF:
PYTHONPATH=src python3 -m iftypeset.cli render-html --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-pdf --input fixtures/sample.md --out out --profile web_pdf
Run QA:
PYTHONPATH=src python3 -m iftypeset.cli qa --out out --profile web_pdf
Sanity check:
./scripts/ci.sh
Don’t Lose Work (Session Resets)
Chat logs are not durable. The repo is.
- Snapshot:
./scripts/audit.sh - Restore point:
./scripts/checkpoint.sh "short note" - Checkpoint index:
docs/CHECKPOINTS.md
What Remains (Prioritized)
-
Improve post-render QA beyond current heuristics
- PDF-aware stranded headings / keep-with-next violations
- more reliable overflow/clipping detection when a renderer is pinned
-
Increase implemented rule coverage where it matters
- citations normalization / author-date variants where feasible
- i18n/locale-driven checks (without pretending perfect automation)
- link/DOI wrapping policies (reduce broken PDFs)
-
Forgejo integration
- use
iftypesetas the Forgejo typeset/export worker - emit QA artifacts as export attachments
- use
-
Continue adding rule batches
- prioritize what breaks real documents: figures, references, complex tables, long code blocks