4.7 KiB
iftypeset — Project Summary (downloadable)
What this is
iftypeset is a deterministic document-quality runtime for turning Markdown into:
render.html+render.css(stable, shareable)render.pdf(stable given pinned engine + fonts)- quality artifacts (
lint-report.json,layout-report.json,qa-report.json,coverage-report.json)
The differentiator is measurement + enforceability:
- A machine‑readable rule registry (
spec/rules/**.ndjson) derived from Chicago/Bringhurst as paraphrases only, with pointer refs (no book text). - Profiles (
spec/profiles/*.yaml) that map typographic intent into render tokens. - QA gates (
spec/quality_gates.yaml) that fail builds when layout degrades.
Positioning (honest): this is CI for document quality, not “we perfectly implement Chicago/Bringhurst.”
What works today (verified)
Verified locally via ./scripts/ci.sh (spec validate + report + tests):
- CLI pipeline:
validate-spec,report,lint,render-html,render-pdf,qa,emit-css - Deterministic lint (with safe rewrite mode) + manual checklist emission
- Deterministic HTML output; PDF rendering works when an engine is available (Playwright is the default)
- QA analyzer:
- HTML heuristics: bare URL/DOI/email wrap, overfull tokens, table/code overflow (profile-aware)
- PDF heuristics: widows/orphans via Poppler text extraction (
pdftotext -layout)
- Session resilience:
./scripts/audit.sh(truth snapshot)./scripts/state.sh(writesdocs/SESSION_STATE.md)./scripts/checkpoint.sh "note"(portable restore tarball + entry indocs/CHECKPOINTS.md)
- “Trust contract” artifact emitted by
report:out/trust-contract.mdandout/trust-contract.json- makes enforcement vs manual + QA limitations explicit
Rule registry snapshot (real counts)
From out/coverage-report.json (generated by report):
- Total rules: 524
- Enforcement split: manual 379, typeset 62, lint 70, postrender 13
- Editorial category is no longer empty: 45 house-pointer editorial rules (manual-first)
Interpretation:
- The registry is intentionally broader than enforcement.
- Manual rules are not a weakness if the system is explicit and generates checklists deterministically.
How to run (quick start)
From repo root:
./scripts/ci.sh
Minimal manual run:
PYTHONPATH=src python3 -m iftypeset.cli validate-spec --spec spec --build-indexes
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-html --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-pdf --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli qa --out out --profile web_pdf
“Don’t lose work” (session resets)
Chat logs are not durable. The repo is.
- Resume:
./scripts/resume.sh - Snapshot:
./scripts/audit.sh - Pasteable state:
./scripts/state.sh(updatesdocs/SESSION_STATE.md) - Restore point:
./scripts/checkpoint.sh "short note"
If a session “looks rolled back”, it’s usually because work wasn’t committed yet. Check:
git status --porcelainout/checkpoints/iftypeset_checkpoint_*.tar.gzdocs/CHECKPOINTS.md
What remains (next work, prioritized)
1) Deepen PDF-aware QA (beyond widows/orphans)
High leverage detectors:
- stranded headings / keep-with-next violations (PDF-aware)
- more reliable overflow/clipping detection (tables/code, page bounds)
Goal: make qa failures correlate strongly with “this PDF looks broken.”
2) Increase implemented enforcement where it buys down real pain
Focus areas:
- citations normalization + i18n/locale checks where deterministic
- link/DOI/email wrapping policies (reduce broken PDFs)
- table + code overflow strategies across profiles
Rule of thumb: implement what repeatedly causes layout incidents in fixtures.
3) Forgejo integration (export worker)
Target: use iftypeset as Forgejo’s typeset/export worker:
- apply profiles + CSS to exported docs
- emit QA artifacts as export attachments
- treat “PDF export quality” as CI, not a best-effort feature
4) Expand rule batches responsibly
Continue batches where they matter for real docs:
- figures, frontmatter/backmatter, references, long code blocks, complex tables
- keep paraphrase + pointer discipline; no book text in repo
Canonical docs (start here)
README.mdSTATUS.mddocs/06-project-overview.mddocs/07-session-resilience.mddocs/08-handoff.mddocs/09-project-status.mddocs/10-project-brief.mddocs/05-external-evaluation-prompt.mdapp/ARCHITECTURE.mdapp/CLI_SPEC.md