# `iftypeset` — Project Status (2026-01-03)

## What This Project Is

`iftypeset` is a thin, deterministic **typeset runtime** for turning Markdown into:

- Stable, shareable HTML (`render-html`)
- A PDF (`render-pdf`)
- Machine-readable quality reports (`lint`, `qa`, `report`)

It is paired with a **machine-readable rule registry** derived from:

- Chicago Manual of Style (CMOS 18)
- Bringhurst (*Elements of Typographic Style*)

Important: rule records are **paraphrases only** with **pointer refs** (e.g., `CMOS18 §6.2 p377 (scan p10)`). This repo must not contain book text.

## What Works Today

Verified on `master` via `./scripts/ci.sh` (spec validate + report + unit tests):

- End-to-end CLI: `validate-spec`, `report`, `lint`, `render-html`, `render-pdf`, `qa`, `emit-css`
- Deterministic lint engine with safe rewrite mode (`lint --fix --fix-mode rewrite`)
- Deterministic HTML rendering; PDF rendering works when an engine is available (Playwright is the default)
- QA analyzer (HTML + PDF heuristics) with incident details:
  - long/bare URL/DOI/email wrap incidents
  - overfull token detection
  - table/code overflow incidents (profile-aware thresholds)
  - PDF-aware widow/orphan heuristics via Poppler text extraction (`pdftotext -layout`)
  - PDF-aware runt final page detection (short last page heuristics)
- CI wiring for Forgejo: `.forgejo/workflows/ci.yml`
- Session resilience tooling:
  - `./scripts/audit.sh` prints a compact truth snapshot (git + coverage + checkpoints)
  - `./scripts/checkpoint.sh "note"` creates a portable restore tarball recorded in `docs/CHECKPOINTS.md`

## Rule Registry Snapshot (Real Counts)

From `out/coverage-report.json` (generated by `PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out`):

- **Total rules:** 524
- **Enforcement split:** manual 379, typeset 62, lint 70, postrender 13
- **Severity split:** must 37, should 470, warn 17

Category counts:

- editorial 45
- citations 61
- numbers 62
- punctuation 55
- layout 46
- headings 32
- tables 23
- links 21
- i18n 27
- abbreviations 27
- code 28
- accessibility 22
- frontmatter 20
- backmatter 18
- figures 22

## Coverage Map Summary (Sections)

Generated by `python3 tools/coverage_summary.py --coverage-dir spec/coverage --out-json out/coverage-summary.json --out-md out/coverage-summary.md`:

- BRING: 64 sections, 284 rules (all partial)
- CMOS18: 176 sections, 550 rules (all partial)
- Total unique rule_ids across coverage maps: 834
- typography 15

Interpretation:

- The registry is intentionally larger than the enforcement surface.
- Many rules remain `manual_checklist=true` by design until we have deterministic enforcement for them.

## What This Is *Not* Yet

- A full “publication-grade PDF QA” system. PDF-aware checks exist, but are heuristic (text extraction based) and limited in scope.
- A complete automated implementation of Chicago/Bringhurst. The registry is pointer-backed; enforcement is incremental and explicit.

## How To Run (Quick)

```bash
cd /root/ai-workspace/iftypeset
```

Validate + rebuild indexes:

```bash
PYTHONPATH=src python3 -m iftypeset.cli validate-spec --spec spec --build-indexes
```

Generate coverage report:

```bash
PYTHONPATH=src python3 -m iftypeset.cli report --spec spec --out out --build-indexes
```

Lint (and optionally autofix):

```bash
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli lint --input fixtures/sample.md --out out --profile web_pdf --fix --fix-mode rewrite
```

Render HTML + PDF:

```bash
PYTHONPATH=src python3 -m iftypeset.cli render-html --input fixtures/sample.md --out out --profile web_pdf
PYTHONPATH=src python3 -m iftypeset.cli render-pdf  --input fixtures/sample.md --out out --profile web_pdf
```

Run QA:

```bash
PYTHONPATH=src python3 -m iftypeset.cli qa --out out --profile web_pdf
```

Sanity check:

```bash
./scripts/ci.sh
```

## Don’t Lose Work (Session Resets)

Chat logs are not durable. The repo is.

- Snapshot: `./scripts/audit.sh`
- Restore point: `./scripts/checkpoint.sh "short note"`
- Checkpoint index: `docs/CHECKPOINTS.md`

## What Remains (Prioritized)

1) **Improve post-render QA beyond current heuristics**
   - PDF-aware stranded headings / keep-with-next violations
   - more reliable overflow/clipping detection when a renderer is pinned

2) **Increase *implemented* rule coverage where it matters**
   - citations normalization / author-date variants where feasible
   - i18n/locale-driven checks (without pretending perfect automation)
   - link/DOI wrapping policies (reduce broken PDFs)

3) **Forgejo integration**
   - use `iftypeset` as the Forgejo typeset/export worker
   - emit QA artifacts as export attachments

4) **Continue adding rule batches**
   - prioritize what breaks real documents: figures, references, complex tables, long code blocks