# iftypeset tools (ephemeral extraction helpers)

These helpers exist to support **pointer-based** rule creation from purchased reference PDFs
without storing or committing copyrighted text.

**Rules of engagement (non-negotiable):**
- Do **not** check in OCR output or full extracted book text.
- Use these tools to locate **where** guidance lives (section/page) and to inform **paraphrased** rule records.
- `source_refs` in `spec/rules/**.ndjson` must be pointers (e.g., `CMOS18 §6 p377`), not quotes.
- Keep any OCR artifacts **ephemeral** (prefer `/tmp`, delete images after OCR).

Tools in this folder may print short snippets to stdout for operator convenience.
That is okay for local use; do not redirect that output into committed files.

## coverage_summary.py

Deterministic summary of `spec/coverage/*.json` (sections + unique rule IDs).
Writes JSON + Markdown summaries to `out/coverage-summary.json` and
`out/coverage-summary.md`.

## coverage_ocr_audit.py

Audit helper that compares OCR-detected **section numbers only** to a coverage map
within a scan-page range, and performs pointer sanity checks (scan page + printed page).