# iftypeset tools (ephemeral extraction helpers) These helpers exist to support **pointer-based** rule creation from purchased reference PDFs without storing or committing copyrighted text. **Rules of engagement (non-negotiable):** - Do **not** check in OCR output or full extracted book text. - Use these tools to locate **where** guidance lives (section/page) and to inform **paraphrased** rule records. - `source_refs` in `spec/rules/**.ndjson` must be pointers (e.g., `CMOS18 ยง6 p377`), not quotes. - Keep any OCR artifacts **ephemeral** (prefer `/tmp`, delete images after OCR). Tools in this folder may print short snippets to stdout for operator convenience. That is okay for local use; do not redirect that output into committed files. ## coverage_summary.py Deterministic summary of `spec/coverage/*.json` (sections + unique rule IDs). Writes JSON + Markdown summaries to `out/coverage-summary.json` and `out/coverage-summary.md`. ## coverage_ocr_audit.py Audit helper that compares OCR-detected **section numbers only** to a coverage map within a scan-page range, and performs pointer sanity checks (scan page + printed page).