4.7 KiB
Codex Max Workload (High-Leverage) — Forgejo PDF Integration + PDF QA Expansion
This is the next big acceleration chunk for iftypeset.
Goal: make iftypeset visibly valuable in the real world by wiring it into the existing Forgejo PDF export worker, then deepen PDF QA so we catch real pagination failures (beyond widows/orphans).
Non-negotiables
- Do not OCR/transcribe Chicago/Bringhurst into the repo(s). Paraphrases + pointers only.
- Keep dependencies minimal. Prefer stdlib + existing tooling already present in the environment.
- Keep outputs deterministic. If a dependency/version makes determinism impossible, document it in a “trust contract” note instead of pretending.
Target repos
iftypeset:/root/ai-workspace/iftypesetforgejo-pdf:/root/ai-workspace/forgejo-pdf
Part A — Integrate iftypeset CSS into Forgejo PDF export (primary goal)
A1) Understand current Forgejo PDF pipeline
In /root/ai-workspace/forgejo-pdf, identify:
- where Markdown is converted to HTML
- where CSS is selected/loaded (currently
basic.css,professional.css, etc.) - where PDF is rendered (Chromium / Playwright / Puppeteer / wkhtmltopdf / etc.)
- how configuration is passed (env vars, config file, CLI args)
Write a short “current pipeline” note in forgejo-pdf/worker/pdf/README.md (or existing docs file) so we can reason about changes later.
A2) Add a new theme option: iftypeset-web_pdf
We want Forgejo to be able to select:
basicprofessionaliftypeset-web_pdf(new)
Implementation constraints:
- The CSS file must ship with the worker (no network fetch at runtime).
- The CSS must load after any base CSS so it can override.
- The result should still render correctly even if some fonts are missing (font stack fallbacks).
A3) Generate and vendor the CSS from iftypeset
In iftypeset the command exists:
iftypeset emit-css --spec spec --profile web_pdf --out out-css
Pick one approach:
-
Vendored CSS in
forgejo-pdf(recommended for now):- Generate
out-css/iftypeset_web_pdf.css - Copy into
forgejo-pdf/worker/pdf/assets/css/iftypeset-web_pdf.css - Add a small “refresh script” in
forgejo-pdf/scripts/update_iftypeset_css.shthat regenerates and copies the file. - Document the source-of-truth as
iftypeset+ the refresh script.
- Generate
-
Build-time generation (optional later):
- During
forgejo-pdfbuild, calliftypeset emit-cssand bundle the output.
- During
A4) Wire it into the render code path
Update the Forgejo PDF worker so that selecting the new theme:
- includes the CSS file in the HTML head
- produces PDFs/HTML without breaking existing themes
A5) Tests + validation
Add a minimal smoke test in forgejo-pdf that:
- renders a small known fixture Markdown
- verifies that the resulting HTML references
iftypeset-web_pdf.css - (optional) verifies PDF generation completes
Avoid checking in large PDF binaries. Prefer small text-based assertions (e.g., HTML contains a <link> to the CSS).
A6) Documentation
Update both repos:
iftypeset/forgejo/README.md: “how to enableiftypeset-web_pdftheme in forgejo-pdf”forgejo-pdfdocs: new theme option + how to refresh CSS
Part B — Expand PDF QA beyond widows/orphans (secondary goal)
We already have PDF-aware widows/orphans via Poppler pdftotext -layout.
Add 1–2 additional PDF incident kinds without adding heavy dependencies:
- Stranded headings (most valuable)
- detect headings at end of a page with insufficient following content (heuristic)
- Overfull/clipping approximation
- flag suspicious long lines that exceed measure, using
pdftotext -layoutand profile measure targets as heuristics
- flag suspicious long lines that exceed measure, using
Requirements:
- Emit incidents with enough context to be actionable (page number, snippet, rule/tag).
- Update
spec/quality_gates.yamlmetrics if needed (or map into existing ones). - Add fixtures + tests in
iftypeset/tests/for the new incidents. - Update
STATUS.mdanddocs/06-project-overview.mdif behavior changes.
Completion checklist (what “done” looks like)
- Forgejo PDF worker can render using
iftypeset-web_pdftheme. - A refresh script exists to sync CSS from
iftypeset. - Both repos have docs explaining how to use it.
- CI stays green in both repos (or a clear note exists if one repo lacks CI).
iftypesetPDF QA reports at least one new PDF incident kind beyond widows/orphans.
After each major chunk
- Run:
bash /root/ai-workspace/iftypeset/scripts/ci.sh- create an
iftypesetcheckpoint:bash /root/ai-workspace/iftypeset/scripts/checkpoint.sh "note" - and if you touch
forgejo-pdf, do the equivalent if that repo has checkpoint tooling (otherwise documentgit rev-parse HEAD).