# Multi-renderer Strategy (HTML→PDF adapters)

We should not bet the product on a single PDF engine. `iftypeset` should be **renderer-agnostic**: the “meaning” is in the rule registry + profiles + QA gates; the PDF renderer is an interchangeable adapter.

## Principles

- **Determinism first**: the adapter must emit `render-log.json` with engine name + version + key options.
- **No-network capable**: engines must run with `--network=none`/offline mode in CI where possible.
- **Graceful degradation**: if no PDF engine exists, HTML artifacts + HTML-based QA must still run.
- **Capability disclosure**: if a gate can’t be measured with an engine, report it explicitly (don’t silently pass).

## Adapter interface (contract)

All PDF engines implement the same interface:

```python
class PdfEngine(Protocol):
    name: str

    def is_available(self) -> bool: ...
    def version(self) -> str: ...
    def render(self, *, html_path: str, css_path: str, assets_dir: str | None, out_pdf: str, options: dict) -> dict:
        """Returns a structured log: timings, warnings, engine opts, feature flags."""
```

The CLI should support:

- `--engine auto|playwright|weasyprint|prince|antenna|vivliostyle|wkhtmltopdf`
- `--engine-opts <json>`

## “Majors” to target (pragmatic)

### Tier 1 (easy to run, common)

1) **Playwright (browser-backed PDF)**
- via Playwright (preferred)
- Pros: ubiquitous, good HTML/CSS coverage, easy containerization.
- Cons: paged-media features vary; footnotes/running headers are limited unless carefully built.

2) **WeasyPrint**
- Pros: pure Python workflow, good paged-media support, easy CI story.
- Cons: CSS compatibility differs; some complex layouts may need workarounds.

### Tier 2 (best print fidelity; commercial)

3) **PrinceXML**
- Pros: excellent paged media, footnotes, running headers, print-quality output.
- Cons: license cost; needs binary distribution policy.

4) **Antenna House Formatter**
- Pros: top-tier print fidelity; standards publishing; robust PDF/A options.
- Cons: license + operational complexity.

### Tier 3 (useful but limited)

5) **Vivliostyle / Paged.js**
- Pros: strong paged-media model in the web ecosystem.
- Cons: heavier runtime; often “HTML+JS render” rather than simple CLI.

6) **wkhtmltopdf**
- Pros: simple deploy story in legacy environments.
- Cons: outdated rendering model; limited CSS; not ideal for “high quality”.

## Capability matrix (what we care about)

We should encode an engine capability report (per run) for:

- paged media (margins, page size, running headers)
- hyphenation support + dictionaries
- font embedding/subsetting
- link handling (wrap/break strategy)
- footnotes (if we later support them)
- PDF/A options (later)

This capability map feeds QA:

- if engine can’t support a gate (e.g., true widow/orphan detection on PDF), QA should:
  - run the best available approximation, and
  - mark the gate as `skipped` with a reason, not `passed`.

## Determinism knobs (must record)

For every PDF render, write `out/render-log.json` including:

- engine name + version
- invocation args
- environment hints (OS, locale)
- “self-contained” mode on/off
- fonts policy + resolution (requested primary fonts, what fontconfig matched, and what fonts were embedded in the PDF)
- any warnings from the engine

If the engine is a browser:

- fix viewport
- disable external requests
- pin print settings (margins, background graphics, scaling)

## Security model

- Assume untrusted Markdown input (CI context). Mitigations:
  - never execute embedded JS during HTML render (or use a hardened renderer container)
  - disable network
  - restrict filesystem access (mount only `out/` and input)
- If using headless browsers, treat them as an attack surface; run in locked-down containers.

## Recommended v0.1 path (fastest)

1) Implement adapters for:
   - Playwright (auto-detect)
   - WeasyPrint (if installed)
2) Keep Prince/AH as optional adapters (stub + docs) until needed.
3) Use QA gates as the real value:
   - link wrap, code/table overflow, stranded headings (HTML and PDF when possible)

This keeps delivery fast while preserving “compatible with the majors”.

## Future: “Engine parity” testing

Once adapters exist, add an integration job that renders the same fixtures through 2 engines (when available) and compares:

- gate metrics (should be within thresholds)
- file size ranges
- major layout regressions (e.g., table clipping incidents)

We don’t need pixel-perfect equivalence; we need “quality gates still pass”.