Session Resilience (avoid “lost work” when chats reset)

Codex/chat sessions can lose conversation context when a connection drops. The filesystem does not: the durable source of truth is the repo.

This project adds a few boring mechanisms to make resuming work deterministic.

What to trust

Repo state: README.md, STATUS.md, docs/ are canonical.
CI: ./scripts/ci.sh is the fastest sanity check.
Artifacts: out/ contains the latest reports from CI runs.

Quick resume checklist (30 seconds)

From the repo root:

./scripts/resume.sh (recommended)
./scripts/audit.sh (raw)
./scripts/state.sh (writes docs/SESSION_STATE.md for pasteable state)
./scripts/ci.sh

If both look sane, you’re back.

Multi-session coordination (avoid Codex waiting)

When running multiple Codex sessions in parallel, coordinate through the repo:

Task queue: docs/13-task-board.md
Session context: docs/SESSION_STATE.md
Durable restore points: docs/CHECKPOINTS.md + out/checkpoints/

Workflow:

Pick a task from docs/13-task-board.md and claim it (owner + working_set).
Stay inside your working_set to avoid conflicts.
Run ./scripts/ci.sh.
Create a restore point: ./scripts/checkpoint.sh "task <id>: <note>"
Mark the task done and paste the checkpoint reference.

If a change must block others (shared files), create a temporary lock file:

docs/LOCK_<area>.md

Coordination contract (short)

Use this when handing off across sessions:

Claim the task in docs/13-task-board.md.
Declare the working set in docs/SESSION_STATE.md.
Lock shared areas via docs/LOCK_<area>.md if needed.
Checkpoint after changes (./scripts/checkpoint.sh "note").
Close the task and paste the checkpoint reference.

“It looks rolled back” (most common disconnect trap)

If you resume a session and it looks like files “disappeared”, it’s almost always one of:

You’re looking at a clean git checkout somewhere else (remote machine / new container) and the prior work was never committed.
You’re looking at a stale copy outside the repo (see warning below).

What to do:

Run ./scripts/audit.sh and inspect the git status --porcelain output.
Check out/checkpoints/ for the latest iftypeset_checkpoint_*.tar.gz.
If you need to move the work to a new machine/session, copy the latest checkpoint tarball and extract it:
- tar -xzf out/checkpoints/iftypeset_checkpoint_<timestamp>.tar.gz -C <new_dir>

Important: checkpoints snapshot the repo tree (including untracked files). They are the durable “chat-proof” handoff mechanism even when you don’t want to commit yet.

Create a checkpoint (2 minutes)

When you finish a meaningful chunk of work (new rule batches, QA changes, renderer changes), run:

./scripts/checkpoint.sh "what changed"

This:

runs CI and stores the CI JSON in out/checkpoints/
creates a compressed snapshot tarball in out/checkpoints/
appends a new entry to docs/CHECKPOINTS.md with the snapshot hash
writes docs/SESSION_STATE.md so the snapshot includes a pasteable “resume context”

This gives you a portable restore point even if the chat transcript is gone.

Best practice (recommended)

Push to a remote early (Forgejo/GitHub). A remote is the best anti-loss mechanism.
Treat STATUS.md as the “1-page truth” for what exists and what’s next.
Don’t rely on chat logs for state; copy any critical decisions into docs/.

Common “it looks rolled back” trap

If you have multiple status documents on disk, prefer the one inside the repo:

✅ canonical: ai-workspace/iftypeset/docs/09-project-status.md
⚠️ non-canonical copies (e.g., /root/docs/09-project-status.md) can drift and misreport counts.

3.8 KiB Raw Export PDF Blame History Unescape Escape