249 lines
8.4 KiB
Markdown
249 lines
8.4 KiB
Markdown
# IF.GOV + IF.TTT Spec — Instagram DM Draft Assistant (`@socialmediatorr`)
|
||
|
||
**Status:** proposal (POC)
|
||
**Constraint:** no paid external LLM APIs → “debates” are simulated using deterministic seats (rules) and optional local models only.
|
||
|
||
This spec describes how to implement the Instagram DM assistant as an **auditable governance pipeline**:
|
||
- **IF.GOV.TRIAGE** decides risk + route (normal vs human vs urgent).
|
||
- **IF.GOV.PANEL** simulates a multi‑seat review of the proposed draft reply (no external APIs required).
|
||
- **IF.TTT** records a chain‑of‑custody (hashes + decisions + evidence bundle) so results are provable later.
|
||
|
||
---
|
||
|
||
## 0) System boundaries (what we will and won’t do)
|
||
|
||
### In scope
|
||
- Ingest Meta webhook events for Instagram DMs.
|
||
- Produce **draft replies** (default) using templates + simple intent routing.
|
||
- Escalate a tiny fraction of DMs to a human (Sergio) quickly, with a direct “open thread” link.
|
||
- Produce IF.TTT‑style trace records and evidence bundles for audit/replay.
|
||
- Run “panel debates” **without external APIs** (rule seats + optional local model seats).
|
||
|
||
### Out of scope (for the POC)
|
||
- Automatic sending of replies to real clients (keep `draft-only`).
|
||
- Therapy-by-DM, crisis intervention, diagnosis, or medical claims.
|
||
- Storing/exporting full DM transcripts in a public repo.
|
||
|
||
---
|
||
|
||
## 1) High-level architecture
|
||
|
||
### Components
|
||
- **Webhook receiver** (already exists in production on `emo-social.infrafabric.io`): verifies Meta signature and normalizes events.
|
||
- **Event store**: append-only storage of DM events + derived decisions (local, private).
|
||
- **Triage engine** (`IF.GOV.TRIAGE`): risk + language + intent + confidence.
|
||
- **Draft engine**: chooses a reply template (Top 20) or a safe fallback.
|
||
- **Panel engine** (`IF.GOV.PANEL`): simulated debate across “seats” → approve/patch/escalate.
|
||
- **Trace recorder** (`IF.TTT`): emits signed decision records + evidence bundles.
|
||
- **Reviewer UI**: queue view for Drafts + Escalations + “open IG thread” action.
|
||
|
||
### Data flow (valid Mermaid)
|
||
```mermaid
|
||
flowchart LR
|
||
W[Meta webhook event] --> V[Verify signature]
|
||
V --> N[Normalize event]
|
||
N --> ES[Event store append]
|
||
ES --> T[IF.GOV.TRIAGE]
|
||
T -->|urgent| E[Escalation record]
|
||
T -->|normal| D[Draft engine]
|
||
T -->|needs-human| H[Human-required record]
|
||
D --> P[IF.GOV.PANEL seats]
|
||
P --> R[Panel decision]
|
||
E --> TR[IF.TTT trace + bundle]
|
||
H --> TR
|
||
R --> TR
|
||
TR --> UI[Reviewer UI queue]
|
||
```
|
||
|
||
---
|
||
|
||
## 2) IF.GOV.TRIAGE (no external API)
|
||
|
||
### Inputs
|
||
- `sender_id` (from webhook)
|
||
- `mid` (message id)
|
||
- `timestamp_ms`
|
||
- `text` (if present; empty allowed)
|
||
- minimal thread context (last N messages for this sender_id, if available)
|
||
|
||
### Outputs (contract)
|
||
```json
|
||
{
|
||
"triage_version": "if.gov.triage/igdm/v1",
|
||
"trace_id": "uuid",
|
||
"ts_utc": "2025-12-25T12:00:00Z",
|
||
"time_cet": "2025-12-25T13:00:00+01:00",
|
||
"sender_id": "123",
|
||
"mid": "m_abc",
|
||
"language": { "code": "es", "confidence": 0.86, "source": "text_or_thread" },
|
||
"intent": { "label": "book|link|video|price|help|other", "confidence": 0.90 },
|
||
"risk": {
|
||
"tier": "normal|needs-human|urgent",
|
||
"score": 0.05,
|
||
"reasons": ["..."],
|
||
"panel_size": 5
|
||
}
|
||
}
|
||
```
|
||
|
||
### Triage rules (POC defaults)
|
||
- **Language detection**
|
||
- If message has enough text: detect language from message text.
|
||
- Else: reuse last confident thread language.
|
||
- Else: set `confidence < 0.5` and prefer a 1‑line language question.
|
||
- **Intent detection**
|
||
- Keyword routing for: `book`, `link`, `video`, `price/cost`, `call`, `therapy`, etc.
|
||
- If unknown: intent=`other` with low confidence.
|
||
- **Risk tier**
|
||
- `urgent` if self-harm/suicide signals OR violence/abuse indicators.
|
||
- `needs-human` if: therapeutic disclosure, legal threats, harassment, complex personal crisis, repeated angry loop.
|
||
- `normal` otherwise.
|
||
|
||
### “Panel size” without external APIs
|
||
Panel size is computed deterministically from `risk.score` (same pattern as the existing `guard_engine.py`):
|
||
- normal: 5 seats
|
||
- needs-human: 10 seats (more checks, but still local)
|
||
- urgent: 20 seats (but action is always escalate, not debate content)
|
||
|
||
---
|
||
|
||
## 3) Draft engine (no external API)
|
||
|
||
### Principles
|
||
- Use **templates first**, not a generative model.
|
||
- Always mirror the user’s language (or ask a 1‑line language question if uncertain).
|
||
- Keep replies short; ask one clear next question when helpful.
|
||
- Never invite deep disclosure in DMs; route to “resources / call / book link”.
|
||
|
||
### Draft outputs
|
||
```json
|
||
{
|
||
"draft_version": "igdm.draft/v1",
|
||
"trace_id": "uuid",
|
||
"template_id": "top20:book:v1:es",
|
||
"text": "…",
|
||
"placeholders": ["BOOK_LINK"],
|
||
"notes": ["language=es", "intent=book"]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 4) IF.GOV.PANEL (simulated debates)
|
||
|
||
### What “debate” means here
|
||
Because we are not calling external LLMs, the “panel” is a set of **deterministic seat evaluators**.
|
||
Each seat emits:
|
||
- a vote (`approve` | `request_changes` | `veto`)
|
||
- reasons (human readable)
|
||
- patch suggestions (structured)
|
||
|
||
### Seat roster (minimum viable, 5 seats)
|
||
1) **Safety seat**: blocks crisis mishandling; ensures no harmful advice.
|
||
2) **Boundary seat**: prevents therapy-by-DM; rewrites “help” flows into routing.
|
||
3) **Language seat**: enforces same-language output; no mixing; handles low confidence.
|
||
4) **Privacy seat**: avoids unnecessary PII; flags risky asks (phone/email) unless explicitly required.
|
||
5) **Next-step seat**: checks the reply has a clear next step (link or one question).
|
||
|
||
Optional seats (when panel size grows)
|
||
- **Tone/VoiceDNA seat**: checks length + emoji pattern + directness vs DM voice rules.
|
||
- **Spam/abuse seat**: detects harassment loops and routes to block/report guidance.
|
||
- **Contrarian seat**: tries to misread the message and see if the draft fails.
|
||
|
||
### Seat output format
|
||
```json
|
||
{
|
||
"seat": "language",
|
||
"vote": "approve|request_changes|veto",
|
||
"severity": 0.0,
|
||
"reasons": ["..."],
|
||
"patches": [
|
||
{ "op": "replace_text", "path": "draft.text", "value": "..." }
|
||
]
|
||
}
|
||
```
|
||
|
||
### Panel aggregation (deterministic)
|
||
- If any seat returns `veto` → panel decision becomes `escalate_human` (or `urgent_escalate`).
|
||
- Else if any seat returns `request_changes` → apply patches (in order), re-run seats once.
|
||
- Else → approve.
|
||
|
||
### Panel decision record
|
||
```json
|
||
{
|
||
"panel_version": "if.gov.panel/igdm/v1",
|
||
"trace_id": "uuid",
|
||
"panel_size": 5,
|
||
"seats": [ { "...": "..." } ],
|
||
"decision": "approve_draft|revise_draft|escalate_human|urgent_escalate",
|
||
"final_draft_text_sha256": "…",
|
||
"reason_summary": "short"
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 5) Escalation UX (how Sergio actually sees it)
|
||
|
||
### Escalation record
|
||
```json
|
||
{
|
||
"escalation_version": "igdm.escalation/v1",
|
||
"trace_id": "uuid",
|
||
"tier": "urgent|needs-human",
|
||
"reason_codes": ["self_harm_signal"],
|
||
"sender_id": "123",
|
||
"mid": "m_abc",
|
||
"time_cet": "2025-12-25T21:13:00+01:00",
|
||
"open_links": {
|
||
"instagram_thread": "https://www.instagram.com/direct/t/<conversation_id>/",
|
||
"fb_inbox": "https://business.facebook.com/latest/inbox/all/?asset_id=<page_id>"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Notification strategy (POC)
|
||
No paid services required:
|
||
- Show escalations in a **logged-in dashboard** on `emo-social.infrafabric.io`.
|
||
- Optional: email later (requires SMTP relay configured); not required for the POC.
|
||
|
||
---
|
||
|
||
## 6) IF.TTT trace + evidence bundles (provable without leaking)
|
||
|
||
### Two-bundle approach (recommended)
|
||
- **Private bundle** (internal): includes raw message text, stored locally with strict permissions.
|
||
- **Public bundle** (shareable): contains hashes + redacted previews only.
|
||
|
||
### Bundle contents (public)
|
||
```
|
||
bundle/
|
||
manifest.json
|
||
event.json
|
||
triage.json
|
||
draft.json
|
||
panel.json
|
||
escalation.json (only if escalated)
|
||
sha256sums.txt
|
||
signature_ed25519.txt
|
||
```
|
||
|
||
### Minimum “public” fields
|
||
- `message_text_sha256` (not raw)
|
||
- `draft_text_sha256` (not raw)
|
||
- triage + panel decision + reason codes
|
||
- timestamps (UTC + CET)
|
||
|
||
This is enough to prove: “given these bytes (committed), these deterministic governance steps happened, and this decision was produced”.
|
||
|
||
---
|
||
|
||
## 7) Rollout plan (safe)
|
||
|
||
1) **Triage-only** + escalation queue (no drafts yet).
|
||
2) **Draft-only** templates for Top 20 intents (no sending).
|
||
3) Add simulated **IF.GOV.PANEL** seats and store panel decisions.
|
||
4) Emit IF.TTT bundles for each event (public + private).
|
||
5) Add comparison table: `draft` vs `actual sent` (manual) to measure quality.
|
||
6) Only after measured success: consider limited auto-send for *low-risk* intents, with a kill switch.
|
||
|