danny 98150f5631 IGDM shadow: add Business Login connect flow

2025-12-29 17:36:39 +00:00

8.4 KiB

Raw Export PDF Blame History

IF.GOV + IF.TTT Spec — Instagram DM Draft Assistant (`@socialmediatorr`)

Status: proposal (POC)
Constraint: no paid external LLM APIs → “debates” are simulated using deterministic seats (rules) and optional local models only.

This spec describes how to implement the Instagram DM assistant as an auditable governance pipeline:

IF.GOV.TRIAGE decides risk + route (normal vs human vs urgent).
IF.GOV.PANEL simulates a multi‑seat review of the proposed draft reply (no external APIs required).
IF.TTT records a chain‑of‑custody (hashes + decisions + evidence bundle) so results are provable later.

0) System boundaries (what we will and won’t do)

In scope

Ingest Meta webhook events for Instagram DMs.
Produce draft replies (default) using templates + simple intent routing.
Escalate a tiny fraction of DMs to a human (Sergio) quickly, with a direct “open thread” link.
Produce IF.TTT‑style trace records and evidence bundles for audit/replay.
Run “panel debates” without external APIs (rule seats + optional local model seats).

Out of scope (for the POC)

Automatic sending of replies to real clients (keep draft-only).
Therapy-by-DM, crisis intervention, diagnosis, or medical claims.
Storing/exporting full DM transcripts in a public repo.

1) High-level architecture

Components

Webhook receiver (already exists in production on emo-social.infrafabric.io): verifies Meta signature and normalizes events.
Event store: append-only storage of DM events + derived decisions (local, private).
Triage engine (IF.GOV.TRIAGE): risk + language + intent + confidence.
Draft engine: chooses a reply template (Top 20) or a safe fallback.
Panel engine (IF.GOV.PANEL): simulated debate across “seats” → approve/patch/escalate.
Trace recorder (IF.TTT): emits signed decision records + evidence bundles.
Reviewer UI: queue view for Drafts + Escalations + “open IG thread” action.

Data flow (valid Mermaid)

flowchart LR
  W[Meta webhook event] --> V[Verify signature]
  V --> N[Normalize event]
  N --> ES[Event store append]
  ES --> T[IF.GOV.TRIAGE]
  T -->|urgent| E[Escalation record]
  T -->|normal| D[Draft engine]
  T -->|needs-human| H[Human-required record]
  D --> P[IF.GOV.PANEL seats]
  P --> R[Panel decision]
  E --> TR[IF.TTT trace + bundle]
  H --> TR
  R --> TR
  TR --> UI[Reviewer UI queue]

2) IF.GOV.TRIAGE (no external API)

Inputs

sender_id (from webhook)
mid (message id)
timestamp_ms
text (if present; empty allowed)
minimal thread context (last N messages for this sender_id, if available)

Outputs (contract)

{
  "triage_version": "if.gov.triage/igdm/v1",
  "trace_id": "uuid",
  "ts_utc": "2025-12-25T12:00:00Z",
  "time_cet": "2025-12-25T13:00:00+01:00",
  "sender_id": "123",
  "mid": "m_abc",
  "language": { "code": "es", "confidence": 0.86, "source": "text_or_thread" },
  "intent": { "label": "book|link|video|price|help|other", "confidence": 0.90 },
  "risk": {
    "tier": "normal|needs-human|urgent",
    "score": 0.05,
    "reasons": ["..."],
    "panel_size": 5
  }
}

Triage rules (POC defaults)

Language detection
- If message has enough text: detect language from message text.
- Else: reuse last confident thread language.
- Else: set confidence < 0.5 and prefer a 1‑line language question.
Intent detection
- Keyword routing for: book, link, video, price/cost, call, therapy, etc.
- If unknown: intent=other with low confidence.
Risk tier
- urgent if self-harm/suicide signals OR violence/abuse indicators.
- needs-human if: therapeutic disclosure, legal threats, harassment, complex personal crisis, repeated angry loop.
- normal otherwise.

“Panel size” without external APIs

Panel size is computed deterministically from risk.score (same pattern as the existing guard_engine.py):

normal: 5 seats
needs-human: 10 seats (more checks, but still local)
urgent: 20 seats (but action is always escalate, not debate content)

3) Draft engine (no external API)

Principles

Use templates first, not a generative model.
Always mirror the user’s language (or ask a 1‑line language question if uncertain).
Keep replies short; ask one clear next question when helpful.
Never invite deep disclosure in DMs; route to “resources / call / book link”.

Draft outputs

{
  "draft_version": "igdm.draft/v1",
  "trace_id": "uuid",
  "template_id": "top20:book:v1:es",
  "text": "…",
  "placeholders": ["BOOK_LINK"],
  "notes": ["language=es", "intent=book"]
}

4) IF.GOV.PANEL (simulated debates)

What “debate” means here

Because we are not calling external LLMs, the “panel” is a set of deterministic seat evaluators. Each seat emits:

a vote (approve | request_changes | veto)
reasons (human readable)
patch suggestions (structured)

Seat roster (minimum viable, 5 seats)

Safety seat: blocks crisis mishandling; ensures no harmful advice.
Boundary seat: prevents therapy-by-DM; rewrites “help” flows into routing.
Language seat: enforces same-language output; no mixing; handles low confidence.
Privacy seat: avoids unnecessary PII; flags risky asks (phone/email) unless explicitly required.
Next-step seat: checks the reply has a clear next step (link or one question).

Optional seats (when panel size grows)

Tone/VoiceDNA seat: checks length + emoji pattern + directness vs DM voice rules.
Spam/abuse seat: detects harassment loops and routes to block/report guidance.
Contrarian seat: tries to misread the message and see if the draft fails.

Seat output format

{
  "seat": "language",
  "vote": "approve|request_changes|veto",
  "severity": 0.0,
  "reasons": ["..."],
  "patches": [
    { "op": "replace_text", "path": "draft.text", "value": "..." }
  ]
}

Panel aggregation (deterministic)

If any seat returns veto → panel decision becomes escalate_human (or urgent_escalate).
Else if any seat returns request_changes → apply patches (in order), re-run seats once.
Else → approve.

Panel decision record

{
  "panel_version": "if.gov.panel/igdm/v1",
  "trace_id": "uuid",
  "panel_size": 5,
  "seats": [ { "...": "..." } ],
  "decision": "approve_draft|revise_draft|escalate_human|urgent_escalate",
  "final_draft_text_sha256": "…",
  "reason_summary": "short"
}

5) Escalation UX (how Sergio actually sees it)

Escalation record

{
  "escalation_version": "igdm.escalation/v1",
  "trace_id": "uuid",
  "tier": "urgent|needs-human",
  "reason_codes": ["self_harm_signal"],
  "sender_id": "123",
  "mid": "m_abc",
  "time_cet": "2025-12-25T21:13:00+01:00",
  "open_links": {
    "instagram_thread": "https://www.instagram.com/direct/t/<conversation_id>/",
    "fb_inbox": "https://business.facebook.com/latest/inbox/all/?asset_id=<page_id>"
  }
}

Notification strategy (POC)

No paid services required:

Show escalations in a logged-in dashboard on emo-social.infrafabric.io.
Optional: email later (requires SMTP relay configured); not required for the POC.

6) IF.TTT trace + evidence bundles (provable without leaking)

Two-bundle approach (recommended)

Private bundle (internal): includes raw message text, stored locally with strict permissions.
Public bundle (shareable): contains hashes + redacted previews only.

Bundle contents (public)

bundle/
  manifest.json
  event.json
  triage.json
  draft.json
  panel.json
  escalation.json   (only if escalated)
  sha256sums.txt
  signature_ed25519.txt

Minimum “public” fields

message_text_sha256 (not raw)
draft_text_sha256 (not raw)
triage + panel decision + reason codes
timestamps (UTC + CET)

This is enough to prove: “given these bytes (committed), these deterministic governance steps happened, and this decision was produced”.

7) Rollout plan (safe)

Triage-only + escalation queue (no drafts yet).
Draft-only templates for Top 20 intents (no sending).
Add simulated IF.GOV.PANEL seats and store panel decisions.
Emit IF.TTT bundles for each event (public + private).
Add comparison table: draft vs actual sent (manual) to measure quality.
Only after measured success: consider limited auto-send for low-risk intents, with a kill switch.

8.4 KiB Raw Export PDF Blame History Unescape Escape

IF.GOV + IF.TTT Spec — Instagram DM Draft Assistant (@socialmediatorr)