emo-social-insta-dm-agent/README.md

# Emo-Social Insta DM Agent

Project for the **Emo-Social Instagram DM agent** for `@socialmediatorr` (distinct from the existing Sergio RAG DB agent).

Includes:
- Meta Graph API exporter (full DM history)
- Instagram “Download your information” importer
- DM analysis pipeline (bot-vs-human, conversions, objections, rescue logic, product eras)
- Helpers for device login + page subscription + sending replies

## Where this runs

Build/run the Docker image in `pct 250` (ai-dev) and keep production (`pct 220`) clean.

## Credentials

Source of truth creds file (host):

- `/root/tmp/emo-social-meta-app-creds.txt`

Inside `pct 250`, place the same file at the same path:

- `/root/tmp/emo-social-meta-app-creds.txt`

Required for export:

- `META_PAGE_ID`
- `META_PAGE_ACCESS_TOKEN`

## Build (pct 250)

- `docker build -t emo-social-insta-dm-agent:dev /root/ai-workspace/emo-social-insta-dm-agent`

Note: in this LXC, `docker run` / `podman run` can fail due to AppArmor confinement. Use the **direct Python** commands below unless you change the LXC config to be AppArmor-unconfined.

## Obtain tokens (pct 250)

`meta_device_login` requires `META_CLIENT_TOKEN` (from Meta app dashboard → Settings → Advanced → Client token).
Alternatively, set `META_CLIENT_ACCESS_TOKEN` as `APP_ID|CLIENT_TOKEN` to override `META_APP_ID` for device-login only.

If `meta_device_login start` returns OAuth error `(#3) enable "Login from Devices"`, enable it in the *same app id* the script prints:

- `https://developers.facebook.com/apps/<APP_ID>/fb-login/settings/` → set **Login from Devices** = Yes

Start device login (prints a URL + code to enter; no tokens are printed):

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login start`

After authorizing in the browser, poll and write `META_PAGE_ID` + `META_PAGE_ACCESS_TOKEN` into `/root/tmp/emo-social-meta-app-creds.txt`:

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login poll --write-page-token --target-ig-user-id 17841466913731557`

### Fallback: Graph API Explorer → user token → page token

If device-login is blocked, get a **Facebook User access token** via Graph API Explorer and derive the Page token:

- Open: `https://developers.facebook.com/tools/explorer/`
- Select the correct app, then generate a user token with:
  - `pages_show_list,pages_read_engagement,instagram_manage_messages`
- Save the user token to (keep mode `600`):
  - `/root/tmp/meta-user-access-token.txt`
- Then write `META_PAGE_ID` + `META_PAGE_ACCESS_TOKEN` into `/root/tmp/emo-social-meta-app-creds.txt` (no token values printed):
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_page_token_from_user_token --target-ig-user-id 17841466913731557`

## Export history (pct 250)

Quick token sanity check (does not print token values):

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_token_doctor`

Export to a mounted directory so results persist on disk:

- `docker run --rm -v /root/tmp:/root/tmp emo-social-insta-dm-agent:dev python -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history`
  - (Direct) `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history`

### Workaround: fetch a single thread by `user_id`

If `/conversations` listing times out, you can still fetch a single thread if you know the sender’s `user_id` (from webhook `sender.id`):

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.fetch_thread_messages --user-id <SENDER_ID> --max-pages 2 --out /root/tmp/thread.jsonl`

Small test run:

- `docker run --rm -v /root/tmp:/root/tmp emo-social-insta-dm-agent:dev python -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history --max-conversations 3 --max-pages 2`
  - (Direct) `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history --max-conversations 3 --max-pages 2`

## Full history fallback (Instagram export)

If Meta app review blocks Graph API DM access, export Sergio’s IG data via Instagram “Download your information” and import it:

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.import_instagram_export --input /path/to/instagram-export.zip --out /root/tmp/emo-social-insta-dm-export-history`

## Analyze DM history (Behavioral Cloning / “Biographical Sales”)

This produces the “Sergio persona” artifacts needed for the DM agent:

- Separates frequent `[BOT]` templates vs rare `[MANUAL]` replies (plus `[HYBRID]`).
- Builds **bot fatigue** + **script editorial timeline** charts.
- Extracts **training pairs** (user → Sergio manual reply) from converted threads.
- Generates **rescue playbook** (human saves after silence/negative sentiment).
- Generates **objection handlers** (price/time/trust/stop → best replies).
- Builds a quarterly **eras** CSV (offers/pricing + vocabulary drift).

Outputs are written with mode `600` and may contain sensitive DM content. Keep them out of git.

### Analyze a raw Instagram export folder (recommended)

Optional: index first (lets you filter recency without scanning every thread):

- `python3 -m sergio_instagram_messaging.index_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-ig-index.jsonl`

Then analyze:

- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --owner-name "Sergio de Vocht" --index /root/tmp/socialmediatorr-ig-index.jsonl --since-days 180`

### Analyze an imported history dir (messages/*.jsonl)

If you already ran `import_instagram_export` (or have a partial import output), point the analyzer at that directory:

- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /root/tmp/socialmediatorr-ig-export-history --out /root/tmp/socialmediatorr-agent-analysis --owner-name "Sergio de Vocht"`

### Two-stage workflow (verify templates before full run)

Pass 1 generates `top_outgoing_templates.json` + `template_counts.jsonl`:

- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --stage pass1`

Pass 2 reuses the cache and writes the full deliverables:

- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --stage pass2 --templates-cache /root/tmp/socialmediatorr-agent-analysis/top_outgoing_templates.json`

### Human-readable report (English)

After analysis, generate a single Markdown report:

- `python3 -m sergio_instagram_messaging.generate_dm_report --analysis-dir /root/tmp/socialmediatorr-agent-analysis`

### Plain-English deep report (Mermaid diagrams)

Generate the deeper “no raw quotes” report directly from an Instagram export folder:

- `python3 -m sergio_instagram_messaging.generate_dm_report_detailed --export-input /path/to/export-root --out /root/tmp/dm_history_report_en_detailed.md`

## VoiceDNA (manual reply style)

`voice_dna/voiceDNA_socialmediatorr_insta_dm.json` is a **safe-to-store** style fingerprint generated from the last 6 months of **manual (non-template) DM replies** (no raw DM quotes are included).

It also encodes a hard rule for the bot:

- Always reply in the **user’s input language** (English / Spanish / French / Catalan), with a short clarification if the user’s message is too short to detect.

Regenerate from a local Instagram export folder:

- `python3 -m sergio_instagram_messaging.generate_voice_dna --export-input /path/to/export-root --out voice_dna/voiceDNA_socialmediatorr_insta_dm.json --owner-name "Sergio de Vocht" --window-months 6`

## Ready Replies (Top 20)

Multi-language ready-made answers for the Top question topics (aligned to the VoiceDNA language-mirroring policy):

- `reply_library/top20_ready_answers.json` (programmatic)
- `reply_library/top20_ready_answers.md` (copy/paste)

## Webhooks (new messages → auto-reply)

Meta webhooks are two steps:

1) App subscription (`/{app_id}/subscriptions`) — sets callback URL + verify token + fields.
2) Page subscription (`/{page_id}/subscribed_apps`) — attaches the Page/IG inbox to the app.

The production webhook endpoint exists at:

- `https://emo-social.infrafabric.io/meta/webhook`

### Subscribe the Page to the app (required for message delivery)

This requires a Page access token that includes `pages_manage_metadata`.

Device login with that scope:

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login start --scope pages_show_list,pages_read_engagement,instagram_manage_messages,pages_manage_metadata`
- After authorizing in the browser, poll and write tokens:
  - `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login poll --write-page-token --target-ig-user-id 17841466913731557`

Then subscribe the Page:

- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_subscribe_page subscribe --fields messages`