emo-social-insta-dm-agent/README.md
2025-12-24 12:18:45 +00:00

186 lines
9.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Emo-Social Insta DM Agent
Project for the **Emo-Social Instagram DM agent** for `@socialmediatorr` (distinct from the existing Sergio RAG DB agent).
Includes:
- Meta Graph API exporter (full DM history)
- Instagram “Download your information” importer
- DM analysis pipeline (bot-vs-human, conversions, objections, rescue logic, product eras)
- Helpers for device login + page subscription + sending replies
## Where this runs
Build/run the Docker image in `pct 250` (ai-dev) and keep production (`pct 220`) clean.
## Credentials
Source of truth creds file (host):
- `/root/tmp/emo-social-meta-app-creds.txt`
Inside `pct 250`, place the same file at the same path:
- `/root/tmp/emo-social-meta-app-creds.txt`
Required for export:
- `META_PAGE_ID`
- `META_PAGE_ACCESS_TOKEN`
## Build (pct 250)
- `docker build -t emo-social-insta-dm-agent:dev /root/ai-workspace/emo-social-insta-dm-agent`
Note: in this LXC, `docker run` / `podman run` can fail due to AppArmor confinement. Use the **direct Python** commands below unless you change the LXC config to be AppArmor-unconfined.
## Obtain tokens (pct 250)
`meta_device_login` requires `META_CLIENT_TOKEN` (from Meta app dashboard → Settings → Advanced → Client token).
Alternatively, set `META_CLIENT_ACCESS_TOKEN` as `APP_ID|CLIENT_TOKEN` to override `META_APP_ID` for device-login only.
If `meta_device_login start` returns OAuth error `(#3) enable "Login from Devices"`, enable it in the *same app id* the script prints:
- `https://developers.facebook.com/apps/<APP_ID>/fb-login/settings/` → set **Login from Devices** = Yes
Start device login (prints a URL + code to enter; no tokens are printed):
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login start`
After authorizing in the browser, poll and write `META_PAGE_ID` + `META_PAGE_ACCESS_TOKEN` into `/root/tmp/emo-social-meta-app-creds.txt`:
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login poll --write-page-token --target-ig-user-id 17841466913731557`
### Fallback: Graph API Explorer → user token → page token
If device-login is blocked, get a **Facebook User access token** via Graph API Explorer and derive the Page token:
- Open: `https://developers.facebook.com/tools/explorer/`
- Select the correct app, then generate a user token with:
- `pages_show_list,pages_read_engagement,instagram_manage_messages`
- Save the user token to (keep mode `600`):
- `/root/tmp/meta-user-access-token.txt`
- Then write `META_PAGE_ID` + `META_PAGE_ACCESS_TOKEN` into `/root/tmp/emo-social-meta-app-creds.txt` (no token values printed):
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_page_token_from_user_token --target-ig-user-id 17841466913731557`
## Export history (pct 250)
Quick token sanity check (does not print token values):
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_token_doctor`
Export to a mounted directory so results persist on disk:
- `docker run --rm -v /root/tmp:/root/tmp emo-social-insta-dm-agent:dev python -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history`
- (Direct) `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history`
### Workaround: fetch a single thread by `user_id`
If `/conversations` listing times out, you can still fetch a single thread if you know the senders `user_id` (from webhook `sender.id`):
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.fetch_thread_messages --user-id <SENDER_ID> --max-pages 2 --out /root/tmp/thread.jsonl`
Small test run:
- `docker run --rm -v /root/tmp:/root/tmp emo-social-insta-dm-agent:dev python -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history --max-conversations 3 --max-pages 2`
- (Direct) `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.export_meta_ig_history --out /root/tmp/emo-social-insta-dm-history --max-conversations 3 --max-pages 2`
## Full history fallback (Instagram export)
If Meta app review blocks Graph API DM access, export Sergios IG data via Instagram “Download your information” and import it:
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.import_instagram_export --input /path/to/instagram-export.zip --out /root/tmp/emo-social-insta-dm-export-history`
## Analyze DM history (Behavioral Cloning / “Biographical Sales”)
This produces the “Sergio persona” artifacts needed for the DM agent:
- Separates frequent `[BOT]` templates vs rare `[MANUAL]` replies (plus `[HYBRID]`).
- Builds **bot fatigue** + **script editorial timeline** charts.
- Extracts **training pairs** (user → Sergio manual reply) from converted threads.
- Generates **rescue playbook** (human saves after silence/negative sentiment).
- Generates **objection handlers** (price/time/trust/stop → best replies).
- Builds a quarterly **eras** CSV (offers/pricing + vocabulary drift).
Outputs are written with mode `600` and may contain sensitive DM content. Keep them out of git.
### Analyze a raw Instagram export folder (recommended)
Optional: index first (lets you filter recency without scanning every thread):
- `python3 -m sergio_instagram_messaging.index_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-ig-index.jsonl`
Then analyze:
- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --owner-name "Sergio de Vocht" --index /root/tmp/socialmediatorr-ig-index.jsonl --since-days 180`
### Analyze an imported history dir (messages/*.jsonl)
If you already ran `import_instagram_export` (or have a partial import output), point the analyzer at that directory:
- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /root/tmp/socialmediatorr-ig-export-history --out /root/tmp/socialmediatorr-agent-analysis --owner-name "Sergio de Vocht"`
### Two-stage workflow (verify templates before full run)
Pass 1 generates `top_outgoing_templates.json` + `template_counts.jsonl`:
- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --stage pass1`
Pass 2 reuses the cache and writes the full deliverables:
- `python3 -m sergio_instagram_messaging.analyze_instagram_export --input /path/to/your_instagram_activity --out /root/tmp/socialmediatorr-agent-analysis --stage pass2 --templates-cache /root/tmp/socialmediatorr-agent-analysis/top_outgoing_templates.json`
### Human-readable report (English)
After analysis, generate a single Markdown report:
- `python3 -m sergio_instagram_messaging.generate_dm_report --analysis-dir /root/tmp/socialmediatorr-agent-analysis`
### Plain-English deep report (Mermaid diagrams)
Generate the deeper “no raw quotes” report directly from an Instagram export folder:
- `python3 -m sergio_instagram_messaging.generate_dm_report_detailed --export-input /path/to/export-root --out /root/tmp/dm_history_report_en_detailed.md`
## VoiceDNA (manual reply style)
`voice_dna/voiceDNA_socialmediatorr_insta_dm.json` is a **safe-to-store** style fingerprint generated from the last 6 months of **manual (non-template) DM replies** (no raw DM quotes are included).
It also encodes a hard rule for the bot:
- Always reply in the **users input language** (English / Spanish / French / Catalan), with a short clarification if the users message is too short to detect.
Regenerate from a local Instagram export folder:
- `python3 -m sergio_instagram_messaging.generate_voice_dna --export-input /path/to/export-root --out voice_dna/voiceDNA_socialmediatorr_insta_dm.json --owner-name "Sergio de Vocht" --window-months 6`
## Ready Replies (Top 20)
Multi-language ready-made answers for the Top question topics (aligned to the VoiceDNA language-mirroring policy):
- `reply_library/top20_ready_answers.json` (programmatic)
- `reply_library/top20_ready_answers.md` (copy/paste)
## Webhooks (new messages → auto-reply)
Meta webhooks are two steps:
1) App subscription (`/{app_id}/subscriptions`) — sets callback URL + verify token + fields.
2) Page subscription (`/{page_id}/subscribed_apps`) — attaches the Page/IG inbox to the app.
The production webhook endpoint exists at:
- `https://emo-social.infrafabric.io/meta/webhook`
### Subscribe the Page to the app (required for message delivery)
This requires a Page access token that includes `pages_manage_metadata`.
Device login with that scope:
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login start --scope pages_show_list,pages_read_engagement,instagram_manage_messages,pages_manage_metadata`
- After authorizing in the browser, poll and write tokens:
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_device_login poll --write-page-token --target-ig-user-id 17841466913731557`
Then subscribe the Page:
- `cd /root/ai-workspace/emo-social-insta-dm-agent && python3 -m sergio_instagram_messaging.meta_subscribe_page subscribe --fields messages`