# INTEGRATIONS-SIP-SLACK: Slack API Integration Research ## InfraFabric DevOps Notification & Collaboration Platform **Document Classification:** Research & Specification **Research Methodology:** IF.search 8-Pass Analysis **Research Agent:** Haiku-39 **Document Version:** 1.0 **Last Updated:** 2025-11-14 **Target Integration Complexity:** 7.5/10 --- ## Executive Summary This document presents a comprehensive analysis of Slack integration APIs for the InfraFabric DevOps platform. The research employs the IF.search 8-pass methodology to systematically evaluate Slack's communication, notification, and workflow automation capabilities for infrastructure management scenarios. The analysis reveals that Slack provides a robust, production-ready platform for real-time infrastructure alerts, team collaboration, and automated incident response workflows. Key findings indicate that InfraFabric can leverage Slack's Events API, Incoming Webhooks, Slash Commands, and Block Kit UI components to create a comprehensive infrastructure management interface within Slack's native environment. **Key Statistics:** - Slack API provides 30+ method families with 100+ distinct operations - 100+ event types available for real-time workspace monitoring - 4 rate limit tiers (1/min to 100+/min) with burst allowances - OAuth 2.0 with 50+ granular permission scopes - 13 Block Kit block types + 20+ interactive elements - Socket Mode for persistent bidirectional communication - Enterprise Grid support with advanced compliance certifications --- ## PASS 1: Signal Capture - Slack API Documentation Scan ### 1.1 Core Slack API Products Overview Slack's developer platform exposes six primary API categories designed to enable comprehensive workspace interaction and custom application development: #### 1.1.1 Web API (REST) The foundational HTTP-based API providing programmatic access to Slack workspace features. The Web API encompasses approximately 30+ method families organized hierarchically: **Method Family Categories:** - **Messaging & Communication** (chat, conversations, groups, im, mpim) - **User & Profile Management** (users, users.profile, admin.users) - **Channel Management** (channels, conversations, admin.conversations) - **File Operations** (files, files.upload, files.info) - **Reaction & Annotation** (reactions, pins, stars, bookmarks) - **Search & Discovery** (search, conversations, channels) - **Authentication & Authorization** (oauth, auth, admin.tokens) - **App Management** (apps, app_mention, app_home) - **Workflow Automation** (workflows, workflow_steps, functions) - **Canvas & Lists** (canvas, lists) - **Enterprise Features** (admin, auditlogs, admin.conversations) - **Reminders & Scheduling** (reminders, schedule_message) - **Status & Presence** (users.getPresence, users.setPresence) - **Legal Holds & Compliance** (admin.legalHolds, compliance) - **Call Management** (calls, conversations.calls) Each method is documented with: - Required OAuth scopes - Request parameters and response formats - Rate limit tier classification - Error codes and handling strategies - SDK availability (Python, Node.js, Java) #### 1.1.2 Events API Provides real-time event notifications when user actions occur in Slack workspaces. Events are delivered via HTTP webhooks or WebSocket connections (Socket Mode). **Event Type Categories (100+ distinct event types):** - Application Events (app_installed, app_deleted, app_mention, app_request) - Channel Events (channel_created, channel_deleted, channel_archived, channel_unarchived, channel_converted, channel_left, channel_joined, channel_rename, channel_change_topic, channel_change_purpose) - Message Events (message, thread_broadcast, message_replied) - User Events (member_joined_channel, member_left_channel, user_change, user_status_changed, team_join) - File Events (file_created, file_deleted, file_shared, file_public_shared, file_unshared) - Reaction Events (reaction_added, reaction_removed) - Pin Events (pin_added, pin_removed) - Star Events (star_added, star_removed) - Workflow Events (workflow_published, workflow_deleted, workflow_step_execute, workflow_step_failed) - Interactive Events (block_actions, interactive_message, message_actions, shortcut, slash_commands, options, view_submission, view_closed) - Team Events (team_plan_change, team_rename, team_domain_change, team_profile_change, team_icon_change) - Admin Events (admin events for managing team members, channels, and settings) #### 1.1.3 Incoming Webhooks Simplified HTTP POST endpoints for publishing messages into Slack channels. Webhooks support full Block Kit formatting and message threading. **Key Characteristics:** - One-way communication (Slack channel ← External system) - No OAuth token management required - Unique URL generation per webhook - Support for message blocks, formatting, and attachments - Threading support with `thread_ts` parameter - Convenient for alert aggregation and log streaming #### 1.1.4 Slash Commands User-initiated commands accessible through Slack's message composer (e.g., `/infrahealth`, `/deploylog`). Provides context-aware text input and immediate response capability. **Command Anatomy:** ``` /command text parameter additional parameters ``` **Payload Contents:** - `command`: The slash command trigger - `text`: User-supplied parameter string - `user_id`: Invoking user's ID - `channel_id`: Channel context - `response_url`: Webhook for delayed responses - `trigger_id`: Token for opening modals (3-second lifetime) - `api_app_id`: App identifier #### 1.1.5 Bot Users Bot user accounts within Slack workspaces that can participate in conversations, receive mentions, and access specific APIs. Bots can be scoped with granular permissions and can operate within Slack's rate limiting framework. **Bot Capabilities:** - Participate in channels and direct messages - Respond to mentions and direct messages - Post messages and rich blocks - Manage reactions and pins - Receive and respond to events - Execute slash commands #### 1.1.6 Block Kit UI Components Slack's component library for building rich, interactive user interfaces. Block Kit components are reusable across messages, modals, App Home, and Canvases. **Component Categories:** - **Block Types** (13 types): Section, Button, Input, Actions, Context, Divider, Image, File, Header, Rich Text, Table, Video, Markdown - **Element Types** (20+ types): Button, Select Menu, Multi-Select Menu, Datepicker, Timepicker, Plaintext Input, Number Input, Confirmation Dialog, Overflow Menu, Checkboxes, Radio Buttons - **Composition Objects**: Text, Confirmation Dialog, Option Group, Dispatch Action Configuration ### 1.2 Supporting Infrastructure #### 1.2.1 Slack CLI Command-line tool for managing Slack app lifecycle: - App creation and scaffolding - Local development environment setup - Testing and debugging - Deployment to workspace - Permission management - Token rotation and security #### 1.2.2 Bolt Framework Open-source framework available for Python, JavaScript, and Java: - Simplifies request handling and response sending - Automatic token rotation and management - Built-in rate limiting navigation - Event subscription management - Modal and interactive component handling - Middleware pattern for request processing #### 1.2.3 SDKs & Libraries Official SDKs for multiple languages: - Python (`slack-sdk`) - JavaScript/Node.js (`@slack/web-api`, `@slack/bolt-js`) - Java (`slack-api-client`) #### 1.2.4 Developer Sandboxes Isolated testing workspaces for development without affecting production installations. --- ## PASS 2: Primary Analysis - Integration Patterns & Interaction Models ### 2.1 Real-Time Infrastructure Notifications InfraFabric can leverage Slack's notification infrastructure to deliver real-time infrastructure alerts through multiple channels: #### 2.1.1 Incoming Webhook Notifications (Tier 2 Rate Limit) **Use Case:** High-volume alert streaming from monitoring systems **Implementation Pattern:** ``` Infrastructure Event → InfraFabric Alert Engine → Slack Webhook → Channel ``` **Supported Message Types:** 1. **Critical Infrastructure Alerts** - Server down - CPU threshold exceeded - Memory exhaustion - Disk space critical - Network latency anomaly - Database connection pool exhaustion 2. **Deployment Notifications** - Build started/completed/failed - Deployment initiated/successful/rolled back - Container registry image pushed - Certificate expiration warning 3. **Security Events** - Authentication failure threshold - Unauthorized access attempt - SSL/TLS certificate issues - API rate limit exceeded - Firewall rule violation **Message Structure Example:** ```json { "text": "🚨 CRITICAL: Production Database CPU at 95%", "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": "Production Database Alert" } }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": "*Environment:*\nProduction" }, { "type": "mrkdwn", "text": "*Severity:*\nCritical" } ] }, { "type": "section", "text": { "type": "mrkdwn", "text": "*Metric:* CPU Usage\n*Current:* 95%\n*Threshold:* 80%" } }, { "type": "actions", "elements": [ { "type": "button", "text": { "type": "plain_text", "text": "View Dashboard" }, "url": "https://dashboard.infra.local/db-prod", "action_id": "btn_view_dashboard" } ] } ], "thread_ts": "1699000000.000001" } ``` #### 2.1.2 Events API for Interactive Notifications **Use Case:** Receiving acknowledgment and response actions from Slack users **Event Flow:** ``` Slack User Interaction → Events API → InfraFabric Webhook → Alert Escalation Engine ``` **Supported Interactive Events:** - `block_actions`: User clicks button, selects menu option - `view_submission`: User submits modal form - `shortcut`: User triggers custom shortcut - `message`: User mentions bot or responds to thread #### 2.1.3 Slack App Home for Infrastructure Dashboard **Use Case:** Centralized infrastructure management interface **App Home Features:** - Real-time cluster status overview - Deployment pipeline status - Alert history and acknowledgments - Quick action buttons for common operations - Linked documentation and runbooks ### 2.2 Bot Interactions & Real-Time Messaging #### 2.2.1 Bot User Capabilities InfraFabric bot can participate in channels and direct conversations: **Bot Interaction Scenarios:** 1. **Alert Escalation Conversation** ``` @infra-bot escalate high-priority alert to oncall-devops ``` 2. **Status Inquiry** ``` @infra-bot health status prod-api-cluster ``` 3. **Log Retrieval** ``` @infra-bot logs deployment-2024-11-14 --filter error --last 100 ``` #### 2.2.2 Direct Message Conversations Bot can be messaged directly for: - Personalized alert rules - Private status queries - Sensitive operational information - One-on-one escalation workflows ### 2.3 Slash Commands for Infrastructure Management #### 2.3.1 Command Categories **Cluster Management Commands:** - `/infra cluster status` - Display cluster health - `/infra cluster restart node-01` - Restart specific node - `/infra cluster scale api --replicas 5` - Adjust deployment replicas **Deployment Commands:** - `/deploy service --version 1.2.3 --environment prod` - Trigger deployment - `/rollback service --environment prod` - Immediate rollback - `/deployment logs service --lines 50` - View recent deployment logs **Alert Management Commands:** - `/alert acknowledge ALERT_ID` - Mark alert as handled - `/alert silence service --duration 1h` - Temporarily silence alerts - `/alert history service --days 7` - View alert timeline **Incident Response Commands:** - `/incident create --title "API outage" --severity critical` - Create incident - `/incident assign @john --role lead` - Assign incident role - `/incident timeline add "Service restored at 14:30"` - Update timeline **Information Retrieval Commands:** - `/infra docs deployment-process` - Access runbooks - `/config get PARAMETER` - Query configuration values - `/health check integration-name` - Test external integration #### 2.3.2 Command Response Patterns **Immediate Response (Ephemeral - visible only to invoker):** ```json { "response_type": "ephemeral", "text": "Processing your request..." } ``` **Shared Response (Channel visible):** ```json { "response_type": "in_channel", "blocks": [ { "type": "section", "text": { "type": "mrkdwn", "text": "*Deployment Status: Success*" } } ] } ``` **Delayed Response (via response_url):** For operations exceeding 3-second completion window, use the `response_url` webhook to send results asynchronously. ### 2.4 Interactive Components & Modal-Based Workflows #### 2.4.1 Block Actions & Button Interactions Buttons embedded in messages provide one-click operations: **Use Cases:** 1. **Alert Acknowledgment** ``` Message: "Production API Down" Button 1: "Acknowledge & Start Incident" → Opens modal Button 2: "Snooze 15 min" Button 3: "View Details" → Opens modal with logs ``` 2. **Approval Workflows** ``` Message: "Production deployment waiting approval" Button 1: "Approve" → Triggers deployment Button 2: "Reject" → Rolls back pending changes Button 3: "Request Changes" → Opens comment modal ``` 3. **Quick Actions** ``` Message: "Backup job failed" Button 1: "Retry Now" Button 2: "Skip & Continue" Button 3: "Manual Recovery" ``` #### 2.4.2 Modal Forms for Complex Operations Modals provide rich input forms for infrastructure operations: **Modal Stack Architecture:** - Root modal: Primary operation form - Child modals: Confirmations, additional parameters - Up to 3 views can exist in stack simultaneously - Previous view state preserved when pushing new modal **Example Modal Workflow - Create Infrastructure Alert:** ```json { "type": "modal", "callback_id": "alert_create_modal", "title": { "type": "plain_text", "text": "Create Alert Rule" }, "blocks": [ { "type": "input", "block_id": "alert_service", "label": { "type": "plain_text", "text": "Service Name" }, "element": { "type": "plain_text_input", "action_id": "service_input", "placeholder": { "type": "plain_text", "text": "e.g., api-gateway, database" } } }, { "type": "input", "block_id": "alert_metric", "label": { "type": "plain_text", "text": "Metric Type" }, "element": { "type": "static_select", "action_id": "metric_select", "options": [ { "text": { "type": "plain_text", "text": "CPU Usage" }, "value": "cpu" }, { "text": { "type": "plain_text", "text": "Memory Usage" }, "value": "memory" }, { "text": { "type": "plain_text", "text": "Response Time" }, "value": "response_time" } ] } }, { "type": "input", "block_id": "alert_threshold", "label": { "type": "plain_text", "text": "Threshold Value" }, "element": { "type": "number_input", "action_id": "threshold_input", "is_decimal_allowed": true } }, { "type": "input", "block_id": "alert_severity", "label": { "type": "plain_text", "text": "Severity Level" }, "element": { "type": "radio_buttons", "action_id": "severity_radio", "options": [ { "text": { "type": "plain_text", "text": "Info" }, "value": "info" }, { "text": { "type": "plain_text", "text": "Warning" }, "value": "warning" }, { "text": { "type": "plain_text", "text": "Critical" }, "value": "critical" } ] } }, { "type": "input", "block_id": "alert_channels", "label": { "type": "plain_text", "text": "Notification Channels" }, "element": { "type": "multi_static_select", "action_id": "channels_select", "options": [ { "text": { "type": "plain_text", "text": "#alerts-general" }, "value": "alerts-general" }, { "text": { "type": "plain_text", "text": "#incidents" }, "value": "incidents" }, { "text": { "type": "plain_text", "text": "#oncall" }, "value": "oncall" } ] } } ], "submit": { "type": "plain_text", "text": "Create Alert" }, "close": { "type": "plain_text", "text": "Cancel" } } ``` ### 2.5 Workflow Automation Integration #### 2.5.1 Custom Workflow Steps InfraFabric can register custom workflow steps within Slack Workflow Builder: **Example Workflow Steps:** 1. **Trigger Infrastructure Alert** - Input: Alert conditions - Output: Alert ID, severity 2. **Execute Infrastructure Operation** - Input: Operation type, parameters - Output: Operation status, result code 3. **Validate Infrastructure State** - Input: Service name, expected state - Output: Validation result, current state 4. **Escalate to On-Call** - Input: Severity level, context - Output: Escalation ID, assigned user **Workflow Builder Integration Enables:** - No-code automation for routine infrastructure operations - Conditional branching based on system state - Sequential multi-step orchestration - Integration with Slack native functions (post message, update user profile, etc.) --- ## PASS 3: Rigor & Refinement - Rate Limits, Architecture Patterns, App Distribution ### 3.1 Rate Limiting Architecture #### 3.1.1 Rate Limit Tiers Slack implements a 4-tier rate limiting system evaluated per API method per workspace per app: **Tier 1: 1+ request per minute** - Low-frequency operations - Admin operations with workspace impact - Examples: `admin.teams.settings`, `team.profile.get` - Use case for InfraFabric: Workspace settings, team configuration **Tier 2: 20+ requests per minute** - Most common Web API methods - Standard messaging, file operations, user queries - Examples: `chat.postMessage`, `users.list`, `files.list` - Use case for InfraFabric: Alert posting, user lookups, file access **Tier 3: 50+ requests per minute** - High-frequency paginated operations - Conversation and channel queries - Examples: `conversations.history`, `conversations.list` - Use case for InfraFabric: Alert history pagination, conversation retrieval **Tier 4: 100+ requests per minute** - Extremely high-frequency operations - Search operations with large result sets - Examples: `search.messages`, `conversations.search` - Use case for InfraFabric: Historical log searches, alert pattern analysis **Special Rate Limits:** - `chat.postMessage`: 1 message per second per channel (global workspace limit) - Events API: 30,000 event deliveries per workspace per app per 60 minutes - Profile updates: 10 updates per minute per token - Reactions: Limited by message volume #### 3.1.2 Rate Limit Response Handling When rate limits are exceeded: ``` HTTP 429 Too Many Requests Retry-After: 5 (seconds) X-Rate-Limit-Remaining: 0 X-Rate-Limit-Reset: 1699000000 ``` **InfraFabric Handling Strategy:** 1. Implement exponential backoff (2s → 4s → 8s → 16s) 2. Queue requests in priority order (critical alerts first) 3. Batch operations where possible (multi-user lookups, etc.) 4. Monitor rate limit headers for proactive throttling 5. Use Bolt framework's automatic rate limit navigation #### 3.1.3 Burst Behavior Slack allows temporary burst spikes exceeding stated limits but monitors for sustained abuse patterns: - Short bursts (100 requests in 30 seconds on Tier 3) acceptable - Sustained high volume triggers workspace-level throttling - Monitoring based on patterns, not individual requests ### 3.2 Event API Architecture - Push vs. Pull Models #### 3.2.1 Events API vs. Real-Time Messaging (RTM) **Events API (Recommended)** - Push-based: Slack sends events to your webhook URL - HTTP POST delivery with retry logic - Supports 100+ event types - 30,000 deliveries per workspace per app per 60 minutes - Resiliency: Automatic retries on 4xx/5xx responses - Better for high-volume scenarios **RTM (Deprecated - migration path to Events API)** - WebSocket-based persistent connection - Client initiates connection to Slack - Older event model with limited type coverage - Sunset: Slack actively migrating users to Events API - Not recommended for new integrations **InfraFabric Event Selection:** - Use Events API for all new integrations - Subscribe to: `app_mention`, `message`, `reaction_added`, `file_created`, `app_home_opened` - Configure webhook URL for event delivery - Implement signature verification for security #### 3.2.2 Event Subscription Configuration **Steps to Configure Event Subscriptions:** 1. Create Slack app 2. Navigate to "Event Subscriptions" in app settings 3. Enable events and provide HTTPS webhook URL 4. Verify webhook with 3-second challenge/response 5. Select events to subscribe to 6. Save and enable **Webhook Verification Flow:** ``` Slack Request: POST https://your-domain.com/slack/events { "type": "url_verification", "challenge": "3eZbrw1aBc2bnT0Yib9qFnhLY8nW1Wz0zVm7" } Your Response (3 seconds): HTTP 200 { "challenge": "3eZbrw1aBc2bnT0Yib9qFnhLY8nW1Wz0zVm7" } ``` ### 3.3 Socket Mode for Persistent Connections #### 3.3.1 Socket Mode Architecture Alternative to webhooks using WebSocket persistent connection: **Advantages:** - No public URL exposure required - NAT/firewall traversal - Lower latency for bidirectional communication - Suitable for private/on-premise InfraFabric deployments **Disadvantages:** - Connection management complexity - SDK-based implementation requirement - Slightly higher latency than webhooks - Not ideal for extreme high-volume scenarios **Implementation Pattern:** ```python # Bolt framework Socket Mode example import os from slack_bolt import App from slack_bolt.adapter.socket_mode import SocketModeHandler app = App(token=os.environ.get("SLACK_BOT_TOKEN")) @app.message("hello") def message_hello(message, say): say(f"Hey there <@{message['user']}>!") if __name__ == "__main__": handler = SocketModeHandler( app, os.environ.get("SLACK_APP_TOKEN") ) handler.start() ``` #### 3.3.2 When to Use Socket Mode vs. Webhooks **Use Webhooks When:** - Public HTTPS endpoint available - High-volume event processing - Distributed load balancing desired - Simpler infrastructure **Use Socket Mode When:** - Operating in isolated/private network - Avoiding public endpoint exposure - Testing/development convenience - Persistent connection preferred ### 3.4 App Distribution Models #### 3.4.1 Workspace-Only Apps (Single Workspace) **Use Case:** Private infrastructure management tool - Install app once to single workspace - Workspace token manages all operations - No OAuth user installation flow - Simpler authentication - No Marketplace listing - Best for enterprise internal tools **InfraFabric Workspace-Only Installation:** ```yaml # app.yml manifest display_information: name: InfraFabric DevOps description: Infrastructure management and alerting oauth_config: scopes: bot: - chat:write - chat:write.public - commands - incoming-webhook - app_mentions:read - events:write - channels:read settings: event_subscriptions: request_url_verification_disabled: false events: - app_mention - message.channels interactivity: is_enabled: true request_url: https://infra.company.com/slack/webhook ``` #### 3.4.2 Distributed Apps (Multiple Workspaces) **Use Case:** SaaS Slack app for infrastructure teams across organizations - OAuth 2.0 installation flow - Per-workspace token management - User authentication and scoping - Slack Marketplace listing - Multi-tenancy architecture **OAuth Installation Flow for InfraFabric:** ``` 1. User clicks "Add to Slack" button 2. Browser redirected to: https://slack.com/oauth/v2/authorize ?client_id=YOUR_CLIENT_ID &scopes=chat:write,commands,app_mentions:read &redirect_uri=https://infra.company.com/oauth/callback 3. User authorizes permissions in Slack dialog 4. Slack redirects to: https://infra.company.com/oauth/callback ?code=AUTH_CODE &state=STATE_PARAM 5. InfraFabric backend calls: POST https://slack.com/api/oauth.v2.access client_id=YOUR_CLIENT_ID client_secret=YOUR_SECRET code=AUTH_CODE redirect_uri=https://infra.company.com/oauth/callback 6. Slack returns: { "ok": true, "access_token": "xoxb-xxxx", "token_type": "bot", "scope": "chat:write,commands,app_mentions:read", "bot_user_id": "U12345678", "app_id": "A12345678", "team": {"name": "Acme Corp", "id": "T12345678"}, "authed_user": {...}, "is_enterprise_install": false } ``` #### 3.4.3 Enterprise Grid Apps For large organizations with Enterprise Grid: - Multi-workspace token management - Enterprise-level permissions - Audit logging integration - Custom role support - Org-wide app installation options --- ## PASS 4: Cross-Domain Analysis - Pricing, OAuth, Enterprise Support, Compliance ### 4.1 Slack Pricing & Team Plans #### 4.1.1 Free Plan **Cost:** $0/user/month **API Capabilities:** - Full Web API access - Events API (30,000 events/60 min) - Incoming Webhooks (unlimited) - Slash Commands (up to 5) - App Home - Basic Block Kit components - Socket Mode access - RTM API (deprecated but available) **Limitations:** - Message history: Last 10,000 messages - File storage: 5GB shared workspace - Integrations: Limited to 10 - Support: Community-based **InfraFabric Viability:** Excellent for testing/POCs; production viable for small teams #### 4.1.2 Pro Plan **Cost:** $8.75/user/month (billed annually) **Additional Capabilities Over Free:** - Message history: Full workspace history - File storage: 20GB per user - Integrations: Unlimited - Advanced permissions - Priority email support - Shared channels **InfraFabric Recommendation:** Suitable for small-medium production deployments #### 4.1.3 Business+ Plan **Cost:** $12.50/user/month (billed annually) **Additional Capabilities:** - Compliance exports - Enhanced security - Admin audit logs - Custom security policies - Advanced threat detection - Priority phone support **InfraFabric Fit:** Mid-market enterprises with compliance requirements #### 4.1.4 Enterprise Grid Plan **Cost:** Custom pricing **Capabilities:** - Organization-wide installation - Single sign-on (SSO) - Advanced compliance (SOC 2 Type II, ISO 27001) - Custom workspace provisioning - Org-wide permissions management - Advanced data residency - Dedicated support team - Custom SLA terms **InfraFabric Enterprise Deployment:** Full feature set with org-wide infrastructure visibility #### 4.1.5 App Distribution Pricing - Slack Marketplace apps: Free to list - Paid plans: Slack takes 30% commission - Free/Freemium models: Revenue share available ### 4.2 OAuth 2.0 Authentication Architecture #### 4.2.1 OAuth Scope Categories (50+ available) **Communication Scopes:** - `chat:write` - Post messages - `chat:write.public` - Post to public channels without bot membership - `channels:manage` - Create, delete, rename channels - `conversations:manage` - Manage conversations - `groups:manage` - Manage private channels - `im:write` - Open/close direct messages **User & Profile Scopes:** - `users:read` - List users and get profiles - `users:read.email` - Access email addresses - `users:write` - Modify user profiles - `admin.users:write` - Manage users (Enterprise) **File Scopes:** - `files:read` - View file information - `files:write` - Upload and delete files **Event Scopes:** - `app_mentions:read` - Receive app mentions - `channels:history` - Read channel message history - `groups:history` - Read private channel history - `im:history` - Read direct message history **Reaction Scopes:** - `reactions:read` - View reactions - `reactions:write` - Add/remove reactions - `pins:read` - View pinned messages - `pins:write` - Pin/unpin messages **Workflow Scopes:** - `workflow.steps:execute` - Execute custom workflow steps - `commands` - Create slash commands - `incoming-webhook` - Send webhook messages **Admin Scopes (Enterprise/Workspace Admin):** - `admin.conversations:write` - Manage conversations - `admin.users:write` - Manage users - `admin.teams:write` - Manage teams - `admin.apps:write` - Manage app installation - `auditlogs:read` - Read audit logs #### 4.2.2 Token Management Best Practices **Token Types:** 1. **Bot User Token** (`xoxb-...`) - Represents app's bot user - Used for all API operations - Workspace-scoped - Should be treated as secret 2. **User Token** (`xoxp-...`) - Represents individual user - Scoped to that user's permissions - Shorter lived - Less common in modern OAuth flows 3. **App Token** (`xapp-...`) - Used for Socket Mode connections - Server-to-server authentication - Different permission model **Token Storage Requirements:** - Encrypted at rest (AES-256 minimum) - Secured in-transit (TLS 1.2+) - IP allowlisting recommended - Regular rotation schedule - Audit logging for access **Token Revocation:** ``` POST https://slack.com/api/auth.revoke token=xoxb-yourtoken ``` Revocation immediately terminates: - API access - Event delivery - WebSocket connections - Bot presence ### 4.3 Enterprise Grid Support #### 4.3.1 Enterprise-Specific Features **Org-Wide Installation:** ``` Single app install manages all organization workspaces Workspace isolation with secure token management Centralized logging and compliance ``` **Enterprise-Level Compliance:** - SOC 2 Type II certification - ISO 27001 certification - FedRAMP certification (available) - HIPAA compliance ready - GDPR compliance - Data residency options (US, EU) **Advanced Org Features:** - Custom retention policies - Legally hold conversations - Org-wide audit logs - Advanced threat detection - Custom data export - Network restrictions #### 4.3.2 Multi-Workspace Management **InfraFabric Enterprise Scenario:** ``` Organization Workspace 1 (US) → Single App Token Organization Workspace 2 (EU) → Workspace-scoped permission Organization Workspace 3 (APAC) → Workspace-scoped permission Centralized API → All workspaces via org installation ``` ### 4.4 Compliance & Security Certifications #### 4.4.1 Compliance Standards **SOC 2 Type II** - Security controls audited annually - Availability, processing integrity, confidentiality - Third-party attestation - On-demand availability: Data centers with SOC 2 compliance **ISO 27001** - Information security management system - Encryption, access controls, incident response - Risk assessment and management - Certification covers all data centers **HIPAA Compliance** - Business Associate Agreement (BAA) available - Encryption in transit and at rest - Access controls and audit logging - Suitable for healthcare infrastructure **GDPR Compliance** - Data residency (EU data stays in EU) - User data export capabilities - Right to deletion support - Data Processing Agreement (DPA) **FedRAMP Authorization** - US government approved (in progress) - Federal agency usage support - Compliance with NIST standards #### 4.4.2 InfraFabric Compliance Mapping **Data Classification for InfraFabric:** - Infrastructure alerts: Potentially sensitive (not PII in most cases) - Deployment logs: May contain API keys, credentials - Incident communications: Operational data, not medical **Recommended Security Controls:** 1. OAuth scope minimization (request only needed permissions) 2. Message content encryption (sensitive parameters) 3. Webhook signature verification (HMAC-SHA256) 4. Token encryption and rotation 5. Audit logging for all operations 6. Rate limiting and DDoS protection 7. Network isolation (Socket Mode or VPN) --- ## PASS 5: Framework Mapping - InfraFabric DevOps Integration Scenarios ### 5.1 Real-Time Infrastructure Monitoring & Alerting #### 5.1.1 Alert Escalation Workflow **Scenario: Progressive Alert Escalation** ``` Monitoring System (Prometheus/Grafana) ↓ Alert Rules Engine triggers ↓ HTTP Webhook → InfraFabric Alert Processor ↓ Slack Webhook → Channel Alert Message ↓ If not acknowledged in 5 min: ↓ Block Action: Escalate Button ↓ Modal: Assign to On-Call Engineer ↓ /oncall escalate critical-prod-db ↓ Direct Message to assigned engineer ↓ Incident Created in incident-response channel ``` **Slack Integration Points:** 1. **Alert Webhook** (Tier 2 rate limit: 20/min per channel) - Prometheus alertmanager sends alerts - InfraFabric formats and posts to Slack - Max 1 msg/sec per channel natural limit - Burst capacity: 10-15 alerts simultaneously 2. **Button Acknowledgment** (Events API) - User clicks "Acknowledge" button - `block_actions` event triggers webhook - InfraFabric updates alert status - Response message confirms acknowledgment 3. **Escalation Modal** (Slash command + modal) - `/incident escalate ALERT_ID` opens form - Modal selects on-call engineer - `view_submission` event triggered - Direct message sent to assignee #### 5.1.2 Multi-Channel Alert Routing **Alert Routing Logic:** ```python # Alert severity → Slack channel mapping ALERT_ROUTING = { "critical": ["#incidents", "#oncall", "@head-of-infrastructure"], "warning": ["#alerts-general", "#platform-team"], "info": ["#alerts-general"], "maintenance": ["#deployments"] } # For each alert: # 1. Evaluate severity + service + environment # 2. Post to primary channel with buttons # 3. Thread replies with details # 4. Cross-post to secondary channels as necessary # 5. Store threadTS for correlation ``` ### 5.2 Deployment Pipeline Integration #### 5.2.1 Deployment Approval Workflow **Scenario: Production Deployment Approval** ``` CI/CD Pipeline (Jenkins/GitLab/GitHub Actions) ↓ Deployment staged, awaiting approval ↓ Slack Message posted: "Production API v1.2.3 awaiting approval" [Approve] [Reject] [View Changes] ↓ Engineering Lead clicks [Approve] ↓ block_actions event → InfraFabric API ↓ CI/CD triggered to proceed ↓ Modal with deployment options (rolling/blue-green/canary) ↓ Threaded updates as deployment progresses ↓ Final status message with: ✅ Deployment successful - Duration: 4m 23s - Affected instances: 12 - Rollback: Available until 2025-11-15 10:00 UTC ``` **Rate Limit Considerations:** - Initial approval message: Tier 2 (20/min) - Progress updates (one per minute during deployment): Tier 3 (50/min) - Thread replies (logs, metrics): Tier 2 - Total: ~10-15 messages per 10-minute deployment #### 5.2.2 Artifact & Container Registry Integration **Slack Integration for Image Management:** ``` New Docker image built: myapp:1.2.3 ↓ Container Registry (Docker Hub/ECR/GCR) webhook ↓ Slack Message: Title: "New myapp image available" Blocks: - Image details (tag, size, digest) - Scan results (vulnerabilities) - Build logs Actions: - Deploy to staging - Scan again - View in registry ``` ### 5.3 Security & Compliance Operations #### 5.3.1 Security Alert Aggregation **Scenarios:** 1. **Vulnerability Scan Results** - Container image scanned - Vulnerabilities found - Risk assessment - Remediation path - Slack modal with fix options 2. **Certificate Expiration Warnings** - SSL/TLS cert expiration tracking - 30-day warning - Slack slash command to renew - Modal with CSR generation - Confirmation with chain installation 3. **Authentication Anomalies** - Failed login attempts spike - Geographic anomaly detected - Rate limit exceeded - Slack alert with investigation modal - Option to trigger account lockdown #### 5.3.2 Compliance & Audit Logging **InfraFabric Operations in Slack:** - All slash command executions logged - Message posting events auditable - User action attribution - Timestamp and context capture - Integration with compliance systems ### 5.4 Incident Management & Response #### 5.4.1 Incident Lifecycle in Slack **Phase 1: Detection & Creation** ``` Alert → Slack channel User: "/incident create --service api-gateway --severity critical" Modal collects: - Title - Description - Affected services - Customer impact - Initial status Returns: Incident ID for reference ``` **Phase 2: Investigation & Collaboration** ``` Incident channel created: #incident-12345 Participants auto-invited Modal updates timeline: "/incident timeline --message "API memory leak confirmed, 8 instances affected" Threaded messages for: - Diagnostic findings - Root cause analysis - Action items ``` **Phase 3: Remediation & Communication** ``` Actions taken in Slack: - /incident update-status investigating → in-progress → resolved - /incident assign-lead @sarah (incident commander) - /incident notify-customers "eta: 5 minutes" Webhook notifications: - Service restarted - Load balanced shifted - Metrics normalizing ``` **Phase 4: Post-Incident & Retrospective** ``` /incident close 12345 Summary message created with: - Timeline - Root cause - Action items for prevention - Retrospective meeting scheduled ``` ### 5.5 Team Collaboration & Knowledge Sharing #### 5.5.1 Runbook & Documentation Integration **Slack Features Leveraged:** - App Home tab with searchable runbooks - Slash command access: `/infra docs deployment` - Button links to detailed documentation - Threaded Q&A in operational channels ```json { "type": "section", "text": { "type": "mrkdwn", "text": "Need help? Access runbooks:" } } ``` #### 5.5.2 Knowledge Base Search **Slack Integration Pattern:** ``` /kb search database timeout troubleshooting ↓ InfraFabric searches documentation ↓ Modal displays results: 1. Database Timeout Troubleshooting Guide (4.8★) 2. Connection Pool Exhaustion (4.5★) 3. Query Performance Analysis (4.3★) ↓ Click to view in details modal Thread with related incidents/resolution ``` --- ## PASS 6: Specification & Implementation Details ### 6.1 Bot User Creation & Configuration #### 6.1.1 Bot User Setup via App Management **Step 1: Create Slack App** ``` https://api.slack.com/apps/new Select workspace Name: "InfraFabric" Display name: "InfraFabric DevOps Bot" ``` **Step 2: Configure Bot User** ``` Features → App Home ✓ Allow users to send Slash Commands and messages ✓ Show Tabs - Home tab (messages + info) - Messages tab (allow direct messages) Features → OAuth & Permissions Add scopes: Bot Token Scopes: - chat:write - chat:write.public - commands - app_mentions:read - reactions:read - pins:read - users:read - files:read - incoming-webhook ``` **Step 3: Install App to Workspace** ``` Settings → Install App "Install to Workspace" button Authorization dialog Bot user created: @infrared-bot Token generated: xoxb-xxxxxxxxx ``` #### 6.1.2 Bot User Capabilities Matrix | Capability | Scope Required | Rate Limit | Notes | |------------|----------------|-----------|-------| | Post messages | `chat:write` | Tier 2 (20/min) | 1 msg/sec/channel | | Post to public | `chat:write.public` | Tier 2 | No channel membership | | Receive mentions | `app_mentions:read` | Events API | Subscribe to events | | Read reactions | `reactions:read` | Tier 2 | Per message basis | | Pin messages | `pins:write` | Tier 2 | Admin permission needed | | Update profile | `users:write` | Tier 1 | Status, display name | | Read message history | `channels:history` | Tier 3 (50/min) | Paginated results | | Upload files | `files:write` | Tier 2 | Size limits apply | | Execute workflows | `workflow.steps:execute` | Custom | Custom step invocation | ### 6.2 Incoming Webhook Configuration #### 6.2.1 Webhook Endpoint Setup **Slack Configuration:** ``` Features → Incoming Webhooks Activate Incoming Webhooks: ON Add New Webhook to Workspace Select channel: #alerts-production Authorize Generated URL: https://hooks.slack.com/services/T123/B456/xyz... ``` **InfraFabric Configuration:** ```yaml # config/slack.yaml webhooks: production-alerts: url: "https://hooks.slack.com/services/T123/B456/xyz..." channel: "#alerts-production" timeout_seconds: 10 retry_count: 3 retry_backoff: exponential deployments: url: "https://hooks.slack.com/services/T123/B457/abc..." channel: "#deployments" incidents: url: "https://hooks.slack.com/services/T123/B458/def..." channel: "#incidents" ``` #### 6.2.2 Message Posting via Webhook **Python Implementation:** ```python import requests import json from datetime import datetime def post_alert_to_slack(alert_data): webhook_url = "https://hooks.slack.com/services/..." payload = { "text": f"🚨 {alert_data['severity']}: {alert_data['service']}", "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": f"{alert_data['service'].upper()} Alert" } }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": f"*Service:*\n{alert_data['service']}" }, { "type": "mrkdwn", "text": f"*Severity:*\n{alert_data['severity']}" }, { "type": "mrkdwn", "text": f"*Environment:*\n{alert_data['environment']}" }, { "type": "mrkdwn", "text": f"*Triggered:*\n{datetime.now().isoformat()}" } ] }, { "type": "section", "text": { "type": "mrkdwn", "text": f"*Details:*\n{alert_data['message']}" } }, { "type": "actions", "elements": [ { "type": "button", "text": { "type": "plain_text", "text": "Acknowledge" }, "action_id": f"ack_alert_{alert_data['id']}", "value": alert_data['id'] }, { "type": "button", "text": { "type": "plain_text", "text": "View Details" }, "url": f"https://infra.company.com/alerts/{alert_data['id']}" } ] } ] } response = requests.post(webhook_url, json=payload) response.raise_for_status() return response.json() # Usage alert = { "id": "alert-2024-001", "service": "api-gateway", "severity": "CRITICAL", "environment": "production", "message": "CPU usage exceeded 95% threshold" } result = post_alert_to_slack(alert) print(f"Message posted: {result['ts']}") ``` #### 6.2.3 Webhook Security **Signature Verification (Optional but Recommended):** ```python import hmac import hashlib import time def verify_slack_signature(request_headers, request_body): """ Slack webhooks are unsigned by default. However, if you want to verify the webhook origin, implement token verification in request header. """ # Store webhook secret securely WEBHOOK_SECRET = os.environ.get('SLACK_WEBHOOK_SECRET') # For Slack webhook calls, verify the URL matches expected endpoint # Consider using API tokens instead of plain webhooks for sensitive operations # Best Practice: Use OAuth tokens for critical operations def post_alert_authenticated(token, channel, message): """ Use OAuth token for authenticated message posting Allows tracking of who posted the message Enables richer interactions """ client = WebClient(token=token) response = client.chat_postMessage( channel=channel, blocks=message['blocks'], text=message['text'] ) return response ``` ### 6.3 Slash Command Implementation #### 6.3.1 Command Configuration **Slack App Settings:** ``` Features → Slash Commands Create New Command Command: /deploy Request URL: https://infra.company.com/slack/commands/deploy Short Description: Deploy service to environment Usage Hint: service --version 1.0 --environment prod Escape channels, users, and links: ON ``` #### 6.3.2 Command Handler Implementation **Node.js/Bolt Framework:** ```javascript const { App } = require('@slack/bolt'); const app = new App({ token: process.env.SLACK_BOT_TOKEN, signingSecret: process.env.SLACK_SIGNING_SECRET }); // /deploy service --version 1.0 --environment prod app.command('/deploy', async ({ ack, body, respond }) => { // Acknowledge within 3 seconds ack(); // Parse command text const args = parseCommand(body.text); const service = args.service; const version = args.version || 'latest'; const environment = args.environment || 'staging'; // Validate inputs if (!service) { respond({ response_type: 'ephemeral', text: 'Error: Service name required' }); return; } // Start async processing processDeployment(service, version, environment, body) .then(result => { // Post result to response_url for delayed response postToResponseUrl(body.response_url, { response_type: 'in_channel', blocks: buildDeploymentMessage(result) }); }) .catch(error => { postToResponseUrl(body.response_url, { response_type: 'ephemeral', text: `Deployment failed: ${error.message}` }); }); // Immediate acknowledgment respond({ response_type: 'ephemeral', text: `Deploying ${service} v${version} to ${environment}...` }); }); function parseCommand(text) { const regex = /(\w+)(?:\s+--(\w+)\s+([^\s-]+))/g; const args = { service: text.split(' ')[0] }; let match; while ((match = regex.exec(text)) !== null) { args[match[2]] = match[3]; } return args; } function buildDeploymentMessage(result) { return [ { type: "section", text: { type: "mrkdwn", text: `*Deployment Complete*` } }, { type: "section", fields: [ { type: "mrkdwn", text: `*Service:*\n${result.service}` }, { type: "mrkdwn", text: `*Version:*\n${result.version}` }, { type: "mrkdwn", text: `*Duration:*\n${result.duration}s` }, { type: "mrkdwn", text: `*Status:*\n${result.status === 'success' ? '✅ Success' : '❌ Failed'}` } ] } ]; } app.start(process.env.PORT || 3000); ``` ### 6.4 Block Kit UI Design Patterns #### 6.4.1 Alert Message Pattern ```json { "type": "header", "text": { "type": "plain_text", "text": "🚨 Critical Alert: Production Database" } } ``` **Block Breakdown:** 1. **Header** - Alert title with emoji/severity 2. **Section with fields** - Key metrics (service, severity, environment) 3. **Section with details** - Extended information 4. **Actions** - Button response options 5. **Context** - Timestamp and metadata #### 6.4.2 Deployment Status Pattern ```json { "type": "section", "text": { "type": "mrkdwn", "text": "*Deployment Status: In Progress*\n\n_Deploying api-gateway v1.2.3 to production_" } } ``` #### 6.4.3 Interactive Modal Pattern **Cancel button:** Returns `view_closed` event **Submit button:** Returns `view_submission` event **Block-level validation:** Return errors in `view_submission` response ### 6.5 Event Subscription Configuration #### 6.5.1 Event Subscriptions Setup **Slack Configuration:** ``` Features → Event Subscriptions Enable Events: ON Request URL: https://infra.company.com/slack/events Verification: Automatic Subscribe to bot events: ✓ app_home_opened ✓ app_mention ✓ block_actions ✓ file_created ✓ file_shared ✓ message.channels ✓ message.groups ✓ message.im ✓ message.mpim ✓ reaction_added ✓ reaction_removed ``` #### 6.5.2 Event Handler Implementation **Python/Bolt:** ```python from slack_bolt import App from slack_bolt.request import BoltRequest app = App(token=os.environ.get("SLACK_BOT_TOKEN")) # Handle app mentions @app.event("app_mention") def handle_mention(body, say, logger): logger.info(f"App mentioned by {body['event']['user']}") text = body['event']['text'] user = body['event']['user'] if 'status' in text.lower(): # Respond with infrastructure status status = get_infrastructure_status() say( blocks=[ { "type": "section", "text": { "type": "mrkdwn", "text": f"<@{user}> Here's the infrastructure status:\n{status}" } } ], thread_ts=body['event'].get('ts') ) # Handle block actions (button clicks, menu selections) @app.action("ack_alert_*") def handle_acknowledge(ack, body, say, logger): ack() alert_id = body['actions'][0]['value'] user = body['user']['id'] # Update alert status update_alert_status(alert_id, 'acknowledged', user) # Update message with confirmation say( text=f"Alert {alert_id} acknowledged by <@{user}>", thread_ts=body['message']['ts'] ) # Handle view submissions (modal form submits) @app.view("alert_create_modal") def handle_modal_submission(ack, body, view, logger): ack() # Extract form values values = view['state']['values'] service = values['alert_service']['service_input']['value'] metric = values['alert_metric']['metric_select']['selected_option']['value'] threshold = values['alert_threshold']['threshold_input']['value'] try: # Create alert alert = create_alert_rule(service, metric, threshold) # Post confirmation in thread # ... notification logic except Exception as e: # Return validation errors logger.error(f"Alert creation failed: {e}") # Note: Cannot push new view after submission in simple flow # Consider using update approach or direct response @app.event("file_created") def handle_file_created(body, logger): logger.info(f"File created: {body['event']}") # Process file creation (e.g., scan for credentials, analyze logs) @app.event("reaction_added") def handle_reaction(body, say): reaction = body['event']['reaction'] user = body['event']['user'] if reaction == 'thumbsup': # Handle approval pass elif reaction == 'thumbsdown': # Handle rejection pass app.start(port=int(os.environ.get("PORT", 3000))) ``` --- ## PASS 7: Meta-Validation - API References, Best Practices, Deprecations ### 7.1 Slack API Documentation & Authoritative Sources #### 7.1.1 Primary Documentation References **Slack Developer Docs:** https://docs.slack.dev/ - **Web API Methods:** https://docs.slack.dev/reference/methods - Organized by method family - Each method includes: scope requirements, parameters, responses, examples - Rate limit tier documented per method - **Events API:** https://docs.slack.dev/reference/events - 100+ event types with descriptions - Event payload structures - Subscription requirements - **Block Kit Reference:** https://docs.slack.dev/reference/block-kit - 13 block types with properties - 20+ element types - Composition objects for common patterns - Interactive handlers - **Slash Commands:** https://docs.slack.dev/interactivity/implementing-slash-commands - Command configuration - Payload structure - Response patterns - Best practices - **Modals:** https://docs.slack.dev/surfaces/modals - View stack architecture - Block Kit support (100 blocks max) - Event flows (view_submission, view_closed) - Validation and error handling - **Authentication:** https://docs.slack.dev/authentication - OAuth 2.0 flow - Scope documentation - Token management - Workspace & enterprise installation - **Socket Mode:** https://docs.slack.dev/socket-mode - WebSocket-based alternative to webhooks - Connection lifecycle - Event delivery over socket - **Rate Limiting:** https://docs.slack.dev/apis/web-api/rate-limits - 4-tier system (1, 20, 50, 100 requests per minute) - Throttling behavior and HTTP 429 - Burst allowances and monitoring - **Incoming Webhooks:** https://docs.slack.dev/messaging/sending-messages-using-incoming-webhooks - URL generation and management - Message formatting with Block Kit - Error responses #### 7.1.2 SDK & Framework Documentation **Slack Bolt Framework:** - https://slack.dev/bolt/ - Official Bolt documentation - Python: https://github.com/slackapi/bolt-python - JavaScript: https://github.com/slackapi/bolt-js - Java: https://github.com/slackapi/bolt-java **SDKs:** - Python: https://github.com/slackapi/python-slack-sdk - JavaScript: https://github.com/slackapi/node-slack-sdk - Java: https://github.com/slackapi/java-slack-sdk ### 7.2 Block Kit Best Practices #### 7.2.1 Design Principles **Accessibility:** - Use `type: "plain_text"` for all text elements (not mrkdwn) in labels - Provide context in error messages - Support keyboard navigation for interactive elements - Alt text for images **Performance:** - Limit blocks per message: 50 max (100 for modals) - Minimize network requests triggered by block actions - Lazy-load large lists or datasets - Cache frequently accessed data **User Experience:** - Clear button labels (action-oriented verbs) - Provide confirmation for destructive actions - Indicate loading states during async operations - Show errors inline in modals #### 7.2.2 Common Patterns **Alert/Notification Template:** ``` Header (status emoji) → Section (details) → Actions (buttons) → Context (meta) ``` **Form/Modal Template:** ``` Header (title) → Input blocks → Actions (submit/cancel) → Help text ``` **Status Update Template:** ``` Section (progress bar) → Fields (metrics) → Context (timestamp) ``` ### 7.3 API Deprecation Timeline #### 7.3.1 Deprecated & Superseded APIs **Real-Time Messaging (RTM) API - DEPRECATED** - Status: Sunset announced, migration timeline 2023-2024 - Replacement: Events API - Migration guide: https://docs.slack.dev/concepts/apis#deprecations - Last update: Slack actively discouraging new RTM implementations - InfraFabric status: DO NOT USE - implement Events API instead **Legacy Rich Messages (Attachments Format) - DEPRECATED** - Status: Still functional but superseded by Block Kit - Replacement: Block Kit with Message Formatting - Old attachments: Limited to single-color formatting - Migration: Use `blocks` parameter instead of `attachments` - InfraFabric status: Use Block Kit exclusively **Classic Workspace Slack CLI - DEPRECATED** - Status: Superseded by Slack CLI 2.0 - Recommendations: Use modern Slack CLI for new projects - Old package: `@slack/cli` → New: `slack` CLI #### 7.3.2 Features with Sunset Notices **RTM Socket Mode Alternative:** - Original RTM will be completely removed - Alternative: Socket Mode (supported indefinitely) - Timeline: RTM removal announced for end of 2024 ### 7.4 Verification & Signature Validation #### 7.4.1 Request Signature Verification **Slack Sending Strategy:** ``` X-Slack-Request-Timestamp: 1531420618 X-Slack-Signature: v0=a9d2f6e9u12d7e1k4a... ``` **Verification Implementation (Recommended for Security):** ```python import hmac import hashlib import os from time import time def verify_slack_request(timestamp, signature, body): """ Verify that request came from Slack Timestamp should be within 5 minutes (prevents replay attacks) """ # Check timestamp if abs(time() - int(timestamp)) > 300: # 5 minutes return False # Reconstruct signing base signing_base = f"v0:{timestamp}:{body}" # Calculate expected signature signing_secret = os.environ.get("SLACK_SIGNING_SECRET") expected_sig = ( "v0=" + hmac.new( signing_secret.encode(), signing_base.encode(), hashlib.sha256 ).hexdigest() ) # Compare signatures (constant-time comparison to prevent timing attacks) return hmac.compare_digest(signature, expected_sig) # Flask example from flask import request @app.route('/slack/events', methods=['POST']) def slack_events(): timestamp = request.headers.get('X-Slack-Request-Timestamp') signature = request.headers.get('X-Slack-Signature') body = request.get_data(as_text=True) if not verify_slack_request(timestamp, signature, body): return ('Unauthorized', 403) # Process event ... ``` --- ## PASS 8: Deployment Planning - App Manifest, Permissions, Installation, Monitoring ### 8.1 App Manifest Configuration (YAML) #### 8.1.1 Complete App Manifest for InfraFabric ```yaml _metadata: major_version: 1 minor_version: 1 display_information: name: InfraFabric description: Infrastructure management and real-time alerting platform long_description: | InfraFabric integrates infrastructure monitoring, deployment automation, and incident response directly into Slack. Receive real-time alerts, manage deployments, and coordinate incident response without leaving Slack. background_color: "#1F2937" features: app_home: home_tab_enabled: true messages_tab_enabled: true messages_tab_read_only_enabled: false bot_user: display_name: InfraFabric Bot always_online: true slash_commands: - command: /deploy url: https://infra.company.com/slack/commands/deploy description: Deploy a service to specified environment usage_hint: service --version 1.0 --environment prod should_escape: true - command: /incident url: https://infra.company.com/slack/commands/incident description: Create or manage infrastructure incidents usage_hint: create --service api --severity critical should_escape: true - command: /infra url: https://infra.company.com/slack/commands/infra description: Query infrastructure status and health usage_hint: status --service api-gateway should_escape: true - command: /oncall url: https://infra.company.com/slack/commands/oncall description: Manage on-call escalations usage_hint: escalate alert-id --role incident-commander should_escape: true oauth_config: scopes: bot: - app_mentions:read # Receive app mentions - channels:history # Read channel history - channels:manage # Create/delete channels - channels:read # List channels - chat:write # Post messages - chat:write.public # Post to public channels - commands # Slash commands - files:read # Access files - files:write # Upload files - groups:history # Read private channel history - groups:manage # Manage private channels - groups:read # List private channels - im:history # Read DM history - im:read # List DMs - im:write # Open DMs - incoming-webhook # Send webhook messages - pinned_messages:read # Read pins - reactions:read # Read reactions - reactions:write # Add reactions - team:read # Read team info - users:read # List users - users:read.email # Access user emails - workflow.steps:execute # Execute workflow steps socket_mode_enabled: false # Enable if using Socket Mode request_url_verification_disabled: false settings: event_subscriptions: request_url_verification_enabled: true request_url: https://infra.company.com/slack/events bot_events: - app_home_opened - app_mention - block_actions - file_created - file_shared - message.channels - message.groups - message.im - reaction_added - reaction_removed - tokens_revoked - url_verification user_events: - member_joined_channel - member_left_channel interactivity: is_enabled: true request_url: https://infra.company.com/slack/interactive message_menu_options_url: null org_deploy_enabled: false socket_mode_enabled: false token_rotation_enabled: true app_unfurl_domains: - infra.company.com - logs.company.com - metrics.company.com distribution: direct_installation_enabled: true oauth_expiration_set: false ``` #### 8.1.2 Manifest Deployment **Using Slack CLI:** ```bash # Create app from manifest slack apps create --from-manifest ./app.yml --workspace my-workspace # Update existing app manifest slack apps manifest update --app-id A12345678 --manifest ./app.yml # View current manifest slack apps manifest view --app-id A12345678 > current-manifest.yml ``` ### 8.2 OAuth Token & Permission Management #### 8.2.1 Token Rotation Strategy **InfraFabric Token Rotation Configuration:** ```python # config/slack_auth.py import os from slack_sdk import WebClient from slack_sdk.errors import SlackApiError class TokenManager: def __init__(self): self.primary_token = os.environ.get("SLACK_BOT_TOKEN") self.backup_tokens = [ os.environ.get(f"SLACK_BOT_TOKEN_BACKUP_{i}") for i in range(1, 4) if os.environ.get(f"SLACK_BOT_TOKEN_BACKUP_{i}") ] self.rotation_interval_days = 30 self.last_rotation = None def get_client(self, use_backup=False): token = self.backup_tokens[0] if use_backup else self.primary_token return WebClient(token=token) def rotate_token(self): """ Rotate bot token (requires manual OAuth flow or admin setup) 1. Generate new token via OAuth 2. Test new token 3. Update primary 4. Revoke old token after grace period """ try: new_token = self._obtain_new_token() # Test new token client = WebClient(token=new_token) client.auth_test() # Rotate to backup self.backup_tokens.insert(0, self.primary_token) self.backup_tokens.pop() # Remove oldest self.primary_token = new_token self.last_rotation = datetime.now() # Schedule old token revocation self._schedule_token_revocation(self.backup_tokens[-1]) return True except Exception as e: logger.error(f"Token rotation failed: {e}") return False def _obtain_new_token(self): # Implementation depends on your OAuth provider # Could be manual approval, automated service account, etc. pass def _schedule_token_revocation(self, token): # Schedule token revocation 24 hours after rotation pass ``` #### 8.2.2 Permission Audit **Quarterly Permission Review:** ```python def audit_permissions(client): """ Audit current permissions vs. required permissions """ try: auth_info = client.auth_test() required_scopes = { 'app_mentions:read', 'channels:history', 'channels:read', 'chat:write', 'commands', 'incoming-webhook', 'users:read' } actual_scopes = set(auth_info['scope'].split(',')) missing = required_scopes - actual_scopes extra = actual_scopes - required_scopes if missing: logger.warning(f"Missing scopes: {missing}") if extra: logger.info(f"Extra scopes (consider removing): {extra}") return { 'status': 'ok' if not missing else 'missing_scopes', 'actual': actual_scopes, 'required': required_scopes, 'missing': missing, 'extra': extra } except SlackApiError as e: logger.error(f"Permission audit failed: {e}") ``` ### 8.3 Workspace Installation Process #### 8.3.1 Single-Workspace Installation **Manual Installation Steps:** ``` 1. Navigate to https://api.slack.com/apps/ 2. Click "Create New App" 3. Upload manifest or configure manually 4. Navigate to "Install App" section 5. Click "Install to Workspace" 6. Authorize required permissions 7. Copy bot token (xoxb-...) 8. Save to secure configuration 9. Configure webhook URL in Events section 10. Verify request URL (3-second response required) ``` #### 8.3.2 Multi-Workspace Installation (Distributed App) **OAuth Installation Flow Implementation:** ```python from flask import Flask, request, redirect from slack_sdk.oauth.oauth_handler import OAuthHandler from slack_sdk.oauth.oauth_settings import OAuthSettings app = Flask(__name__) oauth_settings = OAuthSettings( client_id=os.environ.get("SLACK_CLIENT_ID"), client_secret=os.environ.get("SLACK_CLIENT_SECRET"), scopes=[ "app_mentions:read", "channels:history", "channels:read", "chat:write", "commands", "incoming-webhook" ], installation_store=FileInstallationStore(base_dir="./data"), state_store=FileOAuthStateStore(expiration_seconds=600, base_dir="./data") ) handler = OAuthHandler(oauth_settings) @app.route("/slack/install", methods=["GET"]) def slack_install(): return redirect(handler.get_authorization_url(state=handler.get_state())) @app.route("/slack/oauth_redirect", methods=["GET"]) def slack_oauth_redirect(): try: completion = handler.handle(request) return f"Installation successful for {completion.installation.team_name}!" except Exception as e: return f"Installation failed: {str(e)}", 400 # Use stored tokens for API calls @app.route("/slack/events", methods=["POST"]) def slack_events(): team_id = request.json.get('team_id') # Retrieve workspace-specific token installation = oauth_settings.installation_store.find_installation( team_id=team_id ) if not installation: return ("Unauthorized", 403) client = WebClient(token=installation.bot_token) # Process event with team-specific token ... ``` ### 8.4 Monitoring & Observability #### 8.4.1 Application Metrics to Track **Slack API Call Metrics:** ```python import time from datetime import datetime from prometheus_client import Counter, Histogram, Gauge # Metrics slack_api_calls = Counter( 'slack_api_calls_total', 'Total Slack API calls', ['method', 'status'] ) slack_api_latency = Histogram( 'slack_api_latency_seconds', 'Slack API call latency', ['method'], buckets=(0.1, 0.5, 1.0, 2.0, 5.0) ) slack_rate_limit_remaining = Gauge( 'slack_rate_limit_remaining', 'Slack API rate limit remaining', ['method'] ) slack_event_processing_duration = Histogram( 'slack_event_processing_seconds', 'Time to process Slack event', ['event_type'], buckets=(0.1, 0.5, 1.0, 2.0, 5.0) ) def track_api_call(method): """Decorator to track API call metrics""" def decorator(func): def wrapper(*args, **kwargs): start = time.time() try: result = func(*args, **kwargs) duration = time.time() - start slack_api_calls.labels(method=method, status='success').inc() slack_api_latency.labels(method=method).observe(duration) return result except Exception as e: slack_api_calls.labels(method=method, status='error').inc() raise return wrapper return decorator @track_api_call('chat.postMessage') def post_message(client, channel, blocks): response = client.chat_postMessage( channel=channel, blocks=blocks ) # Track rate limit remaining = int(response.get('headers', {}).get('X-Rate-Limit-Remaining', 0)) slack_rate_limit_remaining.labels(method='chat.postMessage').set(remaining) return response ``` #### 8.4.2 Error Tracking & Alerting **Error Categories & Handling:** ```python class SlackErrorHandler: """ Categorize and handle Slack API errors """ # Retryable errors RETRYABLE_ERRORS = { 'connection_error', 'request_timeout', 'internal_error', # 5xx 'rate_limited' # 429 } # Non-retryable errors TERMINAL_ERRORS = { 'invalid_auth', # Token invalid/revoked 'token_revoked', 'not_in_channel', 'channel_not_found', 'user_disabled', 'restricted_action', 'invalid_arguments' } def handle_error(self, error, context): """ Handle Slack API error with appropriate strategy """ error_code = error.response.get('error', 'unknown') if error_code in self.TERMINAL_ERRORS: # Alert and log for manual intervention logger.critical(f"Terminal error: {error_code}", extra=context) self.alert_ops(f"Slack API terminal error: {error_code}") elif error_code in self.RETRYABLE_ERRORS: # Implement exponential backoff self.queue_for_retry(context) else: # Unknown error - log for investigation logger.warning(f"Unknown error: {error_code}", extra=context) def alert_ops(self, message): # Send to PagerDuty, email, or incident system pass def queue_for_retry(self, context): # Queue to job system (Celery, RabbitMQ, etc.) pass ``` #### 8.4.3 Health Check Endpoints **Application Health Check:** ```python @app.route("/health", methods=["GET"]) def health_check(): """ Verify InfraFabric Slack integration is operational """ health_status = { 'status': 'healthy', 'timestamp': datetime.utcnow().isoformat(), 'components': {} } # Check Slack API connectivity try: client = WebClient(token=get_current_token()) response = client.auth_test() health_status['components']['slack_api'] = { 'status': 'operational', 'bot_id': response['user_id'], 'team_name': response['team_name'] } except Exception as e: health_status['components']['slack_api'] = { 'status': 'error', 'message': str(e) } health_status['status'] = 'degraded' # Check webhook receiver health_status['components']['webhook_receiver'] = { 'status': 'operational', 'last_event_received': get_last_event_timestamp() } # Check token rotation status token_manager = TokenManager() days_since_rotation = ( datetime.now() - token_manager.last_rotation ).days if token_manager.last_rotation else None health_status['components']['token_rotation'] = { 'status': 'ok' if days_since_rotation < 30 else 'warning', 'days_since_rotation': days_since_rotation } status_code = 200 if health_status['status'] == 'healthy' else 503 return health_status, status_code ``` ### 8.5 Testing Strategies #### 8.5.1 Unit Test Scenarios **Test Suite Structure:** ```python import pytest from unittest.mock import Mock, patch from slack_sdk import WebClient class TestInfraFabricSlackIntegration: @pytest.fixture def mock_slack_client(self): return Mock(spec=WebClient) def test_post_alert_message(self, mock_slack_client): """Test alert message posting""" alert_data = { 'id': 'alert-001', 'service': 'api-gateway', 'severity': 'CRITICAL', 'message': 'CPU threshold exceeded' } result = post_alert_to_slack(mock_slack_client, alert_data) mock_slack_client.chat_postMessage.assert_called_once() assert result['ts'] is not None def test_slash_command_validation(self): """Test slash command input validation""" # Valid command parsed = parse_deploy_command("api-gateway --version 1.2.3") assert parsed['service'] == 'api-gateway' assert parsed['version'] == '1.2.3' # Invalid command with pytest.raises(ValidationError): parse_deploy_command("--invalid") def test_modal_form_submission(self, mock_slack_client): """Test modal form processing""" view_payload = { 'state': { 'values': { 'alert_service': { 'service_input': {'value': 'database'} }, 'alert_threshold': { 'threshold_input': {'value': '85'} } } } } result = process_alert_modal(view_payload) assert result['service'] == 'database' assert result['threshold'] == 85 def test_rate_limit_handling(self): """Test rate limit backoff strategy""" client = WebClient(token="xoxb-test") with patch.object(client, 'chat_postMessage') as mock: # Simulate rate limit mock.side_effect = SlackApiError( response={'headers': {'Retry-After': '5'}} ) # Verify exponential backoff result = post_with_retry(client, "channel", "message") assert mock.call_count >= 2 # Initial + retry ``` #### 8.5.2 Integration Test Scenarios ``` SCENARIO 1: End-to-End Alert Flow 1. Alert triggered in monitoring system 2. Webhook posts to Slack channel 3. User clicks "Acknowledge" button 4. Button click triggers block_actions event 5. InfraFabric updates alert status 6. Message updated with acknowledgment VERIFICATION: - Message posted ✓ - Button interactive ✓ - Status updated ✓ SCENARIO 2: Slash Command with Modal 1. User types `/deploy api --version 1.2.3` 2. Command webhook received 3. Modal displayed with deployment options 4. User selects "blue-green" deployment 5. User clicks Submit 6. Deployment initiated VERIFICATION: - Command acknowledged ✓ - Modal displayed ✓ - Form processed ✓ - Deployment triggered ✓ SCENARIO 3: Incident Channel Creation 1. User creates incident with `/incident create` 2. Modal collects incident details 3. New channel created (#incident-12345) 4. Participants auto-invited 5. Incident summary posted to channel VERIFICATION: - Channel created ✓ - Users added ✓ - Summary posted ✓ SCENARIO 4: Multi-Workspace Install 1. Workspace A clicks "Add to Slack" 2. OAuth flow completes 3. Token stored for Workspace A 4. Workspace B clicks "Add to Slack" 5. OAuth flow completes 6. Token stored for Workspace B 7. Event received from Workspace A 8. Correct token used for response VERIFICATION: - Both workspaces installed ✓ - Events routed correctly ✓ - Tokens isolated ✓ ``` --- ## Summary: InfraFabric Slack Integration Specifications ### Integration Complexity Assessment: 7.5/10 **Reasoning:** - Complexity drivers: Multi-mode API integration (webhooks, events, slash commands, modals), token management, rate limiting, OAuth - Mitigating factors: Mature Slack platform, excellent documentation, Bolt framework simplification ### Recommended Implementation Path **Phase 1: Foundation (Weeks 1-2)** - Set up Slack app with OAuth - Implement webhook receiver for alerts - Deploy basic alert messages to #alerts channel - Implement /status slash command **Phase 2: Interactivity (Weeks 3-4)** - Add button actions for alert acknowledgment - Implement /deploy slash command - Create deployment modal form - Set up event subscriptions **Phase 3: Advanced Features (Weeks 5-6)** - Implement incident channel workflow - Add custom workflow steps - Deploy Socket Mode alternative - Implement token rotation **Phase 4: Production Hardening (Weeks 7-8)** - Comprehensive error handling - Rate limiting strategy - Monitoring and alerting - Load testing and optimization ### Architecture Decisions **Recommended Approach:** - Use **Incoming Webhooks** for high-volume alerts (Tier 2 rate limit sufficient) - Use **Events API** for interactive components (more flexible than webhooks) - Use **Bolt Framework** (Python or JavaScript) for server implementation - Use **Socket Mode** for distributed/private deployments - Store **workspace tokens** securely with encryption at rest and in transit - Implement **rate limit monitoring** with proactive throttling ### Expected Performance Metrics - Alert delivery latency: <2 seconds (webhook) - Slash command response: <3 seconds - Modal interaction: <1 second - Event processing: <5 seconds for complex operations - Token refresh: Weekly with 30-day rotation policy ### Cost Implications - Slack Free/Pro tier sufficient for small teams - Enterprise Grid needed for org-wide infrastructure visibility - No additional cost for Slack API usage (included in team plans) - Minimal infrastructure cost (webhook endpoint + event processor) --- ## Document Classification & References **Status:** Complete Research Document **Version:** 1.0 **Last Updated:** 2025-11-14 **Methodology:** IF.search 8-Pass Analysis **Agent:** Haiku-39 **Primary Sources:** - Slack Developer Documentation (https://docs.slack.dev/) - Slack API Reference (https://api.slack.com/) - Bolt Framework Documentation - Official SDKs & Samples - Slack Engineering Best Practices **Recommended Next Steps:** 1. Review with infrastructure team for requirements validation 2. Create detailed implementation plan 3. Set up development Slack workspace 4. Begin Phase 1 implementation 5. Conduct security review with compliance team 6. Deploy to production with monitoring --- **END OF DOCUMENT** **Total Lines:** 2600+ **Methodology Completion:** 8/8 Passes ✓