# Phase 0: Research Technical Foundations **Feature**: 002-slack-bridge-integration **Research Date**: 2025-10-22 **Status**: Complete ## Executive Summary This document consolidates research on five critical technical areas for implementing the Slack↔Matrix bridge using mautrix-slack with Socket Mode on NixOS. **Key Decisions**: - ✅ Use Socket Mode (WebSocket) - no public endpoint needed - ✅ Use App Login (official OAuth) for production stability - ✅ Require 29 bot scopes + 1 app-level scope (`connections:write`) - ✅ Use sops-nix flat key structure for Slack credentials - ✅ Use automatic portal creation (no manual channel mapping) - ✅ Leverage existing NixOS module, add secrets integration --- ## 1. Slack Socket Mode ### What is Socket Mode? Socket Mode is Slack's **WebSocket-based protocol** (RFC 6455) that enables real-time event delivery without requiring a public HTTP endpoint. **Connection Architecture**: 1. Application calls `apps.connections.open` API with app-level token (xapp-) 2. Slack responds with unique WebSocket URL: `wss://wss.slack.com/link/?ticket=...` 3. Application receives events over WebSocket (Events API, interactivity) 4. Application sends responses via standard Web API (HTTPS) **Key Characteristics**: - No public endpoint required (ideal for behind-firewall deployments) - WebSocket URLs rotate dynamically (not static) - Up to 10 concurrent connections allowed - Events may be distributed across connections - Rate limit: **1 WebSocket URL fetch per minute** (critical for reconnection) ### Token Requirements **Two tokens required**: | Token Type | Format | Purpose | Scope Required | |------------|--------|---------|----------------| | App-Level Token | `xapp-...` | Establish WebSocket connection | `connections:write` | | Bot Token | `xoxb-...` | Perform API operations | 29+ bot scopes | **Authentication Flow**: 1. Open Matrix DM with bridge bot (`@slackbot:clarun.xyz`) 2. Send command: `login app` 3. Provide both tokens when prompted 4. Bridge stores credentials in database, establishes Socket Mode connection ### Limitations and Trade-offs **Technical Constraints**: - WebSocket connections refresh every few hours (automatic reconnection) - Backend container recycling causes occasional disconnects - Rate-limited reconnections (1 request/minute maximum) - Long-lived stateful connections (challenging to scale horizontally) **Production Considerations**: - ❌ Cannot publish to Slack Marketplace (HTTP required) - ⚠️ Slack recommends HTTP for highest reliability - ✅ Socket Mode recommended for: development, local testing, behind-firewall environments **Why Socket Mode for ops-jrz1**: 1. VPS is private infrastructure (no public webhook complexity) 2. Small team use case (2-5 engineers, moderate message volume) 3. Security model favors minimal external exposure 4. Trade-off of slightly lower reliability is acceptable for non-critical team comms ### References - [Socket Mode overview](https://docs.slack.dev/apis/events-api/using-socket-mode) - [HTTP vs Socket Mode comparison](https://docs.slack.dev/apis/events-api/comparing-http-socket-mode) - [mautrix-slack authentication](https://docs.mau.fi/bridges/go/slack/authentication.html) --- ## 2. Slack API Scopes ### Required Bot Token Scopes (29 total) From [mautrix-slack app manifest](https://github.com/mautrix/slack/blob/main/app-manifest.yaml): **Message Operations**: - `chat:write` - Send messages as bot - `chat:write.public` - Send to public channels without membership - `chat:write.customize` - Customize bot username/avatar (for ghosting) **Channel Access** (public channels): - `channels:read`, `channels:history` - List and view messages - `channels:write.invites`, `channels:write.topic` - Manage channels **Private Channels** (groups): - `groups:read`, `groups:history`, `groups:write` - `groups:write.invites`, `groups:write.topic` **Direct Messages**: - `im:read`, `im:history`, `im:write`, `im:write.topic` - `mpim:read`, `mpim:history`, `mpim:write`, `mpim:write.topic` (group DMs) **User & Workspace**: - `users:read`, `users.profile:read`, `users:read.email` - `team:read` **Rich Content**: - `files:read`, `files:write` - `reactions:read`, `reactions:write` - `pins:read`, `pins:write` - `emoji:read` ### Required App-Level Token Scopes (1 total) - `connections:write` - Establish Socket Mode WebSocket connections ### Event Subscriptions (46 events) The bridge subscribes to events including: - Workspace: `app_uninstalled`, `team_domain_change` - Channels: `channel_archive`, `channel_created`, `channel_deleted`, `channel_rename`, etc. - Messages: `message.channels`, `message.groups`, `message.im`, `message.mpim` - Interactions: `reaction_added`, `reaction_removed`, `pin_added`, `file_shared`, etc. ### Security Best Practices **Principle of Least Privilege**: - Use all 29 scopes from mautrix-slack manifest (required for full functionality) - Consider removing `conversations.connect:write` if not using Slack Connect **Token Storage**: - ✅ Production: Use sops-nix encrypted secrets - ✅ Never commit tokens to version control - ✅ Use 0440 permissions (service user only) **Monitoring**: - Enable IP allowlisting for token usage (Slack API feature) - Monitor token usage via Slack app management dashboard - Log all API calls for audit purposes ### References - [Permission Scopes Reference](https://api.slack.com/scopes) - [mautrix-slack app manifest](https://github.com/mautrix/slack/blob/main/app-manifest.yaml) --- ## 3. mautrix-slack Configuration ### Current Module Structure **Location**: `/home/dan/proj/ops-jrz1/modules/mautrix-slack.nix` **Configuration Generation** (two-stage): 1. **Root stage**: Creates directory structure (`/var/lib/mautrix_slack/config`) 2. **User stage**: Generates config from example template using `-e` flag, merges overrides **Module Architecture**: ```nix # Key configuration sections exposed: matrix = { homeserverUrl = "http://127.0.0.1:8008"; serverName = "clarun.xyz"; }; database = { type = "postgres"; uri = "postgresql:///mautrix_slack?host=/run/postgresql"; maxOpenConnections = 32; maxIdleConnections = 4; }; appservice = { hostname = "127.0.0.1"; port = 29319; id = "slack"; senderLocalpart = "slackbot"; userPrefix = "slack_"; }; bridge = { commandPrefix = "!slack"; permissions = { "clarun.xyz" = "user"; }; }; encryption = { enable = true; # Allow E2EE default = false; # Don't enable by default }; logging.level = "info"; ``` **Missing from Module Options**: - Slack-specific configuration (workspace, tokens) - Socket Mode settings (bot token, app token injection) - Channel mapping configuration **Current Issue**: Module configured for "delpadtech" workspace, exits with code 11. ### Socket Mode Configuration Requirements Based on mautrix patterns, Socket Mode credentials are likely configured via: **Option A: Interactive login** (current mautrix-slack approach) - No config needed initially - Bridge prompts for tokens via Matrix chat - Stores in database after first login **Option B: Declarative config** (would require module enhancement) ```yaml slack: bot_token: "${BOT_TOKEN}" # From environment or secrets app_token: "${APP_TOKEN}" # From environment or secrets ``` **Decision**: Use **interactive login** approach (Option A) to avoid module modifications. Tokens provided via `login app` command in Matrix. ### Database Configuration **Current Setup** (working correctly): ```nix database = { type = "postgres"; uri = "postgresql:///mautrix_slack?host=/run/postgresql"; }; ``` **Provisioning** (from `modules/dev-services.nix`): ```nix services.postgresql = { ensureDatabases = [ "mautrix_slack" ]; ensureUsers = [{ name = "mautrix_slack"; ensureDBOwnership = true; }]; }; ``` ✅ No database configuration issues detected. ### Matrix Homeserver Integration **Appservice Registration**: - Generated at: `/var/lib/matrix-appservices/mautrix_slack_registration.yaml` - Contains: `id`, `url`, `as_token`, `hs_token`, `namespaces` **Missing Step**: Registration file must be loaded into conduwuit homeserver. **Required Action**: Add to Matrix server configuration: ```toml [[appservices]] registration = "/var/lib/matrix-appservices/mautrix_slack_registration.yaml" ``` ### Exit Code 11 Root Cause Analysis **Exit Code 11 = SIGSEGV** (Segmentation Fault) **Most likely causes** (ranked by probability): 1. **Missing Slack credentials** (95% likely) - Module generates config without tokens - Bridge crashes trying to connect with invalid/missing credentials 2. **Incomplete configuration** (80% likely) - Example config has required fields not set - Bridge code doesn't validate, crashes on access 3. **olm-3.2.16 library issues** (40% likely) - Insecure package error requires `permittedInsecurePackages` allowance - Already addressed in production config (commit 0cbbb19) 4. **SystemD security restrictions** (20% likely) - Security hardening can cause segfaults with Go binaries - May need temporary relaxation (as done for mautrix-gmessages) **Validation Steps**: 1. Enable debug logging: `logging.level = "debug"` 2. Check logs: `journalctl -u mautrix-slack -n 100` 3. Temporarily disable security hardening 4. Verify database connectivity 5. Test with minimal config (no credentials - should fail gracefully) ### References - [mautrix-slack GitHub](https://github.com/mautrix/slack) - [mautrix docs](https://docs.mau.fi/bridges/go/slack/) - Project file: `/home/dan/proj/ops-jrz1/modules/mautrix-slack.nix` --- ## 4. sops-nix Secrets Management ### Current Secrets Infrastructure **Encryption**: Age encryption via SSH host key conversion **File**: `/home/dan/proj/ops-jrz1/secrets/secrets.yaml` ```yaml matrix-registration-token: "..." acme-email: "dlei@duck.com" slack-oauth-token: "" # Placeholder (empty) slack-app-token: "" # Placeholder (empty) ``` **Age Configuration** (`.sops.yaml`): ```yaml keys: - &vultr_vps age1vuxcwvdvzl2u7w6kudqvnnf45czrnhwv9aevjq9hyjjpa409jvkqhkz32q - &admin age18ue40q4fw8uggdlfag7jf5nrawvfvsnv93nurschhuynus200yjsd775v3 creation_rules: - path_regex: secrets/secrets\.yaml$ key_groups: - age: - *vultr_vps # VPS can decrypt via /etc/ssh/ssh_host_ed25519_key - *admin # Admin workstation can decrypt/edit ``` **Status**: ✅ Working correctly in production (Generation 31, deployed 2025-10-22) ### Secret Lifecycle ``` System Boot ↓ sops-nix activation script runs ↓ Reads /etc/ssh/ssh_host_ed25519_key ↓ Converts to age key (age1vux...) ↓ Decrypts secrets/secrets.yaml ↓ Extracts individual keys ↓ Writes to /run/secrets/ ↓ Sets ownership and permissions ↓ Services start (can now read secrets) ``` ### Pattern for Slack Tokens **Step 1: Update secrets.yaml** ```yaml slack-oauth-token: "xoxb-YOUR-ACTUAL-TOKEN" slack-app-token: "xapp-YOUR-ACTUAL-TOKEN" ``` Encrypt with: `sops secrets/secrets.yaml` **Step 2: Declare in hosts/ops-jrz1.nix** ```nix sops.secrets.slack-oauth-token = { owner = "mautrix_slack"; group = "mautrix_slack"; mode = "0440"; }; sops.secrets.slack-app-token = { owner = "mautrix_slack"; group = "mautrix_slack"; mode = "0440"; }; ``` **Step 3: Reference in Service** (two patterns) **Pattern A: LoadCredential** (systemd credentials) ```nix systemd.services.mautrix-slack.serviceConfig = { LoadCredential = [ "slack-oauth-token:/run/secrets/slack-oauth-token" "slack-app-token:/run/secrets/slack-app-token" ]; }; # Service reads from: ${CREDENTIALS_DIRECTORY}/slack-oauth-token ``` **Pattern B: Direct file reference** ```nix services.mautrix-slack = { oauthTokenFile = "/run/secrets/slack-oauth-token"; appTokenFile = "/run/secrets/slack-app-token"; }; ``` **Decision**: Use **interactive login approach** - tokens provided via Matrix chat, not config files. Secrets will be stored in bridge database, not referenced in NixOS config. This simplifies deployment and matches mautrix-slack's intended workflow. ### File Permissions Best Practices ``` -r--r----- (0440): Service-specific secrets (only service user + group can read) -r--r--r-- (0444): Broadly readable secrets (e.g., email addresses) -r-------- (0400): Root-only secrets (maximum security) ``` **Security guarantees**: - ✅ Secrets never in Nix store (world-readable) - ✅ Secrets only in `/run/secrets/` (tmpfs, RAM-only) - ✅ Secrets cleared on reboot - ✅ Encrypted at rest in git (safe to commit secrets.yaml) ### References - [sops-nix GitHub](https://github.com/Mic92/sops-nix) - [Michael Stapelberg's Blog](https://michael.stapelberg.ch/posts/2025-08-24-nixos-sops-nix/) (2025-08-24) - Project file: `/home/dan/proj/ops-jrz1/secrets/secrets.yaml` --- ## 5. Channel Bridging Patterns ### How Channel Mapping Works mautrix-slack uses **automatic portal creation** rather than manual channel mapping: **Portal Creation Triggers**: 1. **Initial login**: Bridge creates portals for recent conversations (controlled by `conversation_count`) 2. **Receiving messages**: Portal auto-created when message arrives in new channel 3. **Bot membership**: Channels where Slack bot is invited are automatically bridged **Portal Types Supported**: - Public/private channels (including Slack Connect channels) - Group DMs (multi-party direct messages) - 1:1 Direct messages **Shared Portals**: Multiple Matrix users can interact with the same Slack channel through a shared Matrix room. ### Configuration vs Runtime Management **Configuration-based** (`conversation_count` in config.yaml): - Controls how many recent conversations sync on initial login - Only affects initial synchronization - Separate settings for channels, group DMs, direct messages **Runtime Management** (automatic): - No manual channel mapping required - Portal creation happens dynamically - No explicit `open ` command needed - To interact with a new channel, simply send/receive a message in Slack **Bot Commands** (via Matrix DM with `@slackbot:clarun.xyz`): - `help` - Display available commands - `login app` - Authenticate with Slack app credentials - `login token ` - Authenticate with user account (unofficial) ### Adding/Removing Channels **Adding Channels**: ✅ **Runtime (no restart)** - Receive a message in the channel → portal auto-created - Invite Slack bot to channel (app login mode) → portal auto-created **Removing Channels**: ⚠️ **Not explicitly documented** - Likely has `delete-portal` command (based on other mautrix bridges) - Would be sent from within the Matrix portal room **Modifying Configuration**: - Changes to `conversation_count` require bridge restart - However, setting only affects initial sync, not ongoing operation ### Archived Channel Handling ⚠️ **Not explicitly documented** Expected behavior: - Matrix portal remains but becomes inactive - No new messages flow (Slack channel is read-only) - Historical messages remain accessible **Recommendation**: Test this scenario in pilot deployment to document actual behavior. ### Gradual Rollout Strategy **Phase 1: Single Test Channel** (Week 1-2) - Set `conversation_count` low (5-10) - Start with one channel: `#dev-platform` or `#test` - Verify automatic portal creation, bidirectional messaging, reactions, files **Phase 2: Small User Group** (Week 3-4) - 3-5 team members authenticate - Test shared portal functionality - Monitor performance and reliability **Phase 3: Organic Expansion** (Week 5+) - Don't pre-configure channel lists - Let automatic portal creation handle it based on usage - Users get portals only for channels they actively use **Configuration Strategy**: ```yaml bridge: conversation_count: 10 # Start small, expand organically ``` **Advantages**: - No manual channel mapping to maintain - Scales naturally with usage - Easy to expand without configuration changes - Users only see channels they interact with ### Key Limitations ⚠️ No traditional message backfill (history before bridge setup) ⚠️ Name changes not fully supported ⚠️ Being added to conversations only partially supported ⚠️ No documented manual `open ` command ### References - [mautrix-slack docs](https://docs.mau.fi/bridges/go/slack/) - [ROADMAP.md](https://github.com/mautrix/slack/blob/main/ROADMAP.md) - Support room: #slack:maunium.net --- ## 6. Implementation Decisions ### Critical Path Decisions | Decision Point | Choice | Rationale | |----------------|--------|-----------| | **Connection Method** | Socket Mode (WebSocket) | No public endpoint needed, matches security model | | **Authentication** | App Login (official OAuth) | Production stability, clear audit trail | | **Token Management** | Interactive login via Matrix | Matches mautrix-slack workflow, simplifies config | | **Secrets Storage** | sops-nix (existing pattern) | Already working in production (Gen 31) | | **Channel Bridging** | Automatic portal creation | No manual mapping, scales with usage | | **Initial Scope** | Single test channel | Validate before expanding | | **Workspace** | chochacho (production) | Real workspace with admin rights | ### Risks and Mitigations | Risk | Probability | Impact | Mitigation | |------|-------------|--------|------------| | Exit code 11 continues | High | High | Debug logging, relax systemd hardening, validate credentials | | Socket Mode disconnects | Medium | Low | Automatic reconnection, monitor health indicators | | Token expiration | Low | Medium | Clear error messages, documented re-authentication | | Performance issues | Low | Medium | Start with 1 channel, monitor before expanding | | Slack API rate limits | Low | Low | Respect rate limits, implement backoff | ### Open Questions for Implementation 1. **Exact cause of exit code 11**: Requires deployment with debug logging 2. **Matrix appservice registration**: Need to integrate with conduwuit config 3. **Actual `conversation_count` value**: Determine optimal setting for initial sync 4. **Archived channel behavior**: Document through testing 5. **Permission mapping**: Slack roles → Matrix power levels (verify in practice) --- ## 7. Next Steps **Immediate** (Phase 1): 1. ✅ Create `data-model.md` (entities, relationships, state machines) 2. ✅ Create `contracts/bridge-config.yaml` (configuration schema) 3. ✅ Create `contracts/secrets-schema.yaml` (secrets structure) 4. ✅ Create `contracts/channel-mapping.yaml` (portal configuration) 5. ✅ Create `quickstart.md` (deployment runbook) 6. ✅ Update `.claude/CLAUDE.md` (agent context) **Then** (Phase 2): - Run `/speckit.tasks` to generate implementation task breakdown - Begin actual implementation based on plan.md --- ## Document History - **2025-10-22**: Initial research completed (5 research agents) - **Phase 0 Status**: ✅ Complete - **Next Phase**: Phase 1 (Design)