ops-jrz1/specs/002-slack-bridge-integration/data-model.md
Dan ca379311b8 Add Slack bridge integration feature specification
Includes spec, plan, research, data model, contracts, and quickstart guide
for mautrix-slack Socket Mode bridge deployment.
2025-10-26 14:36:44 -07:00

679 lines
23 KiB
Markdown

# Data Model: Slack↔Matrix Bridge
**Feature**: 002-slack-bridge-integration
**Created**: 2025-10-22
**Status**: Design Complete
## Overview
This document defines the conceptual data model for the mautrix-slack bridge. Since this is infrastructure configuration (not application code), the model focuses on configuration entities, runtime state, and operational data flows.
**Key Insight**: Most data is managed internally by mautrix-slack (PostgreSQL database). Our model focuses on **configuration inputs** and **observable runtime state** relevant to NixOS deployment.
---
## 1. Configuration Entities
### 1.1 Bridge Service
**Description**: The mautrix-slack service instance
**Properties**:
| Property | Type | Source | Description |
|----------|------|--------|-------------|
| `workspace` | string | NixOS config | Slack workspace name ("chochacho") |
| `homeserverUrl` | URL | NixOS config | Matrix homeserver address (http://127.0.0.1:8008) |
| `serverName` | domain | NixOS config | Matrix server domain (clarun.xyz) |
| `databaseUri` | URI | NixOS config | PostgreSQL connection string |
| `port` | integer | NixOS config | Appservice listen port (29319) |
| `commandPrefix` | string | NixOS config | Bridge command prefix ("!slack") |
| `permissions` | map | NixOS config | Domain → permission level mappings |
| `loggingLevel` | enum | NixOS config | Log verbosity (debug/info/warn/error) |
| `conversationCount` | integer | config.yaml | Number of recent chats to sync on login |
**Lifecycle**:
- Created: NixOS configuration deployment
- Modified: Configuration updates → rebuild
- Destroyed: Service disabled in config
**State Transitions**: See section 3.1 (Bridge Service State Machine)
### 1.2 Slack Credentials
**Description**: Authentication tokens for Slack API
**Properties**:
| Property | Type | Source | Description |
|----------|------|--------|-------------|
| `botToken` | secret (xoxb-) | sops-nix → bridge DB | Slack bot OAuth token |
| `appToken` | secret (xapp-) | sops-nix → bridge DB | Slack app-level token (Socket Mode) |
| `workspace` | string | Interactive login | Slack workspace identifier (T...) |
**Lifecycle**:
- Created: Slack app configuration → manual token generation
- Stored: Provided via `login app` command → bridge database
- Rotated: Manual token regeneration → re-authentication
- Revoked: Slack app settings or user removes app
**Security Requirements**:
- Tokens never in Nix store (evaluation-time exposure risk)
- Tokens never in config.yaml (file permission risk)
- Tokens stored in bridge PostgreSQL database (encrypted at rest via LUKS)
- Optional: Encrypt in sops-nix for disaster recovery (not used by bridge directly)
### 1.3 Matrix Appservice Registration
**Description**: Matrix homeserver configuration for bridge integration
**Properties**:
| Property | Type | Source | Description |
|----------|------|--------|-------------|
| `id` | string | Generated | Appservice identifier ("slack") |
| `url` | URL | Generated | Bridge endpoint (http://127.0.0.1:29319) |
| `asToken` | secret | Generated | Appservice → homeserver auth |
| `hsToken` | secret | Generated | Homeserver → appservice auth |
| `senderLocalpart` | string | Generated | Bot user localpart ("slackbot") |
| `usernameTemplate` | string | Generated | Ghost user format ("slack_{{.}}") |
| `namespaces.users` | list | Generated | Reserved user namespaces |
**Lifecycle**:
- Created: First service start (`mautrix-slack -g -r registration.yaml`)
- Modified: Rarely (only on namespace changes)
- Consumed: Loaded by Matrix homeserver (conduwuit)
**File Location**: `/var/lib/matrix-appservices/mautrix_slack_registration.yaml`
### 1.4 Channel Portal
**Description**: A bridged Slack channel ↔ Matrix room pair
**Properties** (stored in mautrix-slack database):
| Property | Type | Description |
|----------|------|-------------|
| `slackChannelId` | string | Slack channel ID (C...) |
| `matrixRoomId` | string | Matrix room ID (!...clarun.xyz) |
| `channelName` | string | Slack channel name (#dev-platform) |
| `roomAlias` | string | Matrix room alias (#slack_dev-platform:clarun.xyz) |
| `topic` | string | Channel topic/description |
| `members` | list | Slack users in channel |
| `encrypted` | boolean | Whether Matrix room is encrypted |
| `createdAt` | timestamp | Portal creation time |
| `lastActivity` | timestamp | Last message timestamp |
**Lifecycle**: See section 3.3 (Channel Portal State Machine)
**Observable via**:
- Matrix room list (user perspective)
- Bridge database queries (admin perspective)
- Bot command: `!slack status` (if implemented)
---
## 2. Runtime State Entities
### 2.1 Socket Mode Connection
**Description**: WebSocket connection to Slack's real-time messaging service
**Properties**:
| Property | Type | Description |
|----------|------|-------------|
| `websocketUrl` | URL | Dynamic WebSocket URL (wss://wss.slack.com/link/...) |
| `connectionState` | enum | disconnected / connecting / connected / refreshing |
| `connectionId` | string | Unique connection identifier |
| `connectedAt` | timestamp | When connection established |
| `refreshAt` | timestamp | Estimated refresh time (~2-4 hours) |
| `lastHeartbeat` | timestamp | Last ping/pong from Slack |
| `reconnectAttempts` | integer | Consecutive failed reconnection count |
| `rateLimit` | timestamp | Earliest next connection attempt (1/minute limit) |
**State Transitions**: See section 3.2 (Socket Mode Connection State Machine)
**Observable via**:
- Service logs: `journalctl -u mautrix-slack -f`
- Health indicators: Connection status, last successful message timestamp
### 2.2 Ghost User
**Description**: Matrix representation of a Slack user
**Properties**:
| Property | Type | Description |
|----------|------|-------------|
| `matrixUserId` | string | Ghost user MXID (@slack_U123ABC:clarun.xyz) |
| `slackUserId` | string | Slack user ID (U...) |
| `displayName` | string | Synced from Slack profile |
| `avatarUrl` | mxc:// | Synced from Slack avatar |
| `isBot` | boolean | Whether user is a bot account |
| `email` | string | Slack user email (if available) |
| `slackTeam` | string | Workspace identifier |
**Lifecycle**:
- Created: First message from Slack user in bridged channel
- Updated: Slack profile changes → synced to Matrix
- Deactivated: User leaves workspace (profile retained but inactive)
**Namespace**: `@slack_*:clarun.xyz` (reserved via appservice registration)
### 2.3 Message Event
**Description**: A bridged message in transit
**Properties**:
| Property | Type | Description |
|----------|------|-------------|
| `sourceService` | enum | slack / matrix |
| `sourceEventId` | string | Slack ts or Matrix event ID |
| `targetEventId` | string | Event ID in destination service |
| `messageType` | enum | text / image / file / reaction / edit / delete |
| `content` | object | Message payload (text, attachments, etc.) |
| `sender` | string | User ID in source service |
| `channel` | string | Portal ID |
| `timestamp` | timestamp | Message send time |
| `deliveredAt` | timestamp | When relayed to destination |
| `latency` | duration | deliveredAt - timestamp (should be <5s) |
**Lifecycle** (ephemeral):
- Received: Slack WebSocket event or Matrix /transactions POST
- Transformed: Format conversion (Slack JSON Matrix JSON)
- Sent: Posted to destination API
- Acknowledged: Event ID stored for deduplication
**Observable via**:
- Bridge logs (debug level)
- Health metrics: Message count, delivery latency
- Spec requirement: FR-001/FR-002 (5 second latency SLA)
---
## 3. State Machines
### 3.1 Bridge Service State Machine
```
┌─────────────┐
│ Disabled │ (services.mautrix-slack.enable = false)
└──────┬──────┘
│ nixos-rebuild switch (enable = true)
┌─────────────┐
│ Starting │ ExecStartPre: Generate config, create registration
└──────┬──────┘
│ Config valid, database reachable
┌─────────────┐
│Unauthenticated (service running, waiting for `login app`)
└──────┬──────┘
│ User sends `login app` command, provides tokens
┌─────────────┐
│ Connecting │ Establishing Socket Mode WebSocket
└──────┬──────┘
│ WebSocket handshake successful
┌─────────────┐
│ Active │ Normal operation (relaying messages)
└──┬─────┬────┘
│ │ Connection refresh (every ~2-4 hours)
│ └──→ Connecting (automatic reconnection)
│ Configuration error, auth revoked, database failure
┌─────────────┐
│ Failed │ Service exits (systemd restarts after 10s)
└──────┬──────┘
│ systemd RestartSec expires
└──→ Starting
```
**Key Observations**:
- **Unauthenticated state is valid**: Service can run without Slack credentials
- **Automatic restart**: systemd handles crash recovery
- **Connection refresh is normal**: Not a failure state, automatic transition
### 3.2 Socket Mode Connection State Machine
```
┌─────────────┐
│Disconnected │ Initial state or after connection loss
└──────┬──────┘
│ Bridge has valid credentials
┌─────────────┐
│Requesting URL Call apps.connections.open API
└──────┬──────┘
│ API returns wss:// URL (rate limit: 1/minute)
┌─────────────┐
│ Connecting │ WebSocket handshake in progress
└──────┬──────┘
│ Receives "hello" message from Slack
┌─────────────┐
│ Connected │ Receiving events, acknowledging with envelope_id
└──┬───┬───┬──┘
│ │ │ Slack sends "warning" disconnect (10s notice)
│ │ └──→ Refreshing
│ │
│ │ Network error, timeout, Slack backend restart
│ └──→ Disconnected (immediate reconnection attempt)
│ Normal operation continues
┌─────────────┐
│ Refreshing │ Graceful connection renewal
└──────┬──────┘
│ Fetch new WebSocket URL
└──→ Requesting URL
```
**Error Paths**:
- **Rate limited**: Stay in Disconnected, retry after 1 minute
- **Auth invalid**: Transition to Failed (requires re-authentication)
- **Network partition**: Exponential backoff reconnection attempts
**Health Indicators**:
- `connection_status`: current state name
- `last_successful_message`: timestamp of last event
- `reconnection_attempts`: incremented on failed connections, reset on success
### 3.3 Channel Portal State Machine
```
┌─────────────┐
│ Pending │ User receives message in unbridged Slack channel
└──────┬──────┘
│ Bridge auto-creates portal
┌─────────────┐
│ Creating │ Allocating Matrix room, sending invites
└──────┬──────┘
│ Room created, Matrix users invited
┌─────────────┐
│ Active │ Relaying messages bidirectionally
└──┬───┬───┬──┘
│ │ │ Slack channel archived
│ │ └──→ Archived
│ │
│ │ Admin runs delete-portal command (if available)
│ └──→ Deleting
│ Normal message relay continues
(Active - steady state)
┌─────────────┐
│ Archived │ Slack channel is read-only
└──────┬──────┘
│ Slack channel unarchived
└──→ Active
┌─────────────┐
│ Deleting │ Cleanup: kick users, delete room, remove from DB
└──────┬──────┘
│ Cleanup complete
┌─────────────┐
│ Deleted │ Portal removed (can be recreated if needed)
└─────────────┘
```
**Key Properties by State**:
- **Pending**: Not yet in bridge database
- **Creating**: Room exists but membership incomplete
- **Active**: `lastActivity` updates on each message
- **Archived**: Read-only, no new messages flow
- **Deleted**: Database record removed, room unlinked
---
## 4. Relationships
### 4.1 Entity Relationship Diagram
```
┌──────────────────┐
│ Bridge Service │
└────────┬─────────┘
│ 1
│ manages
↓ N
┌──────────────────┐ ┌──────────────────┐
│ Socket Connection│←──────│ Slack Credentials│
└────────┬─────────┘ 1 └──────────────────┘
│ uses
│ receives events via
↓ N
┌──────────────────┐
│ Channel Portal │
└────────┬─────────┘
│ bridges
↓ N
┌──────────────────┐ ┌──────────────────┐
│ Message Event │───────│ Ghost User │
└──────────────────┘ from └──────────────────┘
│ N │ N
│ │
│ relays to │ represents
↓ 1 ↓ 1
┌──────────────────┐ ┌──────────────────┐
│ Matrix Room │ │ Slack User │
└──────────────────┘ └──────────────────┘
```
### 4.2 Cardinality Table
| Entity A | Relationship | Entity B | Cardinality | Notes |
|----------|--------------|----------|-------------|-------|
| Bridge Service | manages | Socket Connection | 1:1 | One WebSocket per bridge instance |
| Bridge Service | creates | Channel Portal | 1:N | Multiple channels bridged |
| Socket Connection | uses | Slack Credentials | 1:1 | Credentials shared across portals |
| Channel Portal | contains | Message Event | 1:N | Many messages per channel |
| Channel Portal | links | Matrix Room | 1:1 | Bidirectional mapping |
| Ghost User | sends | Message Event | 1:N | User can send many messages |
| Ghost User | represents | Slack User | 1:1 | One MXID per Slack user per workspace |
| Appservice Registration | reserves | Ghost User namespace | 1:N | All @slack_*:clarun.xyz |
---
## 5. Data Flow Diagrams
### 5.1 Message Flow: Slack → Matrix
```
Slack User
│ Posts message in #dev-platform
Slack API (WebSocket event)
│ message.channels event
mautrix-slack (Socket Mode listener)
│ 1. Acknowledge event (envelope_id)
│ 2. Check portal exists for channel
│ 3. Transform message format
│ 4. Lookup/create ghost user
Matrix Homeserver (/_matrix/app/v1/transactions)
│ PUT transaction with event
Matrix Room (#slack_dev-platform:clarun.xyz)
│ Event appears in room timeline
Matrix Users
│ See message from @slack_john:clarun.xyz
```
**Latency Budget**: <5 seconds (FR-001)
**Failure Modes**:
- Portal doesn't exist Auto-create, then deliver
- Ghost user doesn't exist Create, set profile, then deliver
- Matrix homeserver unreachable Retry with exponential backoff
- Event deduplication Check Slack `ts` against database, skip if duplicate
### 5.2 Message Flow: Matrix → Slack
```
Matrix User (@alice:clarun.xyz)
│ Sends message in #slack_dev-platform:clarun.xyz
Matrix Homeserver
│ Appservice transaction to bridge
mautrix-slack (/_matrix/app/v1/transactions)
│ 1. Verify hs_token
│ 2. Lookup portal by room ID
│ 3. Transform message format
│ 4. Determine sender identity
Slack API (chat.postMessage or chat.postEphemeral)
│ POST message to channel via bot token
Slack Channel (#dev-platform)
│ Message appears from bridge bot
│ (with Matrix user's display name if using customization)
Slack Users
│ See message: "Alice (Matrix): Hello from Matrix!"
```
**Latency Budget**: <5 seconds (FR-002)
**Failure Modes**:
- Portal not found Log error, return 200 OK (avoid retry loop)
- Slack API rate limited Queue message, retry with backoff
- Bot not in channel Attempt to join, or return error to Matrix user
- Invalid message format Send error reply to Matrix user
### 5.3 Authentication Flow
```
Admin (NixOS deployment)
│ 1. Deploy configuration (services.mautrix-slack.enable = true)
NixOS Activation
│ 2. Start systemd service
mautrix-slack service
│ 3. Generate config, start service
│ 4. Listen on port 29319
│ 5. State: Unauthenticated
Admin (Matrix client)
│ 6. Open DM with @slackbot:clarun.xyz
│ 7. Send: "login app"
mautrix-slack
│ 8. Prompt: "Please provide bot token"
Admin
│ 9. Send: "xoxb-..."
mautrix-slack
│ 10. Prompt: "Please provide app token"
Admin
│ 11. Send: "xapp-..."
mautrix-slack
│ 12. Store tokens in database
│ 13. Call apps.connections.open
│ 14. Establish WebSocket connection
│ 15. Sync recent conversations (conversation_count)
│ 16. State: Active
Admin
│ 17. Receive success message
│ 18. Invited to bridged channel portals
```
**Security Notes**:
- Tokens transmitted over encrypted Matrix federation (TLS)
- Tokens stored in PostgreSQL database (LUKS-encrypted filesystem)
- Tokens never logged (mautrix bridges sanitize logs)
- Admin can revoke via Slack app settings
### 5.4 Portal Creation Flow
```
Slack User
│ Sends message in #general (not yet bridged)
Slack API (WebSocket event)
│ message.channels event
mautrix-slack
│ 1. Check database: portal exists for channel_id?
│ 2. Not found → Initiate auto-create
Portal Creation Logic
│ 3. Create Matrix room via homeserver API
│ 4. Set room name, topic, avatar
│ 5. Insert portal record in database
│ 6. Map Slack channel ↔ Matrix room
Membership Sync
│ 7. For each Slack member in channel:
│ - Create/update ghost user
│ - Invite ghost user to Matrix room
Relay Message
│ 8. Transform and send original message
Matrix Users
│ 9. Receive room invitation
│ 10. Join room, see first message
```
**Timing**: Portal creation adds ~2-5 seconds latency to first message
**Failure Recovery**:
- Room creation fails Retry up to 3 times
- Ghost user creation fails Skip that user, continue
- Database insert fails Rollback, log error, retry
---
## 6. Database Schema (Conceptual)
**Note**: Actual schema managed by mautrix-slack. This is conceptual understanding for operational purposes.
### Key Tables
**`portal`**
```sql
CREATE TABLE portal (
slack_channel_id TEXT PRIMARY KEY, -- C0123ABC
mxid TEXT NOT NULL, -- !xyz:clarun.xyz
name TEXT, -- dev-platform
topic TEXT,
encrypted BOOLEAN DEFAULT FALSE,
in_space BOOLEAN DEFAULT FALSE,
avatar_url TEXT,
name_set BOOLEAN,
topic_set BOOLEAN,
avatar_set BOOLEAN
);
```
**`puppet`** (Ghost Users)
```sql
CREATE TABLE puppet (
slack_user_id TEXT PRIMARY KEY, -- U0123DEF
team_id TEXT, -- T0456GHI
mxid TEXT NOT NULL, -- @slack_john:clarun.xyz
display_name TEXT,
avatar_url TEXT,
name_set BOOLEAN,
avatar_set BOOLEAN,
contact_info_set BOOLEAN,
is_bot BOOLEAN,
custom_mxid TEXT -- For double-puppeting
);
```
**`user`** (Logged-in Matrix users)
```sql
CREATE TABLE "user" (
mxid TEXT PRIMARY KEY, -- @alice:clarun.xyz
slack_user_id TEXT, -- U789JKL (after login)
team_id TEXT, -- T0456GHI
access_token TEXT, -- Encrypted Slack token
management_room TEXT -- DM with bridge bot
);
```
**`message`** (Event mapping for edits/deletes)
```sql
CREATE TABLE message (
slack_ts TEXT, -- 1234567890.123456
slack_channel_id TEXT, -- C0123ABC
mxid TEXT, -- $event_id:clarun.xyz
UNIQUE(slack_ts, slack_channel_id)
);
```
**Queries Used**:
- Message relay: `SELECT mxid FROM portal WHERE slack_channel_id = ?`
- Ghost user lookup: `SELECT mxid FROM puppet WHERE slack_user_id = ?`
- Edit/delete: `SELECT mxid FROM message WHERE slack_ts = ? AND slack_channel_id = ?`
---
## 7. Configuration Data Flow
```
Git Repository (specs/002-slack-bridge-integration/)
│ Contains: spec.md, plan.md, data-model.md
NixOS Configuration (hosts/ops-jrz1.nix)
│ services.mautrix-slack = { ... }
NixOS Evaluation
│ Merges: modules/mautrix-slack.nix options
ExecStartPre (Python script)
│ 1. Generate example config: mautrix-slack -e
│ 2. Merge configOverrides
│ 3. Write: /var/lib/mautrix_slack/config/config.yaml
mautrix-slack service
│ Reads config.yaml on startup
Runtime Behavior
│ Connects to Matrix, Slack, PostgreSQL
```
**Configuration Layers** (in order of precedence):
1. **Hardcoded defaults** (in mautrix-slack binary)
2. **Example config** (generated with `-e` flag)
3. **NixOS module overrides** (`configOverrides` option)
4. **User extraConfig** (`extraConfig` option)
5. **Runtime authentication** (tokens from `login app` command)
---
## 8. Observability Data
### Health Indicators (SC-003a)
| Metric | Source | Purpose |
|--------|--------|---------|
| `connection_status` | Service logs | Socket Mode connection state |
| `last_successful_message` | Service logs | Timestamp of last relayed message |
| `error_count` | Service logs | Count of errors since last restart |
| `portal_count` | Database query | Number of active channel portals |
| `ghost_user_count` | Database query | Number of Slack users bridged |
| `message_latency` | Bridge metrics | Time between sourcedestination (should be <5s) |
### Log Events
**Key log patterns** (from mautrix bridge codebase):
```
INFO [WebSocket] Connected to Slack via Socket Mode
INFO [Portal] Creating portal for channel #dev-platform (C0123ABC)
INFO [Message] Relaying message from Slack to Matrix: {...}
WARN [Connection] WebSocket disconnected, reconnecting in 5s
ERROR [Auth] Invalid bot token, authentication failed
```
**Monitoring Strategy**:
1. Use `journalctl -u mautrix-slack -f` for real-time monitoring
2. Export logs to persistent storage for analysis
3. Alert on `ERROR` level logs
4. Track `last_successful_message` metric (alert if >1 hour stale)
---
## 9. Document History
- **2025-10-22**: Initial data model design
- **Phase 1 Status**: ✅ Complete
- **Next**: Create contracts/ directory with schemas