20 KiB
Data Model: Maubot Integration
Feature: 003-maubot-integration Date: 2025-10-26 Status: Phase 1 design
Overview
This document defines the data structures, state machines, and relationships for the maubot integration feature. Since maubot is an infrastructure service (not an application with user-facing data), the focus is on service configuration, runtime state, and operational entities.
Core Entities
1. Maubot Service
Description: The maubot framework service that manages bot instances and provides the web-based management interface.
Attributes:
homeserver_url: string (URL) - Matrix homeserver endpoint (e.g.,http://127.0.0.1:8008)server_name: string (domain) - Matrix server domain (e.g.,clarun.xyz)port: integer - Management interface port (default: 29316)database_uri: string - SQLite database path (e.g.,sqlite:///var/lib/maubot/bot.db)admin_username: string - Admin UI login usernameadmin_password_hash: string (secret) - Hashed admin passwordsecret_key: string (secret) - Session signing keyconfig_path: string (path) - Runtime config location (/var/lib/maubot/config/config.yaml)
Relationships:
- Has many: Bot Instances (1:N)
- Has many: Plugins (1:N)
- Connects to: Matrix Homeserver (1:1)
State Machine: N/A (service-level, managed by systemd)
Validation Rules:
homeserver_urlMUST be IPv4127.0.0.1:PORT(not localhost - conduwuit compatibility)portMUST NOT conflict with existing services (check: 8008 Matrix, 29319 Slack bridge, 3000 Forgejo)admin_password_hashMUST be bcrypt with cost >=12secret_keyMUST be >=32 bytes random
Storage:
- NixOS module configuration:
/home/dan/proj/ops-jrz1/modules/maubot.nix - Runtime config:
/var/lib/maubot/config/config.yaml - Secrets:
/run/secrets/maubot-*(sops-nix decrypted)
2. Bot Instance
Description: Individual bot deployment with specific configuration, Matrix user account, and plugin assignment.
Attributes:
id: string (slug) - Instance identifier (e.g.,instagram-bot-1)type: string - Plugin ID (e.g.,sna.instagram)primary_user: string (MXID) - Matrix user ID (e.g.,@instagram-bot:clarun.xyz)enabled: boolean - Whether bot is activeconfig: object (JSON) - Plugin-specific configuration- For Instagram bot:
{"enabled": true, "max_file_size": 50000000, "room_subscriptions": ["!roomid1:clarun.xyz"]}
- For Instagram bot:
access_token: string (secret) - Matrix access token (ephemeral, stored in bot DB)device_id: string - Matrix device identifierdatabase_path: string (optional) - Per-bot database if plugin requires (e.g.,/var/lib/maubot/plugins/instagram-bot-1.db)
Relationships:
- Belongs to: Maubot Service (N:1)
- Uses: Plugin (N:1)
- Authenticated as: Matrix User (1:1)
- Subscribed to: Matrix Rooms (N:M via room_subscriptions config)
State Machine:
[created]
↓
[configured] ─→ disabled
↓ ↓
[enabled] ←───────┘
↓
[running] ←→ [stopped]
↓
[failed] → [restarting]
States:
created: Instance exists in maubot DB but not yet configuredconfigured: Config provided, Matrix user created, not yet enabledenabled: Marked as active in configrunning: Bot process active, connected to Matrix, responding to eventsstopped: Manually stopped via management UIfailed: Encountered error (logged to maubot service journal)restarting: Auto-recovery in progress
Validation Rules:
primary_userMUST match pattern@[a-z0-9-]+:clarun.xyztypeMUST reference an uploaded Pluginconfig.room_subscriptionsMUST be array of valid Matrix room IDs (format:!...clarun.xyz)enabled=truerequiresaccess_tokento be set (bot authenticated)
Storage:
- Instance metadata: Maubot SQLite DB (
/var/lib/maubot/bot.dbtable:instance) - Access tokens: Maubot SQLite DB (encrypted at rest)
- Plugin config: Maubot SQLite DB (JSON blob)
3. Plugin
Description: Packaged bot functionality (.mbp file) containing code, metadata, and dependencies.
Attributes:
id: string - Plugin identifier (e.g.,sna.instagram)version: string (semver) - Plugin version (e.g.,1.0.0)main_class: string - Python class name (e.g.,InstagramBot)modules: array[string] - Python module list (e.g.,["instagram_bot"])dependencies: array[string] - Python package dependencies (e.g.,["yt-dlp>=2023.1.6", "aiohttp"])database: boolean - Whether plugin requires dedicated databaseconfig_schema: object (JSON Schema) - Plugin configuration validation schemaupload_path: string (path) - Storage location (e.g.,/var/lib/maubot/plugins/sna.instagram-v1.0.0.mbp)
Relationships:
- Belongs to: Maubot Service (N:1)
- Used by: Bot Instances (1:N)
State Machine:
[uploaded]
↓
[validated] ─→ [rejected] (invalid metadata)
↓
[loaded] ←→ [disabled]
↓
[active] (used by >=1 running instance)
↓
[trashed] → [deleted]
Validation Rules:
idMUST match pattern[a-z][a-z0-9._-]+versionMUST be valid semvermain_classMUST exist in provided modules.mbpfile MUST be valid zip containingmaubot.yaml+ Python filesdependenciesMUST be available in nixpkgs (e.g., yt-dlp is available, instaloader is not)
Storage:
- Active plugins:
/var/lib/maubot/plugins/ - Trashed plugins:
/var/lib/maubot/trash/ - Metadata: Maubot SQLite DB (table:
plugin)
4. Bot Configuration
Description: Settings specific to bot instance including Matrix credentials, plugin settings, and room subscriptions.
Attributes:
instance_id: string (foreign key) - References Bot Instanceroom_subscriptions: array[string] - List of Matrix room IDs where bot is active- Example:
["!abc123:clarun.xyz", "!def456:clarun.xyz"]
- Example:
command_prefix: string (optional) - Bot command trigger (e.g.,!instagram,!ig)enabled_features: object - Feature flags for plugin- For Instagram bot:
{"auto_fetch": true, "rate_limiting": true, "caching": false}
- For Instagram bot:
rate_limit_config: object - Rate limiting parameters- Example:
{"max_requests_per_minute": 10, "burst_size": 3, "backoff_seconds": 30}
- Example:
error_notification_level: string (enum) - Minimum severity for admin notifications- Values:
DEBUG,INFO,WARN,ERROR,CRITICAL - Default:
ERROR(per spec FR-013)
- Values:
Relationships:
- Belongs to: Bot Instance (1:1)
- References: Matrix Rooms (N:M via room_subscriptions)
Validation Rules:
room_subscriptionsitems MUST be valid Matrix room IDscommand_prefixMUST NOT conflict with other bots (user responsibility)error_notification_levelMUST be one of valid enum valuesrate_limit_config.max_requests_per_minuteMUST be >0 and <=60
Storage:
- Stored in Bot Instance config JSON blob
- Editable via:
- Maubot web UI (management interface)
- Direct config file edit + bot restart (per FR-010)
5. Admin Notification
Description: ERROR and CRITICAL level bot notifications sent to Matrix homeserver admin room (shared with other platform notifications).
Attributes:
timestamp: datetime (ISO 8601) - When notification was generatedsource_instance: string - Bot instance ID that triggered notificationseverity: string (enum) - Log level (ERRORorCRITICAL)message: string - Human-readable error descriptioncontext: object (JSON) - Additional metadataroom_id: string (optional) - Matrix room where error occurredevent_id: string (optional) - Matrix event that triggered errorexception_type: string (optional) - Python exception classstack_trace: string (optional) - Abbreviated stack trace (last 10 lines)
Relationships:
- Triggered by: Bot Instance (N:1)
- Sent to: Matrix Admin Room (N:1, shared room: defined in ops-jrz1 config)
State Machine: N/A (notifications are fire-and-forget events)
Validation Rules:
severityMUST beERRORorCRITICAL(DEBUG/INFO/WARN go to logs only per FR-013)messageMUST be non-empty- Matrix admin room MUST exist and bot MUST have send permission
Storage:
- Not persisted (real-time notification)
- Logged to systemd journal:
journalctl -u maubot.service - Visible in maubot management dashboard (recent notifications)
6. Bot Database
Description: Per-instance isolated SQLite database for plugin state and data persistence.
Attributes:
instance_id: string (foreign key) - References Bot Instancedatabase_path: string (path) - SQLite file location (e.g.,/var/lib/maubot/plugins/instagram-bot-1.db)schema_version: integer - Plugin-defined schema versionsize_bytes: integer - Database file sizelast_accessed: datetime - Last read/write timestamp
Relationships:
- Belongs to: Bot Instance (1:1, optional - only if plugin requires DB)
- Managed by: Plugin code (plugin-defined schema)
State Machine:
[initialized] (schema created)
↓
[active] (read/write operations)
↓
[migrating] (schema upgrade in progress)
↓
[active]
↓
[archived] (bot deleted, DB preserved)
Validation Rules:
database_pathMUST be within/var/lib/maubot/plugins/directory- Schema migrations MUST be handled by plugin code (not maubot framework)
- Database MUST be owned by
maubotuser/group
Storage:
- Location:
/var/lib/maubot/plugins/<instance-id>.db - Backup: Manual (part of
/var/lib/maubot/directory backup)
Relationships Diagram
┌─────────────────────┐
│ Matrix Homeserver │
│ (conduwuit) │
└──────────┬──────────┘
│ authenticates
│
┌──────────▼──────────┐
│ Maubot Service │
│ ┌──────────────┐ │
│ │ Admin UI │ │ ← admin login (sops-nix secrets)
│ │ :29316 │ │
│ └──────────────┘ │
│ │
│ manages ↓ │
│ │
│ ┌──────────────┐ │
│ │ Bot Instance │───┼──→ uses Plugin (.mbp)
│ │ (instagram) │ │
│ └───┬──────────┘ │
│ │ has config │
│ ↓ │
│ ┌──────────────┐ │
│ │ Bot Config │ │
│ │ - rooms[] │ │
│ │ - settings │ │
│ └──────────────┘ │
│ │
│ stores ↓ │
│ │
│ ┌──────────────┐ │
│ │ Bot Database │ │ (optional, plugin-specific)
│ │ (SQLite) │ │
│ └──────────────┘ │
└─────────────────────┘
│ sends notifications
↓
┌─────────────────────┐
│ Matrix Admin Room │ (shared with platform)
└─────────────────────┘
Configuration File Structures
Maubot Service Config
File: /var/lib/maubot/config/config.yaml
Structure:
database: "sqlite:///var/lib/maubot/bot.db"
server:
hostname: 0.0.0.0
port: 29316
admins:
admin: <INJECTED_FROM_CREDENTIALS_DIRECTORY> # Replaced at runtime
homeservers:
clarun.xyz:
url: http://127.0.0.1:8008
secret: <INJECTED_REGISTRATION_TOKEN> # Optional, for auto-registration
logging:
level: INFO
handlers:
- type: journal # Log to systemd journal
api_features:
login: true
plugin: true
plugin_upload: true
instance: true
instance_database: true
log: true
Generation:
- Maubot example config generated via
maubot -c config.yaml -e - Python script merges NixOS module overrides
- Secrets injected from
$CREDENTIALS_DIRECTORY(systemd LoadCredential) - Final config written to
/var/lib/maubot/config/config.yaml
Bot Instance Config
Stored in: Maubot SQLite DB (not file-based)
Access methods:
- Maubot web UI (http://localhost:29316/_matrix/maubot)
- Direct database edit (advanced, not recommended)
- File-based config edit + restart (for room subscriptions per FR-010)
Example config (Instagram bot):
{
"enabled": true,
"max_file_size": 50000000,
"room_subscriptions": [
"!abc123def:clarun.xyz",
"!xyz789ghi:clarun.xyz"
],
"rate_limiting": {
"enabled": true,
"max_requests_per_minute": 10,
"backoff_seconds": 30
},
"error_notification_level": "ERROR"
}
Plugin Metadata
File: maubot.yaml (inside .mbp archive)
Structure:
id: sna.instagram
version: 1.0.0
license: MIT
modules:
- instagram_bot
main_class: InstagramBot
database: false # Plugin doesn't use dedicated DB
config: true # Plugin accepts configuration
config_schema:
type: object
properties:
enabled:
type: boolean
default: true
max_file_size:
type: integer
default: 50000000
room_subscriptions:
type: array
items:
type: string
pattern: "^!.+:.+$"
dependencies:
- yt-dlp>=2023.1.6
- aiohttp
- pillow
State Persistence
Service State
Location: /var/lib/maubot/bot.db (SQLite)
Tables:
instance- Bot instance metadataplugin- Uploaded plugin metadataclient- Matrix client credentials (access tokens)log- Recent bot activity logs
Backup strategy:
- Included in
/var/lib/maubot/directory backup - Rollback via NixOS generations (service config)
- Database can be wiped and rebuilt from scratch (bot re-registration required)
Runtime State
Location: Memory (maubot service process)
Contents:
- Active bot instances (Python objects)
- Matrix client connections (aiohttp sessions)
- Event handlers (registered callbacks)
- Plugin instances (loaded Python classes)
Recovery:
- Automatic on maubot service restart
- Bot instances reconnect to Matrix
- Plugin state reloaded from DB (if applicable)
Security Model
Secrets Hierarchy
-
Service-level secrets (sops-nix encrypted):
maubot-admin-password- Management UI loginmaubot-secret-key- Session signingmatrix-registration-token- Bot user creation (reused from Matrix homeserver)
-
Bot-level secrets (stored in maubot DB):
- Matrix access tokens (per bot instance)
- Matrix device IDs
- Plugin-specific credentials (if any)
-
Runtime secrets (ephemeral):
- Active session tokens (management UI)
- Matrix sync tokens (E2EE keys if enabled)
Permissions
File permissions:
/var/lib/maubot/ → drwxr-x--- maubot:maubot
/var/lib/maubot/config/ → drwx------ maubot:maubot
/var/lib/maubot/config/config.yaml → -rw------- maubot:maubot (contains secrets)
/var/lib/maubot/bot.db → -rw-r----- maubot:maubot
/var/lib/maubot/plugins/ → drwxr-xr-x maubot:maubot
/run/secrets/maubot-* → -r-------- maubot:maubot (0400)
Network access:
- Management interface: localhost:29316 only (SSH tunnel required for remote access per spec)
- Matrix homeserver: localhost:8008 (IPv4, conduwuit compatibility)
- No external network access (except Matrix federation via homeserver)
Operational Entities
Health Check State
Attributes:
last_check_timestamp: datetimeservice_status: enum (healthy,degraded,failed)maubot_version_endpoint: boolean -/maubot/v1/versionaccessibleactive_instances_count: integerfailed_instances: array[string] - Instance IDs with errorslast_successful_message_timestamp: datetime (per bot instance)
Storage: Systemd timer state + systemd journal logs
Health indicators (per spec SC-003):
- Service responds to HTTP health check (curl to version endpoint)
- Active instances count matches enabled instances count
- No ERROR/CRITICAL logs in last 5 minutes
- All enabled bots have recent Matrix sync activity (<10 minutes)
Data Flow Diagrams
Instagram URL Processing Flow
1. User posts Instagram URL in Matrix room
↓
2. Matrix homeserver distributes event to all clients
↓
3. Bot instance receives event (if subscribed to that room)
↓
4. Plugin regex matches Instagram URL pattern
↓
5. Plugin calls yt-dlp extraction (async thread pool)
↓
6. yt-dlp downloads media to temporary directory
↓
7. Plugin uploads media to Matrix homeserver
↓
8. Plugin sends Matrix message event with media attachment
↓
9. Cleanup temporary files
↓
10. Log extraction success/failure (severity-based notification if ERROR/CRITICAL)
Bot Registration Flow
1. Admin accesses maubot web UI via SSH tunnel
↓
2. Create new bot client (provide Matrix user ID)
↓
3. Maubot attempts registration via conduwuit registration token
↓
4. If successful: Access token stored in maubot DB
↓
5. Create bot instance (select plugin, provide config)
↓
6. Bot connects to Matrix homeserver
↓
7. Bot joins configured rooms (from room_subscriptions)
↓
8. Bot starts listening for events
Validation Rules Summary
Configuration Validation
- All Matrix room IDs MUST match pattern
!.+:.+ - Homeserver URL MUST be
http://127.0.0.1:PORT(IPv4, not localhost) - Admin password MUST meet minimum strength (length >=16, bcrypt cost >=12)
- Plugin IDs MUST be globally unique within maubot instance
- File paths MUST be absolute and within permitted directories
Runtime Validation
- Bot instances CANNOT start without valid Matrix access token
- Room subscriptions MUST reference existing rooms (checked at runtime, logged if invalid)
- Plugin dependencies MUST be available in NixOS environment
- Rate limiting MUST be enforced before external API calls (Instagram)
Security Validation
- Secrets MUST NEVER appear in logs or config files (placeholders only)
- Management interface MUST bind localhost only (0.0.0.0 for within-container, but not exposed externally)
- Database files MUST have restrictive permissions (0600 or 0640)
- ERROR/CRITICAL notifications MUST include sanitized context (no credentials in stack traces)
Migration Strategy
From ops-base to ops-jrz1
Data migration: Not required (fresh deployment)
Configuration migration:
- Extract maubot.nix module from ops-base
- Adapt namespace:
services.matrix-vm.maubot→services.maubot - Update homeserver URL:
continuwuity→conduwuit - Remove registration_secrets (not supported by conduwuit)
- Add registration token configuration
Plugin migration:
- Copy Instagram bot .mbp file from ops-base:
/home/dan/proj/sna/sna-instagram-bot.mbp - Upload to ops-jrz1 maubot via web UI or API
- Create bot instance with room subscriptions
- Test content fetching in designated rooms
No database migration needed (SQLite DB created fresh on ops-jrz1)
Capacity Planning
Single Instagram Bot Instance
Estimated resource usage:
- Memory: ~100MB (maubot service + bot instance + yt-dlp subprocess)
- Disk:
- Maubot DB: <10MB (metadata only)
- Plugins: ~1MB per .mbp file
- Temporary files: Up to 50MB (during media download, auto-cleanup)
- CPU: Burst during media extraction (yt-dlp), idle otherwise
- Network: <1GB/day (assuming <20 Instagram fetches/day at ~50MB each)
Scale validation (per SC-002):
- Maubot service supports 3+ concurrent instances without degradation
- Each additional bot: ~50MB memory, minimal CPU/network impact
- Shared resources: Maubot DB (SQLite supports concurrent reads), management UI
Status: Data model complete. Ready for quickstart.md generation.