ops-jrz1/specs/003-maubot-integration/research.md

18 KiB

Research Findings: Maubot Integration

Feature: 003-maubot-integration Date: 2025-10-26 Status: Phase 0 complete

Overview

Research conducted to resolve technical unknowns for extracting maubot from ops-base and deploying to ops-jrz1 with Instagram bot functionality.


Decision 1: Maubot-Conduwuit Compatibility

Decision

YES - Maubot is fully compatible with conduwuit with registration method modifications

Rationale

  • ops-base successfully runs maubot 0.5.2+ on continuwuity (conduwuit fork) at matrix.talu.uno
  • Over 10 production maubot instances confirmed working with conduwuit
  • Maubot uses standard Matrix Client-Server API (homeserver-agnostic)
  • ops-jrz1 conduwuit (0.5.0-rc.8) supports all required Matrix APIs

Key Finding: Registration Method Differs

ops-base pattern (continuwuity):

registration_secrets:
  matrix.talu.uno:
    url: http://127.0.0.1:6167
    secret: REPLACE_REGISTRATION_SECRET  # Shared secret registration

ops-jrz1 requirement (conduwuit):

  • Conduwuit does NOT support registration_shared_secret like Synapse
  • Must use registration tokens or admin room commands for bot user creation

Registration Token Method (simpler, more secure):

  1. Configure conduwuit with registration token (from sops-nix)
  2. During bot client creation in maubot web UI, provide registration token
  3. Bot registers via standard Matrix client registration API

Alternative: Admin Room Commands:

!admin users create-user maubot-bot-1
# Returns generated password

Integration Pattern

  • Remove registration_secrets section from maubot config
  • Remove registrationSecretFile option from NixOS module
  • Document registration token workflow in quickstart.md

Compatibility Notes

  • Database: SQLite works (no changes needed)
  • Network: Use IPv4 127.0.0.1:8008 (not localhost - conduwuit binds IPv4 only)
  • Encryption: maubot 0.5.2+ supports E2EE with conduwuit
  • Appservice: Maubot bots are regular users, not appservice users (no appservice registration needed)

Known Issues (Resolved)

  • maubot < 0.5.2 had bug causing excessive key uploads (fixed in 0.5.2+)
  • Use latest stable maubot from nixpkgs

References

  • ops-base maubot.nix:387
  • ops-base maubot-deployment-instructions.md
  • ops-base conduwuit admin room discovery worklog

Decision 2: Instagram Content Fetching

Decision

Use yt-dlp (primary) for Instagram content extraction

Rationale

  • ops-base Instagram bot uses yt-dlp >=2023.1.6 (available in nixpkgs)
  • Proven working implementation at /home/dan/proj/sna/instagram_bot.py
  • Packaged as sna-instagram-bot.mbp and deployed successfully
  • Source bot had instaloader fallback, but instaloader not in nixpkgs (yt-dlp-only mode in production)

Implementation Pattern

Extraction Architecture:

class InstagramBot(Plugin):  # Inherits from maubot.Plugin

    @event.on(EventType.ROOM_MESSAGE)
    async def handle_message(self, event: MessageEvent):
        # 1. Detect Instagram URLs via regex
        # 2. Extract content with yt-dlp (async thread pool)
        # 3. Upload media to Matrix homeserver
        # 4. Send to room with metadata (caption, uploader, dimensions)

Content Types Supported:

  • Posts (images)
  • Reels (videos)
  • IGTV (videos)
  • Stories (if publicly accessible)

File Handling:

  • Temporary directory for downloads (auto-cleanup)
  • Max file size: 50MB (configurable)
  • Supported formats: mp4, jpg, jpeg, png, webp
  • MIME type detection for proper Matrix msgtype

Metadata Extraction:

  • Title, description, uploader
  • Dimensions (width x height)
  • Duration (for videos)
  • Posted as separate text message after media

Rate Limiting Strategy

Current State: No rate limiting implemented in ops-base bot

Risks:

  • Burst of URLs in high-traffic room could trigger Instagram rate limits
  • No request tracking, queuing, or throttling
  • Extraction failures logged but no retry logic

Recommendations for 003-maubot-integration:

  1. Add per-room request tracking
  2. Implement exponential backoff on extraction failures
  3. Queue URLs and process with delays (e.g., 5 seconds between requests)
  4. Add configuration for max requests/minute
  5. Monitor extraction failure rates as health indicator

Known Limitations

  1. Instagram API changes: yt-dlp requires updates when Instagram changes interface
  2. Private content: Cannot access private posts/stories (public only)
  3. Rate limiting exposure: Heavy usage may cause temporary failures
  4. No retry logic: Failed extractions not queued for later attempt
  5. File size limits: 50MB hard limit, Matrix homeserver may have separate limits
  6. No caching: Frequently shared URLs re-extracted every time

Plugin Packaging

Format: .mbp archive (zip file)

Structure:

sna-instagram-bot.mbp:
  instagram_bot.py   (11,643 bytes)
  maubot.yaml        (plugin metadata)
  README.md          (documentation)

Metadata (maubot.yaml):

id: sna.instagram
version: 1.0.0
main_class: InstagramBot
modules: [instagram_bot]

Creation:

cd /path/to/plugin
zip -r instagram-bot.mbp instagram_bot.py maubot.yaml README.md

Deployment Methods:

  1. API upload (automated):

    curl -X POST \
      -H "Authorization: Bearer $TOKEN" \
      -F "file=@instagram-bot.mbp" \
      "http://localhost:29316/_matrix/maubot/v1/plugins/upload"
    
  2. Web UI (manual): Upload via http://localhost:29316/_matrix/maubot (SSH tunnel)

Source Files to Adapt

  • Plugin source: /home/dan/proj/sna/instagram_bot.py
  • Plugin package: /home/dan/proj/sna/sna-instagram-bot.mbp
  • Deployment scripts: /home/dan/proj/ops-base/scripts/*instagram-bot.sh

Alternatives Considered

instaloader:

  • Rejected: Not available in nixpkgs
  • ops-base bot had fallback support, but unused in production

Official Instagram API:

  • Rejected: Requires Facebook developer approval (per spec clarifications)
  • Community scraping approach acceptable for internal team use

Decision 3: NixOS Module Adaptation Strategy

Decision

Two-layer module pattern matching mautrix-slack architecture

Rationale

  • ops-jrz1 established pattern with mautrix-slack module
  • Low-level module (services.maubot) provides full configuration surface
  • High-level wrapper (services.dev-platform.maubot) simplifies common usage
  • Consistent with existing infrastructure patterns

Source Pattern: ops-base maubot.nix

Module namespace: services.matrix-vm.maubot

Key characteristics:

  • Runtime config generation with placeholder substitution
  • systemd LoadCredential for secrets injection
  • Python script in ExecStartPre replaces placeholders
  • SQLite database at /var/lib/maubot/bot.db
  • Timer-based health monitoring (5min check + 10min auto-restart)
  • Config template at /etc/maubot/config.yaml → runtime config at /run/maubot/config.yaml

Secrets pattern:

LoadCredential = [
  "admin-password:${cfg.adminPasswordFile}"
  "secret-key:${cfg.secretKeyFile}"
  "registration-secret:${cfg.registrationSecretFile}"  # REMOVE for conduwuit
];

Target Pattern: ops-jrz1 Services

mautrix-slack.nix pattern:

  • Module namespace: services.mautrix-slack (low-level)
  • Wrapper: services.dev-platform.slackBridge in modules/dev-services.nix
  • Config: Example config generation + YAML merging via Python
  • Database: PostgreSQL via unix socket
  • Secrets: No LoadCredential (tokens from interactive login)
  • State: /var/lib/mautrix_slack/config/config.yaml (within StateDirectory)

Adaptation decisions:

Aspect ops-base ops-jrz1 Target
Namespace services.matrix-vm.maubot services.maubot + services.dev-platform.maubot
Config location /run/maubot/config.yaml /var/lib/maubot/config/config.yaml
Config approach Template substitution Example config + YAML merge + secret substitution
Secrets LoadCredential + Python replacement LoadCredential + Python replacement (retain ops-base pattern)
Database SQLite /var/lib/maubot/bot.db SQLite (same path)
Logs File + journal Journal only (StandardOutput)
State Manual StateDirectory + tmpfiles StateDirectory = "maubot" (systemd managed)
Health checks Timer-based (5min + 10min) Retain ops-base pattern
User/group maubot:maubot maubot:maubot + matrix-appservices supplementary

Configuration Generation Hybrid Approach

Recommendation: Combine mautrix-slack example config pattern with ops-base secrets injection

Steps:

  1. Run maubot -c config.yaml -e to generate example config (ensures structure completeness)
  2. Python script merges structured overrides (like mautrix-slack)
  3. Write config with placeholders to StateDirectory
  4. Second step reads from CREDENTIALS_DIRECTORY and replaces placeholders
  5. Final config written with proper permissions (0600)

Why hybrid:

  • Example config ensures YAML structure stays valid across maubot versions
  • LoadCredential provides better security than storing secrets in Nix store
  • Proven pattern from both source (ops-base) and target (mautrix-slack)

Database Decision

Recommendation: SQLite (match ops-base)

Rationale:

  • Maubot workload is lightweight (bot state, plugin configs)
  • ops-base SQLite deployment proven stable
  • Simpler backup/restore (single file)
  • Isolation from shared PostgreSQL (Forgejo, mautrix-slack use it)
  • Less complex dependency chain
  • Adequate for small team usage (<10 bot instances)

Path: /var/lib/maubot/bot.db

Future: Support PostgreSQL via config option if scaling needs emerge

Secrets Management

Recommendation: Retain ops-base LoadCredential pattern

Secrets required:

# In secrets/secrets.yaml (add)
maubot-admin-password: "..."     # Admin UI login
maubot-secret-key: "..."         # Session signing key
# matrix-registration-token: "..." # Already exists, reuse for bot user creation

systemd configuration:

LoadCredential = [
  "admin-password:/run/secrets/maubot-admin-password"
  "secret-key:/run/secrets/maubot-secret-key"
  "registration-token:/run/secrets/matrix-registration-token"  # Reused
];

Substitution in ExecStartPre (Python script):

# Read from $CREDENTIALS_DIRECTORY
admin_pw = Path(os.environ['CREDENTIALS_DIRECTORY'], 'admin-password').read_text().strip()
# Replace placeholders in config
config = config.replace('REPLACE_ADMIN_PASSWORD', admin_pw)

Why not mautrix-slack pattern:

  • mautrix-slack gets tokens via interactive login (no pre-provisioning needed)
  • Maubot requires secrets before service starts (admin UI, signing key)
  • LoadCredential keeps secrets out of Nix store and config files

Health Monitoring

Recommendation: Retain ops-base timer-based pattern

Implementation:

  • maubot-health.service (oneshot): Curl to http://localhost:29316/_matrix/maubot/v1/version every 5 minutes
  • maubot-health-restart.service (oneshot): Check for failed health checks, restart if needed (every 10 minutes)
  • systemd.timers for scheduling

Why retain:

  • Maubot provides explicit health endpoint (unlike mautrix-slack)
  • ops-base pattern proven reliable
  • mautrix-slack has no health monitoring (only log-based Socket Mode checks)
  • Valuable for production stability (auto-recovery)

Directory Structure

Target layout:

/var/lib/maubot/
  ├── config/
  │   └── config.yaml          # Generated runtime config
  ├── plugins/                  # Plugin storage (.mbp files)
  ├── trash/                    # Deleted plugins
  └── bot.db                    # SQLite database

Changes from ops-base:

  • Config in StateDirectory (not /run/maubot/)
  • Logs via journal (remove /var/log/maubot/)
  • Use StateDirectory = "maubot" (systemd automatic management)

Security Hardening

Apply from mautrix-slack:

  • StateDirectory = "maubot"
  • StateDirectoryMode = "0750"
  • PrivateTmp = true
  • ProtectSystem = "strict"
  • ReadWritePaths = [ cfg.dataDir ]
  • MemoryMax = "512M" (match ops-base)
  • Standard systemd hardening flags

Remove from ops-base:

  • RuntimeDirectory (use StateDirectory)
  • LogsDirectory (use journal)
  • Manual tmpfiles rules

Integration Points

hosts/ops-jrz1.nix additions:

sops.secrets.maubot-admin-password = { mode = "0400"; };
sops.secrets.maubot-secret-key = { mode = "0400"; };

services.dev-platform.maubot = {
  enable = true;
  port = 29316;  # Management interface
};

modules/dev-services.nix additions:

services.dev-platform.maubot = {
  enable = mkOption { type = types.bool; default = false; };
  port = mkOption { type = types.port; default = 29316; };
};

config = mkIf cfg.maubot.enable {
  services.maubot = {
    enable = true;
    homeserverUrl = "http://127.0.0.1:${toString cfg.matrix.port}";
    serverName = cfg.matrix.serverName;
    port = cfg.maubot.port;
    # ... map other options
  };
};

Alternatives Considered

Pure mautrix-slack pattern:

  • Rejected: Would require removing LoadCredential and storing secrets in config
  • Less secure (secrets in Nix store or config files)
  • More code rewrite from proven ops-base pattern

Keep ops-base pattern exactly:

  • Rejected: Inconsistent with ops-jrz1 conventions
  • Manual directory management instead of StateDirectory
  • File-based logging instead of journal
  • Less integration with dev-platform namespace

Technical Context Summary

Language/Version: Python 3.11 (maubot runtime) Primary Dependencies: maubot 0.5.2+, yt-dlp >=2023.1.6, aiohttp, SQLite Storage: SQLite at /var/lib/maubot/bot.db Testing: Manual QA (automated tests future enhancement) Target Platform: NixOS 24.05+ on ops-jrz1 VPS (45.77.205.49) Project Type: Infrastructure service (NixOS module) Performance Goals: <5 second Instagram content fetch (per SC-001), 99% uptime over 7 days (per SC-003) Constraints: localhost-only management interface (SSH tunnel required), single Instagram bot instance initially Scale/Scope: 1 Instagram bot instance MVP, architecture validated for 3 concurrent instances (SC-002)


Platform Vision Alignment

Core Philosophy Adherence

Build It Right Over Time:

  • Extract proven maubot module from ops-base (avoid reinvention)
  • Declarative NixOS module pattern
  • Self-documenting via quickstart.md and inline comments
  • Sustainable pattern (matches existing mautrix-slack infrastructure)

Presentable State First:

  • Working Instagram bot demonstrates value immediately
  • Clear documentation (research.md, quickstart.md, contracts/)
  • Professional deployment pattern (consistent with mautrix-slack)

Architecture Principles

Communication Layer:

  • Maubot extends Matrix functionality (bot framework)
  • Instagram bot brings external content into Matrix (enriches communication)
  • Aligns with Matrix-centric hub architecture

Deployment Philosophy:

  • NixOS-Native pattern (module + sops-nix secrets)
  • Declarative and reproducible
  • Built-in rollback (NixOS generations)
  • Clear separation: infrastructure (maubot service) vs application (Instagram plugin)

Sustainability:

  • Small team focus (single bot instance initially, validate 3-instance capability)
  • Quality over speed (comprehensive research before implementation)
  • Proven patterns (extract from ops-base, not experimental)

Risk Assessment

Low Risk

  • SQLite database (proven, simple)
  • LoadCredential secrets (ops-base pattern working)
  • Health monitoring (non-intrusive timers)
  • StateDirectory approach (standard systemd)

Medium Risk

  • conduwuit compatibility (ops-base uses continuwuity fork)
    • Mitigation: Early testing of bot registration and Matrix connection
  • Two-layer module pattern (new for maubot, proven with mautrix-slack)
    • Mitigation: Follow exact mautrix-slack pattern
  • Instagram scraping stability (yt-dlp depends on Instagram not changing)
    • Mitigation: yt-dlp actively maintained, ops-base deployment proven

Requires Testing

  • Registration token workflow with conduwuit (different from ops-base shared secret)
  • Management interface localhost binding (security requirement)
  • Instagram content fetching with current yt-dlp version
  • Bot response in designated rooms only (room-based activation per FR-006)
  • Auto-recovery after homeserver restart (SC-004)

Next Steps

Phase 1: Design & Contracts

  1. Generate data-model.md with entities:
    • Maubot Service, Bot Instance, Plugin, Bot Configuration, Admin Notification, Bot Database
  2. Generate contracts/ with configuration schemas (if applicable)
  3. Generate quickstart.md with deployment runbook including:
    • Registration token setup
    • Bot creation workflow
    • Room subscription configuration
    • Admin room access procedure
  4. Update AGENTS.md with maubot, yt-dlp context

Phase 2: Implementation Planning

  1. Extract maubot.nix from ops-base to ops-jrz1
  2. Adapt namespace and configuration patterns
  3. Add sops secrets declarations
  4. Create dev-platform wrapper in dev-services.nix
  5. Test service startup and conduwuit connection
  6. Deploy Instagram plugin
  7. Validate SC-001 through SC-008

References

Source Files Analyzed

  • /home/dan/proj/ops-base/vm-configs/modules/maubot.nix (387 lines)
  • /home/dan/proj/ops-base/vm-configs/modules/continuwuity.nix (413 lines)
  • /home/dan/proj/ops-base/docs/maubot-deployment-instructions.md
  • /home/dan/proj/ops-base/docs/continuwuit-appservice-registration-guide.md
  • /home/dan/proj/ops-jrz1/modules/mautrix-slack.nix (current)
  • /home/dan/proj/ops-jrz1/modules/dev-services.nix (current)
  • /home/dan/proj/ops-jrz1/docs/platform-vision.md (architecture principles)
  • /home/dan/proj/sna/instagram_bot.py (11,643 bytes)
  • /home/dan/proj/sna/sna-instagram-bot.mbp (packaged plugin)

External Documentation


Status: Research complete. All technical unknowns resolved. Ready for Phase 1 design.