486 lines
15 KiB
Markdown
486 lines
15 KiB
Markdown
# Maubot Sync Not Delivering Events to Instagram Bot
|
|
|
|
**Status:** OPEN
|
|
**Severity:** HIGH
|
|
**Date Opened:** 2025-11-01
|
|
**Reporter:** Claude (session continuation)
|
|
**Affected Component:** maubot Matrix client sync loop
|
|
**Related Worklog:** [docs/worklogs/2025-11-01-maubot-instagram-bot-deployment-and-sync-investigation.org](../worklogs/2025-11-01-maubot-instagram-bot-deployment-and-sync-investigation.org)
|
|
|
|
## Problem Statement
|
|
|
|
The Instagram bot plugin deployed to maubot appears correctly configured at all layers (service, client, instance, room membership) but is not receiving Matrix events. Messages sent to the bot's configured room (`!dT40EUcemb8e6bPiig:clarun.xyz`) do not trigger the plugin's event handler, and the client's `next_batch` token remains stuck at 40267 despite multiple messages sent.
|
|
|
|
## Symptoms
|
|
|
|
1. **No event handler execution**: Plugin's `handle_message()` method never called
|
|
2. **Stuck next_batch**: Client sync token remains at 40267 across multiple checks
|
|
3. **Silent failure**: No errors in maubot service logs
|
|
4. **Successful initialization**: Plugin logs "SocialMediaBot initialized" on startup
|
|
5. **Correct configuration**: All database verification queries show expected state
|
|
|
|
## Environment
|
|
|
|
- **Server:** clarun.xyz (45.77.205.49)
|
|
- **OS:** NixOS (unstable)
|
|
- **Matrix Homeserver:** conduwuit (http://127.0.0.1:8008)
|
|
- **Maubot Version:** (from nixpkgs-unstable)
|
|
- **Bot User:** @instagram-bot:clarun.xyz
|
|
- **Test Room:** !dT40EUcemb8e6bPiig:clarun.xyz
|
|
- **Plugin:** sna-instagram-bot v1.0.0 (6946 bytes)
|
|
- **Database:** /var/lib/maubot/bot.db (SQLite)
|
|
|
|
## Evidence Collected
|
|
|
|
### Database State Verification
|
|
|
|
#### Client Configuration
|
|
```sql
|
|
SELECT id, homeserver, user_id, sync, autojoin, enabled,
|
|
length(access_token) as token_len, next_batch
|
|
FROM client WHERE id = 'instagram-bot';
|
|
```
|
|
|
|
**Result:**
|
|
- `id`: instagram-bot
|
|
- `homeserver`: http://127.0.0.1:8008
|
|
- `user_id`: @instagram-bot:clarun.xyz
|
|
- `sync`: 1 (enabled)
|
|
- `autojoin`: 1 (enabled)
|
|
- `enabled`: 1 (enabled)
|
|
- `access_token`: 32 bytes (valid)
|
|
- `next_batch`: 40267 (NOT UPDATING)
|
|
|
|
**Verified twice with 5-second gap:** next_batch remained at 40267.
|
|
|
|
#### Instance Configuration
|
|
```sql
|
|
SELECT id, enabled, config FROM instance WHERE id = 'instagram-bot-1';
|
|
```
|
|
|
|
**Result:**
|
|
- `id`: instagram-bot-1
|
|
- `enabled`: 1
|
|
- `config`: `{"enabled": true, "max_file_size": 50000000, "allowed_rooms": ["!dT40EUcemb8e6bPiig:clarun.xyz"]}`
|
|
|
|
Configuration correct with DM room ID in allowed_rooms list.
|
|
|
|
#### Room Membership
|
|
```sql
|
|
SELECT * FROM room_state WHERE room_id = '!dT40EUcemb8e6bPiig:clarun.xyz';
|
|
```
|
|
|
|
**Result:** Bot confirmed in room.
|
|
|
|
### Service Logs
|
|
|
|
#### Maubot Service Status
|
|
```bash
|
|
ssh root@45.77.205.49 'systemctl status maubot'
|
|
```
|
|
- Service: active (running)
|
|
- No errors reported
|
|
|
|
#### Startup Logs
|
|
```bash
|
|
ssh root@45.77.205.49 'journalctl -u maubot -n 50'
|
|
```
|
|
|
|
**Observed:**
|
|
- "SocialMediaBot initialized" message appears
|
|
- No initialization errors
|
|
- No configuration errors
|
|
- No authentication errors
|
|
|
|
#### Live Monitoring During Message Test
|
|
```bash
|
|
ssh root@45.77.205.49 'journalctl -u maubot -f'
|
|
```
|
|
|
|
**Test:** User sends message to room `!dT40EUcemb8e6bPiig:clarun.xyz`
|
|
|
|
**Result:** No log entries appear. Expected "Received message in room..." (INFO level) did not occur.
|
|
|
|
### Homeserver Verification
|
|
|
|
#### Conduwuit Accessibility
|
|
```bash
|
|
ssh root@45.77.205.49 'curl -s http://127.0.0.1:8008/_matrix/client/versions | jq .'
|
|
```
|
|
|
|
**Result:** Successfully returns Matrix client API versions. Homeserver responsive.
|
|
|
|
#### Other Matrix Functionality
|
|
- Slack bridge (mautrix-slack) functioning normally
|
|
- Bidirectional message flow working
|
|
- ~50 portal rooms syncing successfully
|
|
- Element client can send/receive messages normally
|
|
|
|
**Conclusion:** Conduwuit homeserver operational; issue specific to maubot client.
|
|
|
|
### Web UI Verification
|
|
|
|
**Access:** SSH tunnel to http://localhost:29316
|
|
|
|
**Observations:**
|
|
- Bot instance shows as "started"
|
|
- Client shows as enabled
|
|
- Configuration visible and correct
|
|
- No error indicators in UI
|
|
|
|
### Code Verification
|
|
|
|
#### Event Handler Registration
|
|
```python
|
|
@event.on(EventType.ROOM_MESSAGE)
|
|
async def handle_message(self, event: MessageEvent) -> None:
|
|
self.log.info(f"Received message in room {event.room_id}")
|
|
# ... rest of handler
|
|
```
|
|
|
|
**Analysis:** Decorator syntax correct, matches maubot patterns used in other plugins.
|
|
|
|
#### Config Class Registration
|
|
```python
|
|
@classmethod
|
|
def get_config_class(cls):
|
|
return Config
|
|
```
|
|
|
|
**Analysis:** Present and correct (added during session to fix config loading).
|
|
|
|
## What Has Been Ruled Out
|
|
|
|
### ❌ Configuration Issues
|
|
- [x] Config not loaded (fixed with get_config_class())
|
|
- [x] Null config in event handler (fixed with null checks)
|
|
- [x] Empty allowed_rooms (updated to include test room)
|
|
- [x] Wrong room ID (verified against Element client URL)
|
|
- [x] Bot not in room (database shows room membership)
|
|
|
|
### ❌ Authentication Issues
|
|
- [x] Invalid access token (32 bytes, confirmed in database)
|
|
- [x] Homeserver unreachable (curl test successful)
|
|
- [x] Client disabled (database shows enabled=1)
|
|
- [x] Sync disabled (database shows sync=1)
|
|
|
|
### ❌ Code Issues
|
|
- [x] Plugin crashes on startup (logs show successful init)
|
|
- [x] Event handler syntax error (no Python errors in logs)
|
|
- [x] Missing dependencies (plugin loads successfully)
|
|
- [x] Plugin not deployed (correct file size and timestamp)
|
|
|
|
### ❌ Service Issues
|
|
- [x] Maubot service not running (systemctl shows active)
|
|
- [x] Service crashes on message (no crash logs observed)
|
|
- [x] Database locked (other queries succeed)
|
|
- [x] File permissions (service has correct access)
|
|
|
|
### ❌ Matrix Server Issues
|
|
- [x] Homeserver down (curl test succeeds, Slack bridge works)
|
|
- [x] Room doesn't exist (Element shows room, user can send messages)
|
|
- [x] Encryption preventing delivery (room not encrypted)
|
|
|
|
## Hypotheses
|
|
|
|
### Hypothesis 1: Maubot Client Sync Loop Not Starting
|
|
**Likelihood:** MEDIUM
|
|
|
|
**Evidence For:**
|
|
- next_batch not updating
|
|
- No sync activity in logs
|
|
- No events delivered despite correct config
|
|
|
|
**Evidence Against:**
|
|
- No errors in logs
|
|
- Service shows as running
|
|
- Plugin initialization succeeds
|
|
|
|
**Test:** Check maubot internal state for sync loop status (no known command for this).
|
|
|
|
### Hypothesis 2: Appservice Registration Issue
|
|
**Likelihood:** LOW
|
|
|
|
**Evidence For:**
|
|
- Events might not be routed to this specific bot
|
|
- Other appservices (Slack bridge) working suggests different registration
|
|
|
|
**Evidence Against:**
|
|
- Maubot manages registration automatically
|
|
- No errors about appservice registration
|
|
- Plugin loaded successfully
|
|
|
|
**Test:** Manually verify appservice registration file exists and is correct.
|
|
|
|
### Hypothesis 3: Race Condition in Client Initialization
|
|
**Likelihood:** MEDIUM
|
|
|
|
**Evidence For:**
|
|
- Sync might start before plugin fully ready
|
|
- next_batch stuck at initialization value
|
|
|
|
**Evidence Against:**
|
|
- Multiple service restarts didn't fix
|
|
- Configuration updates didn't trigger re-sync
|
|
|
|
**Test:** Delete client and recreate via web UI (fresh start).
|
|
|
|
### Hypothesis 4: Maubot Bug with Multiple Clients
|
|
**Likelihood:** LOW
|
|
|
|
**Evidence For:**
|
|
- No other maubot bots tested on this server
|
|
- Could be resource contention or state corruption
|
|
|
|
**Evidence Against:**
|
|
- Only one maubot instance running
|
|
- Database not corrupted (queries succeed)
|
|
|
|
**Test:** Create minimal test plugin to verify maubot sync working at all.
|
|
|
|
### Hypothesis 5: Conduwuit Appservice Event Filtering
|
|
**Likelihood:** LOW-MEDIUM
|
|
|
|
**Evidence For:**
|
|
- Events might not be sent to all appservices
|
|
- Could be namespacing or user_id filtering issue
|
|
|
|
**Evidence Against:**
|
|
- Slack bridge receives events fine
|
|
- No errors in conduwuit logs about failed delivery
|
|
|
|
**Test:** Check conduwuit logs for appservice event delivery attempts.
|
|
|
|
### Hypothesis 6: Client Sync Token Corruption
|
|
**Likelihood:** MEDIUM
|
|
|
|
**Evidence For:**
|
|
- next_batch stuck at specific value (40267)
|
|
- Might be invalid token causing sync to fail silently
|
|
|
|
**Evidence Against:**
|
|
- No errors about invalid sync token
|
|
- Value looks reasonable (numeric, not obviously corrupt)
|
|
|
|
**Test:** Reset next_batch to null or empty string, force fresh sync.
|
|
|
|
## Investigation Steps (Ordered by Priority)
|
|
|
|
### Step 1: Test with Minimal Plugin ⚡ HIGH PRIORITY
|
|
**Goal:** Determine if maubot sync works at all on this server.
|
|
|
|
**Actions:**
|
|
1. Create minimal test plugin:
|
|
```python
|
|
from maubot import Plugin
|
|
from maubot.handlers import event
|
|
from mautrix.types import EventType, MessageEvent
|
|
|
|
class TestBot(Plugin):
|
|
@event.on(EventType.ROOM_MESSAGE)
|
|
async def handle(self, evt: MessageEvent) -> None:
|
|
self.log.info(f"TEST BOT GOT MESSAGE: {evt.content.body}")
|
|
await evt.respond("pong")
|
|
```
|
|
|
|
2. Deploy as new instance
|
|
3. Send test message
|
|
4. Check if handler executes
|
|
|
|
**Expected Outcome:**
|
|
- If minimal bot works: Issue specific to Instagram bot plugin
|
|
- If minimal bot fails: Systemic maubot sync issue
|
|
|
|
### Step 2: Reset Client next_batch Token ⚡ HIGH PRIORITY
|
|
**Goal:** Force fresh sync from current state.
|
|
|
|
**Actions:**
|
|
```bash
|
|
ssh root@45.77.205.49 "sqlite3 /var/lib/maubot/bot.db \"UPDATE client SET next_batch = '' WHERE id = 'instagram-bot';\""
|
|
ssh root@45.77.205.49 'systemctl restart maubot'
|
|
```
|
|
|
|
**Expected Outcome:**
|
|
- next_batch should update to new value after sync
|
|
- Events might start delivering
|
|
|
|
**Risk:** Might cause duplicate events if sync was partially working.
|
|
|
|
### Step 3: Recreate Bot Client via Web UI
|
|
**Goal:** Fresh client state, eliminate corruption.
|
|
|
|
**Actions:**
|
|
1. SSH tunnel: `ssh -L 29316:localhost:29316 root@45.77.205.49`
|
|
2. Access http://localhost:29316
|
|
3. Delete existing client "instagram-bot"
|
|
4. Create new client with same credentials
|
|
5. Update instance to use new client
|
|
6. Test message delivery
|
|
|
|
**Expected Outcome:**
|
|
- Fresh client initialization might fix sync startup
|
|
- New access token might help if token was issue
|
|
|
|
### Step 4: Check Conduwuit Logs for Event Delivery
|
|
**Goal:** Verify homeserver attempting to deliver events to maubot appservice.
|
|
|
|
**Actions:**
|
|
```bash
|
|
ssh root@45.77.205.49 'journalctl -u matrix-continuwuity --since "5 minutes ago" | grep -i maubot'
|
|
ssh root@45.77.205.49 'journalctl -u matrix-continuwuity --since "5 minutes ago" | grep -i appservice'
|
|
```
|
|
|
|
**Look for:**
|
|
- POST requests to maubot appservice endpoint
|
|
- Event delivery attempts
|
|
- Errors or timeouts
|
|
|
|
**Expected Outcome:**
|
|
- If delivery attempts present: Maubot not processing them
|
|
- If no delivery attempts: Conduwuit not routing to maubot
|
|
|
|
### Step 5: Verify Appservice Registration
|
|
**Goal:** Ensure maubot properly registered as appservice with conduwuit.
|
|
|
|
**Actions:**
|
|
```bash
|
|
# Check maubot registration file
|
|
ssh root@45.77.205.49 'cat /var/lib/maubot/registration.yaml'
|
|
|
|
# Check conduwuit knows about it
|
|
ssh root@45.77.205.49 'cat /var/lib/matrix-continuwuity/continuwuity.toml' | grep -A 10 appservice
|
|
```
|
|
|
|
**Look for:**
|
|
- Registration file exists with correct namespace
|
|
- Tokens match between maubot and conduwuit config
|
|
- URL accessible from conduwuit
|
|
|
|
### Step 6: Enable Maubot Debug Logging
|
|
**Goal:** Get more visibility into sync loop operation.
|
|
|
|
**Actions:**
|
|
1. Edit maubot config to enable debug logging
|
|
2. Restart service
|
|
3. Monitor logs during test message
|
|
4. Look for sync loop activity
|
|
|
|
**Expected Information:**
|
|
- Sync request/response cycle
|
|
- Event processing pipeline
|
|
- Any silent failures
|
|
|
|
### Step 7: Review Maubot Source Code
|
|
**Goal:** Understand sync loop initialization and failure modes.
|
|
|
|
**Actions:**
|
|
1. Find maubot package in nix store: `nix-store -q --references $(which maubot) | grep maubot`
|
|
2. Read client sync implementation
|
|
3. Look for conditions that prevent sync from starting
|
|
4. Check for silent failure paths
|
|
|
|
**Expected Outcome:**
|
|
- Identify potential bug or misconfiguration
|
|
- Find undocumented requirements
|
|
|
|
### Step 8: Compare with Working Bot
|
|
**Goal:** Identify configuration differences.
|
|
|
|
**Actions:**
|
|
If Slack bridge bot uses maubot (it doesn't, but if another working bot exists):
|
|
1. Compare database client entries
|
|
2. Compare initialization logs
|
|
3. Compare registration files
|
|
4. Identify what working bot does differently
|
|
|
|
## Workarounds
|
|
|
|
### Workaround 1: Direct Matrix Client Library
|
|
**Status:** NOT ATTEMPTED
|
|
|
|
Instead of using maubot, implement bot using matrix-nio or mautrix-python directly.
|
|
|
|
**Pros:**
|
|
- Full control over sync loop
|
|
- Better debugging visibility
|
|
- No maubot abstraction layer
|
|
|
|
**Cons:**
|
|
- More code to write
|
|
- Lose maubot plugin infrastructure
|
|
- Need to implement own config management
|
|
|
|
### Workaround 2: Webhook-Based Bot
|
|
**Status:** NOT ATTEMPTED
|
|
|
|
Use Matrix webhook events instead of client-server sync.
|
|
|
|
**Pros:**
|
|
- No sync loop needed
|
|
- Push-based event delivery
|
|
- Potentially more reliable
|
|
|
|
**Cons:**
|
|
- Requires appservice registration
|
|
- More complex setup
|
|
- Different architecture
|
|
|
|
## Related Issues
|
|
|
|
- None currently tracked (first occurrence of this issue)
|
|
|
|
## Resources
|
|
|
|
### Documentation
|
|
- Matrix Client-Server API: https://spec.matrix.org/latest/client-server-api/
|
|
- Maubot plugin development: https://docs.mau.fi/maubot/dev/getting-started.html
|
|
- mautrix-python sync protocol: https://github.com/mautrix/python
|
|
|
|
### Similar Issues (External)
|
|
- Search maubot GitHub issues for "sync not working"
|
|
- Search Matrix spec issues for client sync failures
|
|
|
|
### Internal References
|
|
- Worklog: [docs/worklogs/2025-11-01-maubot-instagram-bot-deployment-and-sync-investigation.org](../worklogs/2025-11-01-maubot-instagram-bot-deployment-and-sync-investigation.org)
|
|
- Maubot module: [modules/maubot.nix](../../modules/maubot.nix)
|
|
- Plugin source: ~/proj/sna/instagram_bot.py
|
|
- Previous maubot worklog: [docs/worklogs/2025-10-26-maubot-planning-session.org](../worklogs/2025-10-26-maubot-planning-session.org)
|
|
|
|
## Timeline
|
|
|
|
- **2025-11-01 ~12:00**: Deployed Instagram bot plugin to maubot
|
|
- **2025-11-01 ~12:30**: Fixed config loading bugs (get_config_class, null checks)
|
|
- **2025-11-01 ~13:00**: Updated allowed_rooms configuration via database
|
|
- **2025-11-01 ~13:30**: Added INFO logging to plugin
|
|
- **2025-11-01 ~14:00**: Discovered sync not delivering events (next_batch stuck)
|
|
- **2025-11-01 ~15:00**: Verified all configuration via database and web UI
|
|
- **2025-11-01 ~15:30**: Created worklog and this investigation document
|
|
|
|
## Next Actions
|
|
|
|
**Immediate (Today):**
|
|
1. ⚡ Step 1: Deploy minimal test plugin
|
|
2. ⚡ Step 2: Reset next_batch token
|
|
3. ⚡ Step 3: Recreate client via web UI
|
|
|
|
**Short-term (This Week):**
|
|
4. Step 4: Check conduwuit event delivery logs
|
|
5. Step 5: Verify appservice registration
|
|
6. Step 6: Enable maubot debug logging
|
|
|
|
**Long-term (If Unresolved):**
|
|
7. Step 7: Review maubot source code
|
|
8. Step 8: Consider direct Matrix client library implementation
|
|
9. Report bug to maubot project if confirmed
|
|
|
|
## Contact
|
|
|
|
**Assignee:** TBD (user decision on next steps)
|
|
**Reviewer:** TBD
|
|
**Related Stakeholders:** Matrix platform users awaiting Instagram bot functionality
|
|
|
|
---
|
|
|
|
**Last Updated:** 2025-11-01 15:42 PST
|
|
**Next Review:** After attempting Steps 1-3
|