diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index 53d2ae6..dea1d15 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -1,6 +1,6 @@ {"id":"ops-jrz1-00e","title":"Upgrade NixOS from 24.05 to 24.11","description":"Running NixOS 24.05.20241230 (Uakari). Current stable is 24.11. May be missing security patches. Low priority as no known critical CVEs, but should plan upgrade.","status":"open","priority":3,"issue_type":"task","created_at":"2025-12-04T21:03:22.760228514-08:00","updated_at":"2025-12-04T21:04:35.805980055-08:00","comments":[{"id":1,"issue_id":"ops-jrz1-00e","author":"dan","text":"Analysis Findings:\n1. Version Mismatch: Local flake.nix is pinned to 'nixos-24.05', but the dev environment reports '25.11' (Unstable), indicating state divergence.\n2. Upstream Bugs: Blocking issues in mautrix-slack (ops-jrz1-blh) and maubot (sync failure) are present in the current unstable revision (2025-12-02).\n3. Recommendation: Upgrade platform to NixOS 24.11 (Stable) to align environment, ensure stability, and pull fresh upstream fixes.","created_at":"2025-12-08T23:54:57Z"}]} {"id":"ops-jrz1-03o","title":"Upgrade mautrix-slack to v25.11","description":"Upgrade is just flake update + deploy. Current deployed: v0.2.3+dev.unknown (Oct 13). Flake lock: v25.10 (Oct 22). Latest nixpkgs-unstable: v25.11. Run: nix flake update nixpkgs-unstable \u0026\u0026 deploy. May fix edit panic (ops-jrz1-qxr).","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-05T18:24:18.332067067-08:00","updated_at":"2025-12-05T19:07:09.156981447-08:00","closed_at":"2025-12-05T19:07:09.156981447-08:00"} -{"id":"ops-jrz1-1bk","title":"Add CPU watchdog timer","description":"Systemd timer that detects sustained CPU abuse and kills offending user.\n\n## Script: /usr/local/bin/cpu-watchdog\n```bash\n#\\!/usr/bin/env bash\n# Detect sustained CPU abuse, kill after 5 consecutive violations\nTHRESHOLD=180 # 180% CPU (almost 2 cores)\nCOUNTFILE=\"/var/lib/cpu-watchdog\"\nmkdir -p \"$COUNTFILE\"\n\nfor user in $(ls /home); do\n id \"$user\" \u0026\u003e/dev/null || continue\n pct=$(ps -u \"$user\" -o %cpu= 2\u003e/dev/null | awk '{s+=$1}END{print int(s)}')\n pct=${pct:-0}\n \n if [ \"$pct\" -gt \"$THRESHOLD\" ]; then\n count=$(cat \"$COUNTFILE/$user\" 2\u003e/dev/null || echo 0)\n count=$((count + 1))\n echo \"$count\" \u003e \"$COUNTFILE/$user\"\n logger -t cpu-watchdog \"User $user at ${pct}% CPU (strike $count/5)\"\n \n if [ \"$count\" -ge 5 ]; then\n /usr/local/bin/killswitch \"$user\" \"sustained CPU abuse (${pct}%)\"\n rm -f \"$COUNTFILE/$user\"\n fi\n else\n rm -f \"$COUNTFILE/$user\"\n fi\ndone\n```\n\n## Systemd timer\n```nix\nsystemd.services.cpu-watchdog = {\n script = ''/usr/local/bin/cpu-watchdog'';\n serviceConfig.Type = \"oneshot\";\n};\nsystemd.timers.cpu-watchdog = {\n wantedBy = [ \"timers.target\" ];\n timerConfig = {\n OnBootSec = \"1min\";\n OnUnitActiveSec = \"1min\";\n };\n};\n```\n\n## Behavior\n- Runs every minute\n- 5 consecutive minutes at \u003e180% CPU = kill\n- Resets counter if CPU drops below threshold","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-02T20:20:53.246401154-08:00","created_by":"dan","updated_at":"2026-01-02T20:20:53.246401154-08:00","dependencies":[{"issue_id":"ops-jrz1-1bk","depends_on_id":"ops-jrz1-396","type":"blocks","created_at":"2026-01-02T20:21:14.270063028-08:00","created_by":"dan"}]} +{"id":"ops-jrz1-1bk","title":"Add CPU watchdog timer","description":"Systemd timer that detects sustained CPU abuse and kills offending user.\n\n## Script: /usr/local/bin/cpu-watchdog\n```bash\n#\\!/usr/bin/env bash\n# Detect sustained CPU abuse, kill after 5 consecutive violations\nTHRESHOLD=180 # 180% CPU (almost 2 cores)\nCOUNTFILE=\"/var/lib/cpu-watchdog\"\nmkdir -p \"$COUNTFILE\"\n\nfor user in $(ls /home); do\n id \"$user\" \u0026\u003e/dev/null || continue\n pct=$(ps -u \"$user\" -o %cpu= 2\u003e/dev/null | awk '{s+=$1}END{print int(s)}')\n pct=${pct:-0}\n \n if [ \"$pct\" -gt \"$THRESHOLD\" ]; then\n count=$(cat \"$COUNTFILE/$user\" 2\u003e/dev/null || echo 0)\n count=$((count + 1))\n echo \"$count\" \u003e \"$COUNTFILE/$user\"\n logger -t cpu-watchdog \"User $user at ${pct}% CPU (strike $count/5)\"\n \n if [ \"$count\" -ge 5 ]; then\n /usr/local/bin/killswitch \"$user\" \"sustained CPU abuse (${pct}%)\"\n rm -f \"$COUNTFILE/$user\"\n fi\n else\n rm -f \"$COUNTFILE/$user\"\n fi\ndone\n```\n\n## Systemd timer\n```nix\nsystemd.services.cpu-watchdog = {\n script = ''/usr/local/bin/cpu-watchdog'';\n serviceConfig.Type = \"oneshot\";\n};\nsystemd.timers.cpu-watchdog = {\n wantedBy = [ \"timers.target\" ];\n timerConfig = {\n OnBootSec = \"1min\";\n OnUnitActiveSec = \"1min\";\n };\n};\n```\n\n## Behavior\n- Runs every minute\n- 5 consecutive minutes at \u003e180% CPU = kill\n- Resets counter if CPU drops below threshold","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-02T20:20:53.246401154-08:00","created_by":"dan","updated_at":"2026-01-02T21:02:35.469465906-08:00","closed_at":"2026-01-02T21:02:35.469465906-08:00","close_reason":"Closed","dependencies":[{"issue_id":"ops-jrz1-1bk","depends_on_id":"ops-jrz1-396","type":"blocks","created_at":"2026-01-02T20:21:14.270063028-08:00","created_by":"dan"}]} {"id":"ops-jrz1-2bu","title":"Direct Slack bot path for learners","description":"Alternative path: learners write Python bots using slack-bolt, connect directly to Slack via Socket Mode. No Matrix, no bridge.\n\n## Architecture\n```\nlearner code → slack-bolt → Socket Mode WebSocket → Slack API\n```\n\n## Status\n\n**Done:**\n- [x] /etc/slack-learner.env with shared tokens (xoxb-, xapp-)\n- [x] learners group for access control (dantest is member)\n- [x] learner-add.sh adds users to group, sources env in .bashrc\n- [x] Design doc: docs/learner-slack-direct.md\n\n**Not Done:**\n- [ ] Starter template (~/slack-bot-template/)\n- [ ] Process management (systemd user services or supervisor)\n- [ ] #learner-sandbox channel in Slack\n- [ ] End-to-end test with real learner\n\n## Tradeoffs vs Maubot/Matrix (ops-jrz1-2pm)\n- Faster feedback (direct to Slack)\n- Excellent slack-bolt docs\n- But: shared bot identity, manual process management\n\n## Ready to Use NOW\nWorks today with terminal editors (vim/nano):\n```bash\nssh alice@ops-jrz1\npip install slack-bolt\npython bot.py # responds in Slack\n```\n\nVS Code Remote-SSH needs nix-ld deployed first.","status":"open","priority":2,"issue_type":"epic","created_at":"2025-12-29T18:56:10.239324326-05:00","created_by":"dan","updated_at":"2026-01-02T10:04:58.786306917-08:00"} {"id":"ops-jrz1-2pm","title":"Remote dev environment for learners","description":"Set up dev environments for learners to build maubot plugins (Matrix bots that can bridge to Slack).\n\n## Approach\nVS Code Remote-SSH + shared maubot + per-user Unix accounts\n\n## Architecture\n```\nlearner code → maubot → Matrix → mautrix-slack bridge → Slack\n```\n\n## Status\n\n**Done:**\n- [x] learner-add.sh / learner-remove.sh scripts\n- [x] Hello-world plugin template (templates/plugin-skeleton/)\n- [x] Test user `dantest` created with ~/plugins/hello-dantest/\n- [x] Maubot running and healthy\n\n**Not Done:**\n- [ ] nix-ld for VS Code Remote-SSH (config added, not deployed)\n- [ ] Test full VS Code Remote-SSH flow\n- [ ] Test Claude Code extension over Remote-SSH\n- [ ] #learners-sandbox Matrix room\n- [ ] Onboarding doc polish\n\n## Tradeoffs vs Direct Slack (ops-jrz1-2bu)\n- Slower feedback (bridge hop)\n- Sparse maubot docs\n- But: managed process lifecycle, per-bot identity\n\n## Docs\n- docs/learner-onboarding.md\n- docs/learner-admin.md","status":"open","priority":2,"issue_type":"epic","created_at":"2025-12-28T10:13:21.90764918-05:00","created_by":"dan","updated_at":"2026-01-02T10:04:58.472361796-08:00"} {"id":"ops-jrz1-30e","title":"Add gastown (gt) to system packages via flake input","description":"Add gastown CLI (gt) as a system-wide package.\n\n## What is gastown?\nGastown (gt) is the orchestration layer to beads' memory layer. Together they form a workflow for AI-supervised coding.\n\n## Status\n**BLOCKED**: No releases available yet. Repo uses GoReleaser but no tags/releases published.\n- Requires Go 1.24 to build from source\n- Once releases exist, add like beads: flake input + systemPackages\n\n## Source\ngithub:steveyegge/gastown\n\n## Workaround\nUsers can build locally: `go install github.com/steveyegge/gastown@latest`","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-02T16:37:47.093900357-08:00","created_by":"dan","updated_at":"2026-01-02T19:08:28.742017652-08:00","closed_at":"2026-01-02T19:08:28.742017652-08:00","close_reason":"Won't fix - not adding gastown to this server"} @@ -21,7 +21,7 @@ {"id":"ops-jrz1-6of","title":"AI cost/rate limiting per user","description":"One user could drain API credits with runaway script. Need rate limiting per user, either via proxy middleware or opencode config. Track usage.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-05T15:32:30.772304538-08:00","updated_at":"2025-12-05T17:42:42.773613559-08:00","closed_at":"2025-12-05T17:42:42.773613559-08:00","dependencies":[{"issue_id":"ops-jrz1-6of","depends_on_id":"ops-jrz1-3so","type":"parent-child","created_at":"2025-12-05T17:05:47.206816868-08:00","created_by":"daemon","metadata":"{}"},{"issue_id":"ops-jrz1-6of","depends_on_id":"ops-jrz1-wj2","type":"blocks","created_at":"2025-12-05T17:17:38.658742196-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"ops-jrz1-7j4","title":"Git credential strategy for non-programmers","description":"Non-programmers can't manage SSH keys. Pre-configure git-credential-store or provide simple PAT workflow with docs. Store in persistent home with 600 perms.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-05T15:32:19.673999683-08:00","updated_at":"2025-12-05T17:38:54.788694408-08:00","closed_at":"2025-12-05T17:38:54.788694408-08:00","dependencies":[{"issue_id":"ops-jrz1-7j4","depends_on_id":"ops-jrz1-3so","type":"parent-child","created_at":"2025-12-05T17:05:47.139749437-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"ops-jrz1-88o","title":"Implement backup strategy for VPS","description":"No backups configured. Critical data: Matrix DB (622M), PostgreSQL (161M), Forgejo (2.5M), maubot (320K). No recovery path if disk fails. Need automated backups with off-site storage.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-12-04T22:55:25.546850172-08:00","updated_at":"2025-12-05T00:56:27.720623612-08:00","closed_at":"2025-12-05T00:56:27.720623612-08:00"} -{"id":"ops-jrz1-8m7","title":"Add cgroups limits for user slices","description":"Add soft resource limits to prevent one user/agent from crashing server.\n\n## Config\n```nix\nsystemd.slices.\"user\".sliceConfig = {\n MemoryMax = \"80%\";\n TasksMax = 500;\n CPUWeight = 100; # Fair sharing, no hard quota\n};\n```\n\n## Behavior\n- Memory: Users collectively can't exceed 80% RAM\n- Tasks: Max 500 processes per user (prevents fork bombs)\n- CPU: Fair sharing when contended, bursts allowed\n\n## Testing\n- Verify with `systemctl show user-1001.slice`\n- Test fork bomb doesn't crash server","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-02T20:16:22.600133044-08:00","created_by":"dan","updated_at":"2026-01-02T20:16:22.600133044-08:00"} +{"id":"ops-jrz1-8m7","title":"Add cgroups limits for user slices","description":"Add soft resource limits to prevent one user/agent from crashing server.\n\n## Config\n```nix\nsystemd.slices.\"user\".sliceConfig = {\n MemoryMax = \"80%\";\n TasksMax = 500;\n CPUWeight = 100; # Fair sharing, no hard quota\n};\n```\n\n## Behavior\n- Memory: Users collectively can't exceed 80% RAM\n- Tasks: Max 500 processes per user (prevents fork bombs)\n- CPU: Fair sharing when contended, bursts allowed\n\n## Testing\n- Verify with `systemctl show user-1001.slice`\n- Test fork bomb doesn't crash server","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-02T20:16:22.600133044-08:00","created_by":"dan","updated_at":"2026-01-02T21:02:35.455928291-08:00","closed_at":"2026-01-02T21:02:35.455928291-08:00","close_reason":"Closed"} {"id":"ops-jrz1-9gd","title":"Upgrade VPS RAM for dev environments","description":"Current: 2GB. Need 4-8GB for multiple code-server containers. Coordinate with Vultr, plan maintenance window.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-05T17:16:54.267689439-08:00","updated_at":"2025-12-28T00:08:06.748175273-05:00","closed_at":"2025-12-28T00:08:06.748175273-05:00","close_reason":"Browser-based dev environment cancelled","dependencies":[{"issue_id":"ops-jrz1-9gd","depends_on_id":"ops-jrz1-3so","type":"parent-child","created_at":"2025-12-05T17:17:36.331146543-08:00","created_by":"daemon","metadata":"{}"}]} {"id":"ops-jrz1-9pe","title":"Research: System packages for learner accounts","description":"How do dev users get access to toolchains (Go, Node, Rust, etc.)?\n\n## Findings\n\n**Users CAN self-install packages:**\n```bash\nnix profile install nixpkgs#go\nnix profile install nixpkgs#nodejs\nnix profile install nixpkgs#rustc\n```\n\nPackages go to `~/.nix-profile/bin`, already in PATH. Works today.\n\n**Devshells work too:**\n```bash\n# In project with flake.nix\nnix develop\n```\n\n## Options\n\n| Option | Pros | Cons |\n|--------|------|------|\n| **Self-service only** | Minimal config, user learns nix | Cold start friction |\n| **Global defaults** | Zero friction for common tools | Bloats system, version conflicts |\n| **Starter script** | One command setup, customizable | Another thing to maintain |\n| **direnv + devshells** | Per-project envs, reproducible | Needs direnv installed globally |\n\n## Current State\n- `nix profile install` works for users ✅\n- `nix develop` works ✅\n- direnv NOT installed globally\n- Only python3, uv in system packages\n\n## Recommendation\n1. Add `direnv` to global packages (enables per-project devshells)\n2. Document `nix profile install` for quick one-offs\n3. Provide example flake.nix templates for Go, Node, Rust projects\n4. Keep system packages minimal (python3, uv, direnv, git, vim)\n\n## Test Results\n```\n$ nix profile install nixpkgs#go\n$ go version\ngo version go1.22.8 linux/amd64\n```","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-02T12:27:32.894163417-08:00","created_by":"dan","updated_at":"2026-01-02T12:32:32.502649201-08:00","closed_at":"2026-01-02T12:32:32.502649201-08:00","close_reason":"Users can self-install via nix profile. Added direnv globally for devshells."} {"id":"ops-jrz1-9x8","title":"Claude CLI update mechanism","description":"Claude Code CLI is manually installed to /usr/local/bin/claude.\n\n## Current state\n- Installed via: curl -fsSL https://claude.ai/install.sh | bash\n- Copied to /usr/local/bin/claude\n- No automatic updates\n\n## Options\n1. Periodic manual update (run install script again)\n2. Systemd timer to check for updates\n3. Package via nix (would need custom derivation)\n\n## Acceptance criteria\nDocument the update process at minimum.","status":"open","priority":3,"issue_type":"task","created_at":"2026-01-02T16:46:03.908575951-08:00","created_by":"dan","updated_at":"2026-01-02T16:46:03.908575951-08:00"}