Add security posture analysis and fix home dir permissions

- docs/security-posture.md: Threat model, risk assessment, recommendations - Make home directories private (chmod 700) - Update learner-add.sh to create private homes - Closes ops-jrz1-k2a 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 19:14:07 -08:00 · 2026-01-02 19:14:07 -08:00 · 3b91f37975
parent 219a38b7aa
commit 3b91f37975
3 changed files with 111 additions and 1 deletions
--- a/.beads/issues.jsonl
+++ b/.beads/issues.jsonl
@ -2,7 +2,7 @@
 {"id":"ops-jrz1-03o","title":"Upgrade mautrix-slack to v25.11","description":"Upgrade is just flake update + deploy. Current deployed: v0.2.3+dev.unknown (Oct 13). Flake lock: v25.10 (Oct 22). Latest nixpkgs-unstable: v25.11. Run: nix flake update nixpkgs-unstable \u0026\u0026 deploy. May fix edit panic (ops-jrz1-qxr).","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-05T18:24:18.332067067-08:00","updated_at":"2025-12-05T19:07:09.156981447-08:00","closed_at":"2025-12-05T19:07:09.156981447-08:00"}
 {"id":"ops-jrz1-2bu","title":"Direct Slack bot path for learners","description":"Alternative path: learners write Python bots using slack-bolt, connect directly to Slack via Socket Mode. No Matrix, no bridge.\n\n## Architecture\n```\nlearner code → slack-bolt → Socket Mode WebSocket → Slack API\n```\n\n## Status\n\n**Done:**\n- [x] /etc/slack-learner.env with shared tokens (xoxb-, xapp-)\n- [x] learners group for access control (dantest is member)\n- [x] learner-add.sh adds users to group, sources env in .bashrc\n- [x] Design doc: docs/learner-slack-direct.md\n\n**Not Done:**\n- [ ] Starter template (~/slack-bot-template/)\n- [ ] Process management (systemd user services or supervisor)\n- [ ] #learner-sandbox channel in Slack\n- [ ] End-to-end test with real learner\n\n## Tradeoffs vs Maubot/Matrix (ops-jrz1-2pm)\n- Faster feedback (direct to Slack)\n- Excellent slack-bolt docs\n- But: shared bot identity, manual process management\n\n## Ready to Use NOW\nWorks today with terminal editors (vim/nano):\n```bash\nssh alice@ops-jrz1\npip install slack-bolt\npython bot.py  # responds in Slack\n```\n\nVS Code Remote-SSH needs nix-ld deployed first.","status":"open","priority":2,"issue_type":"epic","created_at":"2025-12-29T18:56:10.239324326-05:00","created_by":"dan","updated_at":"2026-01-02T10:04:58.786306917-08:00"}
 {"id":"ops-jrz1-2pm","title":"Remote dev environment for learners","description":"Set up dev environments for learners to build maubot plugins (Matrix bots that can bridge to Slack).\n\n## Approach\nVS Code Remote-SSH + shared maubot + per-user Unix accounts\n\n## Architecture\n```\nlearner code → maubot → Matrix → mautrix-slack bridge → Slack\n```\n\n## Status\n\n**Done:**\n- [x] learner-add.sh / learner-remove.sh scripts\n- [x] Hello-world plugin template (templates/plugin-skeleton/)\n- [x] Test user `dantest` created with ~/plugins/hello-dantest/\n- [x] Maubot running and healthy\n\n**Not Done:**\n- [ ] nix-ld for VS Code Remote-SSH (config added, not deployed)\n- [ ] Test full VS Code Remote-SSH flow\n- [ ] Test Claude Code extension over Remote-SSH\n- [ ] #learners-sandbox Matrix room\n- [ ] Onboarding doc polish\n\n## Tradeoffs vs Direct Slack (ops-jrz1-2bu)\n- Slower feedback (bridge hop)\n- Sparse maubot docs\n- But: managed process lifecycle, per-bot identity\n\n## Docs\n- docs/learner-onboarding.md\n- docs/learner-admin.md","status":"open","priority":2,"issue_type":"epic","created_at":"2025-12-28T10:13:21.90764918-05:00","created_by":"dan","updated_at":"2026-01-02T10:04:58.472361796-08:00"}
-{"id":"ops-jrz1-30e","title":"Add gastown (gt) to system packages via flake input","description":"Add gastown CLI (gt) as a system-wide package.\n\n## What is gastown?\nGastown (gt) is the orchestration layer to beads' memory layer. Together they form a workflow for AI-supervised coding.\n\n## Status\n**BLOCKED**: No releases available yet. Repo uses GoReleaser but no tags/releases published.\n- Requires Go 1.24 to build from source\n- Once releases exist, add like beads: flake input + systemPackages\n\n## Source\ngithub:steveyegge/gastown\n\n## Workaround\nUsers can build locally: `go install github.com/steveyegge/gastown@latest`","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-02T16:37:47.093900357-08:00","created_by":"dan","updated_at":"2026-01-02T17:30:30.182568728-08:00"}
+{"id":"ops-jrz1-30e","title":"Add gastown (gt) to system packages via flake input","description":"Add gastown CLI (gt) as a system-wide package.\n\n## What is gastown?\nGastown (gt) is the orchestration layer to beads' memory layer. Together they form a workflow for AI-supervised coding.\n\n## Status\n**BLOCKED**: No releases available yet. Repo uses GoReleaser but no tags/releases published.\n- Requires Go 1.24 to build from source\n- Once releases exist, add like beads: flake input + systemPackages\n\n## Source\ngithub:steveyegge/gastown\n\n## Workaround\nUsers can build locally: `go install github.com/steveyegge/gastown@latest`","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-02T16:37:47.093900357-08:00","created_by":"dan","updated_at":"2026-01-02T19:08:28.742017652-08:00","closed_at":"2026-01-02T19:08:28.742017652-08:00","close_reason":"Won't fix - not adding gastown to this server"}
 {"id":"ops-jrz1-3au","title":"Research: Learner deployment pipeline","description":"How does learner code get to \"prod\" (running services)?\n\n## Current context\n- Maubot: Upload .mbp via web UI\n- Slack bots: Manual `python bot.py` or systemd user service\n\n## Questions\n1. Can learners run persistent services? (`systemctl --user`)\n2. Should they have access to maubot admin UI?\n3. Git-based deploy? Push to trigger reload?\n4. Who can restart what?\n\n## Options\n- **Manual only** - Learner runs in foreground/tmux\n- **User systemd** - `systemctl --user enable mybot`\n- **Supervised** - Central supervisor manages learner procs\n- **GitOps** - Push to deploy (complex)\n\n## Security considerations\n- What if learner bot crashes in loop?\n- Resource limits on user services?\n- Can learner affect other learners' services?","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-02T12:27:34.107447487-08:00","created_by":"dan","updated_at":"2026-01-02T12:27:34.107447487-08:00"}
 {"id":"ops-jrz1-3b1","title":"Research: Agentic coder sandboxing","description":"When Claude Code runs on server, what can it do? Should we limit it?\n\n## Current state\n- No sandbox\n- Claude has full user privileges\n- Can run any command user can run\n\n## Risks\n- `rm -rf ~` (accidental or hallucinated)\n- Network exfiltration\n- Resource exhaustion (fork bomb, disk fill)\n- Credential theft from env/files\n\n## Options\n1. **Trust the agent** - User's problem if Claude breaks things\n2. **Command allowlist** - Only approved commands\n3. **Container sandbox** - Run agent in container\n4. **Snapshot/rollback** - Easy recovery if things break\n5. **Audit logging** - At least know what happened\n\n## Questions\n- What do other agentic coding setups do?\n- Is this overkill for a learning environment?\n- Does Claude Code have built-in safety?","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-02T12:27:33.705283658-08:00","created_by":"dan","updated_at":"2026-01-02T12:27:33.705283658-08:00"}
 {"id":"ops-jrz1-3ca","title":"Persist opencode state/cache across restarts","description":"opencode may store index/cache in ~/.cache or other dirs not covered by current bind mounts. AI context could be lost on container restart. Verify and add mounts.","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-05T15:32:30.90315778-08:00","updated_at":"2025-12-28T00:05:44.753074955-05:00","closed_at":"2025-12-28T00:05:44.753074955-05:00","close_reason":"Parent epic cancelled - browser-based dev approach abandoned","dependencies":[{"issue_id":"ops-jrz1-3ca","depends_on_id":"ops-jrz1-3so","type":"parent-child","created_at":"2025-12-05T17:05:47.247361009-08:00","created_by":"daemon","metadata":"{}"}]}
--- a/docs/security-posture.md
+++ b/docs/security-posture.md
@ -0,0 +1,107 @@
+# Security Posture Analysis: ops-jrz1 Dev Server
+
+## Executive Summary
+
+ops-jrz1 is a shared development server for learning agentic coding. Current security posture is **acceptable for a trusted learning environment** but has gaps that should be addressed before onboarding untrusted users.
+
+## Threat Model
+
+### Assumed Users
+- **Learners**: Trusted individuals learning agentic coding
+- **Agentic Coders**: AI tools (Claude, opencode, etc.) running with user privileges
+
+### Threat Actors
+1. **Curious learner** - Explores beyond their sandbox (low intent, low skill)
+2. **Compromised AI agent** - Prompt injection or malicious code execution
+3. **Malicious insider** - Intentional abuse (not in scope for learning environment)
+
+### Out of Scope
+- External attackers (network is properly firewalled)
+- Malicious insiders with intent to harm
+
+## Current Security Controls
+
+### ✅ Good
+| Control | Status |
+|---------|--------|
+| Firewall | Only 22, 80, 443 open |
+| Internal services | Bound to 127.0.0.1 only |
+| sudo access | Denied for learner users |
+| sops secrets | Root-only (/run/secrets) |
+| SSH auth | Key-only, no passwords |
+
+### ⚠️ Gaps
+| Gap | Risk | Impact |
+|-----|------|--------|
+| Home dirs world-readable | Users can browse each other's code | Low - learning env |
+| Shared Slack tokens | All users share one bot identity | Medium - actions attributed to same bot |
+| Unrestricted egress | Users/agents can reach any network | Medium - data exfil possible |
+| No resource limits | One user can exhaust CPU/RAM/disk | Medium - DoS other users |
+| Agentic coder privileges | AI runs with full user permissions | Medium - same as human user |
+
+## Asset Inventory
+
+### Secrets at Risk
+| Secret | Location | Access | Risk |
+|--------|----------|--------|------|
+| Slack bot token | /etc/slack-learner.env | learners group | Low - shared intentionally |
+| Slack app token | /etc/slack-learner.env | learners group | Low - shared intentionally |
+| Matrix registration | /run/secrets | root only | Protected |
+| Maubot credentials | /run/secrets | root only | Protected |
+| User SSH keys | ~/.ssh | user only | Protected |
+| User AI auth tokens | ~/.config/* | user only | User responsibility |
+
+### Services
+| Service | Port | Exposure | Notes |
+|---------|------|----------|-------|
+| SSH | 22 | Public | Key auth only |
+| nginx | 80/443 | Public | Reverse proxy |
+| PostgreSQL | 5432 | localhost | Unix socket preferred |
+| Conduwuit | 8008 | localhost | Matrix homeserver |
+| Forgejo | 3000 | localhost | Git repos |
+| mautrix-slack | 29319 | localhost | Bridge |
+| maubot | 29316 | localhost | Bot framework |
+
+## Risk Assessment
+
+### What can a compromised user/agent do?
+1. **Read other users' code** - Home dirs are world-readable
+2. **Use shared Slack bot** - Post messages as the shared bot
+3. **Exhaust resources** - No cgroups/quotas
+4. **Exfiltrate data** - Unrestricted network egress
+5. **Install malware** - In their own profile, persists across sessions
+
+### What can they NOT do?
+1. ❌ Read /run/secrets
+2. ❌ sudo/become root
+3. ❌ Access other users' SSH keys
+4. ❌ Modify system configuration
+5. ❌ Access PostgreSQL directly (unless through app)
+
+## Recommendations
+
+### Priority 1 (Do Now)
+- [ ] **Make home dirs private** - `chmod 700 /home/*`
+- [ ] **Document shared Slack token risk** - Users know bot is shared
+
+### Priority 2 (Before More Users)
+- [ ] **Add resource limits** - cgroups for CPU/memory per user
+- [ ] **Disk quotas** - Prevent one user filling disk
+- [ ] **Consider per-user Slack apps** - If attribution matters
+
+### Priority 3 (Hardening)
+- [ ] **Network egress filtering** - Allowlist or log outbound
+- [ ] **Audit logging** - Track user commands (auditd)
+- [ ] **Separate AI auth storage** - Investigate ~/.config isolation
+
+### Not Recommended
+- ❌ Full sandboxing (VMs, containers per user) - Overkill for learning
+- ❌ Restricting nix profile install - Breaks user autonomy
+
+## Conclusion
+
+For a **trusted learning environment**, current posture is acceptable with minor fixes (home dir permissions). Before scaling to untrusted users, implement Priority 2 controls.
+
+---
+*Generated: 2026-01-03*
+*Issue: ops-jrz1-k2a*
--- a/scripts/learner-add.sh
+++ b/scripts/learner-add.sh
@ -63,6 +63,9 @@ create_user() {
    # NixOS: don't specify shell (uses default), group is 'users'
    useradd -m -g users "$username"

+    # Make home directory private (not world-readable)
+    chmod 700 "/home/$username"
+
    # Add to learners group for Slack token access
    usermod -aG learners "$username"