Commit graph

29 commits

Author SHA1 Message Date
Dan b62f649a28 Add disaster recovery runbook draft
Documents restore procedures for full server loss, partial restore,
and user data recovery scenarios. Includes verification checklists,
time estimates, and break-glass quick reference.

Also documents known gaps (home dirs, ACME, RocksDB consistency)
that need fixing before the runbook is production-ready.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 14:02:01 -08:00
Dan 026f82e697 Document AI agent sandbox conflicts in server-AGENTS.md
Codex CLI seccomp filters block nix daemon access.
Workaround: disable redundant sandbox since server provides isolation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 23:33:19 -08:00
Dan 99b187fa5a Document security model: simple Unix isolation 2026-01-09 16:31:11 -08:00
Dan f17604f0ad Add Forgejo admin operations doc 2026-01-09 15:09:09 -08:00
Dan 92d7646d52 Migrate Slack tokens to sops-nix, improve egress rate limits
- Remove beads from VPS deployment (kept locally for dev workflow)
- Add slack-bot-token and slack-app-token secrets with devs group access
- Remove dead acme-email secret reference
- Increase egress limits from 30/min to 150/min (burst 60→300)
- Change egress blocking from REJECT to DROP for better app behavior
- Add egress-status script for user self-diagnosis
- Update dev-slack-direct.md with new /run/secrets access patterns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 11:14:19 -08:00
Dan d3151b39ed Add mosh alternative to dev onboarding doc 2026-01-05 17:38:42 -08:00
Dan a25abda825 Add Unix social tools section to dev onboarding doc
Documents who, w, finger, write, wall, ytalk and .plan files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 15:34:55 -08:00
Dan c236deb480 Add zig to AGENTS.md available tools 2026-01-04 16:43:44 -08:00
Dan 1158f3a37b Add bun as preferred JS package manager for faster installs 2026-01-04 13:49:56 -08:00
Dan 74cf842afd Improve dev onboarding: devs group, npm setup, AGENTS.md
- Add users.groups.devs for shared resources
- dev-add: check devs group exists before creating user
- dev-add: use .profile for login shell PATH setup
- dev-add: configure npm prefix and .npm-global directory
- dev-add: create AGENTS.md with friendly capability guide
- Update onboarding message with npm install examples
- Add docs/server-AGENTS.md for reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 17:11:03 -08:00
Dan bd49ea001a Add documentation for adding dev tools
Covers four methods: system-wide, per-user nix profile,
per-project devShell, and external flakes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 11:00:15 -08:00
Dan bc81b4ec15 Rename learner to dev across codebase
- scripts/learner-*.sh → scripts/dev-*.sh
- docs/learner-*.md → docs/dev-*.md
- tests/test-learner-env.sh → tests/test-dev-env.sh
- Update all internal references

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-03 10:42:34 -08:00
Dan 3b91f37975 Add security posture analysis and fix home dir permissions
- docs/security-posture.md: Threat model, risk assessment, recommendations
- Make home directories private (chmod 700)
- Update learner-add.sh to create private homes
- Closes ops-jrz1-k2a

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 19:14:07 -08:00
Dan 1b1a91f9cb Switch to user-managed npm for AI coding tools
- Remove manual /usr/local/bin/claude install
- Remove claude symlink setup from learner-add.sh
- Update onboarding docs with npm install instructions
- Users choose their AI coder: claude, opencode, gemini, codex

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 19:03:35 -08:00
Dan 5732205758 Add onboarding doc for dan user 2026-01-02 15:05:41 -08:00
Dan 0ad7ca7b98 Add direct Slack bot path for learners
- learner-add.sh: add users to learners group, source Slack env
- New design doc comparing direct Slack vs maubot/Matrix approach

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 18:56:03 -05:00
Dan 3d33a45cc9 Add learner dev environment, testing infrastructure, and skills
Learner account management:
- learner-add.sh: create accounts with SSH, plugin skeleton
- learner-remove.sh: remove accounts with optional archive
- plugin-skeleton template: starter maubot plugin

Testing:
- flake.nix: add checks output for pre-deploy validation
- smoke-test.sh: post-deploy service verification

Documentation:
- learner-onboarding.md: VS Code Remote-SSH setup guide
- learner-admin.md: account management procedures

Skills:
- code-review.md: multi-lens code review skill
- orch, worklog: symlinks to shared skills
2025-12-28 22:23:06 -05:00
Dan b40b88bb7f Add docs, ignore local dev config 2025-12-08 16:31:40 -08:00
Dan f25a8b06ef Production hardening and technical debt cleanup
Priority 1 - Production Quality:
- Revert Matrix homeserver log level from debug to info
- Reduces log volume by ~70% (22k+ lines/day to <7k)
- Improves performance and reduces disk usage

Priority 2 - Technical Debt:
- Automate sender_localpart fix in mautrix-slack.nix
- Eliminates manual sed command on fresh deployments
- Fix verified working (tested 2025-10-26)
- Update CLAUDE.md to document automated solution

Priority 3 - Project Hygiene:
- Remove unused mautrix-whatsapp and mautrix-gmessages imports
- Archive old configurations to docs/examples/alternative-deployments/
- Remove stale staging/ directories from 001 extraction workflow
- Update deployment documentation in tasks.md and quickstart.md
- Add deployment status notes to spec files

Files Modified:
- modules/dev-services.nix: log level debug → info
- modules/mautrix-slack.nix: automatic sender_localpart fix
- hosts/ops-jrz1.nix: remove unused bridge imports
- CLAUDE.md: update Known Issues, add Resolved Issues section
- specs/002-*/: add deployment status notes
- configurations/ → docs/examples/alternative-deployments/

Tested and Verified:
- All services running (matrix, bridge, forgejo, postgresql, nginx)
- Bridge authenticated and message flow working
- sender_localpart fix generates correct registration file
2025-10-26 15:59:05 -07:00
Dan 0b1751766b Ignore worklogs directory for security
Worklogs may contain sensitive troubleshooting information, error messages,
tokens, or infrastructure details that should not be in version control.
2025-10-26 14:37:26 -07:00
Dan bce31933ed Add platform vision and spec-kit integration docs 2025-10-26 14:36:52 -07:00
Dan d69f8a4ac8 Add Forgejo repository setup worklog 2025-10-26 14:36:42 -07:00
Dan a00a5fe312 Deploy mautrix-slack bridge with IPv4 networking fixes
Changes:
- Fix nginx proxy_pass directives to use 127.0.0.1 instead of localhost
- Fix bridge homeserverUrl to use explicit IPv4 address
- Enable debug logging on conduwuit
- Add spec-kit framework files to .gitignore
- Document deployment in comprehensive worklog

Resolves connection refused errors from localhost resolving to IPv6 [::1]
while services bind only to IPv4 127.0.0.1. Bridge now fully operational
with bidirectional Slack-Matrix message flow working.
2025-10-26 14:33:00 -07:00
Dan c4a00356fc Add comprehensive security & validation test report for Generation 31
Performed full security audit including:
- Matrix API endpoint validation
- TLS/nginx reverse proxy verification
- sops-nix secrets management testing
- Firewall and network security analysis
- SSH hardening verification
- Database connectivity and permissions
- System integrity and log review

Results: All critical tests PASSED
- Excellent network isolation (Matrix/PostgreSQL localhost-only)
- Proper secrets encryption with sops-nix
- Strong SSH hardening (key-only authentication)
- Valid TLS with HSTS enabled
- Minimal attack surface (only SSH/HTTP/HTTPS exposed)

Known issues documented:
- mautrix-slack exit code 11 (non-critical)
- fail2ban not enabled (optional enhancement)
- Forgejo migrations in progress (temporary)

System validated as PRODUCTION READY.

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 22:25:08 -07:00
Dan 64246a6615 Deploy Generation 31 with sops-nix secrets management
Successfully deployed ops-jrz1 Matrix platform to production VPS using
extracted modules from ops-base. Validated deployment workflow following
ops-base best practices: boot -> reboot -> verify.

Changes:
- Pin sops-nix to June 2024 version for nixpkgs 24.05 compatibility
- Configure sops secrets for Matrix registration token and ACME email
- Add encrypted secrets.yaml (safe to commit, encrypted with age)
- Document deployment process and lessons learned

All services verified running:
- Matrix homeserver (matrix-continuwuity): conduwuit 0.5.0-rc.8
- nginx: Proxying Matrix and Forgejo
- PostgreSQL 15.10: Database services
- Forgejo 7.0.12: Git platform

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 21:32:23 -07:00
Dan dbbe460ad0 Add worklog documenting migration strategy and deployment planning
Documents:
- Server relationship clarification (ops-base → ops-jrz1 same VPS)
- Analysis of 4 migration approaches (in-place, parallel, fresh, dual VPS)
- Comprehensive 7-phase migration plan with rollback procedures
- ops-base repository analysis (vultr-dev config, deployment patterns)
- VM testing options and local validation strategies
- Risk assessment and safety layers (build, VM, test mode, generations)

Planning session: 100 minutes, 0 commits, strategic analysis only
Next steps: Execute migration, VM test, Phase 4 docs, or pause
2025-10-14 21:02:05 -07:00
Dan 9ea22ac5b1 Add worklog documenting Phase 3 module extraction
Documents:
- ops-base structure verification
- Extraction of 8 modules + 2 configurations
- Automated sanitization and validation
- Flake integration with sops-nix and nixpkgs-unstable
- Build validation (nix flake check and build passed)
- 28 tasks complete (T012-T039)

Progress: 39/125 tasks (31.2%), 53.4% of MVP complete
2025-10-13 16:22:41 -07:00
Dan ab5aebb161 Phase 3: Extract and sanitize Matrix platform modules from ops-base
Extracted modules:
- Matrix homeserver (matrix-continuwuity.nix)
- mautrix bridges (slack, whatsapp, gmessages)
- Security modules (fail2ban, ssh-hardening)
- Development services module
- Matrix secrets module

All modules sanitized to remove personal information:
- Domains: example.com, matrix.example.org
- IPs: 10.0.0.x, 203.0.113.10
- Paths: /home/user, /path/to/ops-base
- Emails: admin@example.com

Configuration:
- Updated flake.nix with sops-nix and nixpkgs-unstable
- Updated hosts/ops-jrz1.nix to import all extracted modules
- Added example files (secrets, minimal config)
- Generated flake.lock

Generated with Claude Code - https://claude.com/claude-code
2025-10-13 14:51:14 -07:00
Dan 6a26ca12ca Add worklog documenting Phase 1 & 2 foundation setup
Documents:
- Directory structure and NixOS config skeleton creation
- Sanitization/validation scripts with 22 rules implementation
- Git hooks configuration (pre-commit, pre-push)
- 5 key architectural decisions with rationale
- 7 problems/solutions encountered
- Foundation review results (all checks passed)
- Phase progress: 11/125 tasks complete (8.8%)

29KB uncompressed worklog ready for semantic compression.
2025-10-13 13:42:40 -07:00