Successfully deployed ops-jrz1 Matrix platform to production VPS using extracted modules from ops-base. Validated deployment workflow following ops-base best practices: boot -> reboot -> verify. Changes: - Pin sops-nix to June 2024 version for nixpkgs 24.05 compatibility - Configure sops secrets for Matrix registration token and ACME email - Add encrypted secrets.yaml (safe to commit, encrypted with age) - Document deployment process and lessons learned All services verified running: - Matrix homeserver (matrix-continuwuity): conduwuit 0.5.0-rc.8 - nginx: Proxying Matrix and Forgejo - PostgreSQL 15.10: Database services - Forgejo 7.0.12: Git platform Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
4.4 KiB
4.4 KiB
Deployment: Generation 31 - Matrix Platform Migration
Date: 2025-10-22 Status: ✅ SUCCESS Generation: 31 Deployment Time: ~5 minutes (build + reboot)
Summary
Successfully deployed ops-jrz1 Matrix platform using modules extracted from ops-base. This deployment established the foundation deployment pattern and validated sops-nix secrets management integration.
Deployment Method
Following ops-base best practices from worklog research:
# 1. Build and install to boot (safe, rollback-friendly)
rsync -avz --exclude '.git' --exclude 'result' /home/dan/proj/ops-jrz1/ root@45.77.205.49:/root/ops-jrz1/
ssh root@45.77.205.49 'cd /root/ops-jrz1 && nixos-rebuild boot --flake .#ops-jrz1'
# 2. Reboot to test
ssh root@45.77.205.49 'reboot'
# 3. Verify services after reboot (verified all running)
ssh root@45.77.205.49 'systemctl status matrix-continuwuity nginx postgresql forgejo'
# 4. Test API endpoints
curl http://45.77.205.49:8008/_matrix/client/versions
What Works ✅
Core Infrastructure
- NixOS Generation 31 booted successfully
- sops-nix decrypting secrets correctly using VPS SSH host key
- Age encryption working with key:
age1vuxcwvdvzl2u7w6kudqvnnf45czrnhwv9aevjq9hyjjpa409jvkqhkz32q
Services Running
-
Matrix Homeserver (matrix-continuwuity): ✅ Running, API responding
- Version: conduwuit 0.5.0-rc.8
- Listening on: 127.0.0.1:8008
- Database: RocksDB schema version 18
- Registration enabled, federation disabled
-
nginx: ✅ Running
- Proxying to Matrix homeserver
- ACME certificates configured for clarun.xyz and git.clarun.xyz
- Note: WebDAV errors expected (legacy feature, can be removed)
-
PostgreSQL 15.10: ✅ Running
- Serving Forgejo database
- Minor client disconnect logs normal (connection pooling)
-
Forgejo 7.0.12: ✅ Running
- Git service operational
- Connected to PostgreSQL
- Available at git.clarun.xyz
Files Successfully Migrated
.sops.yaml- Encrypted secrets configurationsecrets/secrets.yaml- Encrypted secrets (committed to git, safe because encrypted)- All Matrix platform modules from ops-base
Configuration Highlights
sops-nix Setup
Located in hosts/ops-jrz1.nix:26-38:
sops.defaultSopsFile = ../secrets/secrets.yaml;
sops.age.sshKeyPaths = [ "/etc/ssh/ssh_host_ed25519_key" ];
sops.secrets.matrix-registration-token = {
owner = "continuwuity";
group = "continuwuity";
mode = "0440";
};
sops.secrets.acme-email = {
owner = "root";
mode = "0444";
};
Version Compatibility
Pinned sops-nix to avoid Go version mismatch (flake.nix:9):
sops-nix = {
url = "github:Mic92/sops-nix/c2ea1186c0cbfa4d06d406ae50f3e4b085ddc9b3"; # June 2024 version
inputs.nixpkgs.follows = "nixpkgs";
};
Key Lessons from ops-base Research
Deployment Pattern (Recommended)
nixos-rebuild boot- Install to bootloader, don't activate yet- Reboot - Test new configuration
- Verify services - Ensure everything works
nixos-rebuild switch(optional) - Make current profile permanent
Rollback: If anything fails, select previous generation from GRUB or nixos-rebuild switch --rollback
Secrets Management
- Encrypted
secrets.yamlshould be committed to git (it's encrypted with age, safe to track) - SSH host key converts to age key automatically via
ssh-to-age - Multi-recipient encryption allows both VPS and admin workstation to decrypt
Common Pitfalls Avoided
From 46+ ops-base deployments:
- Exit code 11 ≠ always segfault - Often intentional exit_group(11) from config validation
- SystemCallFilter restrictions - Can block CPU affinity syscalls, needs allowances
- LoadCredential patterns - Use for Python scripts reading secrets from environment
- ACME debugging - Check
journalctl -u acme-*, verify DNS, test staging first
Build Statistics
- 285 derivations built
- 378 paths fetched (786.52 MiB download, 3.39 GiB unpacked)
- Boot time: ~30 seconds
- Service startup: All services up within 2 minutes
Next Steps
- Monitor mautrix-slack (currently segfaulting, needs investigation)
- Establish regular deployment workflow (local build + remote deploy)
- Configure remaining Matrix bridges (WhatsApp, Google Messages)
- Set up monitoring/alerting
References
- ops-base worklogs: Reviewed 46+ deployment entries
- sops-nix docs: Age encryption with SSH host keys
- NixOS deployment patterns: boot -> reboot -> switch workflow