Successfully deployed ops-jrz1 Matrix platform to production VPS using extracted modules from ops-base. Validated deployment workflow following ops-base best practices: boot -> reboot -> verify. Changes: - Pin sops-nix to June 2024 version for nixpkgs 24.05 compatibility - Configure sops secrets for Matrix registration token and ACME email - Add encrypted secrets.yaml (safe to commit, encrypted with age) - Document deployment process and lessons learned All services verified running: - Matrix homeserver (matrix-continuwuity): conduwuit 0.5.0-rc.8 - nginx: Proxying Matrix and Forgejo - PostgreSQL 15.10: Database services - Forgejo 7.0.12: Git platform Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
129 lines
4.4 KiB
Markdown
129 lines
4.4 KiB
Markdown
# Deployment: Generation 31 - Matrix Platform Migration
|
|
**Date:** 2025-10-22
|
|
**Status:** ✅ SUCCESS
|
|
**Generation:** 31
|
|
**Deployment Time:** ~5 minutes (build + reboot)
|
|
|
|
## Summary
|
|
Successfully deployed ops-jrz1 Matrix platform using modules extracted from ops-base. This deployment established the foundation deployment pattern and validated sops-nix secrets management integration.
|
|
|
|
## Deployment Method
|
|
Following ops-base best practices from worklog research:
|
|
|
|
```bash
|
|
# 1. Build and install to boot (safe, rollback-friendly)
|
|
rsync -avz --exclude '.git' --exclude 'result' /home/dan/proj/ops-jrz1/ root@45.77.205.49:/root/ops-jrz1/
|
|
ssh root@45.77.205.49 'cd /root/ops-jrz1 && nixos-rebuild boot --flake .#ops-jrz1'
|
|
|
|
# 2. Reboot to test
|
|
ssh root@45.77.205.49 'reboot'
|
|
|
|
# 3. Verify services after reboot (verified all running)
|
|
ssh root@45.77.205.49 'systemctl status matrix-continuwuity nginx postgresql forgejo'
|
|
|
|
# 4. Test API endpoints
|
|
curl http://45.77.205.49:8008/_matrix/client/versions
|
|
```
|
|
|
|
## What Works ✅
|
|
|
|
### Core Infrastructure
|
|
- **NixOS Generation 31** booted successfully
|
|
- **sops-nix** decrypting secrets correctly using VPS SSH host key
|
|
- **Age encryption** working with key: `age1vuxcwvdvzl2u7w6kudqvnnf45czrnhwv9aevjq9hyjjpa409jvkqhkz32q`
|
|
|
|
### Services Running
|
|
- **Matrix Homeserver (matrix-continuwuity):** ✅ Running, API responding
|
|
- Version: conduwuit 0.5.0-rc.8
|
|
- Listening on: 127.0.0.1:8008
|
|
- Database: RocksDB schema version 18
|
|
- Registration enabled, federation disabled
|
|
|
|
- **nginx:** ✅ Running
|
|
- Proxying to Matrix homeserver
|
|
- ACME certificates configured for clarun.xyz and git.clarun.xyz
|
|
- Note: WebDAV errors expected (legacy feature, can be removed)
|
|
|
|
- **PostgreSQL 15.10:** ✅ Running
|
|
- Serving Forgejo database
|
|
- Minor client disconnect logs normal (connection pooling)
|
|
|
|
- **Forgejo 7.0.12:** ✅ Running
|
|
- Git service operational
|
|
- Connected to PostgreSQL
|
|
- Available at git.clarun.xyz
|
|
|
|
### Files Successfully Migrated
|
|
- `.sops.yaml` - Encrypted secrets configuration
|
|
- `secrets/secrets.yaml` - Encrypted secrets (committed to git, safe because encrypted)
|
|
- All Matrix platform modules from ops-base
|
|
|
|
## Configuration Highlights
|
|
|
|
### sops-nix Setup
|
|
Located in `hosts/ops-jrz1.nix:26-38`:
|
|
```nix
|
|
sops.defaultSopsFile = ../secrets/secrets.yaml;
|
|
sops.age.sshKeyPaths = [ "/etc/ssh/ssh_host_ed25519_key" ];
|
|
|
|
sops.secrets.matrix-registration-token = {
|
|
owner = "continuwuity";
|
|
group = "continuwuity";
|
|
mode = "0440";
|
|
};
|
|
|
|
sops.secrets.acme-email = {
|
|
owner = "root";
|
|
mode = "0444";
|
|
};
|
|
```
|
|
|
|
### Version Compatibility
|
|
Pinned sops-nix to avoid Go version mismatch (flake.nix:9):
|
|
```nix
|
|
sops-nix = {
|
|
url = "github:Mic92/sops-nix/c2ea1186c0cbfa4d06d406ae50f3e4b085ddc9b3"; # June 2024 version
|
|
inputs.nixpkgs.follows = "nixpkgs";
|
|
};
|
|
```
|
|
|
|
## Key Lessons from ops-base Research
|
|
|
|
### Deployment Pattern (Recommended)
|
|
1. **`nixos-rebuild boot`** - Install to bootloader, don't activate yet
|
|
2. **Reboot** - Test new configuration
|
|
3. **Verify services** - Ensure everything works
|
|
4. **`nixos-rebuild switch`** (optional) - Make current profile permanent
|
|
|
|
**Rollback:** If anything fails, select previous generation from GRUB or `nixos-rebuild switch --rollback`
|
|
|
|
### Secrets Management
|
|
- Encrypted `secrets.yaml` **should be committed to git** (it's encrypted with age, safe to track)
|
|
- SSH host key converts to age key automatically via `ssh-to-age`
|
|
- Multi-recipient encryption allows both VPS and admin workstation to decrypt
|
|
|
|
### Common Pitfalls Avoided
|
|
From 46+ ops-base deployments:
|
|
|
|
1. **Exit code 11 ≠ always segfault** - Often intentional exit_group(11) from config validation
|
|
2. **SystemCallFilter restrictions** - Can block CPU affinity syscalls, needs allowances
|
|
3. **LoadCredential patterns** - Use for Python scripts reading secrets from environment
|
|
4. **ACME debugging** - Check `journalctl -u acme-*`, verify DNS, test staging first
|
|
|
|
## Build Statistics
|
|
- **285 derivations built**
|
|
- **378 paths fetched** (786.52 MiB download, 3.39 GiB unpacked)
|
|
- **Boot time:** ~30 seconds
|
|
- **Service startup:** All services up within 2 minutes
|
|
|
|
## Next Steps
|
|
- [ ] Monitor mautrix-slack (currently segfaulting, needs investigation)
|
|
- [ ] Establish regular deployment workflow (local build + remote deploy)
|
|
- [ ] Configure remaining Matrix bridges (WhatsApp, Google Messages)
|
|
- [ ] Set up monitoring/alerting
|
|
|
|
## References
|
|
- ops-base worklogs: Reviewed 46+ deployment entries
|
|
- sops-nix docs: Age encryption with SSH host keys
|
|
- NixOS deployment patterns: boot -> reboot -> switch workflow
|