ops-jrz1/specs/001-extract-matrix-platform/spec.md
Dan 894e7241f1 Initialize ops-jrz1 repository with Matrix platform extraction foundation
- Add speckit workflow infrastructure (.claude, .specify)
- Create NixOS configuration skeleton (flake.nix, configuration.nix, hosts/ops-jrz1.nix)
- Add sanitization scripts with 22 rules for personal info removal
- Add validation scripts with gitleaks integration
- Configure git hooks (pre-commit, pre-push) for security validation
- Add project documentation (README, LICENSE)
- Add comprehensive .gitignore for Nix, secrets, staging

Phase 1 and Phase 2 complete. Foundation ready for module extraction from ops-base.
2025-10-13 13:37:17 -07:00

13 KiB

Feature Specification: Extract Matrix Platform Modules for ops-jrz1 Server

Feature Branch: 001-extract-matrix-platform Created: 2025-10-11 Status: Draft Input: User description: "Extract Matrix platform modules from ops-base to configure ops-jrz1 dev/test server in this repository"

User Scenarios & Testing (mandatory)

User Story 1 - Deploy Matrix Platform to ops-jrz1 Server (Priority: P1)

Deploy Matrix homeserver with bridges to the ops-jrz1 dev/test server using extracted and sanitized modules from this repository.

Why this priority: This is the primary deliverable - a working Matrix platform on ops-jrz1 using production-tested patterns from ops-base.

Independent Test: Can be fully tested by: (1) customizing configuration for ops-jrz1, (2) following deployment guide, (3) deploying to server, (4) verifying Matrix responds and user registration works. Delivers a working Matrix homeserver.

Acceptance Scenarios:

  1. Given a developer with a VPS and domain name, When they follow the quick start guide (5 minutes), Then they have a clear understanding of next steps and requirements
  2. Given the developer has customized the example configuration, When they run the deployment command, Then the Matrix homeserver starts successfully and responds to API requests
  3. Given the Matrix homeserver is running, When the developer follows the bridge setup guide, Then they can add Slack/WhatsApp/Google Messages bridges to their deployment

User Story 2 - Extract and Sanitize Modules from ops-base (Priority: P1)

Extract Matrix modules from ops-base, sanitize personal information (domains, IPs, secrets), and place them in this repository for ops-jrz1 server configuration.

Why this priority: This is foundational - must extract and sanitize modules before we can configure and deploy the server.

Independent Test: Can be tested by: (1) running sanitization scripts on ops-base modules, (2) verifying gitleaks finds no secrets, (3) verifying no personal domains/IPs remain, (4) building configurations successfully. Delivers sanitized modules ready for deployment.

Acceptance Scenarios:

  1. Given ops-base contains production modules with personal config, When sanitization process runs, Then all personal domains (clarun.xyz, talu.uno) are replaced with example.com variants
  2. Given modules contain personal IP addresses, When sanitization completes, Then all IPs are replaced with RFC 1918 private ranges or TEST-NET-3 public examples
  3. Given the sanitized repository, When gitleaks secret scanning runs, Then no secrets, tokens, or sensitive data are detected
  4. Given sanitized files, When nix flake check runs, Then all configurations build successfully with no syntax errors

User Story 3 - Enable Community Contributions (Priority: P3 - Deferred)

Status: Deferred until public publication. Add governance files (CONTRIBUTING.md, SECURITY.md) and community features when ready to share publicly.

Why deferred: Not needed for internal dev/test server. Will implement before public sharing.

Acceptance Scenarios:

  1. Given a developer wants to contribute, When they read CONTRIBUTING.md, Then they understand the process, testing requirements, and code standards
  2. Given a developer submits a pull request, When they push their branch, Then pre-push hooks validate (nix flake check, gitleaks, build tests) before changes reach remote
  3. Given a valid pull request passes CI, When maintainers review it, Then they can merge or request changes based on clear contribution guidelines

User Story 4 - Sync Improvements from ops-base (Priority: P2)

Create workflow for syncing future improvements from ops-base back to this repository as modules evolve.

Why this priority: Useful for ongoing maintenance as production environment improves, but not critical for initial deployment.

Acceptance Scenarios:

  1. Given improvements exist in ops-base modules, When maintainer runs sync workflow script, Then the script identifies changes and generates sanitized diff for review
  2. Given sanitized changes ready to apply, When maintainer applies them to template, Then automated tests (nix flake check, gitleaks) verify no secrets leaked and builds pass
  3. Given synced changes are validated, When maintainer commits and pushes, Then git hooks validate successfully and template remains secure

User Story 5 - Developer Learns Matrix Bridge Patterns from Documentation (Priority: P3)

A developer wants to understand how to implement specific patterns (like Socket Mode authentication, gmessages-style config generation, or admin room setup) that are documented in the template's pattern guides.

Why this priority: Educational value adds community benefit but is not essential for basic deployment. Nice-to-have for initial launch, can be improved iteratively.

Independent Test: Can be tested by: (1) reading pattern documentation (e.g., Socket Mode setup), (2) following the steps in the guide, (3) verifying the pattern works as documented. Delivers knowledge transfer value.

Acceptance Scenarios:

  1. Given a developer implementing Slack bridge, When they read docs/bridges/slack-setup.md, Then they understand Socket Mode vs webhooks and have a checklist of OAuth scopes needed
  2. Given a developer encounters gmessages config generation pattern, When they read docs/patterns/config-generation.md, Then they understand why runtime regeneration is used and can apply the pattern to other bridges
  3. Given a developer using Conduwuit, When they read docs/patterns/admin-room-setup.md, Then they can successfully register appservices via admin room commands

Edge Cases

  • What happens when a user tries to deploy without setting up sops-nix secrets management first?
  • How does the template handle users on different NixOS versions (stable vs unstable)?
  • What if a user deploys to ARM architecture instead of x86_64?
  • How does sanitization handle new personal references added after initial publication?
  • What happens if upstream bridge packages introduce breaking changes?
  • How does the sync workflow prevent accidental secret leakage during updates?

Requirements (mandatory)

Functional Requirements

Repository Structure & Sanitization:

  • FR-001: Repository MUST contain sanitized NixOS modules for Matrix homeserver (Continuwuity), mautrix bridges (Slack, WhatsApp, Google Messages), and security hardening
  • FR-002: All personal domains MUST be replaced with generic examples or ops-jrz1-specific values (clarun.xyz → example.com, talu.uno → ops-jrz1 domain)
  • FR-003: All personal IP addresses MUST be replaced with RFC 1918 private ranges (10.0.0.x) or ops-jrz1-specific addresses
  • FR-004: All secrets MUST be removed from extracted code and managed via sops-nix
  • FR-005: All personal paths MUST be sanitized (/home/dan → appropriate paths)
  • FR-006: Extracted modules MUST NOT contain encrypted secrets or personal debugging logs from ops-base
  • FR-007: Repository MUST include ops-jrz1 server configuration that builds successfully

Documentation:

  • FR-008: Repository MUST include README with architecture overview and deployment guide
  • FR-009: Repository SHOULD include deployment documentation for ops-jrz1 server
  • FR-010: Repository SHOULD include bridge setup notes extracted from worklogs (for reference)
  • FR-011: Repository SHOULD include pattern documentation extracted from worklogs (for reference)
  • FR-012: Repository MUST include secrets-management documentation for sops-nix
  • FR-013: Documentation MUST NOT contain personal infrastructure details from ops-base

Security & Validation:

  • FR-014: Repository MUST pass gitleaks secret scanning with zero findings
  • FR-015: Configuration MUST pass nix flake check validation
  • FR-016: Repository SHOULD include git pre-commit and pre-push hooks for validation
  • FR-017: Deferred - CI workflow not needed for internal dev server
  • FR-018: Deferred - SECURITY.md for future public sharing

Community & Governance:

  • FR-019: Deferred - CONTRIBUTING.md for future public sharing
  • FR-020: Deferred - Issue templates for future public sharing
  • FR-021: Repository SHOULD include LICENSE file for future sharing
  • FR-022: Deferred - Community features for future public sharing

Sync Workflow:

  • FR-023: Repository SHOULD document workflow for future syncs from ops-base
  • FR-024: Repository SHOULD include helper scripts for identifying and sanitizing changes
  • FR-025: Sync workflow SHOULD include validation steps (build check, gitleaks scan)
  • FR-026: Deferred - Quarterly reminders for ongoing maintenance

Testing & Deployment:

  • FR-027: Configuration MUST be tested by deploying to ops-jrz1 server
  • FR-028: Configuration targets x86_64-linux architecture
  • FR-029: All modules MUST build against pinned nixpkgs version for reproducibility

Key Entities (include if feature involves data)

  • Module: A reusable NixOS module file (e.g., matrix-continuwuity.nix, mautrix-slack.nix) containing service configuration and systemd units
  • Configuration: An example deployment configuration file (e.g., example-vps.nix) that imports modules and sets options
  • Secret: Sensitive data (tokens, passwords, keys) managed via sops-nix encryption, stored in secrets/secrets.yaml
  • Pattern Document: Extracted architectural knowledge from worklogs explaining proven implementation approaches
  • Bridge Setup Guide: Step-by-step documentation for configuring specific Matrix bridges with authentication and registration
  • Sanitization Rule: A find/replace or validation rule that ensures personal information is removed (domain, IP, path, secret patterns)
  • Sync Checkpoint: A record of what changes have been synced from ops-base to template at a specific point in time

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: Configuration can be deployed to ops-jrz1 server and Matrix homeserver responds within 30 minutes
  • SC-002: Configuration builds successfully with nix flake check (zero errors)
  • SC-003: gitleaks secret scanning returns zero findings across entire repository
  • SC-004: All target modules extracted and building: matrix-continuwuity, mautrix-slack, mautrix-whatsapp, mautrix-gmessages, security/fail2ban, security/ssh-hardening (6+ modules total)
  • SC-005: Core documentation complete: README, deployment guide, secrets-management docs
  • SC-006: Deferred - Git hooks for future enforcement
  • SC-007: Deferred - Community metrics for future public sharing
  • SC-008: Deferred - Community engagement for future public sharing
  • SC-009: Zero security incidents (no secret leakage from ops-base extraction)
  • SC-010: Deferred - Sync workflow for ongoing maintenance

Assumptions (mandatory)

  • ops-jrz1 is a dev/test server (not production - no 99.9999% uptime requirement)
  • Maintainer has basic familiarity with NixOS and Nix flakes
  • ops-jrz1 server exists or will be provisioned for deployment
  • Domain name and DNS available for ops-jrz1 server
  • SSH access to ops-jrz1 server available
  • ops-base repository contains working Matrix modules to extract
  • ops-base repository may contain worklogs with useful pattern documentation
  • Maintainer has access to ops-base (private) repository
  • gitleaks tool available for secret scanning
  • NixOS provides necessary packages (matrix-continuwuity, mautrix bridges)

Out of Scope (include if needed to clarify boundaries)

  • Public template repository (deferred for future)
  • Community contribution features (deferred for future)
  • Rewriting git history in ops-base
  • Automated sync from ops-base (manual extraction for now)
  • Support for platforms other than NixOS
  • Graphical configuration UI
  • Hosting services or managed deployments
  • Non-Matrix services beyond extracted modules
  • Matrix clients or frontend components
  • Custom bridge development
  • Multi-host deployments (single-host for ops-jrz1)
  • Windows or macOS deployment targets
  • Production monitoring/observability (can be added later)

Dependencies (include if feature relies on external factors)

  • NixOS/nixpkgs: Requires packages for matrix-continuwuity, mautrix-slack, mautrix-whatsapp, mautrix-gmessages, forgejo
  • sops-nix: Required for secrets management, must be compatible with current NixOS version
  • Git hosting platform: Required for hosting repository (Forgejo for development, GitHub/tangl.sh for public publication)
  • gitleaks: Required for automated secret scanning in CI and locally
  • age encryption tool: Required for sops-nix secret encryption
  • ops-base repository: Source of modules and documentation to be extracted
  • Nix flake system: Template uses flakes for dependency management and configuration
  • Let's Encrypt / ACME: Assumed for TLS certificate generation (users must configure)

Open Questions (include if there are unresolved decisions)

None - all critical decisions were resolved in the RFC consensus validation process.