ops-jrz1/specs/001-extract-matrix-platform/spec.md
Dan 894e7241f1 Initialize ops-jrz1 repository with Matrix platform extraction foundation
- Add speckit workflow infrastructure (.claude, .specify)
- Create NixOS configuration skeleton (flake.nix, configuration.nix, hosts/ops-jrz1.nix)
- Add sanitization scripts with 22 rules for personal info removal
- Add validation scripts with gitleaks integration
- Configure git hooks (pre-commit, pre-push) for security validation
- Add project documentation (README, LICENSE)
- Add comprehensive .gitignore for Nix, secrets, staging

Phase 1 and Phase 2 complete. Foundation ready for module extraction from ops-base.
2025-10-13 13:37:17 -07:00

215 lines
13 KiB
Markdown

# Feature Specification: Extract Matrix Platform Modules for ops-jrz1 Server
**Feature Branch**: `001-extract-matrix-platform`
**Created**: 2025-10-11
**Status**: Draft
**Input**: User description: "Extract Matrix platform modules from ops-base to configure ops-jrz1 dev/test server in this repository"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Deploy Matrix Platform to ops-jrz1 Server (Priority: P1)
Deploy Matrix homeserver with bridges to the ops-jrz1 dev/test server using extracted and sanitized modules from this repository.
**Why this priority**: This is the primary deliverable - a working Matrix platform on ops-jrz1 using production-tested patterns from ops-base.
**Independent Test**: Can be fully tested by: (1) customizing configuration for ops-jrz1, (2) following deployment guide, (3) deploying to server, (4) verifying Matrix responds and user registration works. Delivers a working Matrix homeserver.
**Acceptance Scenarios**:
1. **Given** a developer with a VPS and domain name, **When** they follow the quick start guide (5 minutes), **Then** they have a clear understanding of next steps and requirements
2. **Given** the developer has customized the example configuration, **When** they run the deployment command, **Then** the Matrix homeserver starts successfully and responds to API requests
3. **Given** the Matrix homeserver is running, **When** the developer follows the bridge setup guide, **Then** they can add Slack/WhatsApp/Google Messages bridges to their deployment
---
### User Story 2 - Extract and Sanitize Modules from ops-base (Priority: P1)
Extract Matrix modules from ops-base, sanitize personal information (domains, IPs, secrets), and place them in this repository for ops-jrz1 server configuration.
**Why this priority**: This is foundational - must extract and sanitize modules before we can configure and deploy the server.
**Independent Test**: Can be tested by: (1) running sanitization scripts on ops-base modules, (2) verifying gitleaks finds no secrets, (3) verifying no personal domains/IPs remain, (4) building configurations successfully. Delivers sanitized modules ready for deployment.
**Acceptance Scenarios**:
1. **Given** ops-base contains production modules with personal config, **When** sanitization process runs, **Then** all personal domains (clarun.xyz, talu.uno) are replaced with example.com variants
2. **Given** modules contain personal IP addresses, **When** sanitization completes, **Then** all IPs are replaced with RFC 1918 private ranges or TEST-NET-3 public examples
3. **Given** the sanitized repository, **When** gitleaks secret scanning runs, **Then** no secrets, tokens, or sensitive data are detected
4. **Given** sanitized files, **When** nix flake check runs, **Then** all configurations build successfully with no syntax errors
---
### User Story 3 - Enable Community Contributions (Priority: P3 - Deferred)
**Status**: Deferred until public publication. Add governance files (CONTRIBUTING.md, SECURITY.md) and community features when ready to share publicly.
**Why deferred**: Not needed for internal dev/test server. Will implement before public sharing.
**Acceptance Scenarios**:
1. **Given** a developer wants to contribute, **When** they read CONTRIBUTING.md, **Then** they understand the process, testing requirements, and code standards
2. **Given** a developer submits a pull request, **When** they push their branch, **Then** pre-push hooks validate (nix flake check, gitleaks, build tests) before changes reach remote
3. **Given** a valid pull request passes CI, **When** maintainers review it, **Then** they can merge or request changes based on clear contribution guidelines
---
### User Story 4 - Sync Improvements from ops-base (Priority: P2)
Create workflow for syncing future improvements from ops-base back to this repository as modules evolve.
**Why this priority**: Useful for ongoing maintenance as production environment improves, but not critical for initial deployment.
**Acceptance Scenarios**:
1. **Given** improvements exist in ops-base modules, **When** maintainer runs sync workflow script, **Then** the script identifies changes and generates sanitized diff for review
2. **Given** sanitized changes ready to apply, **When** maintainer applies them to template, **Then** automated tests (nix flake check, gitleaks) verify no secrets leaked and builds pass
3. **Given** synced changes are validated, **When** maintainer commits and pushes, **Then** git hooks validate successfully and template remains secure
---
### User Story 5 - Developer Learns Matrix Bridge Patterns from Documentation (Priority: P3)
A developer wants to understand how to implement specific patterns (like Socket Mode authentication, gmessages-style config generation, or admin room setup) that are documented in the template's pattern guides.
**Why this priority**: Educational value adds community benefit but is not essential for basic deployment. Nice-to-have for initial launch, can be improved iteratively.
**Independent Test**: Can be tested by: (1) reading pattern documentation (e.g., Socket Mode setup), (2) following the steps in the guide, (3) verifying the pattern works as documented. Delivers knowledge transfer value.
**Acceptance Scenarios**:
1. **Given** a developer implementing Slack bridge, **When** they read docs/bridges/slack-setup.md, **Then** they understand Socket Mode vs webhooks and have a checklist of OAuth scopes needed
2. **Given** a developer encounters gmessages config generation pattern, **When** they read docs/patterns/config-generation.md, **Then** they understand why runtime regeneration is used and can apply the pattern to other bridges
3. **Given** a developer using Conduwuit, **When** they read docs/patterns/admin-room-setup.md, **Then** they can successfully register appservices via admin room commands
---
### Edge Cases
- What happens when a user tries to deploy without setting up sops-nix secrets management first?
- How does the template handle users on different NixOS versions (stable vs unstable)?
- What if a user deploys to ARM architecture instead of x86_64?
- How does sanitization handle new personal references added after initial publication?
- What happens if upstream bridge packages introduce breaking changes?
- How does the sync workflow prevent accidental secret leakage during updates?
## Requirements *(mandatory)*
### Functional Requirements
**Repository Structure & Sanitization:**
- **FR-001**: Repository MUST contain sanitized NixOS modules for Matrix homeserver (Continuwuity), mautrix bridges (Slack, WhatsApp, Google Messages), and security hardening
- **FR-002**: All personal domains MUST be replaced with generic examples or ops-jrz1-specific values (clarun.xyz → example.com, talu.uno → ops-jrz1 domain)
- **FR-003**: All personal IP addresses MUST be replaced with RFC 1918 private ranges (10.0.0.x) or ops-jrz1-specific addresses
- **FR-004**: All secrets MUST be removed from extracted code and managed via sops-nix
- **FR-005**: All personal paths MUST be sanitized (/home/dan → appropriate paths)
- **FR-006**: Extracted modules MUST NOT contain encrypted secrets or personal debugging logs from ops-base
- **FR-007**: Repository MUST include ops-jrz1 server configuration that builds successfully
**Documentation:**
- **FR-008**: Repository MUST include README with architecture overview and deployment guide
- **FR-009**: Repository SHOULD include deployment documentation for ops-jrz1 server
- **FR-010**: Repository SHOULD include bridge setup notes extracted from worklogs (for reference)
- **FR-011**: Repository SHOULD include pattern documentation extracted from worklogs (for reference)
- **FR-012**: Repository MUST include secrets-management documentation for sops-nix
- **FR-013**: Documentation MUST NOT contain personal infrastructure details from ops-base
**Security & Validation:**
- **FR-014**: Repository MUST pass gitleaks secret scanning with zero findings
- **FR-015**: Configuration MUST pass `nix flake check` validation
- **FR-016**: Repository SHOULD include git pre-commit and pre-push hooks for validation
- **FR-017**: Deferred - CI workflow not needed for internal dev server
- **FR-018**: Deferred - SECURITY.md for future public sharing
**Community & Governance:**
- **FR-019**: Deferred - CONTRIBUTING.md for future public sharing
- **FR-020**: Deferred - Issue templates for future public sharing
- **FR-021**: Repository SHOULD include LICENSE file for future sharing
- **FR-022**: Deferred - Community features for future public sharing
**Sync Workflow:**
- **FR-023**: Repository SHOULD document workflow for future syncs from ops-base
- **FR-024**: Repository SHOULD include helper scripts for identifying and sanitizing changes
- **FR-025**: Sync workflow SHOULD include validation steps (build check, gitleaks scan)
- **FR-026**: Deferred - Quarterly reminders for ongoing maintenance
**Testing & Deployment:**
- **FR-027**: Configuration MUST be tested by deploying to ops-jrz1 server
- **FR-028**: Configuration targets x86_64-linux architecture
- **FR-029**: All modules MUST build against pinned nixpkgs version for reproducibility
### Key Entities *(include if feature involves data)*
- **Module**: A reusable NixOS module file (e.g., matrix-continuwuity.nix, mautrix-slack.nix) containing service configuration and systemd units
- **Configuration**: An example deployment configuration file (e.g., example-vps.nix) that imports modules and sets options
- **Secret**: Sensitive data (tokens, passwords, keys) managed via sops-nix encryption, stored in secrets/secrets.yaml
- **Pattern Document**: Extracted architectural knowledge from worklogs explaining proven implementation approaches
- **Bridge Setup Guide**: Step-by-step documentation for configuring specific Matrix bridges with authentication and registration
- **Sanitization Rule**: A find/replace or validation rule that ensures personal information is removed (domain, IP, path, secret patterns)
- **Sync Checkpoint**: A record of what changes have been synced from ops-base to template at a specific point in time
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Configuration can be deployed to ops-jrz1 server and Matrix homeserver responds within 30 minutes
- **SC-002**: Configuration builds successfully with `nix flake check` (zero errors)
- **SC-003**: gitleaks secret scanning returns zero findings across entire repository
- **SC-004**: All target modules extracted and building: matrix-continuwuity, mautrix-slack, mautrix-whatsapp, mautrix-gmessages, security/fail2ban, security/ssh-hardening (6+ modules total)
- **SC-005**: Core documentation complete: README, deployment guide, secrets-management docs
- **SC-006**: Deferred - Git hooks for future enforcement
- **SC-007**: Deferred - Community metrics for future public sharing
- **SC-008**: Deferred - Community engagement for future public sharing
- **SC-009**: Zero security incidents (no secret leakage from ops-base extraction)
- **SC-010**: Deferred - Sync workflow for ongoing maintenance
## Assumptions *(mandatory)*
- ops-jrz1 is a dev/test server (not production - no 99.9999% uptime requirement)
- Maintainer has basic familiarity with NixOS and Nix flakes
- ops-jrz1 server exists or will be provisioned for deployment
- Domain name and DNS available for ops-jrz1 server
- SSH access to ops-jrz1 server available
- ops-base repository contains working Matrix modules to extract
- ops-base repository may contain worklogs with useful pattern documentation
- Maintainer has access to ops-base (private) repository
- gitleaks tool available for secret scanning
- NixOS provides necessary packages (matrix-continuwuity, mautrix bridges)
## Out of Scope *(include if needed to clarify boundaries)*
- Public template repository (deferred for future)
- Community contribution features (deferred for future)
- Rewriting git history in ops-base
- Automated sync from ops-base (manual extraction for now)
- Support for platforms other than NixOS
- Graphical configuration UI
- Hosting services or managed deployments
- Non-Matrix services beyond extracted modules
- Matrix clients or frontend components
- Custom bridge development
- Multi-host deployments (single-host for ops-jrz1)
- Windows or macOS deployment targets
- Production monitoring/observability (can be added later)
## Dependencies *(include if feature relies on external factors)*
- **NixOS/nixpkgs**: Requires packages for matrix-continuwuity, mautrix-slack, mautrix-whatsapp, mautrix-gmessages, forgejo
- **sops-nix**: Required for secrets management, must be compatible with current NixOS version
- **Git hosting platform**: Required for hosting repository (Forgejo for development, GitHub/tangl.sh for public publication)
- **gitleaks**: Required for automated secret scanning in CI and locally
- **age encryption tool**: Required for sops-nix secret encryption
- **ops-base repository**: Source of modules and documentation to be extracted
- **Nix flake system**: Template uses flakes for dependency management and configuration
- **Let's Encrypt / ACME**: Assumed for TLS certificate generation (users must configure)
## Open Questions *(include if there are unresolved decisions)*
None - all critical decisions were resolved in the RFC consensus validation process.