Multi-lens review skill for operational infrastructure (Nix, shell, Docker, CI/CD). Modeled on code-review with linter-first hybrid architecture. Phase 1 lenses (core safety): - secrets: credential exposure, Nix store, Docker layers, CI masking - shell-safety: shellcheck-backed, temp files, guard snippets - blast-radius: targeting/scoping, dry-run, rollback - privilege: least-privilege, containers, systemd sandboxing Design reviewed via orch consensus (sonar, flash-or, gemini, gpt). Lenses deploy to ~/.config/lenses/ops/ via home-manager. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.2 KiB
7.2 KiB
ops-review Skill Design, Orch Consensus Planning, and Skeleton Implementation
- Session Summary
- Accomplishments
- Key Decisions
- Problems & Solutions
- Technical Details
- Process and Workflow
- Learning and Insights
- Context for Future Work
- Raw Notes
- Session Metrics
Session Summary
Date: 2026-01-01
Focus Area: Designing and implementing the ops-review skill for infrastructure code analysis
Accomplishments
- Explored dotfiles and prox-setup repos to understand actual ops artifact landscape
- Designed ops-review skill with 10 lenses across 3 phases
- Ran orch consensus (sonar, flash-or, gemini, gpt) on initial plan
- Incorporated consensus feedback: linter-first hybrid architecture, crisp lens boundaries
- Created comprehensive plan.md in specs/ops-review/
- Created bd epic (skills-9cu) with 14 child tasks, proper dependency graph
- Built skill skeleton: SKILL.md, README.md, lenses/README.md
- Drafted secrets.md lens with orch consensus review
- Incorporated Nix store exposure, Docker layer persistence, CI masking feedback
- Filed follow-up issue in dotfiles for gitleaks availability (dotfiles-x2m)
- Remaining Phase 1 lenses: shell-safety, blast-radius, privilege
Key Decisions
Decision 1: Linter-first hybrid architecture
- Context: How should ops-review analyze infrastructure code?
-
Options considered:
- Pure LLM analysis - flexible but prone to syntax hallucinations
- Pure linter - deterministic but misses semantic issues
- Hybrid: linters first, LLM interprets - best of both
- Rationale: All 4 consensus models agreed LLMs hallucinate syntax but excel at understanding intent. Static tools catch syntax, LLM finds logic bugs.
- Impact: Each lens integrates with specific tools (shellcheck, statix, gitleaks)
Decision 2: 10 lenses across 3 phases
- Context: How many lenses and how to prioritize?
- Initial proposal: 8 lenses
- Consensus feedback: Add privilege (least-privilege) and supply-chain (pinning)
- Phase 1 (quick mode): secrets, shell-safety, blast-radius, privilege
- Phase 2: idempotency, supply-chain, observability
- Phase 3: nix-hygiene, resilience, orchestration
Decision 3: Crisp lens boundaries to avoid duplicate findings
- Problem: resilience/blast-radius/idempotency overlap
-
Solution: Define ownership table
- idempotency: safe re-run, convergence, atomic writes
- resilience: runtime fault tolerance, timeouts, retries
- blast-radius: change safety, dry-run, rollback
Decision 4: Nix-specific checks as first-class concerns
- Context: Nix has unique security model (world-readable store)
- Insight from consensus: Secrets in .nix strings become readable in /nix/store
- Added to secrets lens: explicit Nix store exposure check
- Remediation: sops-nix/agenix with runtime paths, not embedded strings
Problems & Solutions
| Problem | Solution | Learning |
|---|---|---|
| Initial lens drafts too long (60+ lines) | Reference existing code-review lenses (~45 lines) | Consistent format matters for usability |
| Overlapping lens scopes | Created "Crisp Boundaries" table in plan | Define ownership explicitly upfront |
| What lenses are actually needed? | Explored real repos (dotfiles, prox-setup) | Ground design in actual artifacts |
| False positive risk in secrets lens | Added explicit exemptions (Nix hashes, public keys) | Two-signal rule for generic matches |
Technical Details
Code Changes
- Total files created: 5
-
Key files:
- `specs/ops-review/plan.md` (261 lines) - Comprehensive design document
- `skills/ops-review/SKILL.md` (188 lines) - Agent workflow instructions
- `skills/ops-review/README.md` (96 lines) - User documentation
- `skills/ops-review/lenses/README.md` (85 lines) - Lens index
- `skills/ops-review/lenses/secrets.md` (53 lines) - First lens
Commands Used
# Explored actual infrastructure repos
# (via Task tool with Explore subagent)
# Ran multi-model consensus for plan review
uv run orch consensus "Review this ops-review skill design..." sonar flash-or gemini gpt
# Created bd epic with hierarchical children
bd create "ops-review skill" --type=epic -p 1 --description "..."
bd create "Lens: secrets" --parent skills-9cu -p 1 --deps skills-9cu.1
# Visualized dependency graph
bd graph skills-9cu
# Checked available work
bd ready
Architecture Notes
- Skill follows code-review pattern: lenses as focused prompts
- Lenses deploy to ~/.config/lenses/ops/ via home-manager
- Quick mode (–quick) runs Phase 1 only for CI/pre-commit
- Cross-file awareness via grep-based reference mapping (source, imports)
Process and Workflow
What Worked Well
- Exploring real repos first grounded the design in actual needs
- orch consensus with 4 models surfaced gaps (Nix store, Docker layers)
- bd epic with –parent creates clean hierarchical structure
- Dependency graph visualization helped verify task ordering
What Was Challenging
- Balancing lens completeness with ~45 line target format
- Deciding which checks are linter-backed vs LLM-primary
- Managing context across long design session
Learning and Insights
Technical Insights
- Nix store world-readability is a critical security consideration
- Docker ENV/ARG persist in image layers even if later deleted
- CI masking (::add-mask::) is often overlooked
- shellcheck, statix, gitleaks provide structured JSON output for integration
Process Insights
- orch consensus is valuable for pressure-testing designs
- High temp for brainstorming, low temp for analysis decisions
- bd hierarchical children (.1, .2, etc.) work well for epic breakdown
Architectural Insights
- Linter-first hybrid is emerging pattern (doc-review also uses this)
- Lens boundaries must be explicit to avoid duplicate findings
- Platform-specific remediation matters (sops-nix vs BuildKit secrets)
Context for Future Work
Open Questions
- Should ops-review have its own lens directory or share with code-review?
- How to handle cross-repo awareness (dotfiles uses sops, prox-setup uses passage)?
- Should we run linters in parallel before LLM pass?
Next Steps
- Complete Phase 1 lenses: shell-safety, blast-radius, privilege
- Integration: add to flake.nix, update ai-skills.nix
- Validation: test on dotfiles and prox-setup repos
- Ensure gitleaks available (dotfiles-x2m)
Related Work
- Code Review Skill Creation - Original lens pattern
- Doc-Review Skill Design - Hybrid architecture precedent
- Multi-Lens Code Review Testing - LLM-in-the-loop pattern
Raw Notes
- Dotfiles repo: 100+ Nix modules, 90+ shell scripts, SOPS secrets, Gitea Actions
- Prox-setup repo: 88 Python scripts (Proxmox API), 41 shell scripts, Docker Compose
- Models consulted: sonar, flash-or, gemini, gpt (all 4 supported the design)
- Key insight from GPT: "Require two signals for MED/HIGH when not using known token format"
- All models emphasized: don't flag Nix hashes (sha256-, narHash, vendorHash)
Session Metrics
- Commits made: 0 (work in progress)
- Files created: 5
- Lines added: ~683 (plan.md + skill files + lens)
- bd issues created: 16 (1 epic + 14 children + 1 in dotfiles)
- orch consensus runs: 2