Multi-lens review skill for operational infrastructure (Nix, shell, Docker, CI/CD). Modeled on code-review with linter-first hybrid architecture. Phase 1 lenses (core safety): - secrets: credential exposure, Nix store, Docker layers, CI masking - shell-safety: shellcheck-backed, temp files, guard snippets - blast-radius: targeting/scoping, dry-run, rollback - privilege: least-privilege, containers, systemd sandboxing Design reviewed via orch consensus (sonar, flash-or, gemini, gpt). Lenses deploy to ~/.config/lenses/ops/ via home-manager. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
161 lines
7.2 KiB
Org Mode
161 lines
7.2 KiB
Org Mode
#+TITLE: ops-review Skill Design, Orch Consensus Planning, and Skeleton Implementation
|
|
#+DATE: 2026-01-01
|
|
#+KEYWORDS: ops-review, skill-design, orch-consensus, lenses, infrastructure-review, nix, shell-safety, secrets
|
|
#+COMMITS: 0 (uncommitted work in progress)
|
|
#+COMPRESSION_STATUS: uncompressed
|
|
|
|
* Session Summary
|
|
** Date: 2026-01-01
|
|
** Focus Area: Designing and implementing the ops-review skill for infrastructure code analysis
|
|
|
|
* Accomplishments
|
|
- [X] Explored dotfiles and prox-setup repos to understand actual ops artifact landscape
|
|
- [X] Designed ops-review skill with 10 lenses across 3 phases
|
|
- [X] Ran orch consensus (sonar, flash-or, gemini, gpt) on initial plan
|
|
- [X] Incorporated consensus feedback: linter-first hybrid architecture, crisp lens boundaries
|
|
- [X] Created comprehensive plan.md in specs/ops-review/
|
|
- [X] Created bd epic (skills-9cu) with 14 child tasks, proper dependency graph
|
|
- [X] Built skill skeleton: SKILL.md, README.md, lenses/README.md
|
|
- [X] Drafted secrets.md lens with orch consensus review
|
|
- [X] Incorporated Nix store exposure, Docker layer persistence, CI masking feedback
|
|
- [X] Filed follow-up issue in dotfiles for gitleaks availability (dotfiles-x2m)
|
|
- [ ] Remaining Phase 1 lenses: shell-safety, blast-radius, privilege
|
|
|
|
* Key Decisions
|
|
** Decision 1: Linter-first hybrid architecture
|
|
- Context: How should ops-review analyze infrastructure code?
|
|
- Options considered:
|
|
1. Pure LLM analysis - flexible but prone to syntax hallucinations
|
|
2. Pure linter - deterministic but misses semantic issues
|
|
3. Hybrid: linters first, LLM interprets - best of both
|
|
- Rationale: All 4 consensus models agreed LLMs hallucinate syntax but excel at understanding intent. Static tools catch syntax, LLM finds logic bugs.
|
|
- Impact: Each lens integrates with specific tools (shellcheck, statix, gitleaks)
|
|
|
|
** Decision 2: 10 lenses across 3 phases
|
|
- Context: How many lenses and how to prioritize?
|
|
- Initial proposal: 8 lenses
|
|
- Consensus feedback: Add privilege (least-privilege) and supply-chain (pinning)
|
|
- Phase 1 (quick mode): secrets, shell-safety, blast-radius, privilege
|
|
- Phase 2: idempotency, supply-chain, observability
|
|
- Phase 3: nix-hygiene, resilience, orchestration
|
|
|
|
** Decision 3: Crisp lens boundaries to avoid duplicate findings
|
|
- Problem: resilience/blast-radius/idempotency overlap
|
|
- Solution: Define ownership table
|
|
- idempotency: safe re-run, convergence, atomic writes
|
|
- resilience: runtime fault tolerance, timeouts, retries
|
|
- blast-radius: change safety, dry-run, rollback
|
|
|
|
** Decision 4: Nix-specific checks as first-class concerns
|
|
- Context: Nix has unique security model (world-readable store)
|
|
- Insight from consensus: Secrets in .nix strings become readable in /nix/store
|
|
- Added to secrets lens: explicit Nix store exposure check
|
|
- Remediation: sops-nix/agenix with runtime paths, not embedded strings
|
|
|
|
* Problems & Solutions
|
|
| Problem | Solution | Learning |
|
|
|---------+----------+----------|
|
|
| Initial lens drafts too long (60+ lines) | Reference existing code-review lenses (~45 lines) | Consistent format matters for usability |
|
|
| Overlapping lens scopes | Created "Crisp Boundaries" table in plan | Define ownership explicitly upfront |
|
|
| What lenses are actually needed? | Explored real repos (dotfiles, prox-setup) | Ground design in actual artifacts |
|
|
| False positive risk in secrets lens | Added explicit exemptions (Nix hashes, public keys) | Two-signal rule for generic matches |
|
|
|
|
* Technical Details
|
|
|
|
** Code Changes
|
|
- Total files created: 5
|
|
- Key files:
|
|
- `specs/ops-review/plan.md` (261 lines) - Comprehensive design document
|
|
- `skills/ops-review/SKILL.md` (188 lines) - Agent workflow instructions
|
|
- `skills/ops-review/README.md` (96 lines) - User documentation
|
|
- `skills/ops-review/lenses/README.md` (85 lines) - Lens index
|
|
- `skills/ops-review/lenses/secrets.md` (53 lines) - First lens
|
|
|
|
** Commands Used
|
|
#+begin_src bash
|
|
# Explored actual infrastructure repos
|
|
# (via Task tool with Explore subagent)
|
|
|
|
# Ran multi-model consensus for plan review
|
|
uv run orch consensus "Review this ops-review skill design..." sonar flash-or gemini gpt
|
|
|
|
# Created bd epic with hierarchical children
|
|
bd create "ops-review skill" --type=epic -p 1 --description "..."
|
|
bd create "Lens: secrets" --parent skills-9cu -p 1 --deps skills-9cu.1
|
|
|
|
# Visualized dependency graph
|
|
bd graph skills-9cu
|
|
|
|
# Checked available work
|
|
bd ready
|
|
#+end_src
|
|
|
|
** Architecture Notes
|
|
- Skill follows code-review pattern: lenses as focused prompts
|
|
- Lenses deploy to ~/.config/lenses/ops/ via home-manager
|
|
- Quick mode (--quick) runs Phase 1 only for CI/pre-commit
|
|
- Cross-file awareness via grep-based reference mapping (source, imports)
|
|
|
|
* Process and Workflow
|
|
|
|
** What Worked Well
|
|
- Exploring real repos first grounded the design in actual needs
|
|
- orch consensus with 4 models surfaced gaps (Nix store, Docker layers)
|
|
- bd epic with --parent creates clean hierarchical structure
|
|
- Dependency graph visualization helped verify task ordering
|
|
|
|
** What Was Challenging
|
|
- Balancing lens completeness with ~45 line target format
|
|
- Deciding which checks are linter-backed vs LLM-primary
|
|
- Managing context across long design session
|
|
|
|
* Learning and Insights
|
|
|
|
** Technical Insights
|
|
- Nix store world-readability is a critical security consideration
|
|
- Docker ENV/ARG persist in image layers even if later deleted
|
|
- CI masking (::add-mask::) is often overlooked
|
|
- shellcheck, statix, gitleaks provide structured JSON output for integration
|
|
|
|
** Process Insights
|
|
- orch consensus is valuable for pressure-testing designs
|
|
- High temp for brainstorming, low temp for analysis decisions
|
|
- bd hierarchical children (.1, .2, etc.) work well for epic breakdown
|
|
|
|
** Architectural Insights
|
|
- Linter-first hybrid is emerging pattern (doc-review also uses this)
|
|
- Lens boundaries must be explicit to avoid duplicate findings
|
|
- Platform-specific remediation matters (sops-nix vs BuildKit secrets)
|
|
|
|
* Context for Future Work
|
|
|
|
** Open Questions
|
|
- Should ops-review have its own lens directory or share with code-review?
|
|
- How to handle cross-repo awareness (dotfiles uses sops, prox-setup uses passage)?
|
|
- Should we run linters in parallel before LLM pass?
|
|
|
|
** Next Steps
|
|
- Complete Phase 1 lenses: shell-safety, blast-radius, privilege
|
|
- Integration: add to flake.nix, update ai-skills.nix
|
|
- Validation: test on dotfiles and prox-setup repos
|
|
- Ensure gitleaks available (dotfiles-x2m)
|
|
|
|
** Related Work
|
|
- [[file:2025-12-28-code-review-skill-creation-worklog-cleanup.org][Code Review Skill Creation]] - Original lens pattern
|
|
- [[file:2025-12-04-doc-review-skill-design.org][Doc-Review Skill Design]] - Hybrid architecture precedent
|
|
- [[file:2025-12-26-multi-lens-code-review-workflow-testing.org][Multi-Lens Code Review Testing]] - LLM-in-the-loop pattern
|
|
|
|
* Raw Notes
|
|
- Dotfiles repo: 100+ Nix modules, 90+ shell scripts, SOPS secrets, Gitea Actions
|
|
- Prox-setup repo: 88 Python scripts (Proxmox API), 41 shell scripts, Docker Compose
|
|
- Models consulted: sonar, flash-or, gemini, gpt (all 4 supported the design)
|
|
- Key insight from GPT: "Require two signals for MED/HIGH when not using known token format"
|
|
- All models emphasized: don't flag Nix hashes (sha256-, narHash, vendorHash)
|
|
|
|
* Session Metrics
|
|
- Commits made: 0 (work in progress)
|
|
- Files created: 5
|
|
- Lines added: ~683 (plan.md + skill files + lens)
|
|
- bd issues created: 16 (1 epic + 14 children + 1 in dotfiles)
|
|
- orch consensus runs: 2
|