skills/skills/ops-review/SKILL.md
dan 2d6d13814f feat(ops-review): Add defense-in-depth lens
- Added skills/ops-review/lenses/defense-in-depth.md
- Updated SKILL.md to include the new lens focusing on environment guards and blast-radius safety
2026-01-19 18:05:00 -08:00

7.7 KiB

name description
ops-review Run multi-lens ops review on infrastructure files. Analyzes Nix, shell scripts, Docker, CI/CD for secrets, shell-safety, blast-radius, privilege, idempotency, supply-chain, observability, nix-hygiene, resilience, and orchestration. Interactive - asks before filing issues.

Ops Review Skill

Run focused infrastructure analysis using multiple review lenses. Uses a linter-first hybrid approach: static tools for syntax, LLM for semantics. Findings are synthesized and presented for your approval before any issues are filed.

When to Use

Invoke this skill when:

  • "Review my infrastructure"
  • "Run ops review on bin/"
  • "Check this script for issues"
  • "Analyze my Nix configs"
  • /ops-review

Arguments

The skill accepts an optional target:

  • /ops-review - Reviews recently changed ops files (git diff)
  • /ops-review bin/ - Reviews specific directory
  • /ops-review deploy.sh - Reviews specific file
  • /ops-review --quick - Phase 1 lenses only (fast, <30s)

Target Artifacts

Category File Patterns
Nix/NixOS *.nix, flake.nix, flake.lock
Shell Scripts *.sh, files with #!/bin/bash shebang
Python Automation *.py in ops contexts (scripts/, setup/, deploy/)
Container Configs Dockerfile, docker-compose.yml, *.dockerfile
CI/CD .github/workflows/*.yml, .gitea/workflows/*.yml
Service Configs *.service, *.timer, systemd units
Secrets .sops.yaml, secrets.yaml, SOPS-encrypted files

Architecture: Linter-First Hybrid

Stage 1: Static Tools (fast, deterministic)
├── shellcheck for shell scripts
├── statix + deadnix for Nix
├── hadolint for Dockerfiles
└── yamllint for YAML configs

Stage 2: LLM Analysis (semantic, contextual)
├── Interprets tool output in context
├── Finds logic bugs tools miss
├── Synthesizes cross-file issues
└── Suggests actionable fixes

Available Lenses

Lenses are focused review prompts located in ~/.config/lenses/ops/:

Phase 1: Core Safety (--quick mode)

Lens Focus
secrets.md Hardcoded credentials, SOPS issues, secrets in logs
shell-safety.md set -euo pipefail, quoting, error handling (shellcheck-backed)
blast-radius.md Destructive ops, missing dry-run, no rollback
privilege.md Unnecessary sudo, root containers, chmod 777

Phase 2: Reliability

Lens Focus
idempotency.md Safe re-run, existence checks, atomic operations
defense-in-depth.md NEW: Environment guards, path anchoring, and blast-radius safety
supply-chain.md Unpinned versions, missing SRI hashes, action SHAs
observability.md Silent failures, missing health checks, no logging

Phase 3: Architecture

Lens Focus
nix-hygiene.md Dead code, anti-patterns, module boundaries (statix-backed)
resilience.md Timeouts, retries, graceful shutdown, resource limits
orchestration.md Execution order, prerequisites, implicit coupling

Workflow

Phase 1: Target Selection

  1. Parse the target argument (default: git diff of uncommitted ops files)
  2. Identify files by category (Nix, shell, Docker, etc.)
  3. Show file list to user for confirmation

Phase 2: Pre-Pass (Static Tools)

Run appropriate linters based on file type:

# Shell scripts
shellcheck --format=json script.sh

# Nix files
statix check --format=json file.nix
deadnix --output-format=json file.nix

# Dockerfiles
hadolint --format json Dockerfile

Phase 3: Lens Execution

For each lens, analyze the target files with tool output in context:

  1. Read the lens prompt from ~/.config/lenses/ops/{lens}.md
  2. Include relevant linter output as evidence
  3. Apply the lens to find semantic issues tools miss
  4. Collect findings in structured format

Finding Format:

[TAG] <severity:HIGH|MED|LOW> <file:line>
Issue: <what's wrong>
Suggest: <how to fix>
Evidence: <why it matters>

Phase 4: Synthesis

After all lenses complete:

  1. Deduplicate overlapping findings (same issue from multiple lenses)
  2. Group related issues
  3. Rank by severity and confidence
  4. Generate summary report

Phase 5: Interactive Review

Present findings to user:

  1. Show executive summary (counts by severity)
  2. List top issues with details
  3. Ask: "Which findings should I file as issues?"

User can respond:

  • "File all" - creates beads issues for everything
  • "File HIGH only" - filters by severity
  • "File 1, 3, 5" - specific findings
  • "None" - just keep the report
  • "Let me review first" - show full details

Phase 6: Issue Filing (if requested)

For approved findings:

  1. Create beads issues with bd create
  2. Include lens tag, severity, file location
  3. Link related issues if applicable

Output

The skill produces:

  1. Console summary - immediate feedback
  2. Beads issues - if user approves filing

Severity Rubric

Severity Criteria
HIGH Exploitable vulnerability, data loss risk, will break on next run
MED Reliability issue, tech debt, violation of best practice
LOW Polish, maintainability, defense-in-depth improvement

Context matters: same issue may be HIGH in production, LOW in homelab.

Example Session

User: /ops-review bin/deploy.sh

Agent: I'll review bin/deploy.sh with ops lenses.

[Running shellcheck...]
[Running secrets lens...]
[Running shell-safety lens...]
[Running blast-radius lens...]
[Running privilege lens...]

## Review Summary: bin/deploy.sh

| Severity | Count |
|----------|-------|
| HIGH     | 2     |
| MED      | 3     |
| LOW      | 1     |

### Top Issues

1. [SECRETS] HIGH bin/deploy.sh:45
   Issue: API token passed as command-line argument (visible in process list)
   Suggest: Use environment variable or file with restricted permissions

2. [BLAST-RADIUS] HIGH bin/deploy.sh:78
   Issue: rm -rf with variable that could be empty
   Suggest: Add guard: [ -n "$DIR" ] || exit 1

3. [SHELL-SAFETY] MED bin/deploy.sh:12
   Issue: Missing 'set -euo pipefail'
   Suggest: Add at top of script for fail-fast behavior

Would you like me to file any of these as beads issues?
Options: all, HIGH only, specific numbers (1,2,3), or none

Quick Mode

Use --quick for fast pre-commit checks:

  • Runs only Phase 1 lenses (secrets, shell-safety, blast-radius, privilege)
  • Target: <30 seconds
  • Ideal for CI gates

Cross-File Awareness

Before review, build a reference map:

  • Shell: source, . includes, invoked scripts
  • Nix: imports, flake inputs
  • CI: referenced scripts, env vars, secrets names
  • Compose: service dependencies, volumes, env files
  • systemd: ExecStart targets, dependencies

This enables finding issues in the seams between components.

Guidelines

  1. Linter-First - Always run static tools before LLM analysis
  2. Evidence Over Opinion - Cite linter output and specific lines
  3. Actionable Suggestions - Every finding needs a clear fix
  4. Respect User Time - Summarize first, details on request
  5. No Spam - Don't file issues without explicit approval
  6. Context Matters - Homelab ≠ production severity

Process Checklist

  1. Parse target (files/directory/diff)
  2. Confirm scope with user if large (>10 files)
  3. Run static tools (shellcheck, statix, etc.)
  4. Build reference map for cross-file awareness
  5. Run each lens, collecting findings
  6. Deduplicate and rank findings
  7. Present summary to user
  8. Ask which findings to file
  9. Create beads issues for approved findings
  10. Report issue IDs created

Integration

  • Lenses: Read from ~/.config/lenses/ops/*.md
  • Issue Tracking: Uses bd create for beads issues
  • Static Tools: shellcheck, statix, deadnix, hadolint