skills/docs/worklogs/2026-01-01-ops-review-phase-2-lenses.org
dan fa97fca041 feat: complete ops-review skill with all 10 lenses
Phase 2 lenses (reliability):
- idempotency: safe re-run, atomic ops, convergence
- supply-chain: pinning, provenance, build-time network
- observability: health checks, logging, metrics

Phase 3 lenses (architecture):
- nix-hygiene: statix/deadnix patterns, module design
- resilience: timeouts, retries, resource limits
- orchestration: ordering, dependencies, coupling

All lenses validated via orch consensus (gemini, gpt, flash-or).
Testing delegated to target repos: dotfiles-je5, prox-setup-kqg.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 21:02:39 -08:00

6.3 KiB
Raw Blame History

ops-review Phase 2 Lenses: Idempotency, Supply-Chain, Observability

Session Summary

Date: 2026-01-01 (Day 2 of ops-review skill)

Focus Area: Phase 2 lens implementation with orch consensus validation

Accomplishments

  • Created testing bead in dotfiles (dotfiles-je5) with expected findings from smoke test
  • Created testing bead in prox-setup (prox-setup-kqg) for repo team validation
  • Reassigned skills-9cu.14 (prox-setup testing) to prox-setup repo - teams own their own testing
  • Implemented idempotency lens with orch consensus review
  • Implemented supply-chain lens with orch consensus review
  • Implemented observability lens with orch consensus review
  • All three lenses enriched with feedback from gemini, gpt, flash-or
  • Phase 3 lenses remaining: nix-hygiene, resilience, orchestration

Key Decisions

Decision 1: Repo teams own testing beads

  • Context: Originally had skills-9cu.13 and skills-9cu.14 for testing on dotfiles/prox-setup
  • Options considered:

    1. Test in skills repo, document findings
    2. Create beads in target repos, let teams run and validate
  • Rationale: Teams know their repos best, creates accountability, avoids duplicate work
  • Impact: Filed dotfiles-je5 and prox-setup-kqg with expected findings for comparison

Decision 2: Orch consensus for each lens

  • Context: Phase 1 established pattern of using orch consensus for lens validation
  • Rationale: Multiple models catch different edge cases and false positive risks
  • Impact: Each lens enriched with 3-5 additional patterns from consensus

Problems & Solutions

Problem Solution Learning
gemini/gpt didn't receive file in first orch call Used pipe instead of command substitution: ~cat file \ uv run orch consensus "prompt"~ Pipe is more reliable than $(cat file) for large content
dotfiles dev branch had no upstream git push --set-upstream origin dev bd sync assumes upstream exists

Technical Details

Lens Additions from Orch Consensus

idempotency.md

From consensus feedback:

  • Optimistic locking (ETags, resourceVersion) for read-modify-write races
  • Non-deterministic naming (random suffixes creating duplicates)
  • Delete idempotency (ensure absent pattern)
  • No-op illusion warning (mkdir -p returns 0 even if path is a file)
  • False positive risks section

supply-chain.md

From consensus feedback:

  • Terraform/Tofu provider and module pinning
  • Build-time network access (__noChroot, RUN curl during build)
  • GitHub Actions permissions block (GITHUB_TOKEN defaults)
  • builtins.fetchTarball without sha256

observability.md

From consensus feedback:

  • Resource visibility (disk, inodes, file descriptors, OOM)
  • Heartbeats/dead-man's-switch for scheduled jobs
  • Version/commit hash in startup logs
  • Config dump on startup (redacted)
  • Log rotation policies
  • K8s ignores Dockerfile HEALTHCHECK note

Files Created

  • skills/ops-review/lenses/idempotency.md - Safe re-execution, convergence
  • skills/ops-review/lenses/supply-chain.md - Dependency provenance, pinning
  • skills/ops-review/lenses/observability.md - Visibility, monitoring, debuggability

Commands Used

```bash

cat skills/ops-review/lenses/supply-chain.md | uv run orch consensus "Review this…" gemini gpt flash-or

cd ~/proj/dotfiles && bd create title="…" type=task body="…" bd dep add dotfiles-je5 dotfiles-x2m

bd close skills-9cu.6 reason="Lens created with orch consensus feedback: …" ```

Process and Workflow

What Worked Well

  • Orch consensus continues to add value - each model catches different issues
  • flash-or consistently fastest with good practical feedback
  • gpt provides most comprehensive lists (sometimes too comprehensive)
  • gemini good at Terraform/IaC patterns
  • Filing testing beads in target repos with expected results creates clear validation criteria

What Was Challenging

  • Command substitution $(cat file) didn't work reliably for passing file content to orch
  • Models sometimes provide overlapping suggestions - need to filter to most impactful

Learning and Insights

Technical Insights

  • K8s ignores Dockerfile HEALTHCHECK - probes are the real control plane
  • GITHUB_TOKEN has write access by default - need explicit permissions block
  • builtins.fetchTarball is common way to fetch nixpkgs without hash (security gap)
  • Dead-man's-switch pattern essential for cron - detects "job didn't run at all"
  • mkdir -p returning 0 when path is a file is a "no-op illusion"

Process Insights

  • Piping file content to orch more reliable than command substitution
  • Three models is the sweet spot - more adds diminishing returns
  • gemini + gpt + flash-or covers: IaC, completeness, practical ops

Lens Design Insights

  • "False Positive Risks" section essential - prevents over-flagging
  • Each lens benefits from "Common Fixes" section with copy-paste solutions
  • Crisp boundaries between lenses reduce duplicate findings

Context for Future Work

Open Questions

  • Should Phase 3 lenses also go through orch consensus?
  • How to handle lens overlap when reviewing (priority order?)
  • Metrics for false positive rate validation

Next Steps

  • Phase 3 lenses: nix-hygiene, resilience, orchestration
  • nix-hygiene backed by statix/deadnix (linter-first)
  • Update lenses/README.md with new lenses
  • Consider closing epic when Phase 3 complete

Related Work

Raw Notes

  • Skill deployment is just adding to claudeCodeSkills list in home/claude.nix
  • Lenses auto-deploy via enableLenses = true (default)
  • direnv use_api_keys provides API keys for orch in any repo with .envrc
  • ops-review epic: 11/14 tasks closed, Phase 3 remaining

Session Metrics

  • Commits made: 3
  • Files touched: 32
  • Lines added/removed: +2014/-263
  • Tests added: 0
  • Lenses created: 3 (idempotency, supply-chain, observability)