docs: worklog for code-review skill creation and worklog cleanup
This commit is contained in:
parent
4b72e6fc2e
commit
fb5e3af8e1
|
|
@ -0,0 +1,182 @@
|
|||
#+TITLE: Code Review Skill Creation and Worklog Cleanup
|
||||
#+DATE: 2025-12-28
|
||||
#+KEYWORDS: code-review, skill, worklog, refactoring, orch-consensus, lenses
|
||||
#+COMMITS: 6
|
||||
#+COMPRESSION_STATUS: uncompressed
|
||||
|
||||
* Session Summary
|
||||
** Date: 2025-12-28 (Continuation from 2025-12-26 session)
|
||||
** Focus Area: Creating /code-review skill, cleaning up worklog skill
|
||||
|
||||
* Accomplishments
|
||||
- [X] Ran orch consensus on code-review workflow design (gpt + gemini, qwen was flaky)
|
||||
- [X] Created /code-review skill based on consensus recommendations
|
||||
- [X] Closed proto skills-fvc and 7 child tasks (replaced by skill)
|
||||
- [X] Added code-review to dotfiles claudeCodeSkills deployment
|
||||
- [X] Added code-review to delbaker .skills manifest
|
||||
- [X] Holistic review of skills repo (50 open issues, 2 blocked epics)
|
||||
- [X] Completed all 5 worklog cleanup tasks (127 -> 88 lines, -31%)
|
||||
- [X] Tested updated extract-metrics.sh script
|
||||
- [X] Ran code-review on updated worklog skill (clean - no issues worth filing)
|
||||
- [X] Filed 5 issues in dotfiles from code-review of flake.nix
|
||||
|
||||
* Key Decisions
|
||||
** Decision 1: Skill over Proto for code-review workflow
|
||||
- Context: Had both lenses (prompts) and a beads proto (skills-fvc) for code review
|
||||
- Options considered:
|
||||
1. Keep proto as workflow orchestrator - unused, adds complexity
|
||||
2. Create Claude Code skill as entrypoint - matches actual usage pattern
|
||||
3. Ad-hoc documentation only - too loose
|
||||
- Rationale: Consensus from GPT + Gemini agreed skill is the right abstraction. Proto was never actually used (bd pour/wisp commands).
|
||||
- Impact: Simpler mental model - /code-review is the entrypoint, lenses are prompts it uses
|
||||
|
||||
** Decision 2: Interactive by default for code-review
|
||||
- Context: How much automation for issue filing?
|
||||
- Options considered:
|
||||
1. Full automation - file all findings automatically
|
||||
2. Interactive - present findings, ask before filing
|
||||
3. Report only - never file, just output
|
||||
- Rationale: Both models recommended interactive. Prevents issue spam, keeps human in loop.
|
||||
- Impact: Skill asks "which findings to file?" after presenting summary
|
||||
|
||||
** Decision 3: Consolidate worklog skill aggressively
|
||||
- Context: 5 cleanup tasks from earlier lens review
|
||||
- Rationale: Quick wins, reduce maintenance burden, test the lens -> issue -> fix cycle
|
||||
- Impact: 127 -> 88 lines (-31%), cleaner skill prompt
|
||||
|
||||
* Problems & Solutions
|
||||
| Problem | Solution | Learning |
|
||||
|---------|----------|----------|
|
||||
| Orch consensus with qwen hanging | Kill and retry with gpt + gemini only | qwen has reliability issues on long prompts |
|
||||
| Orch consensus timing out | Run models separately with orch chat, synthesize manually | Parallel queries work, consensus command buffers until all complete |
|
||||
| Proto tasks polluting other repos | Close proto, use skill instead | molecules.jsonl cross-repo loading needs work (bd-k2wg) |
|
||||
| extract-metrics.sh not showing branch/status | Added BRANCH and STATUS output to script | Script was metrics-focused, now includes full git context |
|
||||
| Semantic compression references | Already removed when merging Guidelines/Remember | Sometimes cleanup tasks overlap |
|
||||
|
||||
* Technical Details
|
||||
|
||||
** Code Changes
|
||||
- Total files modified: 20
|
||||
- Key files changed:
|
||||
- =skills/code-review/SKILL.md= - New skill (120 lines)
|
||||
- =skills/code-review/README.md= - Skill documentation
|
||||
- =skills/code-review/lenses/*.md= - Bundled lens prompts
|
||||
- =skills/worklog/SKILL.md= - Refactored (127 -> 88 lines)
|
||||
- =skills/worklog/scripts/extract-metrics.sh= - Added branch/status output
|
||||
- =modules/ai-skills.nix= - Added code-review to skills list
|
||||
- =~/proj/dotfiles/home/claude.nix= - Added code-review to claudeCodeSkills
|
||||
- =~/proj/delbaker/.skills= - Added code-review to manifest
|
||||
|
||||
** New Files Created
|
||||
- =skills/code-review/SKILL.md= - Main skill prompt
|
||||
- =skills/code-review/README.md= - Quick reference
|
||||
- =skills/code-review/lenses/= - Bundled copies of lens prompts
|
||||
|
||||
** Commands Used
|
||||
#+begin_src bash
|
||||
# Orch consensus (failed with 3 models)
|
||||
uv run orch consensus --temperature 1.0 "..." gemini gpt qwen3
|
||||
|
||||
# Orch chat (worked for individual models)
|
||||
uv run orch chat "..." --model gpt --temperature 1.0
|
||||
uv run orch chat "..." --model gemini --temperature 1.0
|
||||
|
||||
# Test updated extract-metrics script
|
||||
./skills/worklog/scripts/extract-metrics.sh
|
||||
|
||||
# Update skills flake in dotfiles
|
||||
cd ~/proj/dotfiles && nix flake update skills
|
||||
#+end_src
|
||||
|
||||
** Architecture Notes
|
||||
- Skill deployment: home-manager symlinks skills from nix store to ~/.claude/skills/
|
||||
- Per-repo skills: .skills manifest + use-skills.sh creates repo-local symlinks
|
||||
- Lenses bundled in skill but also deployed to ~/.config/lenses/ for direct orch use
|
||||
- Proto/molecules layer deemed overhead - skill is simpler for this use case
|
||||
|
||||
* Process and Workflow
|
||||
|
||||
** What Worked Well
|
||||
- Orch consensus (when it worked) provided useful multi-model perspective
|
||||
- Quick iteration: create skill -> deploy -> test on real target (dotfiles flake.nix)
|
||||
- TodoWrite for tracking the 5 worklog tasks
|
||||
- Beads for tracking issues and closing them as work completed
|
||||
- Running code-review on recently modified code as validation
|
||||
|
||||
** What Was Challenging
|
||||
- Orch reliability: qwen hanging, consensus command timing out
|
||||
- Remote git server down throughout session (local commits only)
|
||||
- Context recovery from previous session compaction
|
||||
|
||||
* Learning and Insights
|
||||
|
||||
** Technical Insights
|
||||
- orch chat is more reliable than orch consensus for long prompts
|
||||
- Skills are the right abstraction for Claude Code workflows - simpler than protos
|
||||
- Shell script changes need home-manager rebuild to deploy
|
||||
|
||||
** Process Insights
|
||||
- Lens -> issue -> fix cycle works well for incremental cleanup
|
||||
- Running multiple lenses finds overlapping issues (good for synthesis)
|
||||
- Interactive review prevents over-filing low-value issues
|
||||
|
||||
** Architectural Insights
|
||||
- Skills repo has 3 layers: skills (prompts), lenses (review prompts), workflows (protos)
|
||||
- Lenses are a subset of skills conceptually - focused single-purpose prompts
|
||||
- Proto/molecule layer adds complexity without proportional benefit currently
|
||||
|
||||
* Context for Future Work
|
||||
|
||||
** Open Questions
|
||||
- Should lenses output JSON for structured parsing?
|
||||
- How to handle orch reliability issues (qwen, timeouts)?
|
||||
- Should code-review skill use orch internally or leave it optional?
|
||||
|
||||
** Next Steps
|
||||
- Run code-review on other skills (niri-window-capture has pending review)
|
||||
- Consider remaining 2 worklog tasks (j2a done, njb done - actually all done now)
|
||||
- Address dotfiles issues filed this session (5 issues in flake.nix)
|
||||
- Rebuild home-manager to deploy updated skills
|
||||
|
||||
** Related Work
|
||||
- [[file:2025-12-26-multi-lens-code-review-workflow-testing.org][2025-12-26 Multi-Lens Code Review Testing]] - Created lenses, tested on orch
|
||||
- [[file:2025-12-24-adr-revision-lsp-research-code-audit.org][2025-12-24 ADR Revision]] - Initial lens creation
|
||||
- orch-loq: qwen empty responses bug (filed in orch repo)
|
||||
- bd-k2wg: molecules.jsonl hierarchical loading (filed in beads repo)
|
||||
|
||||
* Raw Notes
|
||||
- Session started from context recovery (previous session compacted)
|
||||
- GPT recommendation: skill as entrypoint, orch for synthesis only, JSON output, interactive by default
|
||||
- Gemini recommendation: consolidate into skills/, single agent explores, orch at end for filtering
|
||||
- Both agreed: delete proto, make skill, interactive review
|
||||
- Worklog cleanup tasks all from earlier lens review (2025-12-25)
|
||||
- extract-metrics.sh output changed from "Session Metrics" to "Git Context"
|
||||
|
||||
** Orch Consensus Key Points
|
||||
From GPT:
|
||||
- Skill = primary workflow entrypoint
|
||||
- Orch = synthesis/filtering only, not for running every lens
|
||||
- JSON source of truth, markdown is rendering
|
||||
- Repo-local beads storage to avoid cross-repo pollution
|
||||
|
||||
From Gemini:
|
||||
- Rename lenses to skills (we kept them separate)
|
||||
- Single agent explores, orch filters at end
|
||||
- "Driver" pattern - human approves before filing
|
||||
- Delete proto as unused complexity
|
||||
|
||||
** Commits This Session
|
||||
1. feat: add /code-review skill with bundled lenses
|
||||
2. docs: add code-review to skills list (ai-skills.nix)
|
||||
3. feat: add code-review skill (dotfiles)
|
||||
4. chore: update skills flake (dotfiles)
|
||||
5. refactor(worklog): consolidate skill prompt
|
||||
6. refactor(worklog): consolidate git commands into script
|
||||
|
||||
* Session Metrics
|
||||
- Commits made: 6 (across skills and dotfiles repos)
|
||||
- Files touched: 20
|
||||
- Lines added/removed: +829/-70
|
||||
- Issues filed: 5 (in dotfiles)
|
||||
- Issues closed: 8 (proto) + 5 (worklog) = 13
|
||||
- Tests added: 0
|
||||
Loading…
Reference in a new issue