Commit graph

48 commits

Author SHA1 Message Date
dan d463fe58bb docs: add worklog for skills directory standardization 2026-01-14 19:18:25 -08:00
dan 3cc7540c46 docs: add worklog for HQ deployment and Codex skills integration
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 06:56:49 -08:00
dan e6d777e589 feat: add Codex per-repo skills support
- use-skills.sh: symlink to $CODEX_HOME/skills when CODEX_HOME is set
- docs: update PER-REPO-SKILLS.md and RFC-SKILLS-MANIFEST.md with Codex flow
- hq: add model configuration section (sonnet-4.5, Claude Max)
- hq: update launch commands with explicit --model flag

Closes skills-legi

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 06:50:56 -08:00
dan dcb29c12cd docs: add HQ deployment and worker release documentation
- Add docs/releasing-worker.md with build and release process
- Add pkgs/worker/default.nix template for nix packaging
- Add skills/hq/README.md with installation and deployment guide
- Update skills/hq/SKILL.md with detailed requirements table
- Add hq and review-gate to skills.nix
- Add releases/ to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 15:36:01 -08:00
dan 0bffc07b5f docs: add worklog for HQ architecture and beads cleanup session
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 21:40:07 -08:00
dan 4da1890fc3 docs: add scenario schema and example test fixtures
Scaffold for agent capability benchmark harness (skills-qng9):
- docs/specs/scenario-schema.md: YAML schema for test scenarios
- tests/scenarios/: Easy, medium, hard example scenarios
- tests/fixtures/: Python fixtures for testing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 21:24:28 -08:00
dan 7c7733bc64 docs: clarify VPS deployment goal for skills
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 21:24:13 -08:00
dan d3d23f0cd8 docs: add worklog for worker CLI cleanup session
Documents exit code fixes, P3/P4 refactors, and tech debt resolution.
10 commits, 13 files, +980/-241 lines, 56/56 tests passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 16:04:08 -08:00
dan 61cce7e380 refactor: extract helpers and add developer docs
- Add optFromDb generic helper for nullable DbValue conversion
- Extract rowToWorkerInfo helper in state.nim (dedup getWorker/getAllWorkers)
- Extract loadContextFromPath helper in context.nim (dedup readContext/findContext)
- Simplify poll() using optFromDb for optional field extraction
- Add developer guide for compiling, testing, and deployment

Closes: skills-r3k, skills-luzk, skills-dszn, skills-m0e2

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 15:34:15 -08:00
dan daec0f3b85 docs: ADR-006 Nim language choice for worker CLI
Documents rationale for using Nim with ORC for the worker coordination
CLI: fast startup, single binary, Python-like syntax, excellent SQLite
support via tiny_sqlite, CLI generation via cligen.

Closes skills-q40

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 23:27:32 -08:00
dan 1c66d019bd feat: add worker CLI scaffold in Nim
Multi-agent coordination CLI with SQLite message bus:
- State machine: ASSIGNED -> WORKING -> IN_REVIEW -> APPROVED -> COMPLETED
- Commands: spawn, start, done, approve, merge, cancel, fail, heartbeat
- SQLite WAL mode, dedicated heartbeat thread, channel-based IPC
- cligen for CLI, tiny_sqlite for DB, ORC memory management

Design docs for branch-per-worker, state machine, message passing,
and human observability patterns.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 18:47:47 -08:00
dan d6c47c26f5 docs: worklog for multi-agent Lego architecture design session
Covers: review-gate Stop hook fixes, circuit breaker, research on
OpenHands/Gastown/JWZ patterns, epic creation, phased approach

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 13:35:58 -08:00
dan 75c5edb86c docs: cross-agent enforcement architecture design
Comprehensive design covering:
- Abstract layers (message passing, memory, enforcement)
- Four enforcement strategies:
  - Hook-based (Claude/Gemini)
  - Orchestrator-enforced (OpenCode/Codex)
  - Validator sidecar (universal)
  - Proxy-based (API interception)
- Circuit breakers (semantic drift, three-strike, budget)
- Adversarial reviewer pattern
- State flow diagram
- Implementation phases

Based on web research via orch (gemini --websearch).

Addresses: skills-8sj

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 19:51:09 -08:00
dan 8c033eedd1 docs: add Gemini path fix (includeDirectories setting)
Gemini CLI can access ~/.claude/skills/ via:
  settings.json: { "context": { "includeDirectories": ["~/.claude/skills"] } }
  or CLI: gemini --include-directories ~/.claude/skills

Closes: skills-8nl, skills-bo8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 19:35:28 -08:00
dan c14075ae7e docs: web research on cross-agent patterns (via orch)
Key findings from gemini --websearch:
- Manager-Worker orchestration (Maestro pattern)
- alice/idle adversarial review gates (emes)
- Git-as-state for agent coordination
- tissue for machine-first issue tracking
- Circuit breakers: semantic drift, three-strike, budget limits
- Sandboxing: Wasm and Docker playgrounds

Validates our direction: beads, orch, file-based coordination.
Gaps: orchestrator-enforced gates, agent messaging, sandboxing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 17:50:37 -08:00
dan ec2d856c05 docs: add agent capability matrix for cross-agent design
Comprehensive comparison of Claude Code, Gemini CLI, OpenCode, and Codex:
- Hooks/lifecycle events (Claude/Gemini best, OpenCode most comprehensive)
- Subagent spawning (MCP is universal bridge)
- File access (Gemini has path restrictions - skills-bo8)
- Sandboxing (Codex has OS-level, others approval-based)
- State persistence (need external store for cross-agent)

Key finding: Orchestrator pattern works across all agents.
Stop hooks only in Claude/Gemini - others need protocol-based gates.

Closes: skills-fqu

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 17:32:17 -08:00
dan 4773abe56f docs: correct alice framing - adversarial agent review for automation
alice is for reviewing AGENT work in unattended/autonomous contexts,
not code review. Key use cases:
- Autonomous runs on ops-jrz1
- CI/CD pipelines with agents
- High-stakes changes without human oversight

Added hybrid approach recommendation: use alice concepts (Stop hook,
adversarial methodology) with our infrastructure (beads, orch).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:45:49 -08:00
dan 239c758dc7 docs: research idle/alice quality gate mechanism
Comprehensive analysis of emes idle/alice plugin:
- Hook chain (6 hooks, Stop is key blocker)
- State management via jwz (topic-based messaging)
- alice agent (read-only Opus reviewer)
- Circuit breakers against infinite loops

Conclusion: alice pattern is overkill for code-review (we ARE the
reviewer). More useful: "review reminder" hook that checks if
code-review was run before exit on significant changes.

Closes: skills-9jk

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:43:46 -08:00
dan a198b31add docs: clarify deployment strategy (beads local, tissue remote)
Local (skills, dotfiles): beads + our dual-publish
Remote (ops-jrz1 VPS): tissue + emes ecosystem

They coexist by environment, not replacing each other.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 16:06:45 -08:00
dan 8a76f4e9cb docs: add plugin systems comparison (ours vs claude vs emes)
Compares three approaches:
- Our system: cross-agent, Nix, lenses
- Claude plugins: official, hooks, marketplace
- emes: mechanical enforcement, tissue, idle, jwz

Living document for iterating on architecture.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 15:39:19 -08:00
dan a84066c1dd docs: add ADR-005 for dual-publish plugin architecture
Captures the decision to maintain both:
- Nix deployment (cross-agent: Gemini, OpenCode)
- Claude plugin system (hooks, marketplace)

Documents trade-offs, consequences, and mitigations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 11:44:15 -08:00
dan d97c96be4c feat: add marketplace.json and update dual-publish guide
- Create .claude-plugin/marketplace.json at repo root
- Register orch as first dual-publish plugin
- Update emes-conversion-guide.md to explain dual-publish pattern
- Cross-agent support (Gemini, OpenCode) via Nix
- Claude plugin system for hooks and /plugin install UX

Part of skills-6x1 (emes plugin architecture epic)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 11:21:32 -08:00
dan f24b2bb518 feat: add emes plugin structure to orch skill
- Add .claude-plugin/plugin.json with metadata
- Copy SKILL.md to skills/orch.md for auto-discovery
- Keep original SKILL.md for Nix backward compat
- Add emes-conversion-guide.md documenting the pattern

Part of skills-6x1 (emes plugin architecture epic)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 11:03:37 -08:00
dan ff1d294d59 docs: worklog for agent update tests and bug fix 2026-01-02 02:30:03 -08:00
dan 1e9d6cb93d docs: add first markdown-format worklog
Tests the new worklog template with YAML frontmatter.
Documents ops-review Phase 3 completion and worklog migration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 21:17:27 -08:00
dan fa97fca041 feat: complete ops-review skill with all 10 lenses
Phase 2 lenses (reliability):
- idempotency: safe re-run, atomic ops, convergence
- supply-chain: pinning, provenance, build-time network
- observability: health checks, logging, metrics

Phase 3 lenses (architecture):
- nix-hygiene: statix/deadnix patterns, module design
- resilience: timeouts, retries, resource limits
- orchestration: ordering, dependencies, coupling

All lenses validated via orch consensus (gemini, gpt, flash-or).
Testing delegated to target repos: dotfiles-je5, prox-setup-kqg.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 21:02:39 -08:00
dan fb882a9434 feat: add ops-review skill with Phase 1 lenses
Multi-lens review skill for operational infrastructure (Nix, shell,
Docker, CI/CD). Modeled on code-review with linter-first hybrid
architecture.

Phase 1 lenses (core safety):
- secrets: credential exposure, Nix store, Docker layers, CI masking
- shell-safety: shellcheck-backed, temp files, guard snippets
- blast-radius: targeting/scoping, dry-run, rollback
- privilege: least-privilege, containers, systemd sandboxing

Design reviewed via orch consensus (sonar, flash-or, gemini, gpt).
Lenses deploy to ~/.config/lenses/ops/ via home-manager.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 17:36:24 -08:00
dan 34afa86b77 docs: worklog for vision benchmark, orch patterns, claude-search 2025-12-29 20:45:01 -05:00
dan bd83887669 research: vision model UI understanding benchmark
Tested Claude Opus 4.5 on btop and GitHub screenshots.
Findings: excellent text/state/layout, approximate coordinates.
Recommendation: hybrid AT-SPI + vision approach.
2025-12-29 15:26:13 -05:00
dan 324b4a6fa3 docs: worklog for issue triage and playwright-visit implementation
Session covered:
- 9 issues closed (design decisions + implementation)
- playwright-visit skill created and tested
- READMEs added for web-search and web-research
- ADR-001 parked, multiple design questions resolved

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 01:22:19 -05:00
dan c7c6bbf796 docs: park ADR-001 skills-molecules integration
Current simpler approach working well:
- Skills as standalone entrypoints
- Agent judgment sufficient for invocation
- Molecules not actively used

Revisit when complex orchestration is needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 23:27:41 -05:00
dan e209db9230 docs: worklog for niri fixes, opencode research, readme update
Session covered:
- skills-m21: niri-window-capture robustness improvements
- skills-czz: OpenCode agents research
- skills-4yn: screenshot-latest deployment
- skills-a23: README update with all 14 skills

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 22:10:24 -05:00
dan fb5e3af8e1 docs: worklog for code-review skill creation and worklog cleanup 2025-12-28 00:06:38 -05:00
dan 2103e0994d docs: worklog for multi-lens code review workflow testing 2025-12-26 02:04:08 -05:00
dan f8372b1e17 docs: worklog for ADR revision, LSP research, code audit session
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 02:54:02 -05:00
dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs
- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 20:55:18 -05:00
dan e366343dd7 docs: worklog for wayland desktop automation session 2025-12-17 14:32:20 -08:00
dan 437265b916 feat: add spec-review skill and ai-tools-doctor docs
spec-review: Multi-model review of spec-kit artifacts using orch
- SKILL.md with progressive disclosure pattern
- Review processes: spec, plan, tasks, gate-check
- Prompts for critique, review, and go/no-go decisions

ai-tools-doctor: RFC and implementation report for diagnostics skill

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-15 00:43:52 -08:00
dan 90e72f1095 fix(use-skills): prevent stderr from corrupting symlink targets
Remove 2>&1 from nix build capture. When repo is dirty, nix emits
warnings to stderr which were being merged into $out and used as
symlink targets, creating broken symlinks like:

  orch -> warning: Git tree '...' is dirty\n/nix/store/...

Now stderr goes to terminal, only stdout (store path) captured.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 12:42:26 -08:00
dan def212bc5b docs: complete worklog for doc-review design session
Adds Vale discovery, spin-off decision, migration details,
and updated session metrics to the design session worklog.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 17:54:05 -08:00
dan 139a521a89 doc-review: design session complete, spun off to ~/proj/doc-review
- Added use_api_keys to .envrc for orch access
- Worklog documents full design process
- Beads closed: skills-bcu, skills-1ig, skills-53k, skills-d6r
- Architecture: Vale + LLM hybrid (deterministic + semantic)
- Implementation continues in dedicated repo
2025-12-04 16:44:49 -08:00
dan 148f219887 chore: replace STATIC_DATA.md with upstream, add session worklog
Code review found upstream STATIC_DATA.md was better quality despite
being 5 lines shorter. Added comprehensive session worklog.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 20:43:51 -08:00
dan e921fd96df feat: add per-repo skill deployment pattern
- Add bin/use-skills.sh helper with use_skill and load_skills_from_manifest
- Add .skills manifest pattern for declarative skill configuration
- Fix ai-skills.nix: remove broken npm plugin code, update skill list
- Add update-opencode, web-search, web-research to flake.nix availableSkills
- Add RFC and documentation for team adoption

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-30 14:47:02 -08:00
dan 26a6469604 feat: add web-research skill and automate deployment
Includes:
- New 'web-search' skill
- New 'web-research' skill with multi-backend support (claude, llm, kagi)
- Automated deployment in bin/deploy-skill.sh
- Sops-nix integration for Kagi API key
- Documentation updates
2025-11-23 23:18:32 -08:00
dan 9e544cd03a docs(worklog): final update with all commits 2025-11-10 15:09:13 -08:00
dan 819ec78b33 docs(worklog): update with guardrails commit 2025-11-10 15:06:17 -08:00
dan ec8dba8097 docs(worklog): update commit count for tufte-press evolution session 2025-11-10 15:03:57 -08:00
dan 5fea49b7c0 feat(tufte-press): evolve skill to complete workflow with JSON generation and build automation
- Transform tufte-press from reference guide to conversation-aware generator
- Add JSON generation from conversation context following strict schema
- Create build automation scripts with Nix environment handling
- Integrate CUPS printing with duplex support
- Add comprehensive workflow documentation

Scripts added:
- skills/tufte-press/scripts/generate-and-build.sh (242 lines)
- skills/tufte-press/scripts/build-card.sh (23 lines)

Documentation:
- Updated SKILL.md with complete workflow instructions (370 lines)
- Updated README.md with usage examples (340 lines)
- Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines)
- Added worklog: 2025-11-10-tufte-press-skill-evolution.org

Features:
- Agent generates valid JSON from conversation
- Schema validation before build (catches errors early)
- Automatic Nix shell entry for dependencies
- PDF build via tufte-press toolchain
- Optional print with duplex support
- Self-contained margin notes enforced
- Complete end-to-end testing

Workflow: Conversation → JSON → Validate → Build → Print

Related: niri-window-capture, screenshot-latest, worklog skills
2025-11-10 15:03:44 -08:00