#+TITLE: use-skills.sh Symlink Corruption Bug Fix and Orch Invocation Analysis #+DATE: 2025-12-14 #+KEYWORDS: use-skills, symlink, nix, stderr, orch, direnv, cross-repo-deps, home-manager #+COMMITS: 0 (uncommitted fix pending) #+COMPRESSION_STATUS: uncompressed * Session Summary ** Date: 2025-12-14 (Sunday) ** Focus Area: Diagnosing and fixing skill loading failures in per-repo deployment * Accomplishments - [X] Analyzed talu project's skill configuration status - [X] Diagnosed why orch skill failed to work in talu session (from user-provided transcript) - [X] Root-caused the symlink corruption bug in use-skills.sh - [X] Fixed the stderr capture issue that corrupted symlink targets - [X] Verified fix by reloading skills in talu - symlinks now correct - [X] Filed skills-fvx (P1 bug) for symlink corruption - closed with fix - [X] Filed skills-d87 (P2 bug) for orch invocation mechanism - [X] Filed dotfiles-3to (P2 task) for adding orch to home-manager - [X] Ran orch consensus to evaluate orch deployment options - [ ] Commit the use-skills.sh fix (pending) * Key Decisions ** Decision 1: Fix stderr capture by removing 2>&1 - Context: nix build emits warnings to stderr when repo is dirty - Options considered: 1. Remove 2>&1 entirely - let stderr go to terminal, capture only stdout 2. Filter out warning lines with grep 3. Redirect stderr to temp file, only show on failure - Rationale: Option 1 is simplest and correct - nix already shows errors on failure, no need to capture and re-echo - Impact: Symlinks now contain only the store path, not warning text ** Decision 2: orch belongs in home-manager, not bundled in skill - Context: orch skill provides documentation but CLI isn't in PATH - Options considered (via orch consensus with gemini + gpt): 1. Wrapper script in skill - self-contained but hardcoded path 2. Global install via home-manager - system tool approach 3. Per-project direnv PATH - repetitive, fragile for agents 4. (Gemini suggestion) Build CLI from source in skill package - Rationale: orch is a general-purpose system tool (like git, rg), not a project-specific dependency. System tools belong in home-manager. - Impact: Cross-repo coordination needed - skills repo documents, dotfiles repo installs ** Decision 3: Cross-repo dependencies noted in description, not formal - Context: skills-d87 is blocked by dotfiles-3to, but bd dep doesn't support cross-repo - Options: Hard dep, soft/formal dep, or text note - Rationale: Text note is sufficient for human readers, no tooling benefit from more formal tracking - Impact: skills-d87 description mentions "Blocked by: dotfiles-3to" * Problems & Solutions | Problem | Solution | Learning | |---------|----------|----------| | Symlinks contained "warning: Git tree is dirty" in target path | Remove 2>&1 from nix build capture - let stderr go to terminal | Shell command substitution captures all stdout, including merged stderr | | orch command not found when agent tried to use skill | Skill documents tool but doesn't provide it - need global install | Skills can be documentation-only for system tools | | Can't create formal cross-repo dependency | Note in issue description | bd beads is per-repo; cross-repo tracking is manual | * Technical Details ** Code Changes - Total files modified: 1 (bin/use-skills.sh) - Key files changed: - ~bin/use-skills.sh~ - Removed 2>&1 from nix build command ** The Bug Original code: #+BEGIN_SRC bash out=$(nix build --print-out-paths --no-link "${SKILLS_REPO}#${skill}" 2>&1) || { echo "use_skill: failed to build ${skill}" >&2 echo "$out" >&2 return 1 } #+END_SRC When repo is dirty, nix emits warning to stderr. The ~2>&1~ merges stderr into stdout, so ~$out~ becomes: #+BEGIN_EXAMPLE warning: Git tree '/home/dan/proj/skills' is dirty /nix/store/j952hgxixifscafb42vmw9vgdphi1djs-ai-skill-orch #+END_EXAMPLE This multiline string with warning becomes the symlink target - completely broken. ** The Fix #+BEGIN_SRC bash out=$(nix build --print-out-paths --no-link "${SKILLS_REPO}#${skill}") || { echo "use_skill: failed to build ${skill}" >&2 return 1 } #+END_SRC Now stderr goes to terminal (where warnings belong), stdout captured cleanly. ** Commands Used #+BEGIN_SRC bash # Verify talu's skill setup cat ~/proj/talu/.envrc cat ~/proj/talu/.skills ls -la ~/proj/talu/.claude/skills/ # Diagnose broken symlinks readlink -f ~/proj/talu/.claude/skills/orch # showed "symlink broken" # Test the fix cd ~/proj/talu && rm -rf .claude/skills .opencode/skills source ~/proj/skills/bin/use-skills.sh && load_skills_from_manifest ls -la .claude/skills/ # now shows clean paths # orch consensus for design decision cd ~/proj/orch && uv run orch consensus "..." gemini gpt --mode vote #+END_SRC ** Architecture Notes - Skills system has two layers: skill packages (nix) and skill loading (direnv/bash) - Skills can be documentation-only (assume tool exists) or bundled (include tool) - System tools (git, rg, orch) should be globally installed, not per-skill - Per-repo skill deployment via .skills manifest + direnv * Process and Workflow ** What Worked Well - User provided exact transcript of failure - made diagnosis quick - orch consensus gave useful opposing viewpoints on design decision - Cross-repo issue filing maintained traceability ** What Was Challenging - Shell stderr/stdout behavior is easy to get wrong - Cross-repo dependencies have no formal tooling support * Learning and Insights ** Technical Insights - ~$(cmd 2>&1)~ is dangerous when you only want stdout - stderr gets mixed in - nix build warnings go to stderr even on success - Symlinks happily accept multiline strings as targets (they just won't resolve) ** Process Insights - When skill invocation fails, check: (1) symlink validity, (2) skill.md readability, (3) actual tool availability - orch consensus is useful for getting opposing viewpoints on design decisions ** Architectural Insights - Distinction between "system tools" and "project tools" helps decide where to install - Skills documenting system tools don't need to bundle them - just assume they exist - Cross-repo coordination is a reality; text notes in descriptions are pragmatic * Context for Future Work ** Open Questions - Should skills have a way to declare system tool dependencies? - Would a "skill doctor" tool help diagnose skill loading issues? ** Next Steps - Commit use-skills.sh fix - dotfiles team implements dotfiles-3to (add orch to home-manager) - Then skills-d87 can be closed ** Related Work - Previous: [[file:2025-11-30-per-repo-skill-deployment-design.org][Per-Repo Skill Deployment Design]] - Cross-repo: dotfiles-3to (Add orch CLI to home-manager packages) * Raw Notes - The failure cascade: dirty repo → nix warning → stderr merged → symlink corrupted → skill.md unreadable → agent doesn't know invocation → bare command fails → hunt for workaround - User asked "how much do we want to mix nix and agentic dev tooling" - good architectural tension to keep in mind - Gemini suggested the "proper" fix (build tool in skill), GPT suggested pragmatic fix (global install) - ended up with pragmatic * Session Metrics - Commits made: 0 (fix uncommitted) - Files touched: 2 (bin/use-skills.sh, .beads/issues.jsonl) - Lines added/removed: +3/-2 - Issues created: 2 (skills-fvx closed, skills-d87 open) - Cross-repo issues: 1 (dotfiles-3to)