# Branch-per-Worker Isolation Design **Status**: Draft **Bead**: skills-roq **Epic**: skills-s6y (Multi-agent orchestration: Lego brick architecture) ## Overview This document defines the git branching strategy for multi-agent coordination. Each worker operates in an isolated git worktree on a dedicated branch, with mandatory rebase before review. ## Design Principles 1. **Orchestrator controls branch lifecycle** - Creates, assigns, cleans up 2. **Worktrees for parallelism** - Each worker gets isolated directory 3. **Integration branch as staging** - Buffer before main 4. **SQLite = process truth, Git = code truth** - Don't duplicate state 5. **Mandatory rebase** - Fresh base before review (consensus requirement) ## Key Decisions ### Branch Naming: `type/task-id` **Decision**: Use `type/task-id` (e.g., `feat/skills-abc`, `fix/skills-xyz`) **Rationale** (2/3 consensus): - Branch describes *work*, not *worker* - Survives task reassignment (if Claude fails, Gemini can continue) - Worker identity in commit author: `Author: claude-3.5 ` **Rejected alternative**: `worker-id/task-id` - becomes misleading on reassignment ### Worktrees vs Checkout **Decision**: Git worktrees (parallel directories) **Rationale** (3/3 consensus): - `git checkout` updates working directory globally - If Worker A checks out while Worker B writes, corruption possible - Worktrees share `.git` object database, isolate filesystem - Maps cleanly to "one worker = one workspace" ``` /project/ ├── .git/ # Shared object database ├── worktrees/ │ ├── skills-abc/ # Worker 1's worktree │ │ └── (full working copy) │ └── skills-xyz/ # Worker 2's worktree │ └── (full working copy) └── (main working copy) # For orchestrator ``` ### Integration Branch **Decision**: Use rolling `integration` branch as staging before `main` **Rationale** (3/3 consensus): - AI agents introduce subtle regressions - Integration branch = "demilitarized zone" for combined testing - Human review before promoting to main - Allows batching of related changes ``` main ─────────────────────────●───────────── ↑ integration ────●────●────●────●────● ↑ ↑ ↑ ↑ feat/T-101 ────● │ │ │ feat/T-102 ─────────● │ │ fix/T-103 ──────────────● │ feat/T-104 ───────────────────● ``` ### Conflict Handling **Decision**: Worker resolves trivial conflicts, escalates semantic conflicts **Rationale** (2/3 consensus - flash-or, gpt): - Blanket "never resolve" is safe but slows throughput - Mechanical conflicts (formatting, imports, non-overlapping) are safe - Logic conflicts require human judgment **Rules**: ```python def handle_rebase_conflict(conflict_info): # Trivial: resolve automatically if is_trivial_conflict(conflict_info): resolve_mechanically() run_tests() if tests_pass(): continue_rebase() else: abort_and_escalate() # Semantic: always escalate else: git_rebase_abort() set_state(CONFLICTED) notify_orchestrator() ``` **Trivial conflict criteria**: - Only whitespace/formatting changes - Import statement ordering - Non-overlapping edits in same file - Less than N lines changed in conflict region **Escalate if**: - Conflict touches core logic - Conflict spans multiple files - Test failures after resolution - Uncertain about correctness ### State Machine Mapping **Decision**: SQLite is process truth, Git is code truth **Rationale** (2/3 consensus - gemini, gpt): - Don't encode state in Git (tags, notes) - causes sync issues - Observable Git signals already exist: | Worker State | Git Observable | |--------------|----------------| | ASSIGNED | Branch exists, worktree created | | WORKING | New commits appearing | | IN_REVIEW | Branch pushed, PR opened (or flag in SQLite) | | APPROVED | PR approved | | COMPLETED | Merged to integration/main | | CONFLICTED | Rebase aborted, no new commits | **Link via task-id**: Commit trailers connect the two: ``` feat: implement user authentication Task: skills-abc Agent: claude-3.5 ``` ### Cross-Worker Dependencies **Decision**: Strict serialization - don't depend on uncommitted work **Rationale** (3/3 consensus): - "Speculative execution" creates house of cards - If A's code rejected, B's work becomes invalid - Cheaper to wait than waste tokens on orphaned code **Pattern for parallel work on related features**: 1. Orchestrator creates epic branch: `epic/auth-system` 2. Both workers branch from epic: `feat/T-101`, `feat/T-102` 3. Workers rebase onto epic, not main 4. Epic merged to integration when all tasks complete ### Branch Cleanup **Decision**: Delete after merge, archive failures **Rationale** (3/3 consensus): - Prevent branch bloat - Archive failures for post-mortem analysis ```bash # On successful merge git branch -d feat/T-101 git worktree remove worktrees/T-101 # On failure/abandonment git branch -m feat/T-101 archive/T-101-$(date +%Y%m%d) git worktree remove worktrees/T-101 ``` ## Workflow ### 1. Task Assignment Orchestrator prepares workspace: ```bash # 1. Fetch latest git fetch origin # 2. Create branch from integration git branch feat/$TASK_ID origin/integration # 3. Create worktree git worktree add worktrees/$TASK_ID feat/$TASK_ID # 4. Update SQLite publish(db, 'orchestrator', { 'type': 'task_assign', 'to': worker_id, 'correlation_id': task_id, 'payload': { 'branch': f'feat/{task_id}', 'worktree': f'worktrees/{task_id}' } }) ``` ### 2. Worker Starts Worker receives assignment: ```bash cd worktrees/$TASK_ID # Confirm environment git status git log --oneline -3 # Begin work... ``` ### 3. Worker Commits During work: ```bash git add -A git commit -m "feat: implement feature X Task: $TASK_ID Agent: $AGENT_ID" ``` Update SQLite: ```python publish(db, agent_id, { 'type': 'state_change', 'correlation_id': task_id, 'payload': {'from': 'ASSIGNED', 'to': 'WORKING'} }) ``` ### 4. Pre-Review Rebase (Mandatory) Before requesting review: ```bash # 1. Fetch latest integration git fetch origin integration # 2. Attempt rebase git rebase origin/integration # 3. Handle result if [ $? -eq 0 ]; then # Success - push and request review git push -u origin feat/$TASK_ID # Update SQLite: IN_REVIEW else # Conflict - check if trivial if is_trivial_conflict; then resolve_and_continue else git rebase --abort # Update SQLite: CONFLICTED fi fi ``` ### 5. Review Review happens (human or review-gate): ```python # Check review state review = get_review_state(task_id) if review['decision'] == 'approved': publish(db, 'reviewer', { 'type': 'review_result', 'correlation_id': task_id, 'payload': {'decision': 'approved'} }) elif review['decision'] == 'changes_requested': publish(db, 'reviewer', { 'type': 'review_result', 'correlation_id': task_id, 'payload': { 'decision': 'changes_requested', 'feedback': review['comments'] } }) # Worker returns to WORKING state ``` ### 6. Merge On approval: ```bash # Orchestrator merges to integration git checkout integration git merge --no-ff feat/$TASK_ID -m "Merge feat/$TASK_ID: $TITLE" git push origin integration # Cleanup git branch -d feat/$TASK_ID git push origin --delete feat/$TASK_ID git worktree remove worktrees/$TASK_ID ``` ### 7. Promote to Main Periodically (or per-task): ```bash # When integration is green git checkout main git merge --ff-only integration git push origin main ``` ## Directory Structure ``` /project/ ├── .git/ ├── .worker-state/ │ ├── bus.db # SQLite message bus │ └── workers/ │ └── worker-auth.json ├── worktrees/ # Worker worktrees (gitignored) │ ├── skills-abc/ │ └── skills-xyz/ └── (main working copy) ``` Add to `.gitignore`: ``` worktrees/ ``` ## Conflict Resolution Script ```bash #!/bin/bash # scripts/try-rebase.sh TASK_ID=$1 TARGET_BRANCH=${2:-origin/integration} cd worktrees/$TASK_ID git fetch origin # Attempt rebase if git rebase $TARGET_BRANCH; then echo "Rebase successful" exit 0 fi # Check conflict severity CONFLICT_FILES=$(git diff --name-only --diff-filter=U) CONFLICT_COUNT=$(echo "$CONFLICT_FILES" | wc -l) # Trivial: single file, small diff if [ "$CONFLICT_COUNT" -le 2 ]; then # Try automatic resolution for whitespace/formatting for file in $CONFLICT_FILES; do if git checkout --theirs "$file" 2>/dev/null; then git add "$file" else echo "Cannot auto-resolve: $file" git rebase --abort exit 2 # CONFLICTED fi done if git rebase --continue; then echo "Auto-resolved trivial conflicts" exit 0 fi fi # Non-trivial: abort and escalate git rebase --abort echo "Conflict requires human intervention" exit 2 # CONFLICTED ``` ## Integration with State Machine | State | Git Action | SQLite Message | |-------|------------|----------------| | IDLE → ASSIGNED | Branch + worktree created | `task_assign` | | ASSIGNED → WORKING | First commit | `state_change` | | WORKING → IN_REVIEW | Push + rebase success | `review_request` | | WORKING → CONFLICTED | Rebase failed | `state_change` + `escalate` | | IN_REVIEW → APPROVED | Review passes | `review_result` | | IN_REVIEW → WORKING | Changes requested | `review_result` | | APPROVED → COMPLETED | Merged | `task_done` | ## Open Questions 1. **Worktree location**: `./worktrees/` or `/tmp/worktrees/`? 2. **Integration → main cadence**: Per-task, hourly, daily, manual? 3. **Epic branches**: How complex should the epic workflow be? 4. **Failed branch retention**: How long to keep archived branches? ## References - OpenHands: https://docs.openhands.dev/sdk/guides/iterative-refinement - Gastown worktrees: https://github.com/steveyegge/gastown - Git worktrees: https://git-scm.com/docs/git-worktree