Research conducted 2026-01-22: - pi-extension-ecosystem-research.md: 56 GitHub projects, 52 official examples - pi-ui-ecosystem-research.md: TUI patterns, components, overlays - multi-model-consensus-analysis.md: gap analysis leading to /synod design
5.8 KiB
Bug Report: Ralph Loop Iteration Counter Not Incrementing
Date: 2026-01-22
Repo: dotfiles (using skills flake's ralph-wiggum extension)
Extension: ~/.pi/agent/extensions/ralph-wiggum/index.ts
Summary
The Ralph loop iteration counter stays stuck at 1 even when the agent completes work and calls ralph_done. The iteration prompt shows "Iteration 1/50" throughout the entire session, never advancing.
Observed Behavior
- Started ralph loop with
ralph_starttool - Completed 7 categories of review work (35 lens passes)
- Called
ralph_donemultiple times after completing work - Each
ralph_donecall returned:"Pending messages already queued. Skipping ralph_done." - Iteration counter never incremented past 1
- Work completed successfully but loop showed "Iteration 1/50" the entire time
- Final completion banner showed "1 iterations" despite doing ~7 logical iterations of work
Root Cause Analysis
In ralph_done tool execute function (line ~460):
async execute(_toolCallId, _params, _onUpdate, ctx) {
if (!currentLoop) {
return { content: [{ type: "text", text: "No active Ralph loop." }], details: {} };
}
const state = loadState(ctx, currentLoop);
if (!state || state.status !== "active") {
return { content: [{ type: "text", text: "Ralph loop is not active." }], details: {} };
}
// THIS IS THE PROBLEM
if (ctx.hasPendingMessages()) {
return {
content: [{ type: "text", text: "Pending messages already queued. Skipping ralph_done." }],
details: {},
};
}
// Iteration only increments AFTER the pending messages check
state.iteration++;
// ...
}
The ctx.hasPendingMessages() check returns true when:
- Other tool calls are batched with
ralph_done - Follow-up messages are queued from previous operations
- Any async operations have pending responses
In practice, this guard ALWAYS triggers during normal agent operation because:
- Agent makes multiple tool calls (read files, run commands, file issues)
- Agent then calls
ralph_done - Previous tool responses create "pending messages"
- Guard triggers, iteration skipped
Impact
- User confusion: Progress appears stuck at iteration 1
- No reflection checkpoints:
reflectEverynever triggers since iteration never advances - Incorrect completion stats: Final banner shows wrong iteration count
- Work document diverges: Agent's actual progress doesn't match Ralph's iteration state
Reproduction Steps
- Start a ralph loop:
/ralph start test-loop --items-per-iteration 5
- Have the agent do ANY work involving multiple tool calls:
- Read a few files
- Run some bash commands
- Call ralph_done
-
Observe:
ralph_donereturns "Pending messages already queued" -
Check state file:
cat .ralph/test-loop.state.json | jq .iteration
# Always returns 1
Proposed Fixes
Option A: Remove the guard entirely
The guard's purpose seems to be preventing duplicate iteration messages, but it's too aggressive:
// Remove this block entirely
if (ctx.hasPendingMessages()) {
return { ... };
}
Risk: Might cause duplicate prompts if agent calls ralph_done multiple times.
Option B: Increment iteration regardless, only skip prompt delivery
// Always increment
state.iteration++;
saveState(ctx, state);
updateUI(ctx);
// Only skip the PROMPT delivery if there are pending messages
if (ctx.hasPendingMessages()) {
return {
content: [{ type: "text", text: `Iteration ${state.iteration} recorded. Prompt deferred due to pending messages.` }],
details: {},
};
}
// Continue with prompt delivery...
Benefit: Counter stays accurate even if prompt is deferred.
Option C: Check for pending USER messages only
If hasPendingMessages() can distinguish message types:
if (ctx.hasPendingUserMessages?.()) { // More specific check
return { ... };
}
Benefit: Tool responses wouldn't block iteration.
Option D: Use a flag to prevent re-entry
// At module level
let ralph_done_in_progress = false;
// In execute
if (ralph_done_in_progress) {
return { content: [{ type: "text", text: "ralph_done already in progress." }], details: {} };
}
ralph_done_in_progress = true;
try {
// ... do the work
} finally {
ralph_done_in_progress = false;
}
Benefit: Prevents actual re-entry without blocking on unrelated pending messages.
Recommended Fix
Option B seems safest:
- Iteration counter always reflects actual progress
- UI stays accurate
- Prompt delivery can be deferred without losing state
- Backwards compatible
Additional Context
State file after "completion" (iteration stuck at 1):
{
"name": "nix-modules-review",
"taskFile": ".ralph/nix-modules-review.md",
"iteration": 1,
"maxIterations": 50,
"itemsPerIteration": 5,
"reflectEvery": 0,
"active": false,
"status": "completed",
"startedAt": "2026-01-22T22:49:53.055Z",
"completedAt": "2026-01-22T22:55:10.628Z"
}
Actual work completed:
- 7 module categories reviewed
- 5 lenses per category = 35 review passes
- 14 issues filed in beads
- Epic created and closed
The iteration should have been ~7-8, not 1.
Questions for Investigation
- What exactly does
ctx.hasPendingMessages()check? Is it documented in pi's ExtensionAPI? - Is this guard necessary for correctness, or just a precaution?
- Are there other extensions using similar patterns that work correctly?
- Should
ralph_donebe designed to be called as the ONLY tool in a response (documented behavior)?
Workaround (Current)
Agent can manually copy the completed work doc to .ralph/ and output <promise>COMPLETE</promise> to trigger completion detection via the agent_end event handler, bypassing ralph_done entirely. This is what happened in the observed session.