skills/specs/001-screenshot-analysis/RESET.md
dan 5fea49b7c0 feat(tufte-press): evolve skill to complete workflow with JSON generation and build automation
- Transform tufte-press from reference guide to conversation-aware generator
- Add JSON generation from conversation context following strict schema
- Create build automation scripts with Nix environment handling
- Integrate CUPS printing with duplex support
- Add comprehensive workflow documentation

Scripts added:
- skills/tufte-press/scripts/generate-and-build.sh (242 lines)
- skills/tufte-press/scripts/build-card.sh (23 lines)

Documentation:
- Updated SKILL.md with complete workflow instructions (370 lines)
- Updated README.md with usage examples (340 lines)
- Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines)
- Added worklog: 2025-11-10-tufte-press-skill-evolution.org

Features:
- Agent generates valid JSON from conversation
- Schema validation before build (catches errors early)
- Automatic Nix shell entry for dependencies
- PDF build via tufte-press toolchain
- Optional print with duplex support
- Self-contained margin notes enforced
- Complete end-to-end testing

Workflow: Conversation → JSON → Validate → Build → Print

Related: niri-window-capture, screenshot-latest, worklog skills
2025-11-10 15:03:44 -08:00

154 lines
4.4 KiB
Markdown

# Specification Reset: Screenshot Analysis
## The Over-Engineering Problem
**Lines of specification**: 635 lines across 3 documents
**Lines of code needed**: ~10-20 lines of bash
**What happened**: Classic solution-first thinking instead of problem-first thinking.
## The Actual Problem Statement
**User workflow**:
1. Press Mod4+S → select region → space
2. Screenshot saved to `~/Pictures/Screenshots/`
3. Want AI to see it immediately
4. Don't want to type `~/Pictures/Screenshots/filename.png` every time
**User says**: "Look at my last screenshot"
**AI needs**: The image file
## Reality Check
### One-Line Solution
```bash
ls -t ~/Pictures/Screenshots/*.{png,jpg,jpeg} 2>/dev/null | head -1
```
This already works. Finding the latest file is **solved**.
### What We Actually Need
**Option 1: Helper script** (if file-based is fine)
```bash
#!/usr/bin/env bash
# skills/screenshot-latest
ls -t ~/Pictures/Screenshots/*.{png,jpg,jpeg} 2>/dev/null | head -1
```
**Option 2: Direct capture** (if we want to skip files)
```bash
#!/usr/bin/env bash
# skills/screenshot-capture
grim - | wl-copy # Capture to clipboard
# or
grim - > /tmp/screen-$(date +%s).png && echo "/tmp/screen-$(date +%s).png"
```
## Questions We Should Have Asked First
1. **Is finding files the real problem?**
- No. Finding the latest file is trivial (`ls -t | head -1`)
2. **Do we need the file at all?**
- Can we capture directly from compositor? YES (`grim -`)
- Can AI read from clipboard? UNKNOWN
- Can AI read from stdin? UNKNOWN
3. **What's the actual pain point?**
- Typing paths? → Solved with 1-line helper
- File management? → Not mentioned by user
- Latency? → Not mentioned by user
- Memory usage? → Files are already on disk
4. **What does "skill" mean in this context?**
- Is it a bash script? (seems like it)
- Is it an OpenCode integration? (unclear)
- Is it a prompt template? (maybe?)
## The Specification Trap
**We wrote**:
- 11 functional requirements
- 3 user stories with acceptance scenarios
- 5 success criteria
- 82 implementation tasks
- 4 bash scripts with full test coverage
**We should have written**:
- "Find latest screenshot: `ls -t ~/Pictures/Screenshots/*.png | head -1`"
- "Test: Create temp dir, touch files, verify script returns newest"
- "Done"
## Root Cause Analysis
**Failure mode**: Applied enterprise feature specification process to a 10-line script
**Why it happened**:
1. Used `/speckit.specify` tool without calibrating scope
2. Answered spec template questions instead of questioning the template
3. Generated tasks from requirements instead of questioning requirements
4. Focused on "how to do it properly" instead of "should we even do this"
## Path Forward
### Immediate Actions
1. **Clarify skill definition**
- What is a "skill" in the OpenCode/Claude Code context?
- Is it a bash script, prompt template, or integration?
2. **Test direct capture**
- Can `grim - | base64` be piped to AI?
- Can AI read from clipboard via `wl-paste`?
- What's the actual integration mechanism?
3. **Verify actual user workflow**
- Does user want file-finding or screen-capture?
- Is this about past screenshots or current screen?
- Is this about "show AI what I see" or "find old screenshots"?
### Decision Tree
```
Do we need to capture NEW screens?
├─ YES → Use `grim -` for direct capture
│ └─ Can AI read from clipboard/stdin?
│ ├─ YES → Skip files entirely
│ └─ NO → Capture to /tmp/screenshot.png
└─ NO → Finding existing files
└─ One-liner: ls -t ~/Pictures/Screenshots/*.png | head -1
```
## Recommendations
**Stop**:
- Implementing the 82-task plan
- Building config file system
- Creating time-based filtering
- Writing comprehensive test suites
**Start**:
1. Write 10-line proof-of-concept
2. Test with actual AI workflow
3. Observe what breaks
4. Fix that one thing
5. Ship it
**Success criteria**: User types "show me the screen" and sees analysis in <2 seconds.
**Not success criteria**:
- Handles 1000+ files efficiently
- Supports Nth screenshot lookup
- Configurable via JSON
- Has 80% test coverage
- Follows enterprise best practices
## Next Session Goal
**Single question to answer**: What's the simplest thing that makes `"look at my last screenshot"` work?
**Acceptance**: User says "that works, thanks"
**Not acceptance**: Comprehensive framework for screenshot management with plugin architecture