skills/specs/001-screenshot-analysis/RESET.md
dan 5fea49b7c0 feat(tufte-press): evolve skill to complete workflow with JSON generation and build automation
- Transform tufte-press from reference guide to conversation-aware generator
- Add JSON generation from conversation context following strict schema
- Create build automation scripts with Nix environment handling
- Integrate CUPS printing with duplex support
- Add comprehensive workflow documentation

Scripts added:
- skills/tufte-press/scripts/generate-and-build.sh (242 lines)
- skills/tufte-press/scripts/build-card.sh (23 lines)

Documentation:
- Updated SKILL.md with complete workflow instructions (370 lines)
- Updated README.md with usage examples (340 lines)
- Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines)
- Added worklog: 2025-11-10-tufte-press-skill-evolution.org

Features:
- Agent generates valid JSON from conversation
- Schema validation before build (catches errors early)
- Automatic Nix shell entry for dependencies
- PDF build via tufte-press toolchain
- Optional print with duplex support
- Self-contained margin notes enforced
- Complete end-to-end testing

Workflow: Conversation → JSON → Validate → Build → Print

Related: niri-window-capture, screenshot-latest, worklog skills
2025-11-10 15:03:44 -08:00

4.4 KiB

Specification Reset: Screenshot Analysis

The Over-Engineering Problem

Lines of specification: 635 lines across 3 documents Lines of code needed: ~10-20 lines of bash

What happened: Classic solution-first thinking instead of problem-first thinking.

The Actual Problem Statement

User workflow:

  1. Press Mod4+S → select region → space
  2. Screenshot saved to ~/Pictures/Screenshots/
  3. Want AI to see it immediately
  4. Don't want to type ~/Pictures/Screenshots/filename.png every time

User says: "Look at my last screenshot" AI needs: The image file

Reality Check

One-Line Solution

ls -t ~/Pictures/Screenshots/*.{png,jpg,jpeg} 2>/dev/null | head -1

This already works. Finding the latest file is solved.

What We Actually Need

Option 1: Helper script (if file-based is fine)

#!/usr/bin/env bash
# skills/screenshot-latest
ls -t ~/Pictures/Screenshots/*.{png,jpg,jpeg} 2>/dev/null | head -1

Option 2: Direct capture (if we want to skip files)

#!/usr/bin/env bash
# skills/screenshot-capture
grim - | wl-copy  # Capture to clipboard
# or
grim - > /tmp/screen-$(date +%s).png && echo "/tmp/screen-$(date +%s).png"

Questions We Should Have Asked First

  1. Is finding files the real problem?

    • No. Finding the latest file is trivial (ls -t | head -1)
  2. Do we need the file at all?

    • Can we capture directly from compositor? YES (grim -)
    • Can AI read from clipboard? UNKNOWN
    • Can AI read from stdin? UNKNOWN
  3. What's the actual pain point?

    • Typing paths? → Solved with 1-line helper
    • File management? → Not mentioned by user
    • Latency? → Not mentioned by user
    • Memory usage? → Files are already on disk
  4. What does "skill" mean in this context?

    • Is it a bash script? (seems like it)
    • Is it an OpenCode integration? (unclear)
    • Is it a prompt template? (maybe?)

The Specification Trap

We wrote:

  • 11 functional requirements
  • 3 user stories with acceptance scenarios
  • 5 success criteria
  • 82 implementation tasks
  • 4 bash scripts with full test coverage

We should have written:

  • "Find latest screenshot: ls -t ~/Pictures/Screenshots/*.png | head -1"
  • "Test: Create temp dir, touch files, verify script returns newest"
  • "Done"

Root Cause Analysis

Failure mode: Applied enterprise feature specification process to a 10-line script

Why it happened:

  1. Used /speckit.specify tool without calibrating scope
  2. Answered spec template questions instead of questioning the template
  3. Generated tasks from requirements instead of questioning requirements
  4. Focused on "how to do it properly" instead of "should we even do this"

Path Forward

Immediate Actions

  1. Clarify skill definition

    • What is a "skill" in the OpenCode/Claude Code context?
    • Is it a bash script, prompt template, or integration?
  2. Test direct capture

    • Can grim - | base64 be piped to AI?
    • Can AI read from clipboard via wl-paste?
    • What's the actual integration mechanism?
  3. Verify actual user workflow

    • Does user want file-finding or screen-capture?
    • Is this about past screenshots or current screen?
    • Is this about "show AI what I see" or "find old screenshots"?

Decision Tree

Do we need to capture NEW screens?
├─ YES → Use `grim -` for direct capture
│         └─ Can AI read from clipboard/stdin?
│            ├─ YES → Skip files entirely
│            └─ NO  → Capture to /tmp/screenshot.png
└─ NO  → Finding existing files
          └─ One-liner: ls -t ~/Pictures/Screenshots/*.png | head -1

Recommendations

Stop:

  • Implementing the 82-task plan
  • Building config file system
  • Creating time-based filtering
  • Writing comprehensive test suites

Start:

  1. Write 10-line proof-of-concept
  2. Test with actual AI workflow
  3. Observe what breaks
  4. Fix that one thing
  5. Ship it

Success criteria: User types "show me the screen" and sees analysis in <2 seconds.

Not success criteria:

  • ✗ Handles 1000+ files efficiently
  • ✗ Supports Nth screenshot lookup
  • ✗ Configurable via JSON
  • ✗ Has 80% test coverage
  • ✗ Follows enterprise best practices

Next Session Goal

Single question to answer: What's the simplest thing that makes "look at my last screenshot" work?

Acceptance: User says "that works, thanks"

Not acceptance: Comprehensive framework for screenshot management with plugin architecture