skills/specs/001-screenshot-analysis/COMPARISON.md
dan 5fea49b7c0 feat(tufte-press): evolve skill to complete workflow with JSON generation and build automation
- Transform tufte-press from reference guide to conversation-aware generator
- Add JSON generation from conversation context following strict schema
- Create build automation scripts with Nix environment handling
- Integrate CUPS printing with duplex support
- Add comprehensive workflow documentation

Scripts added:
- skills/tufte-press/scripts/generate-and-build.sh (242 lines)
- skills/tufte-press/scripts/build-card.sh (23 lines)

Documentation:
- Updated SKILL.md with complete workflow instructions (370 lines)
- Updated README.md with usage examples (340 lines)
- Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines)
- Added worklog: 2025-11-10-tufte-press-skill-evolution.org

Features:
- Agent generates valid JSON from conversation
- Schema validation before build (catches errors early)
- Automatic Nix shell entry for dependencies
- PDF build via tufte-press toolchain
- Optional print with duplex support
- Self-contained margin notes enforced
- Complete end-to-end testing

Workflow: Conversation → JSON → Validate → Build → Print

Related: niri-window-capture, screenshot-latest, worklog skills
2025-11-10 15:03:44 -08:00

6.9 KiB

Specification vs Implementation: A Reality Check

The Numbers

Metric Original Spec Minimal Implementation Ratio
Total Lines 635 lines 185 lines 3.4x
Specification 635 lines 83 lines (SKILL.md) 7.6x
Implementation 0 lines (not started) 22 lines (bash script)
Task Breakdown 82 tasks 0 tasks (just built it)

Key insight: 635 lines of planning to describe 22 lines of code.

What We Specified

Original spec included:

  • 11 functional requirements (FR-001 through FR-010)
  • 3 prioritized user stories with acceptance scenarios
  • 5 success criteria with measurable outcomes
  • Configuration system (JSON files)
  • Time-based filtering ("screenshot from 5 minutes ago")
  • Nth screenshot lookup ("second-to-last screenshot")
  • Symlink handling policy
  • Comprehensive error handling
  • 82 implementation tasks
  • 4 separate bash scripts
  • Full test coverage with bats
  • Data models and contracts

What we actually needed:

  • Find latest file: ls -t ~/Pictures/Screenshots/*.png | head -1
  • Handle errors: Check if directory exists and has files
  • Done

What We Built

Minimal implementation:

  • 1 SKILL.md file (agent instructions)
  • 1 bash script (22 lines including comments)
  • 1 README (user documentation)
  • 1 example output

Features supported:

  • Find latest screenshot
  • Clear error messages
  • Standard formats (PNG, JPG, JPEG)
  • Works immediately
  • Zero configuration

Features NOT supported (and that's OK):

  • Custom directories
  • Nth screenshot lookup
  • Time-based filtering
  • Configuration files
  • Symlink handling policy

Time Analysis

Specification process:

  • Session 1: Wrote spec.md (~30 minutes)
  • Session 2: Clarified ambiguities (~15 minutes)
  • Session 3: Created plan.md (~30 minutes)
  • Session 4: Generated tasks.md (~20 minutes)
  • Session 5: Analyzed over-engineering (~20 minutes)
  • Total: ~115 minutes

Implementation:

  • Wrote SKILL.md (~10 minutes)
  • Wrote find-latest.sh (~5 minutes)
  • Wrote README.md (~5 minutes)
  • Tested (~2 minutes)
  • Total: ~22 minutes

Ratio: 5.2x more time spent on specification than implementation

Root Cause Analysis

What Went Wrong

  1. Template-Driven Development

    • Used /speckit.specify tool without scope calibration
    • Answered every template question instead of questioning the template
    • Treated "specification" as a deliverable rather than a tool
  2. Solution-First Thinking

    • Jumped to "how to find files properly" before validating "is file-finding the problem"
    • Designed abstractions before writing concrete code
    • Optimized for extensibility we don't need
  3. Over-Engineering Bias

    • Added features user didn't request (time filtering, Nth lookup)
    • Created configuration system for single hard-coded value
    • Planned comprehensive test coverage for 22-line script
  4. Lost Sight of Value

    • User wanted: "Don't make me type paths"
    • We delivered: "Comprehensive screenshot management framework"
    • Forgot to ask: "What's the simplest thing that works?"

What Went Right (After Reset)

  1. Reality Check

    • Tested actual command: ls -t | head -1 (works instantly)
    • Counted spec lines: 635 (ridiculous for this problem)
    • Asked: "What are we actually solving?"
  2. Scope Reduction

    • Removed all SHOULD requirements
    • Removed all "future enhancement" features
    • Focused on single use case: "find latest screenshot"
  3. Build First, Plan Second

    • Wrote the script (22 lines)
    • Tested it (works)
    • Documented it (SKILL.md + README)
    • Shipped it

Lessons Learned

For Specifications

Do:

  • Start with one-liner proof of concept
  • Validate the problem is real before solving it
  • Test actual commands before designing abstractions
  • Question every requirement: "What if we didn't do this?"
  • Write code first, plan later for small problems

Don't:

  • Fill in specification templates without scope calibration
  • Add features user didn't request
  • Optimize for hypothetical future needs
  • Create configuration systems for constants
  • Break down tasks before trying the task

For Development

Do:

  • Solve the specific problem user described
  • Use simplest solution that works
  • Ship quickly, iterate on feedback
  • Document limitations clearly
  • Make it easy to enhance later IF needed

Don't:

  • Build extensibility before you need it
  • Abstract before you have concrete examples
  • Add configuration before you have variations
  • Plan comprehensive testing before you have code
  • Assume you know what users will ask for

Decision Framework

When deciding whether to specify or implement first:

Is the solution obvious?
├─ YES → Write code first (this case)
│         └─ Spec is just documentation of what you built
└─ NO  → Write spec first
          ├─ Multiple teams need coordination
          ├─ Complex domain requiring analysis
          ├─ High risk of rework
          └─ Unclear requirements needing clarification

For this problem:

  • Solution obvious? YES (ls -t | head -1)
  • Multiple teams? NO (just me)
  • Complex domain? NO (file system operations)
  • Risk of rework? LOW (22 lines of bash)
  • Unclear requirements? NO (user described exact problem)

Verdict: Should have written code first.

Success Metrics (Revised)

Original success criteria:

  • SC-001: Work in 100% of cases (unmeasurable without usage data)
  • SC-002: Complete in <1 second for 1000 files (premature optimization)
  • SC-003: Succeed in 95% of requests (can't measure before shipping)
  • SC-004: Clear error messages (this is good, kept it)
  • SC-005: Save 40+ keystrokes (accurate but overcomplicated)

Actual success metric:

  • User says: "look at my screenshot"
  • AI responds:
  • User says: "thanks" (not "that's not the right file")

That's it. That's the only metric that matters.

Recommendation for Future

When given a user request like "make it so I don't have to type paths":

  1. Immediately test: Try one-liner solution
  2. If it works: Ship it with documentation
  3. If it doesn't: THEN write specification

For this problem:

# Time to validate solution: 5 seconds
ls -t ~/Pictures/Screenshots/*.png | head -1

# Time to specify properly: 115 minutes
# Time to implement: 22 minutes
# Time wasted: 93 minutes

Final Thoughts

The best spec is often the code itself.

For simple problems:

  • Code is the specification
  • Tests are the acceptance criteria
  • README is the user story
  • Working software is the proof

Save comprehensive specifications for when you actually need them:

  • Coordinating multiple developers
  • Uncertain problem domains
  • High-risk architectural decisions
  • Compliance/audit requirements

Not for 22-line bash scripts.


This document intentionally created AFTER implementation, not before.