- Transform tufte-press from reference guide to conversation-aware generator - Add JSON generation from conversation context following strict schema - Create build automation scripts with Nix environment handling - Integrate CUPS printing with duplex support - Add comprehensive workflow documentation Scripts added: - skills/tufte-press/scripts/generate-and-build.sh (242 lines) - skills/tufte-press/scripts/build-card.sh (23 lines) Documentation: - Updated SKILL.md with complete workflow instructions (370 lines) - Updated README.md with usage examples (340 lines) - Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines) - Added worklog: 2025-11-10-tufte-press-skill-evolution.org Features: - Agent generates valid JSON from conversation - Schema validation before build (catches errors early) - Automatic Nix shell entry for dependencies - PDF build via tufte-press toolchain - Optional print with duplex support - Self-contained margin notes enforced - Complete end-to-end testing Workflow: Conversation → JSON → Validate → Build → Print Related: niri-window-capture, screenshot-latest, worklog skills
231 lines
6.9 KiB
Markdown
231 lines
6.9 KiB
Markdown
# Specification vs Implementation: A Reality Check
|
|
|
|
## The Numbers
|
|
|
|
| Metric | Original Spec | Minimal Implementation | Ratio |
|
|
|--------|---------------|------------------------|-------|
|
|
| Total Lines | 635 lines | 185 lines | 3.4x |
|
|
| Specification | 635 lines | 83 lines (SKILL.md) | 7.6x |
|
|
| Implementation | 0 lines (not started) | 22 lines (bash script) | ∞ |
|
|
| Task Breakdown | 82 tasks | 0 tasks (just built it) | ∞ |
|
|
|
|
**Key insight**: 635 lines of planning to describe 22 lines of code.
|
|
|
|
## What We Specified
|
|
|
|
**Original spec included**:
|
|
- 11 functional requirements (FR-001 through FR-010)
|
|
- 3 prioritized user stories with acceptance scenarios
|
|
- 5 success criteria with measurable outcomes
|
|
- Configuration system (JSON files)
|
|
- Time-based filtering ("screenshot from 5 minutes ago")
|
|
- Nth screenshot lookup ("second-to-last screenshot")
|
|
- Symlink handling policy
|
|
- Comprehensive error handling
|
|
- 82 implementation tasks
|
|
- 4 separate bash scripts
|
|
- Full test coverage with bats
|
|
- Data models and contracts
|
|
|
|
**What we actually needed**:
|
|
- Find latest file: `ls -t ~/Pictures/Screenshots/*.png | head -1`
|
|
- Handle errors: Check if directory exists and has files
|
|
- Done
|
|
|
|
## What We Built
|
|
|
|
**Minimal implementation**:
|
|
- 1 SKILL.md file (agent instructions)
|
|
- 1 bash script (22 lines including comments)
|
|
- 1 README (user documentation)
|
|
- 1 example output
|
|
|
|
**Features supported**:
|
|
- ✅ Find latest screenshot
|
|
- ✅ Clear error messages
|
|
- ✅ Standard formats (PNG, JPG, JPEG)
|
|
- ✅ Works immediately
|
|
- ✅ Zero configuration
|
|
|
|
**Features NOT supported** (and that's OK):
|
|
- ❌ Custom directories
|
|
- ❌ Nth screenshot lookup
|
|
- ❌ Time-based filtering
|
|
- ❌ Configuration files
|
|
- ❌ Symlink handling policy
|
|
|
|
## Time Analysis
|
|
|
|
**Specification process**:
|
|
- Session 1: Wrote spec.md (~30 minutes)
|
|
- Session 2: Clarified ambiguities (~15 minutes)
|
|
- Session 3: Created plan.md (~30 minutes)
|
|
- Session 4: Generated tasks.md (~20 minutes)
|
|
- Session 5: Analyzed over-engineering (~20 minutes)
|
|
- **Total**: ~115 minutes
|
|
|
|
**Implementation**:
|
|
- Wrote SKILL.md (~10 minutes)
|
|
- Wrote find-latest.sh (~5 minutes)
|
|
- Wrote README.md (~5 minutes)
|
|
- Tested (~2 minutes)
|
|
- **Total**: ~22 minutes
|
|
|
|
**Ratio**: 5.2x more time spent on specification than implementation
|
|
|
|
## Root Cause Analysis
|
|
|
|
### What Went Wrong
|
|
|
|
1. **Template-Driven Development**
|
|
- Used `/speckit.specify` tool without scope calibration
|
|
- Answered every template question instead of questioning the template
|
|
- Treated "specification" as a deliverable rather than a tool
|
|
|
|
2. **Solution-First Thinking**
|
|
- Jumped to "how to find files properly" before validating "is file-finding the problem"
|
|
- Designed abstractions before writing concrete code
|
|
- Optimized for extensibility we don't need
|
|
|
|
3. **Over-Engineering Bias**
|
|
- Added features user didn't request (time filtering, Nth lookup)
|
|
- Created configuration system for single hard-coded value
|
|
- Planned comprehensive test coverage for 22-line script
|
|
|
|
4. **Lost Sight of Value**
|
|
- User wanted: "Don't make me type paths"
|
|
- We delivered: "Comprehensive screenshot management framework"
|
|
- Forgot to ask: "What's the simplest thing that works?"
|
|
|
|
### What Went Right (After Reset)
|
|
|
|
1. **Reality Check**
|
|
- Tested actual command: `ls -t | head -1` (works instantly)
|
|
- Counted spec lines: 635 (ridiculous for this problem)
|
|
- Asked: "What are we actually solving?"
|
|
|
|
2. **Scope Reduction**
|
|
- Removed all SHOULD requirements
|
|
- Removed all "future enhancement" features
|
|
- Focused on single use case: "find latest screenshot"
|
|
|
|
3. **Build First, Plan Second**
|
|
- Wrote the script (22 lines)
|
|
- Tested it (works)
|
|
- Documented it (SKILL.md + README)
|
|
- Shipped it
|
|
|
|
## Lessons Learned
|
|
|
|
### For Specifications
|
|
|
|
**Do**:
|
|
- ✅ Start with one-liner proof of concept
|
|
- ✅ Validate the problem is real before solving it
|
|
- ✅ Test actual commands before designing abstractions
|
|
- ✅ Question every requirement: "What if we didn't do this?"
|
|
- ✅ Write code first, plan later for small problems
|
|
|
|
**Don't**:
|
|
- ❌ Fill in specification templates without scope calibration
|
|
- ❌ Add features user didn't request
|
|
- ❌ Optimize for hypothetical future needs
|
|
- ❌ Create configuration systems for constants
|
|
- ❌ Break down tasks before trying the task
|
|
|
|
### For Development
|
|
|
|
**Do**:
|
|
- ✅ Solve the specific problem user described
|
|
- ✅ Use simplest solution that works
|
|
- ✅ Ship quickly, iterate on feedback
|
|
- ✅ Document limitations clearly
|
|
- ✅ Make it easy to enhance later IF needed
|
|
|
|
**Don't**:
|
|
- ❌ Build extensibility before you need it
|
|
- ❌ Abstract before you have concrete examples
|
|
- ❌ Add configuration before you have variations
|
|
- ❌ Plan comprehensive testing before you have code
|
|
- ❌ Assume you know what users will ask for
|
|
|
|
## Decision Framework
|
|
|
|
**When deciding whether to specify or implement first**:
|
|
|
|
```
|
|
Is the solution obvious?
|
|
├─ YES → Write code first (this case)
|
|
│ └─ Spec is just documentation of what you built
|
|
└─ NO → Write spec first
|
|
├─ Multiple teams need coordination
|
|
├─ Complex domain requiring analysis
|
|
├─ High risk of rework
|
|
└─ Unclear requirements needing clarification
|
|
```
|
|
|
|
**For this problem**:
|
|
- Solution obvious? YES (`ls -t | head -1`)
|
|
- Multiple teams? NO (just me)
|
|
- Complex domain? NO (file system operations)
|
|
- Risk of rework? LOW (22 lines of bash)
|
|
- Unclear requirements? NO (user described exact problem)
|
|
|
|
**Verdict**: Should have written code first.
|
|
|
|
## Success Metrics (Revised)
|
|
|
|
**Original success criteria**:
|
|
- SC-001: Work in 100% of cases (unmeasurable without usage data)
|
|
- SC-002: Complete in <1 second for 1000 files (premature optimization)
|
|
- SC-003: Succeed in 95% of requests (can't measure before shipping)
|
|
- SC-004: Clear error messages (this is good, kept it)
|
|
- SC-005: Save 40+ keystrokes (accurate but overcomplicated)
|
|
|
|
**Actual success metric**:
|
|
- User says: "look at my screenshot"
|
|
- AI responds: <screenshot analysis>
|
|
- User says: "thanks" (not "that's not the right file")
|
|
|
|
**That's it. That's the only metric that matters.**
|
|
|
|
## Recommendation for Future
|
|
|
|
When given a user request like "make it so I don't have to type paths":
|
|
|
|
1. **Immediately test**: Try one-liner solution
|
|
2. **If it works**: Ship it with documentation
|
|
3. **If it doesn't**: THEN write specification
|
|
|
|
For this problem:
|
|
```bash
|
|
# Time to validate solution: 5 seconds
|
|
ls -t ~/Pictures/Screenshots/*.png | head -1
|
|
|
|
# Time to specify properly: 115 minutes
|
|
# Time to implement: 22 minutes
|
|
# Time wasted: 93 minutes
|
|
```
|
|
|
|
## Final Thoughts
|
|
|
|
**The best spec is often the code itself.**
|
|
|
|
For simple problems:
|
|
- Code is the specification
|
|
- Tests are the acceptance criteria
|
|
- README is the user story
|
|
- Working software is the proof
|
|
|
|
Save comprehensive specifications for when you actually need them:
|
|
- Coordinating multiple developers
|
|
- Uncertain problem domains
|
|
- High-risk architectural decisions
|
|
- Compliance/audit requirements
|
|
|
|
**Not** for 22-line bash scripts.
|
|
|
|
---
|
|
|
|
*This document intentionally created AFTER implementation, not before.*
|