- Transform tufte-press from reference guide to conversation-aware generator - Add JSON generation from conversation context following strict schema - Create build automation scripts with Nix environment handling - Integrate CUPS printing with duplex support - Add comprehensive workflow documentation Scripts added: - skills/tufte-press/scripts/generate-and-build.sh (242 lines) - skills/tufte-press/scripts/build-card.sh (23 lines) Documentation: - Updated SKILL.md with complete workflow instructions (370 lines) - Updated README.md with usage examples (340 lines) - Created SKILL-DEVELOPMENT-STRATEGY-tufte-press.md (450 lines) - Added worklog: 2025-11-10-tufte-press-skill-evolution.org Features: - Agent generates valid JSON from conversation - Schema validation before build (catches errors early) - Automatic Nix shell entry for dependencies - PDF build via tufte-press toolchain - Optional print with duplex support - Self-contained margin notes enforced - Complete end-to-end testing Workflow: Conversation → JSON → Validate → Build → Print Related: niri-window-capture, screenshot-latest, worklog skills
9.6 KiB
Feature Specification: Screenshot Analysis Skill
Feature Branch: 001-screenshot-analysis
Created: 2025-11-08
Status: Draft
Input: User description: "We want to start thinking about a skill that has the AI look at the last screenshot, it's mostly so we don't have to type 'they're in ~/Pictures/Screenshots' everytime."
Clarifications
Session 2025-11-08
- Q: Configuration storage mechanism? → A: Skill-specific config file (e.g., ~/.config/opencode/skills/screenshot-analysis/config.json)
- Q: Symlink handling behavior? → A: Ignore symlinks (skip any symlinked screenshot files)
- Q: Same-timestamp file handling? → A: Use filename lexicographic ordering as tiebreaker
User Scenarios & Testing (mandatory)
User Story 1 - Quick Screenshot Analysis (Priority: P1)
A user takes a screenshot and immediately asks the AI agent to analyze it without having to specify the file path or location.
Why this priority: This is the core value proposition - eliminating the need to type file paths repeatedly. This single feature delivers immediate value and addresses the primary user pain point.
Independent Test: Can be fully tested by taking a screenshot, asking "analyze the last screenshot", and verifying the agent finds and analyzes the correct file without requiring a path.
Acceptance Scenarios:
- Given a screenshot was just taken and saved to ~/Pictures/Screenshots, When user requests "look at my last screenshot", Then the agent locates the most recent file and analyzes it
- Given multiple screenshots exist in the directory, When user requests screenshot analysis, Then the agent identifies and uses the most recently created file
- Given user asks "what's in my latest screenshot", When the skill executes, Then the agent reads the screenshot file and provides visual analysis
User Story 2 - Reference Previous Screenshots (Priority: P2)
A user wants to reference screenshots from earlier in the conversation or session without re-uploading or specifying paths.
Why this priority: Extends the basic functionality to support conversation continuity and reduces friction when working with multiple screenshots over time.
Independent Test: Take multiple screenshots over time, then reference them using relative terms like "the screenshot from 5 minutes ago" or "the second-to-last screenshot".
Acceptance Scenarios:
- Given three screenshots taken at different times, When user requests "show me the previous screenshot", Then the agent selects the second-most-recent file
- Given a screenshot from earlier in the session, When user requests "compare this to the earlier screenshot", Then the agent retrieves both the latest and a previous screenshot
- Given user asks for "screenshots from today", When the skill executes, Then the agent lists or analyzes all screenshots created today
User Story 3 - Custom Screenshot Directory Support (Priority: P3)
A user who stores screenshots in a different location can configure the skill to use their preferred directory.
Why this priority: Enables flexibility for users with non-standard configurations, but the default location (~/Pictures/Screenshots) covers the majority use case.
Independent Test: Configure a custom screenshot directory, take a screenshot there, and verify the skill finds it correctly.
Acceptance Scenarios:
- Given user has configured a custom screenshot directory, When they request screenshot analysis, Then the skill searches the configured location instead of the default
- Given no custom directory is configured, When the skill executes, Then it defaults to ~/Pictures/Screenshots
- Given the configured directory doesn't exist, When the skill runs, Then it provides a clear error message and falls back to checking the default location
Edge Cases
- What happens when ~/Pictures/Screenshots is empty (no screenshots exist)?
- How does the system handle permission errors when reading the directory?
- What if multiple screenshots have the same timestamp? (Resolved: use lexicographic filename ordering as tiebreaker per FR-002)
- How does the skill behave if the screenshot file is corrupted or unreadable?
- What if the user's system uses a different default screenshot location (e.g., macOS vs Linux)?
- How does the skill handle very large screenshot files?
- What if the directory contains symlinks to screenshot files (should be ignored per FR-002a)?
Requirements (mandatory)
Functional Requirements
- FR-001: Skill MUST automatically locate the most recent screenshot file in ~/Pictures/Screenshots without user-provided path
- FR-002: Skill MUST determine file recency based on file modification time; when multiple files have identical timestamps, use filename lexicographic ordering as tiebreaker (later in alphabet = more recent)
- FR-002a: Skill MUST ignore symlinks when scanning for screenshot files (only consider regular files)
- FR-003: Skill MUST support common screenshot formats (PNG, JPG, JPEG)
- FR-004: Skill MUST provide clear error messages if no screenshots are found
- FR-005: Skill MUST be invokable through natural language triggers (e.g., "look at my last screenshot", "analyze my recent screenshot")
- FR-006: Skill MUST pass the screenshot file path to the agent's image analysis capability
- FR-007: Skill MUST handle missing or inaccessible screenshot directory gracefully
- FR-008: Skill SHOULD support relative time references (e.g., "screenshot from 5 minutes ago")
- FR-009: Skill SHOULD allow configuration of custom screenshot directories via skill-specific config file (e.g., ~/.config/opencode/skills/screenshot-analysis/config.json or ~/.claude/skills/screenshot-analysis/config.json)
- FR-010: Skill SHOULD support finding the Nth most recent screenshot (e.g., "previous screenshot", "second-to-last screenshot")
Key Entities
- Screenshot File: Image file in the screenshots directory with metadata (path, timestamp, format)
- Screenshot Directory: Configurable location where screenshots are stored (default: ~/Pictures/Screenshots)
- Skill Configuration: Optional JSON config file at ~/.config/opencode/skills/screenshot-analysis/config.json (or ~/.claude/skills/screenshot-analysis/config.json for Claude Code) with fields:
screenshot_dir(custom directory path)
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: Users can request screenshot analysis without typing file paths in 100% of cases where screenshots exist
- SC-002: Skill correctly identifies the most recent screenshot in under 1 second for directories with up to 1000 files
- SC-003: Skill successfully locates screenshots in 95% of user requests when screenshots exist
- SC-004: Error messages are clear and actionable when screenshots cannot be found or accessed
- SC-005: Reduce user keystrokes by an average of 40+ characters per screenshot analysis request (eliminating "~/Pictures/Screenshots/filename.png")
Assumptions
Default Behavior
- Users store screenshots in the standard location (~/Pictures/Screenshots) on Linux systems
- Screenshot filenames include timestamps or modification times that allow reliable sorting by recency
- The agent has image analysis capabilities (can read and analyze image files)
Technical Environment
- File system is accessible and readable
- Standard Unix/Linux file utilities are available
- Screenshot files use standard image formats
User Interaction
- Users will use natural language to request screenshot analysis
- Users understand relative time references ("last", "latest", "recent", "previous")
- Users expect immediate analysis without additional prompts
Configuration
- Custom configuration is optional - defaults work for most users
- Configuration stored in skill-specific JSON file at ~/.config/opencode/skills/screenshot-analysis/config.json or ~/.claude/skills/screenshot-analysis/config.json
- Config file format:
{"screenshot_dir": "/path/to/screenshots"}
Out of Scope
The following are explicitly NOT included in this feature:
- Screenshot capture functionality (assumes screenshots already exist)
- Image editing or manipulation
- Screenshot organization or tagging
- Screenshot upload to external services
- Optical Character Recognition (OCR) - unless built into agent's image analysis
- Screenshot comparison or diff functionality (may be future enhancement)
- Cross-platform screenshot location detection (focuses on Linux ~/Pictures/Screenshots)
- Screenshot history management or database
Dependencies
- Agent must support image file analysis
- File system access (read permissions on screenshots directory)
- Bash scripting environment for helper scripts
- Standard Unix tools (ls, find, stat for file operations)
Notes
- This skill is a convenience wrapper that eliminates repetitive path typing
- The actual image analysis is delegated to the agent's existing capabilities
- Focus is on file discovery and path resolution, not image processing
- Should work with both Claude Code and OpenCode agents