# Feature Specification: Screenshot Analysis Skill **Feature Branch**: `001-screenshot-analysis` **Created**: 2025-11-08 **Status**: Draft **Input**: User description: "We want to start thinking about a skill that has the AI look at the last screenshot, it's mostly so we don't have to type 'they're in ~/Pictures/Screenshots' everytime." ## Clarifications ### Session 2025-11-08 - Q: Configuration storage mechanism? → A: Skill-specific config file (e.g., ~/.config/opencode/skills/screenshot-analysis/config.json) - Q: Symlink handling behavior? → A: Ignore symlinks (skip any symlinked screenshot files) - Q: Same-timestamp file handling? → A: Use filename lexicographic ordering as tiebreaker ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Quick Screenshot Analysis (Priority: P1) A user takes a screenshot and immediately asks the AI agent to analyze it without having to specify the file path or location. **Why this priority**: This is the core value proposition - eliminating the need to type file paths repeatedly. This single feature delivers immediate value and addresses the primary user pain point. **Independent Test**: Can be fully tested by taking a screenshot, asking "analyze the last screenshot", and verifying the agent finds and analyzes the correct file without requiring a path. **Acceptance Scenarios**: 1. **Given** a screenshot was just taken and saved to ~/Pictures/Screenshots, **When** user requests "look at my last screenshot", **Then** the agent locates the most recent file and analyzes it 2. **Given** multiple screenshots exist in the directory, **When** user requests screenshot analysis, **Then** the agent identifies and uses the most recently created file 3. **Given** user asks "what's in my latest screenshot", **When** the skill executes, **Then** the agent reads the screenshot file and provides visual analysis --- ### User Story 2 - Reference Previous Screenshots (Priority: P2) A user wants to reference screenshots from earlier in the conversation or session without re-uploading or specifying paths. **Why this priority**: Extends the basic functionality to support conversation continuity and reduces friction when working with multiple screenshots over time. **Independent Test**: Take multiple screenshots over time, then reference them using relative terms like "the screenshot from 5 minutes ago" or "the second-to-last screenshot". **Acceptance Scenarios**: 1. **Given** three screenshots taken at different times, **When** user requests "show me the previous screenshot", **Then** the agent selects the second-most-recent file 2. **Given** a screenshot from earlier in the session, **When** user requests "compare this to the earlier screenshot", **Then** the agent retrieves both the latest and a previous screenshot 3. **Given** user asks for "screenshots from today", **When** the skill executes, **Then** the agent lists or analyzes all screenshots created today --- ### User Story 3 - Custom Screenshot Directory Support (Priority: P3) A user who stores screenshots in a different location can configure the skill to use their preferred directory. **Why this priority**: Enables flexibility for users with non-standard configurations, but the default location (~/Pictures/Screenshots) covers the majority use case. **Independent Test**: Configure a custom screenshot directory, take a screenshot there, and verify the skill finds it correctly. **Acceptance Scenarios**: 1. **Given** user has configured a custom screenshot directory, **When** they request screenshot analysis, **Then** the skill searches the configured location instead of the default 2. **Given** no custom directory is configured, **When** the skill executes, **Then** it defaults to ~/Pictures/Screenshots 3. **Given** the configured directory doesn't exist, **When** the skill runs, **Then** it provides a clear error message and falls back to checking the default location ### Edge Cases - What happens when ~/Pictures/Screenshots is empty (no screenshots exist)? - How does the system handle permission errors when reading the directory? - What if multiple screenshots have the same timestamp? (Resolved: use lexicographic filename ordering as tiebreaker per FR-002) - How does the skill behave if the screenshot file is corrupted or unreadable? - What if the user's system uses a different default screenshot location (e.g., macOS vs Linux)? - How does the skill handle very large screenshot files? - What if the directory contains symlinks to screenshot files (should be ignored per FR-002a)? ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: Skill MUST automatically locate the most recent screenshot file in ~/Pictures/Screenshots without user-provided path - **FR-002**: Skill MUST determine file recency based on file modification time; when multiple files have identical timestamps, use filename lexicographic ordering as tiebreaker (later in alphabet = more recent) - **FR-002a**: Skill MUST ignore symlinks when scanning for screenshot files (only consider regular files) - **FR-003**: Skill MUST support common screenshot formats (PNG, JPG, JPEG) - **FR-004**: Skill MUST provide clear error messages if no screenshots are found - **FR-005**: Skill MUST be invokable through natural language triggers (e.g., "look at my last screenshot", "analyze my recent screenshot") - **FR-006**: Skill MUST pass the screenshot file path to the agent's image analysis capability - **FR-007**: Skill MUST handle missing or inaccessible screenshot directory gracefully - **FR-008**: Skill SHOULD support relative time references (e.g., "screenshot from 5 minutes ago") - **FR-009**: Skill SHOULD allow configuration of custom screenshot directories via skill-specific config file (e.g., ~/.config/opencode/skills/screenshot-analysis/config.json or ~/.claude/skills/screenshot-analysis/config.json) - **FR-010**: Skill SHOULD support finding the Nth most recent screenshot (e.g., "previous screenshot", "second-to-last screenshot") ### Key Entities - **Screenshot File**: Image file in the screenshots directory with metadata (path, timestamp, format) - **Screenshot Directory**: Configurable location where screenshots are stored (default: ~/Pictures/Screenshots) - **Skill Configuration**: Optional JSON config file at ~/.config/opencode/skills/screenshot-analysis/config.json (or ~/.claude/skills/screenshot-analysis/config.json for Claude Code) with fields: `screenshot_dir` (custom directory path) ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: Users can request screenshot analysis without typing file paths in 100% of cases where screenshots exist - **SC-002**: Skill correctly identifies the most recent screenshot in under 1 second for directories with up to 1000 files - **SC-003**: Skill successfully locates screenshots in 95% of user requests when screenshots exist - **SC-004**: Error messages are clear and actionable when screenshots cannot be found or accessed - **SC-005**: Reduce user keystrokes by an average of 40+ characters per screenshot analysis request (eliminating "~/Pictures/Screenshots/filename.png") ## Assumptions ### Default Behavior - Users store screenshots in the standard location (~/Pictures/Screenshots) on Linux systems - Screenshot filenames include timestamps or modification times that allow reliable sorting by recency - The agent has image analysis capabilities (can read and analyze image files) ### Technical Environment - File system is accessible and readable - Standard Unix/Linux file utilities are available - Screenshot files use standard image formats ### User Interaction - Users will use natural language to request screenshot analysis - Users understand relative time references ("last", "latest", "recent", "previous") - Users expect immediate analysis without additional prompts ### Configuration - Custom configuration is optional - defaults work for most users - Configuration stored in skill-specific JSON file at ~/.config/opencode/skills/screenshot-analysis/config.json or ~/.claude/skills/screenshot-analysis/config.json - Config file format: `{"screenshot_dir": "/path/to/screenshots"}` ## Out of Scope The following are explicitly NOT included in this feature: - Screenshot capture functionality (assumes screenshots already exist) - Image editing or manipulation - Screenshot organization or tagging - Screenshot upload to external services - Optical Character Recognition (OCR) - unless built into agent's image analysis - Screenshot comparison or diff functionality (may be future enhancement) - Cross-platform screenshot location detection (focuses on Linux ~/Pictures/Screenshots) - Screenshot history management or database ## Dependencies - Agent must support image file analysis - File system access (read permissions on screenshots directory) - Bash scripting environment for helper scripts - Standard Unix tools (ls, find, stat for file operations) ## Notes - This skill is a convenience wrapper that eliminates repetitive path typing - The actual image analysis is delegated to the agent's existing capabilities - Focus is on file discovery and path resolution, not image processing - Should work with both Claude Code and OpenCode agents