beads: add AT-SPI findings (skills-pdg)

This commit is contained in:
dan 2025-12-17 14:00:10 -08:00
parent 19e0750693
commit 0b971558a5
2 changed files with 3 additions and 1 deletions

View file

@ -1 +1 @@
0.30.0
0.30.2

View file

@ -33,11 +33,13 @@
{"id":"skills-fvx","title":"use-skills.sh: stderr from nix build corrupts symlinks when repo is dirty","description":"In use-skills.sh, the line:\n\n```bash\nout=$(nix build --print-out-paths --no-link \"${SKILLS_REPO}#${skill}\" 2\u003e\u00261) || {\n```\n\nThe `2\u003e\u00261` merges stderr into stdout. When the skills repo is dirty, nix emits a warning to stderr which gets captured into $out and used as the symlink target.\n\nResult: symlinks like:\n```\norch -\u003e warning: Git tree '/home/dan/proj/skills' is dirty\n/nix/store/j952hgxixifscafb42vmw9vgdphi1djs-ai-skill-orch\n```\n\nFix: redirect stderr to /dev/null or filter it out before creating symlink.","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-12-14T11:54:03.06502295-08:00","updated_at":"2025-12-14T11:59:25.472044754-08:00","closed_at":"2025-12-14T11:59:25.472044754-08:00"}
{"id":"skills-gas","title":"spec-review: File discovery is brittle, can pick wrong file silently","description":"The fallback `find ... | head -1` is non-deterministic and can select wrong spec/plan/tasks file without user noticing. Branch names with `/` also break path construction.\n\nFixes:\n- Fail fast if expected file missing\n- Print chosen file path before proceeding\n- Require explicit confirmation if falling back\n- Handle branch names with slashes","status":"closed","priority":2,"issue_type":"bug","created_at":"2025-12-15T00:23:22.762045913-08:00","updated_at":"2025-12-15T00:46:16.231199231-08:00","closed_at":"2025-12-15T00:46:16.231199231-08:00"}
{"id":"skills-h9f","title":"spec-review: Balance negativity bias in prompts","description":"'Be critical' and 'devil's advocate' can bias toward over-flagging without acknowledging what's good.\n\nAdd:\n- 'List top 3 strongest parts of the document'\n- 'Call out where document is sufficiently clear/testable'\n- Categorize :against concerns as confirmed/plausible/rejected","status":"closed","priority":3,"issue_type":"task","created_at":"2025-12-15T00:23:26.418087998-08:00","updated_at":"2025-12-15T14:07:39.520818417-08:00","closed_at":"2025-12-15T14:07:39.520818417-08:00"}
{"id":"skills-kg7","title":"Desktop automation for Wayland/niri","description":"Explore and implement desktop automation solutions for Wayland (niri compositor).\n\n## First goal: Seeing what's on screen\n\n### Already have (niri-window-capture skill)\n- ✅ Window metadata: list windows, titles, app_id, workspace, position\n- ✅ Screenshots: capture any window invisibly, even from other workspaces\n\n### Missing pieces\n- ❌ **UI tree (AT-SPI)**: semantic structure - buttons, labels, text fields, menus\n - Works for GTK/Qt apps that expose accessibility\n - Electron apps often poor, custom toolkits nothing\n - Query: \"what buttons are visible?\", \"where is the Save button?\"\n \n- ❌ **OCR fallback**: extract text from screenshots when AT-SPI unavailable\n - tesseract for local OCR\n - vision LLM (Claude) for understanding layout + text\n - Needed for apps that don't expose AT-SPI\n\n### Levels of \"seeing\"\n1. **Window awareness**: what's open, where (have this)\n2. **Pixel capture**: screenshots (have this)\n3. **Semantic understanding**: UI elements, their roles, states (need AT-SPI)\n4. **Text extraction**: OCR when semantic not available (need OCR)\n\n## Context\nWayland's security model intentionally blocks X11-style global automation. Solutions require:\n- Privileged daemons (uinput for input injection)\n- Compositor-specific IPC (niri msg)\n- App opt-in via AT-SPI accessibility\n- Portal permissions for screenshots\n\n## Available building blocks\n- **niri IPC**: window management, screenshots, workspace control\n- **ydotool**: kernel uinput input injection (keyboard/mouse)\n- **AT-SPI/dogtail**: app UI introspection (GTK/Qt apps that expose accessibility)\n- **xdg-desktop-portal**: screenshot permissions\n\n## Key questions\n- Which apps do we care about? (GTK, Qt, Electron, web?)\n- Is AT-SPI sufficient for our target apps?\n- When to use OCR vs AT-SPI?\n- How to combine screenshot + AT-SPI for element location?\n\n## Architecture sketch\n```\n┌─────────────────────────────────────────────┐\n│ User/Agent Scripts │\n├─────────────────────────────────────────────┤\n│ Unified Python API? │\n├──────────┬──────────┬──────────┬────────────┤\n│ niri IPC │ ydotool │ AT-SPI │ OCR │\n│ (windows)│ (input) │ (UI tree)│ (fallback) │\n├──────────┴──────────┴──────────┴────────────┤\n│ Wayland compositor / kernel / D-Bus │\n└─────────────────────────────────────────────┘\n```\n\n## Approach\nEngineer solutions based on specific team needs as they arise, rather than building a monolithic framework upfront.","status":"open","priority":2,"issue_type":"epic","created_at":"2025-12-17T12:42:17.863074501-08:00","updated_at":"2025-12-17T12:56:08.188860952-08:00","dependencies":[{"issue_id":"skills-kg7","depends_on_id":"skills-pdg","type":"blocks","created_at":"2025-12-17T13:59:59.915105471-08:00","created_by":"daemon"}]}
{"id":"skills-kmj","title":"Orch skill: document or handle orch not in PATH","description":"Skill docs show 'orch consensus' but orch requires 'uv run' from ~/proj/orch. Either update skill to invoke correctly or document installation requirement.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-01T17:29:48.844997238-08:00","updated_at":"2025-12-01T18:28:11.374048504-08:00","closed_at":"2025-12-01T18:28:11.374048504-08:00"}
{"id":"skills-lie","title":"Compare DEPENDENCIES.md with upstream","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:15:53.925914243-08:00","updated_at":"2025-12-03T20:19:28.665641809-08:00","closed_at":"2025-12-03T20:19:28.665641809-08:00","dependencies":[{"issue_id":"skills-lie","depends_on_id":"skills-ebh","type":"discovered-from","created_at":"2025-12-03T20:15:53.9275694-08:00","created_by":"daemon"}]}
{"id":"skills-lvg","title":"Compare ISSUE_CREATION.md with upstream","description":"","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-03T20:15:54.609282051-08:00","updated_at":"2025-12-03T20:19:29.134966356-08:00","closed_at":"2025-12-03T20:19:29.134966356-08:00","dependencies":[{"issue_id":"skills-lvg","depends_on_id":"skills-ebh","type":"discovered-from","created_at":"2025-12-03T20:15:54.610717055-08:00","created_by":"daemon"}]}
{"id":"skills-m21","title":"Apply niri-window-capture code review recommendations","description":"CODE-REVIEW-niri-window-capture.md identifies action items: add dependency checks to scripts, improve error handling for niri failures, add screenshot directory validation, implement rate limiting. See High/Medium priority sections.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T11:58:24.648846875-08:00","updated_at":"2025-11-30T11:58:24.648846875-08:00"}
{"id":"skills-mx3","title":"spec-review: Define consensus thresholds and decision rules","description":"'Use judgment' for mixed results leads to inconsistent decisions.\n\nDefine:\n- What constitutes consensus (2/3? unanimous?)\n- How to handle NEUTRAL votes\n- Tie-break rules\n- When human override is acceptable and how to document it","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-15T00:23:24.121175736-08:00","updated_at":"2025-12-15T13:58:04.339283238-08:00","closed_at":"2025-12-15T13:58:04.339283238-08:00"}
{"id":"skills-pdg","title":"Enable AT-SPI for UI tree access","description":"## Findings\n\nAT-SPI (Assistive Technology Service Provider Interface) provides semantic UI tree access - buttons, labels, text fields, their states and coordinates.\n\n### Current state\n- AT-SPI is **disabled** on this NixOS system\n- Environment has `NO_AT_BRIDGE=1` and `GTK_A11Y=none`\n- No apps are exposing accessibility info\n\n### To enable\n```nix\nservices.gnome.at-spi2-core.enable = true;\n```\n\nThen rebuild and re-login (apps must start fresh to register with bus).\n\n### App support\n- **GTK apps**: Should work automatically\n- **Qt apps**: Need `QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1` env var\n- **Electron**: Varies by app, often poor support\n\n### Trade-offs\n- Adds runtime overhead to all GTK/Qt apps\n- May want as boot-time option rather than always-on\n- Only useful for automation/accessibility use cases\n\n### Tools once enabled\n- `python3-pyatspi` / `dogtail` for querying UI tree\n- `accerciser` for visual inspection of accessibility tree\n\n### Next steps\n1. Add NixOS option (possibly as boot-time toggle)\n2. Test with common apps (Firefox, terminals, etc.)\n3. Build skill to query UI elements\n\n## Related\nParent epic: skills-kg7 (Desktop automation for Wayland/niri)","status":"open","priority":2,"issue_type":"task","created_at":"2025-12-17T13:59:55.799402507-08:00","updated_at":"2025-12-17T13:59:55.799402507-08:00"}
{"id":"skills-pu4","title":"Clean up stale beads.left.jsonl merge artifact","description":"bd doctor flagged multiple JSONL files. beads.left.jsonl is empty merge artifact that should be removed: git rm .beads/beads.left.jsonl","status":"closed","priority":2,"issue_type":"task","created_at":"2025-11-30T11:58:33.292221449-08:00","updated_at":"2025-11-30T12:37:49.916795223-08:00","closed_at":"2025-11-30T12:37:49.916795223-08:00"}
{"id":"skills-qeh","title":"Add README.md for web-research skill","description":"web-research skill has SKILL.md and scripts but no README.md. AGENTS.md says README.md is for humans, contains installation instructions, usage examples, prerequisites.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T11:58:14.475647113-08:00","updated_at":"2025-11-30T12:00:30.309340468-08:00","dependencies":[{"issue_id":"skills-qeh","depends_on_id":"skills-vb5","type":"blocks","created_at":"2025-11-30T12:01:30.278784381-08:00","created_by":"daemon"}]}
{"id":"skills-rpf","title":"Implement playwright-visit skill for browser automation","description":"## Overview\nBrowser automation skill using Playwright to visit web pages, take screenshots, and extract content.\n\n## Key Findings (from dotfiles investigation)\n\n### Working Setup\n- Use `python312Packages.playwright` from nixpkgs (handles Node driver binary patching for NixOS)\n- Use `executable_path='/run/current-system/sw/bin/chromium'` to use system chromium\n- No `playwright install` needed - no browser binary downloads\n\n### Profile Behavior\n- Fresh/blank profile every launch by default\n- No cookies, history, or logins from user's browser\n- Can persist state with `storage_state` parameter if needed\n\n### Example Code\n```python\nfrom playwright.sync_api import sync_playwright\n\nwith sync_playwright() as p:\n browser = p.chromium.launch(\n executable_path='/run/current-system/sw/bin/chromium',\n headless=True\n )\n page = browser.new_page()\n page.goto('https://example.com')\n print(page.title())\n browser.close()\n```\n\n### Why Not uv/pip?\n- Playwright pip package bundles a Node.js driver binary\n- NixOS can't run dynamically linked executables without patching\n- nixpkgs playwright handles this properly\n\n## Implementation Plan\n1. Create `skills/playwright-visit/` directory\n2. Add flake.nix with devShell providing playwright\n3. Create CLI script with subcommands:\n - `screenshot \u003curl\u003e \u003coutput.png\u003e` - capture page\n - `text \u003curl\u003e` - extract text content \n - `html \u003curl\u003e` - get rendered HTML\n - `pdf \u003curl\u003e \u003coutput.pdf\u003e` - save as PDF\n4. Create skill definition for Claude Code integration\n5. Document usage in skill README\n\n## Dependencies\n- nixpkgs python312Packages.playwright\n- System chromium (already in dotfiles)\n\n## Related\n- dotfiles issue dotfiles-m09 (playwright skill request)","status":"open","priority":2,"issue_type":"feature","created_at":"2025-12-16T16:02:28.577381007-08:00","updated_at":"2025-12-16T16:02:28.577381007-08:00"}