orch skill: sync with CLI v0.1.0
- Update model aliases (gpt-5.2, claude-opus-4.5, etc.) - Add new models: deepseek, r1, qwen, glm, sonar - Document --synthesize, --websearch, --serial flags - Document stdin piping, orch models, orch sessions - Add --allow-expensive usage guidance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
c1f644e6a6
commit
f8db8771ea
|
|
@ -1,3 +1,4 @@
|
|||
{"id":"skills-0nl","title":"Update orch skill to match CLI v0.1.0","description":"The orch skill (in ~/.claude/skills/orch/) is out of sync with the orch CLI.\n\nNeeded updates:\n- Fix model aliases: gpt-5 → gpt-5.2, claude-opus-4.1 → claude-opus-4.5\n- Add new aliases: deepseek, r1, qwen, qwen-fast, glm, sonar, sonar-pro\n- Document --synthesize flag for response aggregation\n- Document stdin piping support\n- Document orch models command\n- Document orch sessions command\n- Add --websearch, --serial, --allow-expensive options\n\nReference: ~/proj/orch/README.md and src/orch/models_registry.py","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-23T21:11:46.294285184-05:00","updated_at":"2025-12-24T01:29:54.408882125-05:00","closed_at":"2025-12-24T01:29:54.408882125-05:00","close_reason":"Updated skill with current CLI features and model aliases"}
|
||||
{"id":"skills-0og","title":"spec-review: Define output capture and audit trail","description":"Reviews happen in terminal then disappear. No audit trail, no diffable history.\n\nAdd:\n- Guidance to tee output to review file (e.g., specs/{branch}/review.md)\n- Standard location for gate check results\n- Template for recording decisions and rationale","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-15T00:23:23.705164812-08:00","updated_at":"2025-12-15T13:02:32.313084337-08:00","closed_at":"2025-12-15T13:02:32.313084337-08:00"}
|
||||
{"id":"skills-1ig","title":"Brainstorm agent-friendly doc conventions","description":"# Agent-Friendly Doc Conventions - Hybrid Architecture\n\n## FINAL ARCHITECTURE: Vale + LLM Hybrid\n\n### Insight\n\u003e \"Good old deterministic testing (dumb robots) is the best way to keep in check LLMs (smart robots) at volume.\"\n\n### Split by Tool\n\n| Category | Rubrics | Tool |\n|----------|---------|------|\n| Vale-only | Format Integrity, Deterministic Instructions, Terminology Strictness, Token Efficiency | Fast, deterministic, CI-friendly |\n| Vale + LLM | Semantic Headings, Configuration Precision, Security Boundaries | Vale flags, LLM suggests fixes |\n| LLM-only | Contextual Independence, Code Executability, Execution Verification | Semantic understanding required |\n\n### Pipeline\n\n```\n┌─────────────────────────────────────────────────────────────┐\n│ Stage 1: Vale (deterministic, fast, free) │\n│ - Runs in CI on every commit │\n│ - Catches 40% of issues instantly │\n│ - No LLM cost for clean docs │\n└─────────────────────┬───────────────────────────────────────┘\n │ only if Vale passes\n ▼\n┌─────────────────────────────────────────────────────────────┐\n│ Stage 2: LLM Triage (cheap model) │\n│ - Evaluates 3 semantic rubrics │\n│ - Identifies which need patches │\n└─────────────────────┬───────────────────────────────────────┘\n │ only if issues found\n ▼\n┌─────────────────────────────────────────────────────────────┐\n│ Stage 3: LLM Specialists (capable model) │\n│ - One agent per failed rubric │\n│ - Generates patches │\n└─────────────────────────────────────────────────────────────┘\n```\n\n### Why This Works\n- Vale is battle-tested, fast, CI-native\n- LLM only fires when needed (adaptive cost)\n- Deterministic rules catch predictable issues\n- LLM handles semantic/contextual issues\n\n---\n\n## Vale Rules Needed\n\n### Format Integrity\n- Existence: code blocks without language tags\n- Regex for unclosed fences\n\n### Deterministic Instructions \n- Existence: hedging words (\"might\", \"may want to\", \"consider\", \"you could\")\n\n### Terminology Strictness\n- Consistency: flag term variations\n\n### Token Efficiency\n- Existence: filler phrases (\"In this section we will...\", \"As you may know...\")\n\n### Semantic Headings (partial)\n- Existence: banned headings (\"Overview\", \"Introduction\", \"Getting Started\")\n\n### Configuration Precision (partial)\n- Existence: vague versions (\"Python 3.x\", \"recent version\")\n\n### Security Boundaries (partial)\n- Existence: hardcoded API key patterns\n\n---\n\n## NEXT STEPS\n\n1. Create Vale style for doc-review rubrics\n2. Test Vale on sample docs\n3. Design LLM prompts for semantic rubrics only\n4. Wire into orch or standalone","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-04T14:02:04.898026177-08:00","updated_at":"2025-12-04T16:43:53.0608948-08:00","closed_at":"2025-12-04T16:43:53.0608948-08:00"}
|
||||
{"id":"skills-1n3","title":"Set up agent skills for Gemini CLI","description":"The AI agent skills (worklog, web-search, etc.) configured in .skills are not currently working when using the Gemini CLI. \\n\\nObserved behavior:\\n- 'worklog' command not found even after 'direnv reload'.\\n- .envrc sources ~/proj/skills/bin/use-skills.sh, but skills are not accessible in the Gemini agent session.\\n\\nNeed to:\\n1. Investigate how Gemini CLI loads its environment compared to Claude Code.\\n2. Update 'use-skills.sh' or direnv configuration to support Gemini CLI.\\n3. Ensure skill symlinks/binaries are correctly in the PATH for Gemini.","status":"open","priority":2,"issue_type":"task","created_at":"2025-12-22T17:39:28.106296919-05:00","updated_at":"2025-12-22T17:39:28.106296919-05:00"}
|
||||
|
|
|
|||
|
|
@ -39,25 +39,37 @@ orch consensus "PROMPT" MODEL1 MODEL2 [MODEL3...]
|
|||
```
|
||||
|
||||
**Model Aliases** (use these):
|
||||
- `flash` → gemini-2.5-flash-preview (fast, cheap)
|
||||
- `gemini` → gemini-3-pro-preview (strong reasoning)
|
||||
- `qwen` → qwen3-8b (fast, cheap)
|
||||
- `deepseek` → deepseek-v3 (balanced)
|
||||
- `r1` → deepseek-r1 (strongest reasoning)
|
||||
- `gpt` / `gpt5` → gpt-5.1 (strong reasoning)
|
||||
- `gpt4` → gpt-4o (legacy)
|
||||
|
||||
| Alias | Model | Notes |
|
||||
|-------|-------|-------|
|
||||
| `flash` | gemini-3-flash-preview | Fast, free |
|
||||
| `gemini` | gemini-3-pro-preview | Strong reasoning, free |
|
||||
| `gpt` / `gpt5` | gpt-5.2 | Strong reasoning |
|
||||
| `gpt4` | gpt-4o | Legacy |
|
||||
| `claude` / `sonnet` | claude-sonnet-4.5 | Balanced (via OpenRouter) |
|
||||
| `haiku` | claude-haiku-4.5 | Fast, cheap |
|
||||
| `opus` | claude-opus-4.5 | Strongest, expensive |
|
||||
| `deepseek` | deepseek-v3.2 | Good value |
|
||||
| `r1` | deepseek-r1-0528 | Reasoning model, expensive |
|
||||
| `qwen` | qwen3-235b-a22b | Good value |
|
||||
| `qwen-fast` | qwen3-8b | Very fast/cheap |
|
||||
| `glm` | glm-4.7 | Reasoning capable |
|
||||
| `sonar` | perplexity/sonar | Web search built-in |
|
||||
| `sonar-pro` | perplexity/sonar-pro | Better web search |
|
||||
|
||||
Use `orch models` to see all available models with pricing and status.
|
||||
|
||||
## Model Selection
|
||||
|
||||
**Quick sanity check**: Use `flash qwen` for fast, cheap validation. Good for "am I missing something obvious?" checks.
|
||||
**Quick sanity check**: Use `flash qwen-fast` for fast, cheap validation. Good for "am I missing something obvious?" checks.
|
||||
|
||||
**Standard consensus**: Use `flash gemini deepseek` for balanced perspectives across providers. Default for most decisions.
|
||||
|
||||
**Deep analysis**: Include `r1` or `gpt` when stakes are high or reasoning is complex. These models think longer but cost more.
|
||||
**Deep analysis**: Include `r1` or `gpt` when stakes are high or reasoning is complex. These models think longer but cost more. Use `--allow-expensive` for r1/opus.
|
||||
|
||||
**Diverse viewpoints**: Mix providers (Google + DeepSeek + OpenAI) rather than multiple models from one provider. Different training leads to genuinely different perspectives.
|
||||
**Diverse viewpoints**: Mix providers (Google + DeepSeek + OpenAI + Anthropic) rather than multiple models from one provider. Different training leads to genuinely different perspectives.
|
||||
|
||||
**Cost-conscious**: `flash` and `qwen` are 10-20x cheaper than premium models. Start cheap, escalate if needed.
|
||||
**Cost-conscious**: `flash` and `qwen-fast` are 10-100x cheaper than premium models. Start cheap, escalate if needed.
|
||||
|
||||
**Options**:
|
||||
- `--mode vote` (default) - Models give Support/Oppose/Neutral verdict
|
||||
|
|
@ -65,13 +77,23 @@ orch consensus "PROMPT" MODEL1 MODEL2 [MODEL3...]
|
|||
- `--mode critique` - Find flaws and weaknesses
|
||||
- `--mode open` - Freeform responses, no structured output
|
||||
- `--temperature 0.1` - Lower = more focused (default 0.1)
|
||||
- `--file PATH` - Include file as context
|
||||
- `--enhance` - Use AI to improve prompt before querying
|
||||
- `--file PATH` - Include file as context (can use multiple times)
|
||||
- `--websearch` - Enable web search (Gemini models only)
|
||||
- `--serial` - Run models in sequence instead of parallel
|
||||
- `--strategy` - Serial strategy: neutral (default), refine, debate, brainstorm
|
||||
- `--synthesize MODEL` - Aggregate all responses into summary using MODEL
|
||||
- `--allow-expensive` - Allow expensive/slow models (opus, r1)
|
||||
- `--timeout SECS` - Timeout per model (default 300)
|
||||
|
||||
**Stances** (devil's advocate):
|
||||
Append `:for`, `:against`, or `:neutral` to bias a model's perspective:
|
||||
```bash
|
||||
orch consensus "Should we rewrite in Rust?" gpt5:for deepseek:against gemini:neutral
|
||||
orch consensus "Should we rewrite in Rust?" gpt:for claude:against gemini:neutral
|
||||
```
|
||||
|
||||
**Stdin piping**:
|
||||
```bash
|
||||
cat code.py | orch consensus "Is this implementation correct?" flash gemini
|
||||
```
|
||||
|
||||
### orch chat
|
||||
|
|
@ -81,6 +103,31 @@ Single-model conversation (when you don't need consensus):
|
|||
orch chat "MESSAGE" --model gemini
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--model MODEL` - Model to use (default: gemini)
|
||||
- `--session ID` - Continue a session
|
||||
- `--file PATH` - Attach file
|
||||
- `--websearch` / `--no-websearch` - Toggle search (default: on)
|
||||
- `--allow-expensive` - Allow expensive models
|
||||
|
||||
### orch models
|
||||
|
||||
List and inspect available models:
|
||||
```bash
|
||||
orch models # List all models with status
|
||||
orch models resolve <alias> # Show details for specific alias
|
||||
```
|
||||
|
||||
### orch sessions
|
||||
|
||||
Manage conversation sessions:
|
||||
```bash
|
||||
orch sessions list # List all sessions
|
||||
orch sessions show <id> # Show session details
|
||||
orch sessions clean 7d # Delete sessions older than 7 days
|
||||
orch sessions export <id> # Export session as JSON
|
||||
```
|
||||
|
||||
## Usage Patterns
|
||||
|
||||
### Quick Second Opinion
|
||||
|
|
@ -92,7 +139,7 @@ orch consensus "I think we should use SQLite for this because [reasons]. Is this
|
|||
### Architecture Decision
|
||||
When facing a tradeoff:
|
||||
```bash
|
||||
orch consensus "Microservices vs monolith for a 3-person team building an e-commerce site?" flash gemini deepseek --mode vote
|
||||
orch consensus "Microservices vs monolith for a 3-person team building an e-commerce site?" flash gemini gpt --mode vote
|
||||
```
|
||||
|
||||
### Code Review
|
||||
|
|
@ -104,7 +151,7 @@ orch consensus "Is this error handling approach correct and complete?" flash gem
|
|||
### Devil's Advocate
|
||||
Get opposing viewpoints deliberately:
|
||||
```bash
|
||||
orch consensus "Should we adopt Kubernetes?" gpt5:for deepseek:against flash:neutral
|
||||
orch consensus "Should we adopt Kubernetes?" gpt:for claude:against flash:neutral
|
||||
```
|
||||
|
||||
### Brainstorm
|
||||
|
|
@ -119,6 +166,18 @@ Find weaknesses before presenting:
|
|||
orch consensus "What are the flaws in this API design?" flash gemini --file api-spec.yaml --mode critique
|
||||
```
|
||||
|
||||
### Synthesize Responses
|
||||
Get a unified summary from multiple perspectives:
|
||||
```bash
|
||||
orch consensus "Evaluate this architecture" flash gemini gpt --synthesize gemini
|
||||
```
|
||||
|
||||
### Use Reasoning Models
|
||||
For complex analysis requiring deep thinking:
|
||||
```bash
|
||||
orch consensus "Analyze the security implications" r1 gemini --allow-expensive
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
Vote mode returns structured verdicts:
|
||||
|
|
@ -128,13 +187,13 @@ Vote mode returns structured verdicts:
|
|||
│ SUPPORT: 2 OPPOSE: 1 NEUTRAL: 0 │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
|
||||
[flash] gemini-2.5-flash - SUPPORT
|
||||
[flash] gemini-3-flash-preview - SUPPORT
|
||||
Reasoning: ...
|
||||
|
||||
[gemini] gemini-3-pro-preview - SUPPORT
|
||||
Reasoning: ...
|
||||
|
||||
[deepseek] deepseek-v3 - OPPOSE
|
||||
[claude] claude-sonnet-4.5 - OPPOSE
|
||||
Reasoning: ...
|
||||
```
|
||||
|
||||
|
|
@ -142,15 +201,12 @@ Reasoning: ...
|
|||
|
||||
1. **Use for genuine uncertainty** - Don't use orch for trivial decisions or to avoid thinking
|
||||
2. **Provide context** - Better prompts get better consensus; use `--file` when relevant
|
||||
3. **Choose models wisely** - flash/qwen for quick checks, r1/gpt for complex reasoning
|
||||
3. **Choose models wisely** - flash/qwen-fast for quick checks, r1/opus for complex reasoning
|
||||
4. **Consider stances** - Devil's advocate is powerful for stress-testing ideas
|
||||
5. **Parse the reasoning** - The verdict matters less than understanding the reasoning
|
||||
6. **Mind the cost** - opus and r1 require `--allow-expensive`; use cheaper models for iteration
|
||||
|
||||
## Requirements
|
||||
|
||||
- `orch` CLI installed (via home-manager or system packages)
|
||||
- API keys configured (OPENROUTER_KEY, GOOGLE_API_KEY, OPENAI_API_KEY)
|
||||
|
||||
## Examples
|
||||
|
||||
See `examples/` directory for sample outputs from different consensus modes.
|
||||
- API keys configured: GEMINI_API_KEY, OPENAI_API_KEY, OPENROUTER_KEY
|
||||
|
|
|
|||
Loading…
Reference in a new issue