- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions - ADR-003: Versioning with Nix store paths, lockfiles, interface contracts - ADR-004: Trace security with HMAC redaction, entropy detection, trace modes Refined based on orch consensus feedback from GPT and Gemini. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
395 lines
9.6 KiB
Markdown
395 lines
9.6 KiB
Markdown
# ADR-004: Trace Security and Redaction Policy
|
|
|
|
## Status
|
|
|
|
Draft (Revised)
|
|
|
|
## Context
|
|
|
|
Skill execution traces (wisps) capture:
|
|
- Environment variables
|
|
- Input arguments
|
|
- Tool calls with arguments
|
|
- File paths and contents
|
|
- Stdout/stderr
|
|
|
|
This data often contains secrets:
|
|
- API keys (AWS, GitHub, OpenAI)
|
|
- Tokens and passwords
|
|
- PII (usernames, emails, paths)
|
|
- Proprietary data
|
|
|
|
Wisps are gitignored but still risky:
|
|
- Local machine compromise
|
|
- Accidental sharing
|
|
- Squashing into public digests
|
|
- Elevation to published skills
|
|
|
|
## Decision
|
|
|
|
### Default-Deny Policy
|
|
|
|
Traces capture minimal information by default. Sensitive data requires explicit opt-in.
|
|
|
|
### HMAC-Based Redaction
|
|
|
|
Instead of plain `[REDACTED]`, use HMAC hashing to enable correlation without revealing values:
|
|
|
|
```yaml
|
|
# Format: [REDACTED:hmac:<first-8-chars-of-hmac>]
|
|
inputs:
|
|
api_token: "[REDACTED:hmac:a1b2c3d4]"
|
|
other_token: "[REDACTED:hmac:a1b2c3d4]" # Same value = same hash
|
|
different_token: "[REDACTED:hmac:e5f6g7h8]" # Different value
|
|
```
|
|
|
|
**Benefits:**
|
|
- Can detect if same secret was used across executions
|
|
- Can correlate inputs without knowing values
|
|
- HMAC key is per-session, not stored in trace
|
|
|
|
**Implementation:**
|
|
```python
|
|
import hmac
|
|
import hashlib
|
|
|
|
def redact_sensitive(value: str, session_key: bytes) -> str:
|
|
"""Redact value with HMAC for correlation."""
|
|
h = hmac.new(session_key, value.encode(), hashlib.sha256)
|
|
return f"[REDACTED:hmac:{h.hexdigest()[:8]}]"
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
**Default**: Only capture from allowlist.
|
|
|
|
```yaml
|
|
trace_env_allowlist:
|
|
- USER
|
|
- HOME
|
|
- PROJECT
|
|
- PWD
|
|
- SHELL
|
|
- TERM
|
|
- LANG
|
|
- TZ
|
|
```
|
|
|
|
**Never capture** (hardcoded denylist with glob patterns):
|
|
```yaml
|
|
trace_env_denylist:
|
|
- "*_KEY"
|
|
- "*_SECRET"
|
|
- "*_TOKEN"
|
|
- "*_PASSWORD"
|
|
- "*_CREDENTIAL*"
|
|
- "AWS_*"
|
|
- "GITHUB_TOKEN"
|
|
- "OPENAI_API_KEY"
|
|
- "ANTHROPIC_API_KEY"
|
|
```
|
|
|
|
### Input Arguments
|
|
|
|
**Default**: Inputs are NOT captured unless explicitly marked safe.
|
|
|
|
Skills must opt-in to capture inputs in manifest:
|
|
|
|
```yaml
|
|
inputs:
|
|
required:
|
|
- name: api_token
|
|
type: string
|
|
sensitive: true # Default, will be HMAC-redacted
|
|
- name: project_name
|
|
type: string
|
|
sensitive: false # Explicitly safe to capture
|
|
```
|
|
|
|
**Rationale**: Safer to miss debugging data than leak secrets. Most inputs can be reconstructed from context.
|
|
|
|
Trace output:
|
|
```yaml
|
|
inputs:
|
|
api_token: "[REDACTED:hmac:a1b2c3d4]"
|
|
project: "my-project" # Only captured because sensitive: false
|
|
```
|
|
|
|
### Tool Calls
|
|
|
|
**Default**: Capture command name and exit code. Parse arguments structurally.
|
|
|
|
#### Structured Argument Parsing
|
|
|
|
Instead of regex on raw strings, parse arguments properly:
|
|
|
|
```yaml
|
|
tool_calls:
|
|
- cmd: "curl"
|
|
parsed_args:
|
|
url: "https://api.example.com/endpoint"
|
|
headers:
|
|
- name: "Authorization"
|
|
value: "[REDACTED:hmac:b2c3d4e5]"
|
|
- name: "Content-Type"
|
|
value: "application/json"
|
|
method: "POST"
|
|
exit_code: 0
|
|
duration_ms: 1234
|
|
```
|
|
|
|
For commands we don't have parsers for, fall back to pattern redaction:
|
|
|
|
```yaml
|
|
tool_calls:
|
|
- cmd: "unknown-tool"
|
|
raw_args: "--token [REDACTED:hmac:c3d4e5f6] --output file.txt"
|
|
exit_code: 0
|
|
```
|
|
|
|
#### Known Command Parsers
|
|
|
|
Implement argument parsers for common commands:
|
|
- `curl`: Parse `-H`, `--header`, `-u`, `--user`, `-d`, `--data`
|
|
- `git`: Parse credentials in URLs, `-c` config values
|
|
- `aws`: Parse `--profile`, environment-based auth
|
|
- `docker`: Parse `-e`, `--env`, registry auth
|
|
|
|
#### Fallback Redaction Patterns
|
|
|
|
For unparsed commands, apply pattern matching:
|
|
```
|
|
Bearer [^\s]+ → Bearer [REDACTED:hmac:...]
|
|
token=[^\s&]+ → token=[REDACTED:hmac:...]
|
|
password=[^\s&]+ → password=[REDACTED:hmac:...]
|
|
-p [^\s]+ → -p [REDACTED:hmac:...]
|
|
--password[= ][^\s]+ → --password [REDACTED:hmac:...]
|
|
```
|
|
|
|
### Entropy Detection
|
|
|
|
Catch secrets that slip through pattern matching:
|
|
|
|
```python
|
|
import math
|
|
from collections import Counter
|
|
|
|
def entropy(s: str) -> float:
|
|
"""Calculate Shannon entropy of string."""
|
|
if not s:
|
|
return 0
|
|
counts = Counter(s)
|
|
probs = [c / len(s) for c in counts.values()]
|
|
return -sum(p * math.log2(p) for p in probs)
|
|
|
|
def looks_like_secret(value: str) -> bool:
|
|
"""Heuristic: high entropy + sufficient length = probably secret."""
|
|
if len(value) < 16:
|
|
return False
|
|
if entropy(value) > 4.5: # Random strings typically > 4.5
|
|
return True
|
|
return False
|
|
```
|
|
|
|
Apply entropy detection to:
|
|
- Unrecognized command arguments
|
|
- Environment variable values (before allowlist check)
|
|
- Input values marked `sensitive: false` (as safety check)
|
|
|
|
### Stdin
|
|
|
|
**Never capture stdin.** Sensitive data is often piped:
|
|
- Passwords via `echo $PASS | command`
|
|
- API responses with tokens
|
|
- File contents with secrets
|
|
|
|
```yaml
|
|
tool_calls:
|
|
- cmd: "some-command"
|
|
stdin: "[NOT_CAPTURED]" # Always this value
|
|
exit_code: 0
|
|
```
|
|
|
|
### File Contents
|
|
|
|
**Default**: Never capture file contents.
|
|
|
|
Only capture:
|
|
- File path (sanitized per ADR-003)
|
|
- File size
|
|
- Content hash (sha256)
|
|
- Action (created/modified/deleted)
|
|
|
|
```yaml
|
|
outputs:
|
|
artifacts:
|
|
- path: "docs/worklogs/2025-12-23.org"
|
|
size: 2048
|
|
sha256: "abc123..."
|
|
action: created
|
|
# content: NOT CAPTURED
|
|
```
|
|
|
|
### Stdout/Stderr
|
|
|
|
**Default**: Not captured.
|
|
|
|
**Opt-in**: Skill can enable with automatic redaction:
|
|
|
|
```yaml
|
|
execution:
|
|
capture_output: true
|
|
output_max_lines: 100
|
|
```
|
|
|
|
Output is run through:
|
|
1. Pattern redaction
|
|
2. Entropy detection
|
|
3. HMAC replacement
|
|
|
|
Before storage.
|
|
|
|
### Trace Modes
|
|
|
|
Support different capture levels for different contexts:
|
|
|
|
| Mode | Use Case | Capture Level |
|
|
|------|----------|---------------|
|
|
| `local` | Debugging on your machine | More permissive, still redacts secrets |
|
|
| `export` | Sharing with others | Aggressive redaction, path sanitization |
|
|
| `elevation` | Promoting to skill | Maximum redaction, human review required |
|
|
|
|
```yaml
|
|
# In trace metadata
|
|
trace:
|
|
mode: local
|
|
redaction_version: "1.0"
|
|
session_key_id: "abc123" # For HMAC correlation within session
|
|
```
|
|
|
|
**Mode transitions:**
|
|
```bash
|
|
# Local trace (default)
|
|
bd wisp show <id>
|
|
|
|
# Export for sharing
|
|
bd wisp export <id> --mode=export > trace.yaml
|
|
|
|
# Prepare for elevation
|
|
bd wisp export <id> --mode=elevation > trace.yaml
|
|
# → Requires manual review before elevation proceeds
|
|
```
|
|
|
|
### Classification Levels
|
|
|
|
Skills declare classification in manifest:
|
|
|
|
| Level | Description | Trace Policy |
|
|
|-------|-------------|--------------|
|
|
| `public` | Safe to share externally | Standard redaction, can elevate |
|
|
| `internal` | Normal internal use | Standard redaction, elevation requires review |
|
|
| `secret` | Contains sensitive data | Maximum redaction, elevation blocked |
|
|
|
|
```yaml
|
|
# In SKILL.md frontmatter
|
|
classification: internal
|
|
```
|
|
|
|
**Behavior by classification**:
|
|
|
|
- `public`: Standard tracing, eligible for elevation
|
|
- `internal`: Standard tracing, elevation requires `--force` and review
|
|
- `secret`:
|
|
- All inputs treated as sensitive
|
|
- Env vars: Only allowlist
|
|
- Tool args: Maximum redaction + entropy detection
|
|
- Elevation: Blocked entirely
|
|
|
|
### Elevation Gate
|
|
|
|
When elevating a molecule to a skill:
|
|
|
|
1. Check skill classification
|
|
2. If `secret`: Block with error
|
|
3. If `internal`: Warn, require `--force`, show redaction summary
|
|
4. If `public`: Proceed with standard review
|
|
|
|
```bash
|
|
$ bd elevate mol-123
|
|
Error: Molecule used skill with classification=secret
|
|
Cannot elevate without manual review
|
|
|
|
$ bd elevate mol-456
|
|
Warning: Molecule used internal skill.
|
|
Redacted fields: api_token, auth_header, 3 env vars
|
|
Review trace for sensitive data.
|
|
Use --force to proceed.
|
|
|
|
$ bd elevate mol-456 --force
|
|
Elevated to skill draft: skills/new-skill/
|
|
Please review before publishing.
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Users can extend allowlist/denylist in `.beads/config.yaml`:
|
|
|
|
```yaml
|
|
trace:
|
|
mode: local # default mode
|
|
env_allowlist:
|
|
- MY_SAFE_VAR
|
|
env_denylist:
|
|
- MY_SECRET_*
|
|
redact_patterns:
|
|
- "my-api-key-[a-z0-9]+"
|
|
entropy_threshold: 4.5 # Adjust sensitivity
|
|
```
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
- Secrets don't leak into traces by default
|
|
- HMAC enables correlation without revealing values
|
|
- Entropy detection catches novel secret patterns
|
|
- Structured parsing more reliable than regex
|
|
- Clear mode separation for different contexts
|
|
- Defense in depth (patterns + entropy + opt-in)
|
|
|
|
### Negative
|
|
|
|
- Less data available for debugging (especially in export mode)
|
|
- HMAC adds computational overhead
|
|
- Entropy detection may have false positives
|
|
- Structured parsers need maintenance per command
|
|
- Configuration complexity
|
|
|
|
### Neutral
|
|
|
|
- Existing wisps unaffected (new policy applies going forward)
|
|
- Trade-off between utility and safety favors safety
|
|
- Local mode still provides reasonable debugging data
|
|
|
|
## Implementation Checklist
|
|
|
|
- [ ] Implement HMAC redaction with session keys
|
|
- [ ] Implement env var filtering with allowlist/denylist
|
|
- [ ] Add `sensitive` field support to manifest parser (default true)
|
|
- [ ] Build structured argument parsers for curl, git, aws, docker
|
|
- [ ] Implement fallback pattern redaction
|
|
- [ ] Implement entropy detection
|
|
- [ ] Add stdin never-capture enforcement
|
|
- [ ] Implement trace modes (local/export/elevation)
|
|
- [ ] Add classification field to manifest
|
|
- [ ] Implement elevation gate with redaction summary
|
|
- [ ] Add config.yaml trace section support
|
|
- [ ] Document patterns, allowlists, and entropy thresholds
|
|
|
|
## Open Questions
|
|
|
|
1. Should HMAC keys be derivable from trace metadata for authorized replay?
|
|
2. How to handle secrets in multi-line values (JSON blobs, certificates)?
|
|
3. Should we offer a "paranoid mode" that captures nothing but exit codes?
|
|
4. How to detect and handle base64-encoded secrets?
|