# ADR-004: Trace Security and Redaction Policy ## Status Draft (Revised) ## Context Skill execution traces (wisps) capture: - Environment variables - Input arguments - Tool calls with arguments - File paths and contents - Stdout/stderr This data often contains secrets: - API keys (AWS, GitHub, OpenAI) - Tokens and passwords - PII (usernames, emails, paths) - Proprietary data Wisps are gitignored but still risky: - Local machine compromise - Accidental sharing - Squashing into public digests - Elevation to published skills ## Decision ### Default-Deny Policy Traces capture minimal information by default. Sensitive data requires explicit opt-in. ### HMAC-Based Redaction Instead of plain `[REDACTED]`, use HMAC hashing to enable correlation without revealing values: ```yaml # Format: [REDACTED:hmac:] inputs: api_token: "[REDACTED:hmac:a1b2c3d4]" other_token: "[REDACTED:hmac:a1b2c3d4]" # Same value = same hash different_token: "[REDACTED:hmac:e5f6g7h8]" # Different value ``` **Benefits:** - Can detect if same secret was used across executions - Can correlate inputs without knowing values - HMAC key is per-session, not stored in trace **Implementation:** ```python import hmac import hashlib def redact_sensitive(value: str, session_key: bytes) -> str: """Redact value with HMAC for correlation.""" h = hmac.new(session_key, value.encode(), hashlib.sha256) return f"[REDACTED:hmac:{h.hexdigest()[:8]}]" ``` ### Environment Variables **Default**: Only capture from allowlist. ```yaml trace_env_allowlist: - USER - HOME - PROJECT - PWD - SHELL - TERM - LANG - TZ ``` **Never capture** (hardcoded denylist with glob patterns): ```yaml trace_env_denylist: - "*_KEY" - "*_SECRET" - "*_TOKEN" - "*_PASSWORD" - "*_CREDENTIAL*" - "AWS_*" - "GITHUB_TOKEN" - "OPENAI_API_KEY" - "ANTHROPIC_API_KEY" ``` ### Input Arguments **Default**: Inputs are NOT captured unless explicitly marked safe. Skills must opt-in to capture inputs in manifest: ```yaml inputs: required: - name: api_token type: string sensitive: true # Default, will be HMAC-redacted - name: project_name type: string sensitive: false # Explicitly safe to capture ``` **Rationale**: Safer to miss debugging data than leak secrets. Most inputs can be reconstructed from context. Trace output: ```yaml inputs: api_token: "[REDACTED:hmac:a1b2c3d4]" project: "my-project" # Only captured because sensitive: false ``` ### Tool Calls **Default**: Capture command name and exit code. Parse arguments structurally. #### Structured Argument Parsing Instead of regex on raw strings, parse arguments properly: ```yaml tool_calls: - cmd: "curl" parsed_args: url: "https://api.example.com/endpoint" headers: - name: "Authorization" value: "[REDACTED:hmac:b2c3d4e5]" - name: "Content-Type" value: "application/json" method: "POST" exit_code: 0 duration_ms: 1234 ``` For commands we don't have parsers for, fall back to pattern redaction: ```yaml tool_calls: - cmd: "unknown-tool" raw_args: "--token [REDACTED:hmac:c3d4e5f6] --output file.txt" exit_code: 0 ``` #### Known Command Parsers Implement argument parsers for common commands: - `curl`: Parse `-H`, `--header`, `-u`, `--user`, `-d`, `--data` - `git`: Parse credentials in URLs, `-c` config values - `aws`: Parse `--profile`, environment-based auth - `docker`: Parse `-e`, `--env`, registry auth #### Fallback Redaction Patterns For unparsed commands, apply pattern matching: ``` Bearer [^\s]+ → Bearer [REDACTED:hmac:...] token=[^\s&]+ → token=[REDACTED:hmac:...] password=[^\s&]+ → password=[REDACTED:hmac:...] -p [^\s]+ → -p [REDACTED:hmac:...] --password[= ][^\s]+ → --password [REDACTED:hmac:...] ``` ### Entropy Detection Catch secrets that slip through pattern matching: ```python import math from collections import Counter def entropy(s: str) -> float: """Calculate Shannon entropy of string.""" if not s: return 0 counts = Counter(s) probs = [c / len(s) for c in counts.values()] return -sum(p * math.log2(p) for p in probs) def looks_like_secret(value: str) -> bool: """Heuristic: high entropy + sufficient length = probably secret.""" if len(value) < 16: return False if entropy(value) > 4.5: # Random strings typically > 4.5 return True return False ``` Apply entropy detection to: - Unrecognized command arguments - Environment variable values (before allowlist check) - Input values marked `sensitive: false` (as safety check) ### Stdin **Never capture stdin.** Sensitive data is often piped: - Passwords via `echo $PASS | command` - API responses with tokens - File contents with secrets ```yaml tool_calls: - cmd: "some-command" stdin: "[NOT_CAPTURED]" # Always this value exit_code: 0 ``` ### File Contents **Default**: Never capture file contents. Only capture: - File path (sanitized per ADR-003) - File size - Content hash (sha256) - Action (created/modified/deleted) ```yaml outputs: artifacts: - path: "docs/worklogs/2025-12-23.org" size: 2048 sha256: "abc123..." action: created # content: NOT CAPTURED ``` ### Stdout/Stderr **Default**: Not captured. **Opt-in**: Skill can enable with automatic redaction: ```yaml execution: capture_output: true output_max_lines: 100 ``` Output is run through: 1. Pattern redaction 2. Entropy detection 3. HMAC replacement Before storage. ### Trace Modes Support different capture levels for different contexts: | Mode | Use Case | Capture Level | |------|----------|---------------| | `local` | Debugging on your machine | More permissive, still redacts secrets | | `export` | Sharing with others | Aggressive redaction, path sanitization | | `elevation` | Promoting to skill | Maximum redaction, human review required | ```yaml # In trace metadata trace: mode: local redaction_version: "1.0" session_key_id: "abc123" # For HMAC correlation within session ``` **Mode transitions:** ```bash # Local trace (default) bd wisp show # Export for sharing bd wisp export --mode=export > trace.yaml # Prepare for elevation bd wisp export --mode=elevation > trace.yaml # → Requires manual review before elevation proceeds ``` ### Classification Levels Skills declare classification in manifest: | Level | Description | Trace Policy | |-------|-------------|--------------| | `public` | Safe to share externally | Standard redaction, can elevate | | `internal` | Normal internal use | Standard redaction, elevation requires review | | `secret` | Contains sensitive data | Maximum redaction, elevation blocked | ```yaml # In SKILL.md frontmatter classification: internal ``` **Behavior by classification**: - `public`: Standard tracing, eligible for elevation - `internal`: Standard tracing, elevation requires `--force` and review - `secret`: - All inputs treated as sensitive - Env vars: Only allowlist - Tool args: Maximum redaction + entropy detection - Elevation: Blocked entirely ### Elevation Gate When elevating a molecule to a skill: 1. Check skill classification 2. If `secret`: Block with error 3. If `internal`: Warn, require `--force`, show redaction summary 4. If `public`: Proceed with standard review ```bash $ bd elevate mol-123 Error: Molecule used skill with classification=secret Cannot elevate without manual review $ bd elevate mol-456 Warning: Molecule used internal skill. Redacted fields: api_token, auth_header, 3 env vars Review trace for sensitive data. Use --force to proceed. $ bd elevate mol-456 --force Elevated to skill draft: skills/new-skill/ Please review before publishing. ``` ### Configuration Users can extend allowlist/denylist in `.beads/config.yaml`: ```yaml trace: mode: local # default mode env_allowlist: - MY_SAFE_VAR env_denylist: - MY_SECRET_* redact_patterns: - "my-api-key-[a-z0-9]+" entropy_threshold: 4.5 # Adjust sensitivity ``` ## Consequences ### Positive - Secrets don't leak into traces by default - HMAC enables correlation without revealing values - Entropy detection catches novel secret patterns - Structured parsing more reliable than regex - Clear mode separation for different contexts - Defense in depth (patterns + entropy + opt-in) ### Negative - Less data available for debugging (especially in export mode) - HMAC adds computational overhead - Entropy detection may have false positives - Structured parsers need maintenance per command - Configuration complexity ### Neutral - Existing wisps unaffected (new policy applies going forward) - Trade-off between utility and safety favors safety - Local mode still provides reasonable debugging data ## Implementation Checklist - [ ] Implement HMAC redaction with session keys - [ ] Implement env var filtering with allowlist/denylist - [ ] Add `sensitive` field support to manifest parser (default true) - [ ] Build structured argument parsers for curl, git, aws, docker - [ ] Implement fallback pattern redaction - [ ] Implement entropy detection - [ ] Add stdin never-capture enforcement - [ ] Implement trace modes (local/export/elevation) - [ ] Add classification field to manifest - [ ] Implement elevation gate with redaction summary - [ ] Add config.yaml trace section support - [ ] Document patterns, allowlists, and entropy thresholds ## Open Questions 1. Should HMAC keys be derivable from trace metadata for authorized replay? 2. How to handle secrets in multi-line values (JSON blobs, certificates)? 3. Should we offer a "paranoid mode" that captures nothing but exit codes? 4. How to detect and handle base64-encoded secrets?