skills/docs/adr/004-trace-security-redaction.md
dan c1f644e6a6 ADRs: add skill manifest, versioning, and trace security designs
- ADR-002: Skill manifest format with JSON Schema, path bases, preconditions
- ADR-003: Versioning with Nix store paths, lockfiles, interface contracts
- ADR-004: Trace security with HMAC redaction, entropy detection, trace modes

Refined based on orch consensus feedback from GPT and Gemini.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 20:55:18 -05:00

9.6 KiB

ADR-004: Trace Security and Redaction Policy

Status

Draft (Revised)

Context

Skill execution traces (wisps) capture:

  • Environment variables
  • Input arguments
  • Tool calls with arguments
  • File paths and contents
  • Stdout/stderr

This data often contains secrets:

  • API keys (AWS, GitHub, OpenAI)
  • Tokens and passwords
  • PII (usernames, emails, paths)
  • Proprietary data

Wisps are gitignored but still risky:

  • Local machine compromise
  • Accidental sharing
  • Squashing into public digests
  • Elevation to published skills

Decision

Default-Deny Policy

Traces capture minimal information by default. Sensitive data requires explicit opt-in.

HMAC-Based Redaction

Instead of plain [REDACTED], use HMAC hashing to enable correlation without revealing values:

# Format: [REDACTED:hmac:<first-8-chars-of-hmac>]
inputs:
  api_token: "[REDACTED:hmac:a1b2c3d4]"
  other_token: "[REDACTED:hmac:a1b2c3d4]"  # Same value = same hash
  different_token: "[REDACTED:hmac:e5f6g7h8]"  # Different value

Benefits:

  • Can detect if same secret was used across executions
  • Can correlate inputs without knowing values
  • HMAC key is per-session, not stored in trace

Implementation:

import hmac
import hashlib

def redact_sensitive(value: str, session_key: bytes) -> str:
    """Redact value with HMAC for correlation."""
    h = hmac.new(session_key, value.encode(), hashlib.sha256)
    return f"[REDACTED:hmac:{h.hexdigest()[:8]}]"

Environment Variables

Default: Only capture from allowlist.

trace_env_allowlist:
  - USER
  - HOME
  - PROJECT
  - PWD
  - SHELL
  - TERM
  - LANG
  - TZ

Never capture (hardcoded denylist with glob patterns):

trace_env_denylist:
  - "*_KEY"
  - "*_SECRET"
  - "*_TOKEN"
  - "*_PASSWORD"
  - "*_CREDENTIAL*"
  - "AWS_*"
  - "GITHUB_TOKEN"
  - "OPENAI_API_KEY"
  - "ANTHROPIC_API_KEY"

Input Arguments

Default: Inputs are NOT captured unless explicitly marked safe.

Skills must opt-in to capture inputs in manifest:

inputs:
  required:
    - name: api_token
      type: string
      sensitive: true    # Default, will be HMAC-redacted
    - name: project_name
      type: string
      sensitive: false   # Explicitly safe to capture

Rationale: Safer to miss debugging data than leak secrets. Most inputs can be reconstructed from context.

Trace output:

inputs:
  api_token: "[REDACTED:hmac:a1b2c3d4]"
  project: "my-project"  # Only captured because sensitive: false

Tool Calls

Default: Capture command name and exit code. Parse arguments structurally.

Structured Argument Parsing

Instead of regex on raw strings, parse arguments properly:

tool_calls:
  - cmd: "curl"
    parsed_args:
      url: "https://api.example.com/endpoint"
      headers:
        - name: "Authorization"
          value: "[REDACTED:hmac:b2c3d4e5]"
        - name: "Content-Type"
          value: "application/json"
      method: "POST"
    exit_code: 0
    duration_ms: 1234

For commands we don't have parsers for, fall back to pattern redaction:

tool_calls:
  - cmd: "unknown-tool"
    raw_args: "--token [REDACTED:hmac:c3d4e5f6] --output file.txt"
    exit_code: 0

Known Command Parsers

Implement argument parsers for common commands:

  • curl: Parse -H, --header, -u, --user, -d, --data
  • git: Parse credentials in URLs, -c config values
  • aws: Parse --profile, environment-based auth
  • docker: Parse -e, --env, registry auth

Fallback Redaction Patterns

For unparsed commands, apply pattern matching:

Bearer [^\s]+        → Bearer [REDACTED:hmac:...]
token=[^\s&]+        → token=[REDACTED:hmac:...]
password=[^\s&]+     → password=[REDACTED:hmac:...]
-p [^\s]+            → -p [REDACTED:hmac:...]
--password[= ][^\s]+ → --password [REDACTED:hmac:...]

Entropy Detection

Catch secrets that slip through pattern matching:

import math
from collections import Counter

def entropy(s: str) -> float:
    """Calculate Shannon entropy of string."""
    if not s:
        return 0
    counts = Counter(s)
    probs = [c / len(s) for c in counts.values()]
    return -sum(p * math.log2(p) for p in probs)

def looks_like_secret(value: str) -> bool:
    """Heuristic: high entropy + sufficient length = probably secret."""
    if len(value) < 16:
        return False
    if entropy(value) > 4.5:  # Random strings typically > 4.5
        return True
    return False

Apply entropy detection to:

  • Unrecognized command arguments
  • Environment variable values (before allowlist check)
  • Input values marked sensitive: false (as safety check)

Stdin

Never capture stdin. Sensitive data is often piped:

  • Passwords via echo $PASS | command
  • API responses with tokens
  • File contents with secrets
tool_calls:
  - cmd: "some-command"
    stdin: "[NOT_CAPTURED]"  # Always this value
    exit_code: 0

File Contents

Default: Never capture file contents.

Only capture:

  • File path (sanitized per ADR-003)
  • File size
  • Content hash (sha256)
  • Action (created/modified/deleted)
outputs:
  artifacts:
    - path: "docs/worklogs/2025-12-23.org"
      size: 2048
      sha256: "abc123..."
      action: created
      # content: NOT CAPTURED

Stdout/Stderr

Default: Not captured.

Opt-in: Skill can enable with automatic redaction:

execution:
  capture_output: true
  output_max_lines: 100

Output is run through:

  1. Pattern redaction
  2. Entropy detection
  3. HMAC replacement

Before storage.

Trace Modes

Support different capture levels for different contexts:

Mode Use Case Capture Level
local Debugging on your machine More permissive, still redacts secrets
export Sharing with others Aggressive redaction, path sanitization
elevation Promoting to skill Maximum redaction, human review required
# In trace metadata
trace:
  mode: local
  redaction_version: "1.0"
  session_key_id: "abc123"  # For HMAC correlation within session

Mode transitions:

# Local trace (default)
bd wisp show <id>

# Export for sharing
bd wisp export <id> --mode=export > trace.yaml

# Prepare for elevation
bd wisp export <id> --mode=elevation > trace.yaml
# → Requires manual review before elevation proceeds

Classification Levels

Skills declare classification in manifest:

Level Description Trace Policy
public Safe to share externally Standard redaction, can elevate
internal Normal internal use Standard redaction, elevation requires review
secret Contains sensitive data Maximum redaction, elevation blocked
# In SKILL.md frontmatter
classification: internal

Behavior by classification:

  • public: Standard tracing, eligible for elevation
  • internal: Standard tracing, elevation requires --force and review
  • secret:
    • All inputs treated as sensitive
    • Env vars: Only allowlist
    • Tool args: Maximum redaction + entropy detection
    • Elevation: Blocked entirely

Elevation Gate

When elevating a molecule to a skill:

  1. Check skill classification
  2. If secret: Block with error
  3. If internal: Warn, require --force, show redaction summary
  4. If public: Proceed with standard review
$ bd elevate mol-123
Error: Molecule used skill with classification=secret
       Cannot elevate without manual review

$ bd elevate mol-456
Warning: Molecule used internal skill.
         Redacted fields: api_token, auth_header, 3 env vars
         Review trace for sensitive data.
         Use --force to proceed.

$ bd elevate mol-456 --force
Elevated to skill draft: skills/new-skill/
Please review before publishing.

Configuration

Users can extend allowlist/denylist in .beads/config.yaml:

trace:
  mode: local  # default mode
  env_allowlist:
    - MY_SAFE_VAR
  env_denylist:
    - MY_SECRET_*
  redact_patterns:
    - "my-api-key-[a-z0-9]+"
  entropy_threshold: 4.5  # Adjust sensitivity

Consequences

Positive

  • Secrets don't leak into traces by default
  • HMAC enables correlation without revealing values
  • Entropy detection catches novel secret patterns
  • Structured parsing more reliable than regex
  • Clear mode separation for different contexts
  • Defense in depth (patterns + entropy + opt-in)

Negative

  • Less data available for debugging (especially in export mode)
  • HMAC adds computational overhead
  • Entropy detection may have false positives
  • Structured parsers need maintenance per command
  • Configuration complexity

Neutral

  • Existing wisps unaffected (new policy applies going forward)
  • Trade-off between utility and safety favors safety
  • Local mode still provides reasonable debugging data

Implementation Checklist

  • Implement HMAC redaction with session keys
  • Implement env var filtering with allowlist/denylist
  • Add sensitive field support to manifest parser (default true)
  • Build structured argument parsers for curl, git, aws, docker
  • Implement fallback pattern redaction
  • Implement entropy detection
  • Add stdin never-capture enforcement
  • Implement trace modes (local/export/elevation)
  • Add classification field to manifest
  • Implement elevation gate with redaction summary
  • Add config.yaml trace section support
  • Document patterns, allowlists, and entropy thresholds

Open Questions

  1. Should HMAC keys be derivable from trace metadata for authorized replay?
  2. How to handle secrets in multi-line values (JSON blobs, certificates)?
  3. Should we offer a "paranoid mode" that captures nothing but exit codes?
  4. How to detect and handle base64-encoded secrets?