Add scan_roles and whitelist documentation (#55)

author Stefan Gasser <redacted>

Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)

committer GitHub <redacted>

Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)
author Stefan Gasser <redacted>
Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)
committer GitHub <redacted>
Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)
diff --git a/docs/configuration/pii-detection.mdx b/docs/configuration/pii-detection.mdx

index 3c8ad6feddb79208b8d0dc75d83310794555a184..c8a7417e0f9854177fff6a46502fd1035f0f6aaf 100644 (file)
--- a/docs/configuration/pii-detection.mdx
+++ b/docs/configuration/pii-detection.mdx
@@ -97,3 +97,38 @@ pii_detection:
    # score_threshold: 0.5  # More aggressive
    # score_threshold: 0.9  # More conservative
  ```
+
+## Whitelist
+
+Exclude specific text patterns from PII masking. Useful for preventing false positives on company names or product identifiers.
+
+```yaml
+masking:
+  whitelist:
+    - "Acme Corp"
+    - "Product XYZ"
+```
+
+Patterns match bidirectionally - detected text containing a whitelist entry (or vice versa) is excluded.
+
+## Scan Roles
+
+By default, all message roles are scanned. To scan only user-controlled content:
+
+```yaml
+pii_detection:
+  scan_roles:
+    - user
+    - tool
+    - function
+```
+
+| Role | Description |
+|------|-------------|
+| `user` | User messages (primary source of PII) |
+| `assistant` | Assistant responses |
+| `system` | System prompts |
+| `tool` | Tool/function call results |
+| `function` | Legacy function results (OpenAI) |
+
+This reduces Presidio API calls for large system prompts and avoids false positives on app-controlled content.
diff --git a/docs/configuration/secrets-detection.mdx b/docs/configuration/secrets-detection.mdx

index d5b3f62691ac96504461bd1c35fc955cd99b4164..0b90bb403dd528d99684110953df8ecd68a41e1a 100644 (file)
--- a/docs/configuration/secrets-detection.mdx
+++ b/docs/configuration/secrets-detection.mdx
@@ -96,6 +96,26 @@ secrets_detection:
      - CONNECTION_STRING  # postgres://user:pass@host, mongodb://user:pass@host
  ```
  
+## Scan Roles
+
+By default, all message roles are scanned. To scan only user-controlled content:
+
+```yaml
+secrets_detection:
+  scan_roles:
+    - user
+    - tool
+    - function
+```
+
+| Role | Description |
+|------|-------------|
+| `user` | User messages (primary source of secrets) |
+| `assistant` | Assistant responses |
+| `system` | System prompts |
+| `tool` | Tool/function call results |
+| `function` | Legacy function results (OpenAI) |
+
  ## Performance
  
  For large payloads, limit scanning:
author	Stefan Gasser <redacted>
	Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)
committer	GitHub <redacted>
	Tue, 20 Jan 2026 22:12:25 +0000 (23:12 +0100)
docs/configuration/pii-detection.mdx		patch \| blob \| history
docs/configuration/secrets-detection.mdx		patch \| blob \| history