From: Stefan Gasser Date: Tue, 20 Jan 2026 22:12:25 +0000 (+0100) Subject: Add scan_roles and whitelist documentation (#55) X-Git-Url: http://git.99rst.org/?a=commitdiff_plain;h=64b96e8ef3a048b44e315de416c789972f8c90df;p=sgasser-llm-shield.git Add scan_roles and whitelist documentation (#55) --- diff --git a/docs/configuration/pii-detection.mdx b/docs/configuration/pii-detection.mdx index 3c8ad6f..c8a7417 100644 --- a/docs/configuration/pii-detection.mdx +++ b/docs/configuration/pii-detection.mdx @@ -97,3 +97,38 @@ pii_detection: # score_threshold: 0.5 # More aggressive # score_threshold: 0.9 # More conservative ``` + +## Whitelist + +Exclude specific text patterns from PII masking. Useful for preventing false positives on company names or product identifiers. + +```yaml +masking: + whitelist: + - "Acme Corp" + - "Product XYZ" +``` + +Patterns match bidirectionally - detected text containing a whitelist entry (or vice versa) is excluded. + +## Scan Roles + +By default, all message roles are scanned. To scan only user-controlled content: + +```yaml +pii_detection: + scan_roles: + - user + - tool + - function +``` + +| Role | Description | +|------|-------------| +| `user` | User messages (primary source of PII) | +| `assistant` | Assistant responses | +| `system` | System prompts | +| `tool` | Tool/function call results | +| `function` | Legacy function results (OpenAI) | + +This reduces Presidio API calls for large system prompts and avoids false positives on app-controlled content. diff --git a/docs/configuration/secrets-detection.mdx b/docs/configuration/secrets-detection.mdx index d5b3f62..0b90bb4 100644 --- a/docs/configuration/secrets-detection.mdx +++ b/docs/configuration/secrets-detection.mdx @@ -96,6 +96,26 @@ secrets_detection: - CONNECTION_STRING # postgres://user:pass@host, mongodb://user:pass@host ``` +## Scan Roles + +By default, all message roles are scanned. To scan only user-controlled content: + +```yaml +secrets_detection: + scan_roles: + - user + - tool + - function +``` + +| Role | Description | +|------|-------------| +| `user` | User messages (primary source of secrets) | +| `assistant` | Assistant responses | +| `system` | System prompts | +| `tool` | Tool/function call results | +| `function` | Legacy function results (OpenAI) | + ## Performance For large payloads, limit scanning: