From: Stefan Gasser Date: Sun, 11 Jan 2026 10:11:54 +0000 (+0100) Subject: Add Mintlify documentation and simplify README (#24) X-Git-Url: http://git.99rst.org/?a=commitdiff_plain;h=4f592d6a7f4123707773236afd1551cc53f69dc9;p=sgasser-llm-shield.git Add Mintlify documentation and simplify README (#24) * Add Mintlify documentation and simplify README Documentation: - Add complete Mintlify docs with introduction, quickstart, integrations - Add concept guides: mask mode, route mode, PII detection, secrets detection - Add API reference: chat completions, models, status, dashboard API - Add configuration guides: overview, providers, PII, secrets, logging - Include dashboard screenshot and branding assets README: - Simplify structure, move detailed docs to Mintlify - Add centered badges and navigation links - Add "What is PasteGuard?" section explaining problem/solution - Update example to Dr. Sarah Chen (consistent across all docs) - Reorder integrations (OpenAI SDK, LangChain, LlamaIndex first) - Move Presidio attribution inline with PII section - Add Tech Stack section Code: - Update description to "Privacy proxy for LLMs" in package.json, startup banner, and /info endpoint Closes #21 * docs: fix secrets default to block (matches current code) --- diff --git a/README.md b/README.md index 6e0732f..d12ff9c 100644 --- a/README.md +++ b/README.md @@ -2,303 +2,116 @@ PasteGuard

-[![CI](https://github.com/sgasser/pasteguard/actions/workflows/ci.yml/badge.svg)](https://github.com/sgasser/pasteguard/actions/workflows/ci.yml) -[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) - -Privacy proxy for LLMs. Masks personal data and secrets / credentials before sending to your provider (OpenAI, Azure, etc.), or routes sensitive requests to local LLM. +

+ CI + License +

-PasteGuard Dashboard +

+ Privacy proxy for LLMs. Masks personal data and secrets before sending to your provider. +

-## Mask Mode (Default) +

+ Quick Start · + Documentation · + Integrations +

-Replaces personal data with placeholders before sending to LLM. Unmasks the response automatically. +
-``` -You send: "Email john@acme.com about the meeting with Sarah Miller" -OpenAI receives: "Email about the meeting with " -OpenAI responds: "I'll contact to schedule with ..." -You receive: "I'll contact john@acme.com to schedule with Sarah Miller..." -``` +PasteGuard Dashboard -- No local GPU needed -- Supports streaming with real-time unmasking +## What is PasteGuard? -## Route Mode +When you use LLM APIs, every prompt is sent to external servers — including customer names, emails, and sensitive business data. Many organizations have policies against sending PII to third-party AI services. -Requests with personal data go to local LLM. Everything else goes to your provider. +PasteGuard is an OpenAI-compatible proxy that sits between your app and the LLM API. It detects personal data and secrets before they leave your network. -``` -"Help with code review" → OpenAI (best quality) -"Email john@acme.com about..." → Ollama (stays on your network) -``` +**Two ways to protect your data:** -- Requires local LLM (Ollama, vLLM, LocalAI) -- Full data isolation - personal data never leaves your network - -## What It Detects +- **Mask Mode** — Replace PII with placeholders, send to your provider, restore in response. No local infrastructure needed. +- **Route Mode** — Send PII requests to a local LLM (Ollama, vLLM, llama.cpp), everything else to your provider. Data never leaves your network. -### PII (Personal Identifiable Information) +Works with OpenAI, Azure, and any OpenAI-compatible API. Just change one URL. -| Type | Examples | -| ------------- | --------------------------- | -| Names | John Smith, Sarah Miller | -| Emails | john@acme.com | -| Phone numbers | +1 555 123 4567 | -| Credit cards | 4111-1111-1111-1111 | -| IBANs | DE89 3704 0044 0532 0130 00 | -| IP addresses | 192.168.1.1 | -| Locations | New York, Berlin | +## Features -Additional entity types can be enabled: `US_SSN`, `US_PASSPORT`, `CRYPTO`, `NRP`, `MEDICAL_LICENSE`, `URL`. +- **PII Detection** — Names, emails, phone numbers, credit cards, IBANs, and more +- **Secrets Detection** — API keys, tokens, private keys caught before they reach the LLM +- **Streaming Support** — Real-time unmasking as tokens arrive +- **24 Languages** — Works in English, German, French, and 21 more +- **OpenAI-Compatible** — Change one URL, keep your code +- **Self-Hosted** — Your servers, your data stays yours +- **Open Source** — Apache 2.0 license, full transparency +- **Dashboard** — See every protected request in real-time -### Secrets (Secrets Shield) +## How It Works -| Type | Pattern | -| -------------------- | --------------------------------------------------------------------------------------------------------- | -| OpenSSH private keys | `-----BEGIN OPENSSH PRIVATE KEY-----` | -| PEM private keys | `-----BEGIN RSA PRIVATE KEY-----`, `-----BEGIN PRIVATE KEY-----`, `-----BEGIN ENCRYPTED PRIVATE KEY-----` | -| OpenAI API keys | `sk-proj-...`, `sk-...` (48+ chars) | -| AWS access keys | `AKIA...` (20 chars) | -| GitHub tokens | `ghp_...`, `gho_...`, `ghu_...`, `ghs_...`, `ghr_...` | -| JWT tokens | `eyJ...` (three base64 segments) | -| Bearer tokens | `Bearer ...` (20+ char tokens) | +``` +You send: "Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org) + about next week's project meeting" -Secrets detection runs **before** PII detection. Three actions available: -- **block** (default): Returns HTTP 400, request never reaches LLM -- **redact**: Replaces secrets with placeholders, unredacts in response (reversible) -- **route_local**: Routes to local LLM (route mode only) +LLM receives: "Write a follow-up email to () + about next week's project meeting" -Detected secrets are never logged in their original form. +LLM responds: "Dear , Following up on our discussion..." -**Languages**: 24 languages supported (configurable at build time). Auto-detected per request. +You receive: "Dear Dr. Sarah Chen, Following up on our discussion..." +``` -Powered by [Microsoft Presidio](https://microsoft.github.io/presidio/). +PasteGuard sits between your app and the LLM provider. It's OpenAI-compatible — just change the base URL. ## Quick Start -### Docker (recommended) - ```bash git clone https://github.com/sgasser/pasteguard.git cd pasteguard cp config.example.yaml config.yaml - -# Option 1: English only (default, ~1.5GB) docker compose up -d - -# Option 2: Multiple languages (~2.5GB) -# Edit config.yaml to add languages, then: -LANGUAGES=en,de,fr,es,it docker compose up -d ``` -### Local Development - -```bash -git clone https://github.com/sgasser/pasteguard.git -cd pasteguard -bun install -cp config.example.yaml config.yaml - -# Option 1: English only (default) -docker compose up presidio-analyzer -d - -# Option 2: Multiple languages -# Edit config.yaml to add languages, then: -LANGUAGES=en,de,fr,es,it docker compose build presidio-analyzer -docker compose up presidio-analyzer -d - -bun run dev -``` +Point your app to `http://localhost:3000/openai/v1` instead of `https://api.openai.com/v1`. Dashboard: http://localhost:3000/dashboard -**Usage:** Point your app to `http://localhost:3000/openai/v1` instead of `https://api.openai.com/v1`. +For multiple languages, configuration options, and more: **[Read the docs →](https://pasteguard.com/docs/quickstart)** -## Language Configuration +## Integrations -By default, only English is installed to minimize image size. Add more languages at build time: +Works with any OpenAI-compatible tool: -```bash -# English only (default, smallest image ~1.5GB) -docker compose build - -# English + German -LANGUAGES=en,de docker compose build - -# Multiple languages -LANGUAGES=en,de,fr,it,es docker compose build -``` - -**Available languages (24):** -`ca`, `zh`, `hr`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `el`, `it`, `ja`, `ko`, `lt`, `mk`, `nb`, `pl`, `pt`, `ro`, `ru`, `sl`, `es`, `sv`, `uk` +- OpenAI SDK (Python/JS) +- LangChain +- LlamaIndex +- Cursor +- Open WebUI +- LibreChat -**Language Fallback Behavior:** - -- Text language is auto-detected for each request -- If detected language is not installed, falls back to `fallback_language` (default: `en`) -- Dashboard shows fallback as `FR→EN` when French text is detected but only English is installed -- Response header `X-PasteGuard-Language-Fallback: true` indicates fallback was used - -Update `config.yaml` to match your installed languages: - -```yaml -pii_detection: - languages: - - en - - de -``` +**[See all integrations →](https://pasteguard.com/docs/integrations)** -See [presidio/languages.yaml](presidio/languages.yaml) for full details including context words. - -## Configuration - -**Mask mode:** - -```yaml -mode: mask -providers: - upstream: - type: openai - base_url: https://api.openai.com/v1 -masking: - placeholder_format: "<{TYPE}_{N}>" # Format for masked values - show_markers: false # Add visual markers to unmasked values -``` - -**Route mode:** - -```yaml -mode: route -providers: - upstream: - type: openai - base_url: https://api.openai.com/v1 - local: - type: ollama - base_url: http://localhost:11434 - model: llama3.2 # Model for all local requests -routing: - default: upstream - on_pii_detected: local -``` - -**Customize PII detection:** - -```yaml -pii_detection: - score_threshold: 0.7 # Confidence (0.0 - 1.0) - entities: # What to detect - - PERSON - - EMAIL_ADDRESS - - PHONE_NUMBER - - CREDIT_CARD - - IBAN_CODE -``` - -**Secrets detection (Secrets Shield):** - -```yaml -secrets_detection: - enabled: true # Enable secrets detection - action: block # block | redact | route_local - entities: # Secret types to detect - - OPENSSH_PRIVATE_KEY - - PEM_PRIVATE_KEY - # API Keys (opt-in): - # - API_KEY_OPENAI - # - API_KEY_AWS - # - API_KEY_GITHUB - # Tokens (opt-in): - # - JWT_TOKEN - # - BEARER_TOKEN - max_scan_chars: 200000 # Performance limit (0 = no limit) - log_detected_types: true # Log types (never logs content) -``` - -- **block** (default): Returns HTTP 400 error, request never reaches LLM -- **redact**: Replaces secrets with placeholders, unredacts in response (reversible, like PII masking) -- **route_local**: Routes to local provider when secrets detected (requires route mode) - -**Logging options:** - -```yaml -logging: - database: ./data/pasteguard.db - retention_days: 30 # 0 = keep forever - log_content: false # Log full request/response - log_masked_content: true # Log masked content for dashboard -``` - -**Dashboard authentication:** - -```yaml -dashboard: - auth: - username: admin - password: ${DASHBOARD_PASSWORD} -``` - -**Environment variables:** Config values support `${VAR}` and `${VAR:-default}` substitution. - -See [config.example.yaml](config.example.yaml) for all options. - -## API Reference - -**Endpoints:** - -| Endpoint | Description | -| ---------------------------------- | ---------------------------- | -| `POST /openai/v1/chat/completions` | Chat API (OpenAI-compatible) | -| `GET /openai/v1/models` | List models | -| `GET /dashboard` | Monitoring UI | -| `GET /dashboard/api/logs` | Request logs (JSON) | -| `GET /dashboard/api/stats` | Statistics (JSON) | -| `GET /health` | Health check | -| `GET /info` | Current configuration | - -**Response headers:** - -| Header | Value | -|--------|-------| -| `X-Request-ID` | Request identifier (forwarded or generated) | -| `X-PasteGuard-Mode` | `route` / `mask` | -| `X-PasteGuard-PII-Detected` | `true` / `false` | -| `X-PasteGuard-PII-Masked` | `true` / `false` (mask mode) | -| `X-PasteGuard-Provider` | `upstream` / `local` | -| `X-PasteGuard-Language` | Detected language code | -| `X-PasteGuard-Language-Fallback` | `true` if fallback was used | -| `X-PasteGuard-Secrets-Detected` | `true` if secrets detected | -| `X-PasteGuard-Secrets-Types` | Comma-separated list (e.g., `OPENSSH_PRIVATE_KEY,API_KEY_OPENAI`) | -| `X-PasteGuard-Secrets-Redacted` | `true` if secrets were redacted (action: redact) | - -## Development - -```bash -docker compose up presidio-analyzer -d # Start detection service -bun run dev # Dev server with hot reload -bun test # Run tests -bun run check # Lint & format -``` - -## Benchmarks - -PII detection accuracy benchmark with test cases for multiple languages: - -```bash -# Start Presidio with all benchmark languages -LANGUAGES=en,de,fr,it,es docker compose up presidio-analyzer -d - -# Run all tests -bun run benchmarks/pii-accuracy/run.ts - -# Run specific languages only -bun run benchmarks/pii-accuracy/run.ts --languages de,en - -# Verbose output -bun run benchmarks/pii-accuracy/run.ts --verbose -``` +## What It Detects -Test data in `benchmarks/pii-accuracy/test-data/` (one file per language). +**PII** (powered by [Microsoft Presidio](https://microsoft.github.io/presidio/)) +- Names +- Emails +- Phone numbers +- Credit cards +- IBANs +- IP addresses +- Locations + +**Secrets** +- OpenSSH private keys +- PEM private keys +- OpenAI API keys +- AWS access keys +- GitHub tokens +- JWT tokens +- Bearer tokens + +## Tech Stack + +[Bun](https://bun.sh) · [Hono](https://hono.dev) · [Microsoft Presidio](https://microsoft.github.io/presidio/) · SQLite ## License diff --git a/config.example.yaml b/config.example.yaml index a6a1b7d..96e18c2 100644 --- a/config.example.yaml +++ b/config.example.yaml @@ -150,7 +150,7 @@ logging: log_content: false # Log masked content for dashboard preview (default: true) - # Shows what was actually sent to upstream LLM with PII replaced by tokens + # Shows what was actually sent to upstream LLM with PII replaced by placeholders # Disable if you don't want any content stored, even masked log_masked_content: true diff --git a/docs/api-reference/chat-completions.mdx b/docs/api-reference/chat-completions.mdx new file mode 100644 index 0000000..df01dc7 --- /dev/null +++ b/docs/api-reference/chat-completions.mdx @@ -0,0 +1,122 @@ +--- +title: Chat Completions +description: POST /openai/v1/chat/completions +--- + +# Chat Completions + +Generate chat completions. Identical to OpenAI's endpoint. + +``` +POST /openai/v1/chat/completions +``` + +## Request + +```bash +curl http://localhost:3000/openai/v1/chat/completions \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "gpt-5.2", + "messages": [ + {"role": "user", "content": "Hello"} + ] + }' +``` + +## Parameters + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `model` | string | Yes | Model ID (e.g., `gpt-5.2`) | +| `messages` | array | Yes | Conversation messages | +| `stream` | boolean | No | Enable streaming | +| `temperature` | number | No | Sampling temperature (0-2) | +| `max_tokens` | number | No | Maximum tokens to generate | + +All OpenAI parameters are supported and forwarded to your provider. + +## Response + +```json +{ + "id": "chatcmpl-abc123", + "object": "chat.completion", + "created": 1677858242, + "model": "gpt-5.2", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "Hello! How can I help you today?" + }, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 10, + "completion_tokens": 15, + "total_tokens": 25 + } +} +``` + +## Streaming + +Set `stream: true` for Server-Sent Events: + + + +```python Python +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:3000/openai/v1") + +stream = client.chat.completions.create( + model="gpt-5.2", + messages=[{"role": "user", "content": "Write a haiku"}], + stream=True +) + +for chunk in stream: + if chunk.choices[0].delta.content: + print(chunk.choices[0].delta.content, end="") +``` + +```javascript JavaScript +import OpenAI from 'openai'; + +const client = new OpenAI({ + baseURL: 'http://localhost:3000/openai/v1' +}); + +const stream = await client.chat.completions.create({ + model: 'gpt-5.2', + messages: [{ role: 'user', content: 'Write a haiku' }], + stream: true +}); + +for await (const chunk of stream) { + process.stdout.write(chunk.choices[0]?.delta?.content || ''); +} +``` + + + +## Response Headers + +PasteGuard adds headers to indicate PII and secrets handling: + +| Header | Description | +|--------|-------------| +| `X-PasteGuard-Mode` | Current mode (`mask` or `route`) | +| `X-PasteGuard-Provider` | Provider used (`upstream` or `local`) | +| `X-PasteGuard-PII-Detected` | `true` if PII was found | +| `X-PasteGuard-PII-Masked` | `true` if PII was masked (mask mode only) | +| `X-PasteGuard-Language` | Detected language code | +| `X-PasteGuard-Language-Fallback` | `true` if configured language was not available | +| `X-PasteGuard-Secrets-Detected` | `true` if secrets were found | +| `X-PasteGuard-Secrets-Types` | Comma-separated list of detected secret types | +| `X-PasteGuard-Secrets-Redacted` | `true` if secrets were redacted | diff --git a/docs/api-reference/dashboard-api.mdx b/docs/api-reference/dashboard-api.mdx new file mode 100644 index 0000000..1545a9e --- /dev/null +++ b/docs/api-reference/dashboard-api.mdx @@ -0,0 +1,100 @@ +--- +title: Dashboard API +description: Request logs and statistics +--- + +# Dashboard API + +The dashboard provides an API for accessing request logs and statistics. + +## Logs + +Get request history with pagination. + +``` +GET /dashboard/api/logs +``` + +### Parameters + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `limit` | number | 100 | Number of logs to return (max 1000) | +| `offset` | number | 0 | Pagination offset | + +### Request + +```bash +curl "http://localhost:3000/dashboard/api/logs?limit=100&offset=0" +``` + +### Response + +```json +{ + "logs": [ + { + "id": 1, + "timestamp": "2026-01-15T10:30:00Z", + "mode": "mask", + "provider": "upstream", + "model": "gpt-5.2", + "pii_detected": true, + "entities": "[\"EMAIL_ADDRESS\",\"PERSON\"]", + "latency_ms": 1234, + "scan_time_ms": 45, + "prompt_tokens": 150, + "completion_tokens": 200, + "user_agent": "OpenAI-Python/1.0.0", + "language": "en", + "language_fallback": false, + "detected_language": "en", + "masked_content": "Hello ", + "secrets_detected": 0, + "secrets_types": null + } + ], + "pagination": { + "limit": 100, + "offset": 0, + "count": 1 + } +} +``` + +--- + +## Statistics + +Get aggregated statistics. + +``` +GET /dashboard/api/stats +``` + +### Request + +```bash +curl http://localhost:3000/dashboard/api/stats +``` + +### Response + +```json +{ + "total_requests": 1500, + "pii_requests": 342, + "pii_percentage": 22.8, + "upstream_requests": 1200, + "local_requests": 300, + "avg_scan_time_ms": 45, + "total_tokens": 125000, + "requests_last_hour": 42, + "entity_breakdown": [ + { "entity": "EMAIL_ADDRESS", "count": 150 }, + { "entity": "PERSON", "count": 120 }, + { "entity": "PHONE_NUMBER", "count": 72 } + ], + "mode": "mask" +} +``` diff --git a/docs/api-reference/models.mdx b/docs/api-reference/models.mdx new file mode 100644 index 0000000..ca9fe05 --- /dev/null +++ b/docs/api-reference/models.mdx @@ -0,0 +1,59 @@ +--- +title: Models +description: GET /openai/v1/models +--- + +# Models + +List available models from your configured provider. + +``` +GET /openai/v1/models +``` + +## Request + +```bash +curl http://localhost:3000/openai/v1/models \ + -H "Authorization: Bearer $OPENAI_API_KEY" +``` + +## Response + +```json +{ + "object": "list", + "data": [ + {"id": "gpt-5.2", "object": "model", "owned_by": "openai"} + ] +} +``` + +## SDK Usage + + + +```python Python +from openai import OpenAI + +client = OpenAI(base_url="http://localhost:3000/openai/v1") + +models = client.models.list() +for model in models: + print(model.id) +``` + +```javascript JavaScript +import OpenAI from 'openai'; + +const client = new OpenAI({ + baseURL: 'http://localhost:3000/openai/v1' +}); + +const models = await client.models.list(); +for (const model of models.data) { + console.log(model.id); +} +``` + + diff --git a/docs/api-reference/status.mdx b/docs/api-reference/status.mdx new file mode 100644 index 0000000..df338d4 --- /dev/null +++ b/docs/api-reference/status.mdx @@ -0,0 +1,103 @@ +--- +title: Status +description: Health check and info endpoints +--- + +# Status Endpoints + +## Health Check + +Check if PasteGuard and its dependencies are running. + +``` +GET /health +``` + +### Request + +```bash +curl http://localhost:3000/health +``` + +### Response + +Returns HTTP 200 when healthy, HTTP 503 when degraded. + +```json +{ + "status": "healthy", + "services": { + "presidio": "up" + }, + "timestamp": "2026-01-15T10:30:00Z" +} +``` + +When a service is down: + +```json +{ + "status": "degraded", + "services": { + "presidio": "down" + }, + "timestamp": "2026-01-15T10:30:00Z" +} +``` + +In route mode, `local_llm` is also included in services. + +--- + +## Info + +Get current configuration and mode. + +``` +GET /info +``` + +### Request + +```bash +curl http://localhost:3000/info +``` + +### Response + +```json +{ + "name": "PasteGuard", + "version": "1.0.0", + "description": "Privacy proxy for LLMs", + "mode": "mask", + "providers": { + "upstream": { "type": "openai" } + }, + "pii_detection": { + "languages": ["en"], + "fallback_language": "en", + "score_threshold": 0.7, + "entities": ["PERSON", "EMAIL_ADDRESS", "PHONE_NUMBER"] + }, + "masking": { + "show_markers": false + } +} +``` + +When language validation is available, `languages` becomes an object: + +```json +{ + "pii_detection": { + "languages": { + "configured": ["en", "de", "fr"], + "available": ["en", "de"], + "missing": ["fr"] + } + } +} +``` + +In route mode, `routing` and `providers.local` are also included. diff --git a/docs/concepts/mask-mode.mdx b/docs/concepts/mask-mode.mdx new file mode 100644 index 0000000..df1e05e --- /dev/null +++ b/docs/concepts/mask-mode.mdx @@ -0,0 +1,91 @@ +--- +title: Mask Mode +description: Replace PII with placeholders before sending to your provider +--- + +# Mask Mode + +Mask mode replaces PII with placeholders before sending to your LLM provider. The response is automatically unmasked before returning to you. + +## How It Works + + + + Your app sends: `"Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org)"` + + + PasteGuard finds: `Dr. Sarah Chen` (PERSON), `sarah.chen@hospital.org` (EMAIL) + + + Provider receives: `"Write a follow-up email to ()"` + + + Provider responds: `"Dear , Following up on our discussion..."` + + + You receive: `"Dear Dr. Sarah Chen, Following up on our discussion..."` + + + +## When to Use + +- Simple setup without local infrastructure +- Want to use external LLM providers while protecting PII + +## Configuration + +```yaml +mode: mask + +providers: + upstream: + type: openai + base_url: https://api.openai.com/v1 +``` + +### Masking Options + +```yaml +masking: + show_markers: false + marker_text: "[protected]" +``` + +| Option | Default | Description | +|--------|---------|-------------| +| `show_markers` | `false` | Add visual markers around unmasked values | +| `marker_text` | `[protected]` | Marker text if enabled | + +## Response Headers + +Mask mode sets these headers on responses: + +``` +X-PasteGuard-Mode: mask +X-PasteGuard-Provider: upstream +X-PasteGuard-PII-Detected: true +X-PasteGuard-PII-Masked: true +X-PasteGuard-Language: en +``` + +If the detected language wasn't configured and fell back to `fallback_language`: + +``` +X-PasteGuard-Language-Fallback: true +``` + +## Streaming Support + +Mask mode supports streaming responses. PasteGuard buffers tokens and unmasks placeholders in real-time as they arrive. + +```python +stream = client.chat.completions.create( + model="gpt-5.2", + messages=[{"role": "user", "content": "Email Dr. Sarah Chen at sarah.chen@hospital.org"}], + stream=True +) + +for chunk in stream: + # PII is already unmasked in each chunk + print(chunk.choices[0].delta.content, end="") +``` diff --git a/docs/concepts/pii-detection.mdx b/docs/concepts/pii-detection.mdx new file mode 100644 index 0000000..f2c0036 --- /dev/null +++ b/docs/concepts/pii-detection.mdx @@ -0,0 +1,68 @@ +--- +title: PII Detection +description: Personal data detection powered by Microsoft Presidio +--- + +# PII Detection + +PasteGuard uses Microsoft Presidio for PII detection, supporting 24 languages with automatic language detection. + +## Supported Entities + +| Entity | Examples | +|--------|----------| +| `PERSON` | Dr. Sarah Chen, John Smith | +| `EMAIL_ADDRESS` | sarah.chen@hospital.org | +| `PHONE_NUMBER` | +1-555-123-4567 | +| `CREDIT_CARD` | 4111-1111-1111-1111 | +| `IBAN_CODE` | DE89 3704 0044 0532 0130 00 | +| `IP_ADDRESS` | 192.168.1.1 | +| `LOCATION` | New York, 123 Main St | +| `US_SSN` | 123-45-6789 | +| `US_PASSPORT` | 123456789 | +| `CRYPTO` | Bitcoin addresses | +| `URL` | https://example.com | + +## Language Support + +PasteGuard supports 24 languages. The language is auto-detected from your input text. + +**Available languages:** Catalan, Chinese, Croatian, Danish, Dutch, English, Finnish, French, German, Greek, Italian, Japanese, Korean, Lithuanian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Slovenian, Spanish, Swedish, Ukrainian + +### Configure Languages + +Languages must be installed during Docker build: + +```bash +LANGUAGES=en,de,fr docker compose build +``` + +If only one language is specified, language detection is skipped for better performance. + +## Confidence Scoring + +Each detected entity has a confidence score (0.0 - 1.0). The default threshold is 0.7. + +- **Higher threshold** = fewer false positives, might miss some PII +- **Lower threshold** = catches more PII, more false positives + +```yaml +pii_detection: + score_threshold: 0.7 +``` + +## Response Headers + +When PII is detected: + +``` +X-PasteGuard-PII-Detected: true +X-PasteGuard-PII-Masked: true # mask mode only +X-PasteGuard-Language: en +``` + +If the fallback language was used: + +``` +X-PasteGuard-Language-Fallback: true +``` diff --git a/docs/concepts/route-mode.mdx b/docs/concepts/route-mode.mdx new file mode 100644 index 0000000..5cafa77 --- /dev/null +++ b/docs/concepts/route-mode.mdx @@ -0,0 +1,126 @@ +--- +title: Route Mode +description: Route PII requests to a local LLM +--- + +# Route Mode + +Route mode sends requests containing PII to a local LLM. Requests without PII go to your configured provider. + +## How It Works + + + + Routed to **Local LLM** (Ollama, vLLM, llama.cpp, etc.) + + PII stays on your network. + + + Routed to **Your Provider** (OpenAI, Azure, etc.) + + Full provider performance. + + + +## When to Use + +- Have local GPU resources +- Need complete data isolation for sensitive requests +- Must prevent any PII from leaving your network + +## Configuration + +```yaml +mode: route + +providers: + upstream: + type: openai + base_url: https://api.openai.com/v1 + local: + type: ollama + base_url: http://localhost:11434 + model: llama3.2 + +routing: + default: upstream + on_pii_detected: local +``` + +### Routing Options + +| Option | Description | +|--------|-------------| +| `default` | Provider for requests without PII | +| `on_pii_detected` | Provider for requests with PII | + +## Local Provider Setup + +### Ollama + +```yaml +providers: + local: + type: ollama + base_url: http://localhost:11434 + model: llama3.2 +``` + +### vLLM + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8000/v1 + model: meta-llama/Llama-2-7b-chat-hf +``` + +### llama.cpp + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8080/v1 + model: local +``` + +### LocalAI + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8080/v1 + model: your-model-name + api_key: ${LOCAL_API_KEY} # if required +``` + +## Response Headers + +Route mode sets these headers on responses: + +When a request is routed to local: + +``` +X-PasteGuard-Mode: route +X-PasteGuard-Provider: local +X-PasteGuard-PII-Detected: true +X-PasteGuard-Language: en +``` + +When routed to your provider: + +``` +X-PasteGuard-Mode: route +X-PasteGuard-Provider: upstream +X-PasteGuard-PII-Detected: false +X-PasteGuard-Language: en +``` + +If the detected language wasn't configured and fell back to `fallback_language`: + +``` +X-PasteGuard-Language-Fallback: true +``` diff --git a/docs/concepts/secrets-detection.mdx b/docs/concepts/secrets-detection.mdx new file mode 100644 index 0000000..8206536 --- /dev/null +++ b/docs/concepts/secrets-detection.mdx @@ -0,0 +1,84 @@ +--- +title: Secrets Detection +description: Detect and protect private keys, API keys, and tokens +--- + +# Secrets Detection + +PasteGuard detects secrets before PII detection and can block, redact, or route requests containing sensitive credentials. + +## Supported Secret Types + +### Private Keys (enabled by default) + +| Type | Pattern | +|------|---------| +| `OPENSSH_PRIVATE_KEY` | `-----BEGIN OPENSSH PRIVATE KEY-----` | +| `PEM_PRIVATE_KEY` | `-----BEGIN RSA PRIVATE KEY-----`, etc. | + +### API Keys (opt-in) + +| Type | Pattern | +|------|---------| +| `API_KEY_OPENAI` | `sk-...` (48+ chars) | +| `API_KEY_AWS` | `AKIA...` (20 chars) | +| `API_KEY_GITHUB` | `ghp_...`, `gho_...`, `ghu_...`, `ghs_...`, `ghr_...` (40+ chars) | + +### Tokens (opt-in) + +| Type | Pattern | +|------|---------| +| `JWT_TOKEN` | `eyJ...` (three base64 segments) | +| `BEARER_TOKEN` | `Bearer ...` (40+ char tokens) | + +## Actions + +| Action | Description | +|--------|-------------| +| `block` | Return HTTP 400, request never reaches LLM (default) | +| `redact` | Replace secrets with placeholders, restore in response | +| `route_local` | Route to local LLM (requires route mode) | + +### Block (Default) + +```yaml +secrets_detection: + enabled: true + action: block +``` + +Request is rejected with HTTP 400. The secret never reaches the LLM. + +### Redact + +```yaml +secrets_detection: + action: redact +``` + +Secrets are replaced with placeholders and restored in the response (like PII masking). + +### Route to Local + +```yaml +mode: route +secrets_detection: + action: route_local +``` + +Requests with secrets are sent to your local LLM instead. + +## Response Headers + +When secrets are detected: + +``` +X-PasteGuard-Secrets-Detected: true +X-PasteGuard-Secrets-Types: OPENSSH_PRIVATE_KEY,API_KEY_OPENAI +``` + +If secrets were redacted: + +``` +X-PasteGuard-Secrets-Redacted: true +``` diff --git a/docs/configuration/logging.mdx b/docs/configuration/logging.mdx new file mode 100644 index 0000000..26a1861 --- /dev/null +++ b/docs/configuration/logging.mdx @@ -0,0 +1,90 @@ +--- +title: Logging +description: Configure request logging +--- + +# Logging Configuration + +```yaml +logging: + database: ./data/pasteguard.db + retention_days: 30 + log_content: false + log_masked_content: true +``` + +## Options + +| Option | Default | Description | +|--------|---------|-------------| +| `database` | `./data/pasteguard.db` | SQLite database path | +| `retention_days` | `30` | Days to keep logs. `0` = forever | +| `log_content` | `false` | Log raw request/response (contains PII!) | +| `log_masked_content` | `true` | Log masked version for dashboard | + +## Database + +Logs are stored in SQLite: + +```yaml +logging: + database: ./data/pasteguard.db +``` + +In Docker, this is persisted via volume: + +```yaml +volumes: + - ./data:/app/data +``` + +## Retention + +Logs older than `retention_days` are automatically deleted: + +```yaml +logging: + retention_days: 30 # Keep for 30 days + # retention_days: 7 # Keep for 7 days + # retention_days: 0 # Keep forever +``` + +## Content Logging + +### Raw Content (not recommended) + +Logs original request/response including PII: + +```yaml +logging: + log_content: true # Contains sensitive data! +``` + +### Masked Content (default) + +Logs masked version for dashboard preview: + +```yaml +logging: + log_masked_content: true +``` + +Shows what was actually sent to your provider with PII replaced by placeholders. + +### No Content + +Disable all content logging: + +```yaml +logging: + log_content: false + log_masked_content: false +``` + +Only metadata (timestamps, models, PII detected) is logged. + +## Security + +- Secret content is **never** logged, even if `log_content: true` +- Only secret types are logged if `log_detected_types: true` +- Masked content shows placeholders like ``, not real PII diff --git a/docs/configuration/overview.mdx b/docs/configuration/overview.mdx new file mode 100644 index 0000000..fbe2b7f --- /dev/null +++ b/docs/configuration/overview.mdx @@ -0,0 +1,69 @@ +--- +title: Overview +description: Configuration basics +--- + +# Configuration Overview + +PasteGuard is configured via `config.yaml`. Copy from the example: + +```bash +cp config.example.yaml config.yaml +``` + +## Mode + +Privacy mode determines how PII is handled. + +```yaml +mode: mask +``` + +| Value | Description | +|-------|-------------| +| `mask` | Replace PII with placeholders, send to provider, restore in response | +| `route` | PII requests stay on your local LLM (Ollama, vLLM, llama.cpp), others go to your configured provider | + +See [Mask Mode](/concepts/mask-mode) and [Route Mode](/concepts/route-mode) for details. + +## Server + +```yaml +server: + port: 3000 + host: "0.0.0.0" +``` + +| Option | Default | Description | +|--------|---------|-------------| +| `port` | `3000` | HTTP port | +| `host` | `0.0.0.0` | Bind address | + +## Dashboard + +```yaml +dashboard: + enabled: true + auth: + username: admin + password: ${DASHBOARD_PASSWORD} +``` + +| Option | Default | Description | +|--------|---------|-------------| +| `enabled` | `true` | Enable dashboard at `/dashboard` | +| `auth.username` | - | Basic auth username | +| `auth.password` | - | Basic auth password | + +## Environment Variables + +Use `${VAR}` or `${VAR:-default}` syntax: + +```yaml +providers: + upstream: + api_key: ${OPENAI_API_KEY} + +pii_detection: + presidio_url: ${PRESIDIO_URL:-http://localhost:5002} +``` diff --git a/docs/configuration/pii-detection.mdx b/docs/configuration/pii-detection.mdx new file mode 100644 index 0000000..2097aef --- /dev/null +++ b/docs/configuration/pii-detection.mdx @@ -0,0 +1,89 @@ +--- +title: PII Detection Config +description: Configure PII detection settings +--- + +# PII Detection Configuration + +```yaml +pii_detection: + presidio_url: http://localhost:5002 + languages: [en, de] + fallback_language: en + score_threshold: 0.7 + entities: + - PERSON + - EMAIL_ADDRESS + - PHONE_NUMBER + - CREDIT_CARD + - IBAN_CODE + - IP_ADDRESS + - LOCATION +``` + +## Options + +| Option | Default | Description | +|--------|---------|-------------| +| `presidio_url` | `http://localhost:5002` | Presidio analyzer URL | +| `languages` | `[en]` | Languages to detect. Must match Docker build | +| `fallback_language` | `en` | Fallback if detected language not in list | +| `score_threshold` | `0.7` | Minimum confidence (0.0-1.0) | +| `entities` | See below | Entity types to detect | + +## Languages + +Languages must be installed during Docker build: + +```bash +LANGUAGES=en,de,fr docker compose build +``` + +Available languages (24): +`ca`, `zh`, `hr`, `da`, `nl`, `en`, `fi`, `fr`, `de`, `el`, `it`, `ja`, `ko`, `lt`, `mk`, `nb`, `pl`, `pt`, `ro`, `ru`, `sl`, `es`, `sv`, `uk` + +### Single Language + +If only one language is specified, language detection is skipped for better performance: + +```yaml +pii_detection: + languages: [en] +``` + +### Fallback Language + +If the detected language isn't in your list, the fallback is used: + +```yaml +pii_detection: + languages: [en, de] + fallback_language: en # Used for French text, etc. +``` + +## Entities + +| Entity | Examples | +|--------|----------| +| `PERSON` | Dr. Sarah Chen, John Smith | +| `EMAIL_ADDRESS` | sarah.chen@hospital.org | +| `PHONE_NUMBER` | +1-555-123-4567 | +| `CREDIT_CARD` | 4111-1111-1111-1111 | +| `IBAN_CODE` | DE89 3704 0044 0532 0130 00 | +| `IP_ADDRESS` | 192.168.1.1 | +| `LOCATION` | New York, 123 Main St | +| `US_SSN` | 123-45-6789 | +| `US_PASSPORT` | 123456789 | +| `CRYPTO` | Bitcoin addresses | +| `URL` | https://example.com | + +## Score Threshold + +Higher = fewer false positives, might miss some PII. Lower = catches more PII, more false positives. + +```yaml +pii_detection: + score_threshold: 0.7 # Default, good balance + # score_threshold: 0.5 # More aggressive + # score_threshold: 0.9 # More conservative +``` diff --git a/docs/configuration/providers.mdx b/docs/configuration/providers.mdx new file mode 100644 index 0000000..38c1a01 --- /dev/null +++ b/docs/configuration/providers.mdx @@ -0,0 +1,141 @@ +--- +title: Providers +description: Configure your LLM providers +--- + +# Providers + +PasteGuard supports two provider types: your configured provider (upstream) and local. + +## Upstream Provider + +Required for both modes. Your LLM provider (OpenAI, Azure, etc.). + +```yaml +providers: + upstream: + type: openai + base_url: https://api.openai.com/v1 + # api_key: ${OPENAI_API_KEY} # Optional fallback +``` + +| Option | Description | +|--------|-------------| +| `type` | `openai` | +| `base_url` | API endpoint | +| `api_key` | Optional. Used if client doesn't send Authorization header | + +### Supported Providers + +Any OpenAI-compatible API works: + +```yaml +# OpenAI +providers: + upstream: + type: openai + base_url: https://api.openai.com/v1 + +# Azure OpenAI +providers: + upstream: + type: openai + base_url: https://your-resource.openai.azure.com/openai/v1 + +# OpenRouter +providers: + upstream: + type: openai + base_url: https://openrouter.ai/api/v1 + api_key: ${OPENROUTER_API_KEY} + +# LiteLLM Proxy +providers: + upstream: + type: openai + base_url: http://localhost:4000 # LiteLLM default port + +# Together AI +providers: + upstream: + type: openai + base_url: https://api.together.xyz/v1 + +# Groq +providers: + upstream: + type: openai + base_url: https://api.groq.com/openai/v1 +``` + +## Local Provider + +Required for Route mode only. Your local LLM. + +```yaml +providers: + local: + type: ollama + base_url: http://localhost:11434 + model: llama3.2 +``` + +| Option | Description | +|--------|-------------| +| `type` | `ollama` or `openai` (for compatible servers) | +| `base_url` | Local LLM endpoint | +| `model` | Model to use for all PII requests | +| `api_key` | Only needed for OpenAI-compatible servers | + +### Ollama + +```yaml +providers: + local: + type: ollama + base_url: http://localhost:11434 + model: llama3.2 +``` + +### vLLM + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8000/v1 + model: meta-llama/Llama-2-7b-chat-hf +``` + +### llama.cpp + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8080/v1 + model: local +``` + +### LocalAI + +```yaml +providers: + local: + type: openai + base_url: http://localhost:8080/v1 + model: your-model + api_key: ${LOCAL_API_KEY} # if required +``` + +## API Key Handling + +PasteGuard forwards your client's `Authorization` header to your provider. You can optionally set `api_key` in config as a fallback: + +```yaml +providers: + upstream: + type: openai + base_url: https://api.openai.com/v1 + api_key: ${OPENAI_API_KEY} # Used if client doesn't send auth +``` diff --git a/docs/configuration/secrets-detection.mdx b/docs/configuration/secrets-detection.mdx new file mode 100644 index 0000000..89a23e7 --- /dev/null +++ b/docs/configuration/secrets-detection.mdx @@ -0,0 +1,100 @@ +--- +title: Secrets Detection Config +description: Configure secrets detection settings +--- + +# Secrets Detection Configuration + +```yaml +secrets_detection: + enabled: true + action: block + entities: + - OPENSSH_PRIVATE_KEY + - PEM_PRIVATE_KEY + max_scan_chars: 200000 + log_detected_types: true +``` + +## Options + +| Option | Default | Description | +|--------|---------|-------------| +| `enabled` | `true` | Enable secrets detection | +| `action` | `block` | Action when secrets found | +| `entities` | Private keys | Secret types to detect | +| `max_scan_chars` | `200000` | Max characters to scan (0 = unlimited) | +| `log_detected_types` | `true` | Log detected types (never logs content) | +| `redact_placeholder` | `` | Placeholder format for redaction | + +## Actions + +| Action | Description | +|--------|-------------| +| `block` | Return HTTP 400, request never reaches LLM (default) | +| `redact` | Replace secrets with placeholders, restore in response | +| `route_local` | Route to local LLM (requires route mode) | + +### Block (Default) + +```yaml +secrets_detection: + action: block +``` + +### Redact + +```yaml +secrets_detection: + action: redact +``` + +### Route to Local + +```yaml +mode: route +secrets_detection: + action: route_local +``` + +## Secret Types + +### Private Keys (enabled by default) + +```yaml +secrets_detection: + entities: + - OPENSSH_PRIVATE_KEY # -----BEGIN OPENSSH PRIVATE KEY----- + - PEM_PRIVATE_KEY # RSA, PRIVATE KEY, ENCRYPTED PRIVATE KEY +``` + +### API Keys (opt-in) + +```yaml +secrets_detection: + entities: + - API_KEY_OPENAI # sk-... (48+ chars) + - API_KEY_AWS # AKIA... (20 chars) + - API_KEY_GITHUB # ghp_, gho_, ghu_, ghs_, ghr_ (40+ chars) +``` + +### Tokens (opt-in) + +```yaml +secrets_detection: + entities: + - JWT_TOKEN # eyJ... (three base64 segments) + - BEARER_TOKEN # Bearer ... (40+ char tokens) +``` + +## Performance + +For large payloads, limit scanning: + +```yaml +secrets_detection: + max_scan_chars: 200000 # 200KB default + # max_scan_chars: 0 # Scan entire request +``` + +Secrets placed after the limit won't be detected. diff --git a/docs/favicon.svg b/docs/favicon.svg new file mode 100644 index 0000000..6f3e667 --- /dev/null +++ b/docs/favicon.svg @@ -0,0 +1,7 @@ + + + + diff --git a/docs/images/dashboard.png b/docs/images/dashboard.png new file mode 100644 index 0000000..1da1f02 Binary files /dev/null and b/docs/images/dashboard.png differ diff --git a/docs/integrations.mdx b/docs/integrations.mdx new file mode 100644 index 0000000..b1c6716 --- /dev/null +++ b/docs/integrations.mdx @@ -0,0 +1,134 @@ +--- +title: Integrations +description: Use PasteGuard with IDEs, chat interfaces, and SDKs +--- + +# Integrations + +PasteGuard works with any tool that supports the OpenAI API. Just change the base URL to point to PasteGuard. + +## Cursor + +In Cursor settings, configure a custom OpenAI base URL: + +1. Open **Settings** → **Models** +2. Scroll to **API Keys** section +3. Enable **Override OpenAI Base URL** toggle +4. Enter: + ``` + http://localhost:3000/openai/v1 + ``` +5. Add your OpenAI API key above + +All requests from Cursor now go through PasteGuard with PII protection. + +## Chat Interfaces + +### Open WebUI + +In your Docker Compose or environment: + +```yaml +services: + open-webui: + environment: + - OPENAI_API_BASE_URL=http://pasteguard:3000/openai/v1 + - OPENAI_API_KEY=your-key +``` + +Or point Open WebUI to PasteGuard as an "OpenAI-compatible" connection. + +### LibreChat + +Configure in your `librechat.yaml`: + +```yaml +version: 1.2.8 +cache: true +endpoints: + custom: + - name: "PasteGuard" + apiKey: "${OPENAI_API_KEY}" + baseURL: "http://localhost:3000/openai/v1" + models: + default: ["gpt-5.2"] + fetch: false + titleConvo: true + titleModel: "gpt-5.2" +``` + +## Python / JavaScript + +### OpenAI SDK + + + +```python Python +from openai import OpenAI + +client = OpenAI( + base_url="http://localhost:3000/openai/v1", + api_key="your-key" +) +``` + +```javascript JavaScript +import OpenAI from 'openai'; + +const client = new OpenAI({ + baseURL: 'http://localhost:3000/openai/v1', + apiKey: 'your-key' +}); +``` + + + +### LangChain + +```python +from langchain_openai import ChatOpenAI + +llm = ChatOpenAI( + base_url="http://localhost:3000/openai/v1", + api_key="your-key" +) +``` + +### LlamaIndex + +```python +from llama_index.llms.openai_like import OpenAILike + +llm = OpenAILike( + api_base="http://localhost:3000/openai/v1", + api_key="your-key", + model="gpt-5.2", + is_chat_model=True +) +``` + +## Environment Variable + +Most tools respect the `OPENAI_API_BASE` or `OPENAI_BASE_URL` environment variable: + +```bash +export OPENAI_API_BASE=http://localhost:3000/openai/v1 +export OPENAI_API_KEY=your-key +``` + +## Verify It Works + +Check the response headers to confirm PasteGuard is processing requests: + +```bash +curl -i http://localhost:3000/openai/v1/chat/completions \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"model": "gpt-5.2", "messages": [{"role": "user", "content": "Hi"}]}' +``` + +Look for: +``` +X-PasteGuard-Mode: mask +X-PasteGuard-Provider: upstream +``` diff --git a/docs/introduction.mdx b/docs/introduction.mdx new file mode 100644 index 0000000..a075b9c --- /dev/null +++ b/docs/introduction.mdx @@ -0,0 +1,80 @@ +--- +title: Introduction +description: Privacy proxy for LLMs +--- + +# What is PasteGuard? + +PasteGuard is an OpenAI-compatible proxy that protects personal data and secrets before sending to your LLM provider (OpenAI, Azure, etc.). + +## The Problem + +When using LLM APIs, every prompt is sent to external servers - including customer names, emails, and sensitive business data. Many organizations have policies against sending PII to third-party AI services. + +## The Solution + +PasteGuard sits between your app and the LLM API: + +``` +Your App → PasteGuard → LLM API +``` + +Two privacy modes: + +| Mode | How it works | +|------|--------------| +| **Mask** | Replace PII with placeholders, send to provider, restore in response | +| **Route** | PII requests stay on your local LLM (Ollama, vLLM, llama.cpp), others go to your configured provider | + +## Quick Example + + + + ``` + Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org) + ``` + + + Detected: `Dr. Sarah Chen` → ``, `sarah.chen@hospital.org` → `` + + + ``` + Write a follow-up email to () + ``` + + + ``` + Dear Dr. Sarah Chen, Following up on our discussion... + ``` + + + +The LLM never sees the real data. PII is masked before sending and restored in the response. + +## Features + +- **PII Detection** — Names, emails, phone numbers, credit cards, IBANs, and more +- **Secrets Detection** — API keys, tokens, private keys caught before they reach the LLM +- **Streaming Support** — Real-time unmasking as tokens arrive +- **24 Languages** — Works in English, German, French, and 21 more +- **OpenAI-Compatible** — Change one URL, keep your code +- **Self-Hosted** — Your servers, your data stays yours +- **Open Source** — Apache 2.0 license, full transparency +- **Dashboard** — See every protected request in real-time + +## Next Steps + + + + Get running in 2 minutes + + + Replace PII with placeholders + + + Route PII to local LLM + + + Customize your setup + + diff --git a/docs/logo-dark.svg b/docs/logo-dark.svg new file mode 100644 index 0000000..e6c4262 --- /dev/null +++ b/docs/logo-dark.svg @@ -0,0 +1,19 @@ + + + + + + + + + + + + diff --git a/docs/logo-light.svg b/docs/logo-light.svg new file mode 100644 index 0000000..f60d8ca --- /dev/null +++ b/docs/logo-light.svg @@ -0,0 +1,19 @@ + + + + + + + + + + + + diff --git a/docs/mint.json b/docs/mint.json new file mode 100644 index 0000000..f0b077b --- /dev/null +++ b/docs/mint.json @@ -0,0 +1,71 @@ +{ + "$schema": "https://mintlify.com/docs.json", + "name": "PasteGuard", + "logo": { + "light": "/wordmark-light.svg", + "dark": "/wordmark-dark.svg", + "href": "/" + }, + "favicon": "/favicon.svg", + "colors": { + "primary": "#b45309", + "light": "#b45309", + "dark": "#d97706", + "background": { + "light": "#fafaf9", + "dark": "#0c0a09" + } + }, + "font": { + "headings": { + "family": "system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif", + "weight": 700 + }, + "body": { + "family": "system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif" + } + }, + "topbarLinks": [ + { + "name": "GitHub", + "url": "https://github.com/sgasser/pasteguard" + } + ], + "navigation": [ + { + "group": "Getting Started", + "pages": ["introduction", "quickstart", "integrations"] + }, + { + "group": "Concepts", + "pages": [ + "concepts/mask-mode", + "concepts/route-mode", + "concepts/pii-detection", + "concepts/secrets-detection" + ] + }, + { + "group": "API Reference", + "pages": [ + "api-reference/chat-completions", + "api-reference/models", + "api-reference/status", + "api-reference/dashboard-api" + ] + }, + { + "group": "Configuration", + "pages": [ + "configuration/overview", + "configuration/providers", + "configuration/pii-detection", + "configuration/secrets-detection", + "configuration/logging" + ] + } + ], + "footerSocials": { + "github": "https://github.com/sgasser/pasteguard" + } +} diff --git a/docs/quickstart.mdx b/docs/quickstart.mdx new file mode 100644 index 0000000..e37bd0f --- /dev/null +++ b/docs/quickstart.mdx @@ -0,0 +1,134 @@ +--- +title: Quickstart +description: Get PasteGuard running in 2 minutes +--- + +# Quickstart + +## 1. Start PasteGuard + +```bash +git clone https://github.com/sgasser/pasteguard.git +cd pasteguard +cp config.example.yaml config.yaml +docker compose up -d +``` + +PasteGuard runs on `http://localhost:3000`. Dashboard at `http://localhost:3000/dashboard`. + + +By default, only English is installed (~1.5GB image). To add more languages: + +```bash +# English + German + French (~2.5GB) +LANGUAGES=en,de,fr docker compose up -d --build +``` + +Update `config.yaml` to match: + +```yaml +pii_detection: + languages: + - en + - de + - fr +``` + +See [PII Detection Config](/configuration/pii-detection) for all 24 available languages. + + +## 2. Make a Request + +Point your OpenAI client to PasteGuard: + + + +```python Python +from openai import OpenAI + +client = OpenAI( + base_url="http://localhost:3000/openai/v1", + api_key="your-openai-key" +) + +response = client.chat.completions.create( + model="gpt-5.2", + messages=[ + {"role": "user", "content": "Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org)"} + ] +) + +print(response.choices[0].message.content) +``` + +```javascript JavaScript +import OpenAI from 'openai'; + +const client = new OpenAI({ + baseURL: 'http://localhost:3000/openai/v1', + apiKey: 'your-openai-key' +}); + +const response = await client.chat.completions.create({ + model: 'gpt-5.2', + messages: [ + { role: 'user', content: 'Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org)' } + ] +}); + +console.log(response.choices[0].message.content); +``` + +```bash cURL +curl http://localhost:3000/openai/v1/chat/completions \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "gpt-5.2", + "messages": [ + {"role": "user", "content": "Write a follow-up email to Dr. Sarah Chen (sarah.chen@hospital.org)"} + ] + }' +``` + + + +## 3. Verify PII Protection + +Check the response headers: + +``` +X-PasteGuard-PII-Detected: true +X-PasteGuard-PII-Masked: true +``` + +The name and email were masked before reaching OpenAI and restored in the response. + +## 4. View Dashboard + +Open `http://localhost:3000/dashboard` in your browser to see: + +- Request history +- Detected PII entities +- Masked content sent to the LLM + + + PasteGuard Dashboard + + +## What's Next? + + + + How PII masking works + + + Route sensitive requests locally + + + Customize detection and providers + + + Explore the API + + diff --git a/docs/wordmark-dark.svg b/docs/wordmark-dark.svg new file mode 100644 index 0000000..066e14f --- /dev/null +++ b/docs/wordmark-dark.svg @@ -0,0 +1,28 @@ + + + + + + + + + + + + + PasteGuard + diff --git a/docs/wordmark-light.svg b/docs/wordmark-light.svg new file mode 100644 index 0000000..9e6a481 --- /dev/null +++ b/docs/wordmark-light.svg @@ -0,0 +1,28 @@ + + + + + + + + + + + + + PasteGuard + diff --git a/package.json b/package.json index e56e02e..7347496 100644 --- a/package.json +++ b/package.json @@ -1,7 +1,7 @@ { "name": "pasteguard", "version": "0.1.0", - "description": "Guard your paste - Privacy-aware LLM proxy that masks PII before sending to cloud LLMs", + "description": "Privacy proxy for LLMs. Masks personal data and secrets before sending to your provider.", "type": "module", "main": "src/index.ts", "scripts": { diff --git a/src/index.ts b/src/index.ts index aa0c49d..d073156 100644 --- a/src/index.ts +++ b/src/index.ts @@ -175,7 +175,7 @@ Provider: console.log(` ╔═══════════════════════════════════════════════════════════╗ ║ PasteGuard ║ -║ Guard your paste - Privacy-aware LLM proxy ║ +║ Privacy proxy for LLMs ║ ╚═══════════════════════════════════════════════════════════╝ Server: http://${host}:${port} diff --git a/src/routes/info.ts b/src/routes/info.ts index c4358cc..d8fbf98 100644 --- a/src/routes/info.ts +++ b/src/routes/info.ts @@ -16,7 +16,7 @@ infoRoutes.get("/info", (c) => { const info: Record = { name: "PasteGuard", version: pkg.version, - description: "Guard your paste - Privacy-aware LLM proxy", + description: "Privacy proxy for LLMs", mode: config.mode, providers: { upstream: {