Skip to content

πŸ”’ Detect security leaks in AI-assisted codebases. Static analysis tool for Python & JS/TS with cross-file taint tracking.

License

Notifications You must be signed in to change notification settings

Privalyse/privalyse-cli

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Repository files navigation

Privalyse Logo

Privacy Guardrails for AI Applications

Catch PII leaks to LLMs before they hit production.

PyPI version Downloads License: MIT Tests Python Versions

Privalyse Demo


Privalyse CLI is a static analysis tool that builds a Semantic Data Flow Graph of your AI application. It traces PII from source to AI sinkβ€”detecting privacy violations that regex-based tools miss.

  • ❌ Traditional Linter: "Variable user_email used in line 42."
  • βœ… Privalyse: "User Email (Source) β†’ Prompt Template β†’ OpenAI API (Sink) = Privacy Leak"

πŸ€– Built for AI Applications

Privalyse is purpose-built for LLM-integrated applications. It detects when sensitive user data is being sent to:

Provider Support
OpenAI (GPT-4, o1, Embeddings) βœ… Full
Anthropic (Claude) βœ… Full
Google (Gemini, Vertex AI) βœ… Full
Mistral AI βœ… Full
Groq βœ… Full
Cohere βœ… Full
Ollama (Local LLMs) βœ… Full
LangChain / LlamaIndex βœ… Full
Hugging Face βœ… Full
Generic HTTP to AI APIs βœ… Full

πŸ›‘οΈ Works with privalyse-mask

privalyse-mask is our companion library for masking PII before sending it to LLMs.

Privalyse CLI automatically recognizes privalyse-mask usage and won't flag already-masked data as leaks.

from privalyse_mask import PrivalyseMasker
from openai import OpenAI

masker = PrivalyseMasker()
client = OpenAI()

# User input with PII
user_input = "My name is Peter and my email is peter@example.com"

# βœ… Mask before sending to LLM
masked_text, mapping = masker.mask(user_input)
# -> "My name is {Name_x92} and my email is {Email_abc123}"

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": masked_text}]  # βœ… Safe - masked data
)

# Restore original values in response
final_response = masker.unmask(response.choices[0].message.content, mapping)

Privalyse CLI will:

  • βœ… Not flag the masked_text being sent to OpenAI (it's sanitized)
  • ⚠️ Flag if you send user_input directly without masking

⚑ Quick Start

Install & Run

pip install privalyse-cli
privalyse
# βœ… Done. Check scan_results.md

GitHub Actions

# .github/workflows/privacy.yml
name: AI Privacy Scan
on: [push, pull_request]

jobs:
  privalyse:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Privalyse
        uses: privalyse/privalyse-cli@v0.3.1

Pre-Commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: privalyse
        name: Privalyse AI Privacy Scan
        entry: privalyse
        language: system
        pass_filenames: false

πŸ“š Documentation


πŸš€ Features

πŸ€– AI Guardrails (Primary Focus)

Specialized checks for LLM-integrated applications.

  • Prevents: Sending sensitive customer data to model prompts
  • Audits: OpenAI, Anthropic, Google Gemini, LangChain, and more
  • Recognizes: privalyse-mask and other sanitization libraries
  • Tracks: Data flow from user input β†’ prompt β†’ AI API

πŸ•΅οΈβ€β™‚οΈ Secret Detection

Detects hardcoded API keys, tokens, and credentials.

  • Supports: AWS, Stripe, OpenAI, Slack, Anthropic, and generic high-entropy strings

πŸ—£οΈ PII Leak Prevention

Identifies PII leaking into logs, external APIs, or analytics.

  • Detects: Emails, Phone Numbers, Credit Cards, SSNs, Names, Addresses
  • Context Aware: Understands variable names like user_email or customer_ssn

βš–οΈ GDPR & Data Sovereignty

Maps data flows to ensure compliance.

  • Flags: Data transfers to non-EU AI providers
  • Verifies: Usage of sanitization/masking functions before data egress

πŸ”§ Recognized Sanitizers

Privalyse automatically recognizes these sanitization patterns and won't flag sanitized data:

Library/Pattern Recognition
privalyse-mask (PrivalyseMasker.mask()) βœ… Full
presidio (Microsoft Presidio) βœ… Full
scrubadub βœ… Full
Custom functions with: mask, anonymize, hash, encrypt, redact, sanitize βœ… Full
Masked text patterns: {Name_xyz}, {Email_abc} βœ… Full

πŸ€– For AI Agents & MCP Servers

Privalyse is agent-friendly. Get structured JSON output for autonomous remediation:

privalyse --format json --out privalyse_report.json

AI coding agents can read the report and automatically fix privacy leaks.


πŸ—ΊοΈ Roadmap

  • Python Support (Full AST Analysis)
  • JavaScript/TypeScript Support (AST & Regex)
  • Cross-File Taint Tracking
  • privalyse-mask Integration
  • VS Code Extension (Coming Soon)
  • Custom Rule Engine

🀝 Contributing

We love contributions! Check out CONTRIBUTING.md to get started.

πŸ“„ License

MIT License. See LICENSE for details.

About

πŸ”’ Detect security leaks in AI-assisted codebases. Static analysis tool for Python & JS/TS with cross-file taint tracking.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Sponsor this project

Packages

No packages published

Languages