🛡️ WeightsWatcher

Supply Chain Security for AI Models

WeightsWatcher is a cryptographic integrity verification system for Machine Learning artifacts. It protects production environments from Model Poisoning, Ransomware, and Time-of-Check to Time-of-Use (TOCTOU) attacks by ensuring that the model loaded into memory is bit-for-bit identical to the one validated during training.

🚨 The Problem

In modern MLOps, models are trained in secure environments but deployed to edge devices or cloud servers where file systems are vulnerable.

Pickle is unsafe: Standard torch.load() can execute arbitrary code (RCE).
Race Conditions (TOCTOU): Checking a hash before loading doesn't prevent an attacker from swapping the file during the read operation.
"Evil Maid" Attacks: If an attacker gains write access to your server, they can overwrite both your model and your checksum file.

🛡️ The Solution

WeightsWatcher wraps standard loaders with a "Secure Shim" that enforces:

Digital Signatures (RSA): Verifies that the lock file was signed by a trusted Private Key.
Parallel Merkle Hashing: Uses multi-core processing to hash large models (10GB+) securely and efficiently.
Active Sentry Mode: A background watchdog that monitors the file system and instantly locks down the API if the model file is touched.
Safe Defaults: Enforces weights_only=True for PyTorch to prevent code execution.

🚀 Installation

WeightsWatcher is modular. Install only what you need:

# Core Library (CLI & Crypto only)
pip install weightswatcher

# With PyTorch support (Examples 01 & 02)
pip install "weightswatcher[torch]"

# With LLM/Transformers support (Example 03)
pip install "weightswatcher[llm]"

# With API/FastAPI Sentry support (Example 04)
pip install "weightswatcher[api]"

# ⚡ For Development (All features + tests)
pip install -e ".[dev]"

🛠️ Usage

1. The CLI (DevOps)

Manage keys and lock files directly from the terminal.

# 1. Generate RSA Keypair
weightswatcher keygen --out .

# 2. Lock & Sign a Model (Training Stage)
weightswatcher lock production_model.pt --key private_key.pem

# 3. Verify a Model (Deployment Stage)
weightswatcher verify production_model.pt --key public_key.pem

2. Python API (Developers)

Integrate secure loading into your inference code.

from weightswatcher import secure_load

# This will RAISE an exception if the signature is invalid
# or if the file content has been tampered with.
weights = secure_load(
    "production_model.pt", 
    public_key_path="public_key.pem", 
    weights_only=True
)

model.load_state_dict(weights)

🧪 Examples & Demos

The examples/ directory contains runnable scripts demonstrating real-world attack vectors.

Example 01: Core Security Demo (Bit Rot vs. Evil Maid)

File: examples/01_real_world_crypto_test.py

A comprehensive security test that runs two scenarios sequentially:

Act 2 (Bit Rot): Simulates random file corruption. WeightsWatcher blocks this via Hash Mismatch.
Act 3 (Evil Maid): Simulates an attacker modifying the model and updating the lock file hashes to hide their tracks. WeightsWatcher blocks this via Signature Mismatch.

[ACT 2] 💥 Attack A: Simple Corruption...
    ✅ SUCCESS: Blocked by Hash Mismatch.
    🛑 LOG: Corruption detected in Chunk #0

[ACT 3] 😈 Attack B: The 'Evil Maid'...
    ✅ SUCCESS: Blocked by Signature Verification.
    🛑 LOG: 🚨 INVALID SIGNATURE: The manifest file has been tampered with!

Example 02: The LLM Lobotomy 🧠

File: examples/02_llm_test.py

Demonstrates why integrity matters for GenAI. We download GPT-2 Medium (~1.5GB) and demonstrate how a "Silent Corruption" attack can leave a model running but brain-damaged.

The Scenario: An attacker modifies the model file on disk, zeroing out a 1MB block of the Embedding Matrix. The model still loads without errors (no crash), but its vocabulary is destroyed.

What you will see:

Baseline: The model successfully lists the colors of the rainbow.
The Attack: We inject the corruption.
The Defense: WeightsWatcher detects the hash mismatch and refuses to load the file.
The "What If": The script forcibly loads the corrupted model to show the consequences. The output becomes incoherent.

[3] 🤖 Generating Text (Baseline)...
    📝 Prompt: 'The colors of the rainbow are red, orange, yellow,'
    ✅ Output: ...green, blue, indigo, and violet.

[4] 😈 SIMULATING ATTACK: Corrupting Vocabulary...
    ⚠️  Injected 1.0 MB of ZEROS at offset 10.0 MB.

[5] 🛡️  Attempting Secure Load...
    ✅ SUCCESS: Attack Blocked!
    🛑 LOG: Corruption detected in Chunk #1

[6] 💀 DEMO: Forcing load to show damage...
    📝 Prompt: 'The colors of the rainbow are red, orange, yellow,'
    💀 Output: ...black, car, dog, 2024, the, the...

Example 03: The Sentry (Active Defense) 🛡️

File: examples/03_fastapi_integration.py

Runs a live FastAPI server protected by a background Watchdog. This demonstrates "Event-Driven Security" where the API automatically shuts down if the model file is tampered with.

How to Run & Verify:

1. Start the Server (Terminal 1)

python examples/03_fastapi_integration.py

2. Open the Test Interface Open your web browser to: [http://localhost:8000/docs] This uses the built-in Swagger UI to let you interact with the API without needing complex curl commands.

3. Send a Valid Request

Click on the green POST /predict bar.
Click the Try it out button (top right).
Click the big blue Execute button.

Result: Scroll down to "Server response". You should see Code 200 and a JSON body:

{
  "status": "success",
  "prediction": "class_1",
  "confidence": 0.98
}

4. Attack the Model (Terminal 2) Keep the server running. Open a new terminal window and corrupt the model file on disk:

echo HACKED >> api_model.pt

Watch Terminal 1: You will see the Sentry detect the change and trip the kill-switch immediately.

5. Verify the Lockout (Browser)

Go back to your browser.
Click the blue Execute button again.

Result: The API now rejects the request. You will see Code 503 (Service Unavailable):

{
  "detail": "🚨 SECURITY ALERT: System compromised. Model integrity check failed."
}

🏗️ Architecture

The "Manifest" Protocol

WeightsWatcher splits models into 10MB chunks. The manifest contains the SHA-256 hash of every chunk, and the manifest itself is signed with an RSA Private Key.

The Sentry Pattern (Event-Driven Security)

Instead of re-hashing the model on every request (high latency), WeightsWatcher uses an OS-level file system watcher.

graph TD
    A[Attack: Malicious Write] -->|File Modified| B(OS Kernel Event)
    B -->|Notify| C{WeightsWatcher Sentry}
    C -->|Trigger| D[Parallel Integrity Scan]
    D -->|Fail| E[Global Panic Switch]
    E -->|Block| F["API Endpoints (503)"]

🤖 CI/CD Integration

This repo includes GitHub Actions workflows to automate your supply chain security.

verify_models.yml: Runs on PRs. Scans the repo to ensure all .pt files match their lock files.
release_model.yml: Runs on Tags (v*). Automatically signs release artifacts using a Private Key stored in GitHub Secrets.

🗺️ Roadmap

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
examples		examples
src/weightswatcher		src/weightswatcher
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ WeightsWatcher

Supply Chain Security for AI Models

🚨 The Problem

🛡️ The Solution

🚀 Installation

🛠️ Usage

1. The CLI (DevOps)

2. Python API (Developers)

🧪 Examples & Demos

Example 01: Core Security Demo (Bit Rot vs. Evil Maid)

Example 02: The LLM Lobotomy 🧠

Example 03: The Sentry (Active Defense) 🛡️

🏗️ Architecture

The "Manifest" Protocol

The Sentry Pattern (Event-Driven Security)

🤖 CI/CD Integration

🗺️ Roadmap

License

About

Uh oh!

Releases

Packages

Languages

License

ef3rguson/WeightsWatcher

Folders and files

Latest commit

History

Repository files navigation

🛡️ WeightsWatcher

Supply Chain Security for AI Models

🚨 The Problem

🛡️ The Solution

🚀 Installation

🛠️ Usage

1. The CLI (DevOps)

2. Python API (Developers)

🧪 Examples & Demos

Example 01: Core Security Demo (Bit Rot vs. Evil Maid)

Example 02: The LLM Lobotomy 🧠

Example 03: The Sentry (Active Defense) 🛡️

🏗️ Architecture

The "Manifest" Protocol

The Sentry Pattern (Event-Driven Security)

🤖 CI/CD Integration

🗺️ Roadmap

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages