WeightsWatcher is a cryptographic integrity verification system for Machine Learning artifacts. It protects production environments from Model Poisoning, Ransomware, and Time-of-Check to Time-of-Use (TOCTOU) attacks by ensuring that the model loaded into memory is bit-for-bit identical to the one validated during training.
In modern MLOps, models are trained in secure environments but deployed to edge devices or cloud servers where file systems are vulnerable.
- Pickle is unsafe: Standard
torch.load()can execute arbitrary code (RCE). - Race Conditions (TOCTOU): Checking a hash before loading doesn't prevent an attacker from swapping the file during the read operation.
- "Evil Maid" Attacks: If an attacker gains write access to your server, they can overwrite both your model and your checksum file.
WeightsWatcher wraps standard loaders with a "Secure Shim" that enforces:
- Digital Signatures (RSA): Verifies that the lock file was signed by a trusted Private Key.
- Parallel Merkle Hashing: Uses multi-core processing to hash large models (10GB+) securely and efficiently.
- Active Sentry Mode: A background watchdog that monitors the file system and instantly locks down the API if the model file is touched.
- Safe Defaults: Enforces
weights_only=Truefor PyTorch to prevent code execution.
WeightsWatcher is modular. Install only what you need:
# Core Library (CLI & Crypto only)
pip install weightswatcher
# With PyTorch support (Examples 01 & 02)
pip install "weightswatcher[torch]"
# With LLM/Transformers support (Example 03)
pip install "weightswatcher[llm]"
# With API/FastAPI Sentry support (Example 04)
pip install "weightswatcher[api]"
# β‘ For Development (All features + tests)
pip install -e ".[dev]"Manage keys and lock files directly from the terminal.
# 1. Generate RSA Keypair
weightswatcher keygen --out .
# 2. Lock & Sign a Model (Training Stage)
weightswatcher lock production_model.pt --key private_key.pem
# 3. Verify a Model (Deployment Stage)
weightswatcher verify production_model.pt --key public_key.pemIntegrate secure loading into your inference code.
from weightswatcher import secure_load
# This will RAISE an exception if the signature is invalid
# or if the file content has been tampered with.
weights = secure_load(
"production_model.pt",
public_key_path="public_key.pem",
weights_only=True
)
model.load_state_dict(weights)The examples/ directory contains runnable scripts demonstrating real-world attack vectors.
File: examples/01_real_world_crypto_test.py
A comprehensive security test that runs two scenarios sequentially:
- Act 2 (Bit Rot): Simulates random file corruption. WeightsWatcher blocks this via Hash Mismatch.
- Act 3 (Evil Maid): Simulates an attacker modifying the model and updating the lock file hashes to hide their tracks. WeightsWatcher blocks this via Signature Mismatch.
[ACT 2] π₯ Attack A: Simple Corruption...
β
SUCCESS: Blocked by Hash Mismatch.
π LOG: Corruption detected in Chunk #0
[ACT 3] π Attack B: The 'Evil Maid'...
β
SUCCESS: Blocked by Signature Verification.
π LOG: π¨ INVALID SIGNATURE: The manifest file has been tampered with!
File: examples/02_llm_test.py
Demonstrates why integrity matters for GenAI. We download GPT-2 Medium (~1.5GB) and demonstrate how a "Silent Corruption" attack can leave a model running but brain-damaged.
The Scenario: An attacker modifies the model file on disk, zeroing out a 1MB block of the Embedding Matrix. The model still loads without errors (no crash), but its vocabulary is destroyed.
What you will see:
- Baseline: The model successfully lists the colors of the rainbow.
- The Attack: We inject the corruption.
- The Defense: WeightsWatcher detects the hash mismatch and refuses to load the file.
- The "What If": The script forcibly loads the corrupted model to show the consequences. The output becomes incoherent.
[3] π€ Generating Text (Baseline)...
π Prompt: 'The colors of the rainbow are red, orange, yellow,'
β
Output: ...green, blue, indigo, and violet.
[4] π SIMULATING ATTACK: Corrupting Vocabulary...
β οΈ Injected 1.0 MB of ZEROS at offset 10.0 MB.
[5] π‘οΈ Attempting Secure Load...
β
SUCCESS: Attack Blocked!
π LOG: Corruption detected in Chunk #1
[6] π DEMO: Forcing load to show damage...
π Prompt: 'The colors of the rainbow are red, orange, yellow,'
π Output: ...black, car, dog, 2024, the, the...
File: examples/03_fastapi_integration.py
Runs a live FastAPI server protected by a background Watchdog. This demonstrates "Event-Driven Security" where the API automatically shuts down if the model file is tampered with.
How to Run & Verify:
1. Start the Server (Terminal 1)
python examples/03_fastapi_integration.py2. Open the Test Interface Open your web browser to: [http://localhost:8000/docs] This uses the built-in Swagger UI to let you interact with the API without needing complex curl commands.
3. Send a Valid Request
- Click on the green
POST /predictbar. - Click the Try it out button (top right).
- Click the big blue Execute button.
- Result: Scroll down to "Server response". You should see Code 200 and a JSON body:
{ "status": "success", "prediction": "class_1", "confidence": 0.98 }
4. Attack the Model (Terminal 2) Keep the server running. Open a new terminal window and corrupt the model file on disk:
echo HACKED >> api_model.ptWatch Terminal 1: You will see the Sentry detect the change and trip the kill-switch immediately.
5. Verify the Lockout (Browser)
- Go back to your browser.
- Click the blue Execute button again.
- Result: The API now rejects the request. You will see Code 503 (Service Unavailable):
{ "detail": "π¨ SECURITY ALERT: System compromised. Model integrity check failed." }
WeightsWatcher splits models into 10MB chunks. The manifest contains the SHA-256 hash of every chunk, and the manifest itself is signed with an RSA Private Key.
Instead of re-hashing the model on every request (high latency), WeightsWatcher uses an OS-level file system watcher.
graph TD
A[Attack: Malicious Write] -->|File Modified| B(OS Kernel Event)
B -->|Notify| C{WeightsWatcher Sentry}
C -->|Trigger| D[Parallel Integrity Scan]
D -->|Fail| E[Global Panic Switch]
E -->|Block| F["API Endpoints (503)"]
This repo includes GitHub Actions workflows to automate your supply chain security.
verify_models.yml: Runs on PRs. Scans the repo to ensure all.ptfiles match their lock files.release_model.yml: Runs on Tags (v*). Automatically signs release artifacts using a Private Key stored in GitHub Secrets.
- Chunked Merkle Tree Hashing
- RSA Digital Signatures
- Parallel Processing (Multi-core hashing)
- Active Sentry (Watchdog)
- CLI Tool
- Support for TensorFlow/Keras (
.h5) - Integration with MLflow
MIT