Skip to content

RAG system for semantic search and Q&A over APT threat intelligence reports

Notifications You must be signed in to change notification settings

hackerman70000/apt-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG for APT Threat Intelligence

RAG (Retrieval-Augmented Generation) system for semantic search and Q&A over APT (Advanced Persistent Threat) reports.

RAG Performance Improvement

Requirements

Installation

uv sync
cp .env.template .env  # add HF_TOKEN

Usage

# 1. Extract text from PDFs
uv run tools/extract data/reports/ data/extracted/

# 2. Generate embeddings
uv run tools/embed data/extracted/

# 3. Query the system
uv run tools/query "What TTPs does APT28 use?"

Evaluation (RAG vs LLM-only)

# Benchmark (requires OPENROUTER_API_KEY in .env)
uv run tools/benchmark --mode both --model openai/gpt-oss-20b

# LLM-as-a-Judge evaluation
uv run tools/evaluate data/evaluation/benchmark.json --model openai/gpt-oss-120b

# Visualize results
uv run jupyter notebook notebooks/evaluation_results.ipynb

Tests

uv run pytest

About

RAG system for semantic search and Q&A over APT threat intelligence reports

Resources

Stars

Watchers

Forks