About this code

This example was extracted from AGPA — my fully autonomous general-purpose agent (closed-source, ~150k LOC).

LocalRAG

A local Retrieval-Augmented Generation (RAG) system for .NET that uses BERT embeddings and multiple search strategies for efficient semantic search and information retrieval.

Overview

LocalRAG provides a complete RAG implementation that runs entirely on your local machine, with no external API dependencies. It combines BERT-based embeddings with multiple search strategies to provide fast and accurate semantic search capabilities.

Features

BERT-based Text Embeddings: Uses ONNX Runtime for high-performance BERT inference
Multiple Search Strategies:
- Locality-Sensitive Hashing (LSH) for efficient similarity search
- Full-Text Search (FTS5) integration via SQLite
- Memory-based vector indexing for real-time queries
SQLite Database: Persistent storage for embeddings and metadata
Configurable Processing: Adjustable chunking, overlap, and threading parameters
Asynchronous API: Non-blocking operations for better performance
Windows Forms Demo: Example application demonstrating usage

Prerequisites

.NET 10.0 SDK or later
Windows, Linux, or macOS
BERT ONNX model (see setup instructions below)
BERT vocabulary file (vocab.txt)

Installation

Clone the repository:

git clone https://github.com/yourusername/LocalRAG.git
cd LocalRAG

Restore NuGet packages:

dotnet restore

Download a BERT model in ONNX format:
- Visit Hugging Face ONNX Models
- Models under Apache 2.0; see Hugging Face for details
- Download a BERT model (e.g., bert-base-uncased or bert-large-uncased)
- Place the .onnx file in the onnxBERT/ directory
- Download the corresponding vocab.txt file
- Place it in the Vocabularies/ directory
Build the project:

dotnet build

Quick Start

using LocalRAG;

// Configure the RAG system
var config = new RAGConfiguration
{
    ModelPath = "onnxBERT/model.onnx",
    VocabularyPath = "Vocabularies/vocab.txt",
    DatabasePath = "Database/embeddings.db"
};

// Initialize the database
using var database = new EmbeddingDatabaseNew(config);

// Add documents
await database.AddRequestToEmbeddingDatabaseAsync(
    requestId: "doc1",
    theRequest: "What is machine learning?",
    embed: true
);

await database.UpdateTextResponse(
    requestId: "doc1",
    message: "Machine learning is a subset of artificial intelligence...",
    embed: true
);

// Search for similar content
var results = await database.SearchEmbeddingsAsync(
    searchText: "artificial intelligence",
    topK: 5,
    minimumSimilarity: 0.75f
);

foreach (var result in results)
{
    Console.WriteLine($"Similarity: {result.Similarity:F3}");
    Console.WriteLine($"Request: {result.Request}");
    Console.WriteLine($"Response: {result.TextResponse}");
}

Configuration

The RAGConfiguration class provides various settings:

public class RAGConfiguration
{
    // File paths
    public string DatabasePath { get; set; }      // SQLite database location
    public string ModelPath { get; set; }         // ONNX model file
    public string VocabularyPath { get; set; }    // BERT vocab file

    // Embedding settings
    public int MaxSequenceLength { get; set; } = 512;
    public int WordsPerString { get; set; } = 40;
    public double OverlapPercentage { get; set; } = 15;

    // LSH settings
    public int NumberOfHashFunctions { get; set; } = 8;
    public int NumberOfHashTables { get; set; } = 10;

    // Performance settings
    public int InterOpNumThreads { get; set; } = 32;
    public int IntraOpNumThreads { get; set; } = 2;
    public int MaxCacheItems { get; set; } = 10000;
}

Architecture

Core Components

EmbedderClassNew: Handles BERT embeddings generation using ONNX Runtime
EmbeddingDatabaseNew: Main database interface with SQLite storage
MemoryHashIndex: In-memory hash-based indexing for fast lookups
FeedbackDatabaseValues: Data model for stored documents and embeddings

Search Flow

Text is preprocessed (tokenized, stop words removed)
BERT generates embeddings via ONNX Runtime
Embeddings are indexed using LSH for fast retrieval
Multiple search strategies are combined for optimal results
Results are ranked by similarity score

Demo Application

The DemoApp project provides a Windows Forms application demonstrating LocalRAG usage:

cd DemoApp
dotnet run

The demo shows:

Adding documents with embeddings
Searching for similar content
Retrieving conversation history
Formatting search results

Performance Considerations

First Run: Initial embedding generation may be slow
Caching: Frequently accessed embeddings are cached in memory
Threading: Adjust InterOpNumThreads and IntraOpNumThreads based on your CPU
Database Size: SQLite performs well up to several million embeddings

Troubleshooting

Model not found error

Ensure the ONNX model file exists at the configured ModelPath. Download from Hugging Face if needed.

Out of memory errors

Reduce MaxCacheItems or MaxSequenceLength in configuration.

Slow embedding generation

Use a smaller BERT model (base vs. large)
Increase thread count if you have more CPU cores
Enable GPU support via ONNX Runtime GPU packages

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the Apache License 2.0 - see LICENSE.txt for details.

Acknowledgments

Built with ONNX Runtime
Uses FastBertTokenizer for tokenization
BERT models from Hugging Face

Roadmap

GPU acceleration support
More embedding models (Sentence Transformers, etc.)
Vector database integration options
REST API interface
Multi-language support

Support

For questions and issues, please open an issue on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
DemoApp		DemoApp
LocalRAG		LocalRAG
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
LocalRAG.sln		LocalRAG.sln
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About this code

LocalRAG

Overview

Features

Prerequisites

Installation

Quick Start

Configuration

Architecture

Core Components

Search Flow

Demo Application

Performance Considerations

Troubleshooting

Model not found error

Out of memory errors

Slow embedding generation

Contributing

License

Acknowledgments

Roadmap

Support

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

johnbrodowski/LocalRAG

Folders and files

Latest commit

History

Repository files navigation

About this code

LocalRAG

Overview

Features

Prerequisites

Installation

Quick Start

Configuration

Architecture

Core Components

Search Flow

Demo Application

Performance Considerations

Troubleshooting

Model not found error

Out of memory errors

Slow embedding generation

Contributing

License

Acknowledgments

Roadmap

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages