A sophisticated Research Assistant powered by GraphRAG technology for analyzing documents and research data.
BasicBot is an advanced research assistant that leverages Graph Retrieval-Augmented Generation (GraphRAG) to provide intelligent analysis of documents, research papers, and technical content. Built with modern AI stack, it combines local LLM inference with graph-based knowledge representation for superior document understanding and question answering.
- FastAPI Backend - High-performance async API server
- Next.js Frontend - Modern React-based user interface
- Neo4j Graph Database - Advanced graph data storage and querying
- Ollama Integration - Local LLM inference with Granite models
- Vector Embeddings - Semantic search and similarity matching
- RLHF Adaptation - Continuous learning from user interactions
- Multi-modal Retrieval: Hybrid search combining semantic vectors and graph relationships
- Document Analysis: Advanced processing of various document types
- Context-Aware Responses: Maintains conversation history and adapts to user needs
- Citation Tracking: Automatically cites document sources in responses
- GraphRAG Implementation: Leverages Neo4j's graph data science capabilities
- Adaptive RLHF: Learns and improves response quality over time
- Plugin Architecture: Extensible system for additional data sources and models
- Real-time Evaluation: Built-in performance metrics and quality grading
- Responsive Web Interface: Clean, intuitive design with dark mode
- Real-time Chat: Streaming responses with typing indicators
- Document Management: Upload and organize research materials
- System Monitoring: Comprehensive health checks and metrics
- Comprehensive API: RESTful endpoints with automatic documentation
- Evaluation Framework: Built-in testing and performance measurement
- Modular Design: Clean separation of concerns for easy maintenance
- Docker Integration: Containerized deployment with docker-compose
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Next.js β β FastAPI β β Neo4j β
β Frontend βββββΊβ Backend βββββΊβ Graph DB β
β (Port 3000) β β (Port 8000) β β (Port 7687) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
ββββββββββββββββββββββββΊββββββββββββββββββββββββ
β Ollama Models
β (Port 11434)
ββββββββββββββββββββββββ
-
Frontend Layer
- React 19 with Next.js 14
- TypeScript for type safety
- Radix UI components
- Tailwind CSS styling
-
Backend Layer
- FastAPI with async support
- Modular service architecture
- Pydantic data validation
- CORS-enabled for frontend integration
-
Data Layer
- Neo4j graph database with GDS
- Vector embeddings with similarity search
- Schema-based data modeling
- Redis for caching and sessions
-
AI/ML Layer
- Ollama integration for local inference
- Granite4 micro model for efficiency
- MXBAI embeddings for semantic search
- Adaptive RLHF learning system
- Python 3.9+ with pip package manager
- Node.js 16+ with npm package manager
- Docker Desktop for containerized services
- Ollama for local LLM inference
- 4GB+ RAM recommended for optimal performance
| Component | Minimum | Recommended |
|---|---|---|
| RAM | 8GB | 16GB+ |
| CPU | 4 cores | 8+ cores |
| Storage | 20GB | 50GB+ |
| Network | Stable internet | High-speed |
git clone https://github.com/kliewerdaniel/basicbot.git
cd basicbot# Run comprehensive setup script
./setup.shThis script will:
- Create Python virtual environment
- Install all dependencies
- Setup Docker containers (Neo4j, Redis)
- Pull required Ollama models
- Create database schema and indexes
- Perform initial data ingestion if files are present
# Start all services
./start.sh- Web Interface: http://localhost:3000
- API Documentation: http://localhost:8000/docs
- Neo4j Browser: http://localhost:7474
- API Health Check: http://localhost:8000/api/health
- Navigate to the web interface
- Upload documents through the document management panel
- Ask questions in the chat interface
- Review responses with source citations and relevance scores
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{
"query": "What are the main approaches to neural network optimization?",
"chat_history": [],
"session_id": "optional-session-id"
}'curl "http://localhost:8000/api/search?q=neural%20network%20optimization&limit=10"curl http://localhost:8000/api/health# Ingest CSV data files
python3 scripts/ingest_data.py --csv data/your_data.csv --create-indexes# Ingest PDF research papers
python3 scripts/ingest_research_data.py --directory data/research_papers/| Variable | Default | Description |
|---|---|---|
NEO4J_URI |
bolt://localhost:7687 |
Neo4j connection URI |
NEO4J_USERNAME |
neo4j |
Neo4j username |
NEO4J_PASSWORD |
research2025 |
Neo4j password |
REDIS_URL |
redis://localhost:6379 |
Redis connection URL |
OLLAMA_HOST |
localhost:11434 |
Ollama server address |
PORT |
8000 |
FastAPI server port |
Models are configured in data/persona.json:
{
"name": "Research Assistant",
"ollama_model": "granite4:micro-h",
"rlhf_thresholds": {
"retrieval_required": 0.6,
"citation_requirement": 0.8,
"formality_level": 0.7
}
}# Run comprehensive evaluation suite
python3 evaluation/run_evaluation.pyThe system provides several evaluation metrics:
- Retrieval Quality: Precision and recall of document retrieval
- Response Accuracy: Alignment with ground truth answers
- Context Relevance: Usefulness of retrieved documents
- Response Quality: Readability and completeness scores
Evaluation datasets are located in evaluation/datasets/:
research_assistant_v1.json- General research questionsstress_tests.json- Edge cases and performance limits
basicbot/
βββ frontend/ # Next.js application
β βββ src/
β β βββ app/ # Next.js app router
β β βββ components/ # React components
β β βββ lib/ # Utilities and configurations
β βββ package.json
βββ scripts/ # Python core logic
β βββ eps_reasoning_agent.py
β βββ eps_retriever.py
β βββ graph_schema.py
β βββ ingest_*.py
βββ evaluation/ # Testing and metrics
β βββ run_evaluation.py
β βββ metrics.py
β βββ datasets/
βββ data/ # Sample data and configurations
βββ test_*.py # Test scripts
βββ main.py # FastAPI application
βββ requirements.txt # Python dependencies
βββ docker-compose.yml # Container orchestration
βββ README.md
# Backend development
cd basicbot
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt # If available
# Frontend development
cd frontend
npm install
npm run dev
# Database development (in separate terminal)
docker-compose up neo4j redis# Backend tests
python3 -m pytest
# Frontend tests
cd frontend
npm test
# Integration tests
./test.sh| Endpoint | Method | Description |
|---|---|---|
/api/chat |
POST | Main chat interface with GraphRAG |
/api/search |
GET | Direct document search |
/api/health |
GET | System health check |
/api/status |
GET | Detailed system status |
/api/ingest |
POST | Trigger data ingestion |
/api/evaluate |
POST | Run evaluation suite |
Request:
{
"query": "What are the benefits of using convolutional neural networks?",
"chat_history": [
{
"role": "user",
"content": "How do neural networks work?"
},
{
"role": "assistant",
"content": "Neural networks are computational models..."
}
],
"session_id": "session-123"
}Response:
{
"response": "Convolutional neural networks offer several key benefits...",
"context_used": [...],
"quality_grade": 0.85,
"retrieval_method": "hybrid",
"retrieval_performed": true,
"sources": [...],
"session_id": "session-123"
}- Query Response Time: 2-5 seconds for complex questions
- Document Ingestion: ~1000 documents/hour
- Memory Usage: 4-8GB during normal operation
- Concurrent Users: 10-20 simultaneous sessions
- Database: Neo4j can handle millions of documents
- LLM: Ollama supports multiple concurrent requests
- Frontend: Next.js handles high traffic efficiently
- Caching: Redis layer improves response times for repeated queries
- Local AI: All processing happens locally using Ollama
- No Data Transmission: Documents stay on your system
- Container Isolation: Services run in isolated Docker containers
- Input Sanitization: All inputs are validated and sanitized
- Session Management: Secure session handling with UUIDs
Application won't start
# Check Docker services
docker-compose ps
# Check Ollama status
ollama list
# Verify ports are available
lsof -i :8000,3000,7474,7687Poor response quality
# Check data ingestion
python3 -c "from scripts.retriever import Retriever; r=Retriever(); print(len(r.retrieve_context('test',1)))"
# Review RLHF thresholds
cat data/persona.jsonDatabase connection errors
# Verify Neo4j is running
curl http://localhost:7474
# Check connection settings
docker-compose logs neo4j- Check the Health Endpoint:
/api/healthfor system status - Review Logs: Check container logs with
docker-compose logs - Run Diagnostics: Execute
./test.shfor system diagnostics
This project is licensed under the MIT License - see the LICENSE file for details.
- Neo4j for graph database technology
- Ollama for local LLM capabilities
- FastAPI for excellent Python web framework
- Next.js for modern React development
- Open Source Community for development tools and libraries
BasicBot - Transforming research analysis with GraphRAG technology π