An advanced multi-round reasoning research tool powered by Large Language Models (LLMs) that automatically decomposes complex queries, executes multi-round searches, and generates in-depth analysis reports.
DeepResearchScribe is an intelligent research assistant that combines the power of large language models with advanced search capabilities to provide comprehensive research reports on complex topics. The system automatically breaks down complex research queries into multiple focused aspects, conducts iterative searches, and synthesizes the findings into structured, professional reports.
Clean and intuitive web interface for research queries
Real-time search results and analysis interface
Complete research workflow demonstration
Check out our Current Development of AI and our Current Development of LLMs- a comprehensive analysis demonstrating the system's capability to generate detailed, structured research reports on complex topics like AI development trends, key participants, technological breakthroughs, and strategic implications.
- π Intelligent Search: Integrated with Jina Search API supporting re-ranking and content filtering
- π§ Multi-Round Reasoning: LLM-powered multi-round search reasoning and analysis
- π Structured Reports: Automatically decompose complex topics into multiple sections and generate structured reports
- π Web Interface: Intuitive Streamlit-based web interface
- π§ Flexible Configuration: Support for multiple LLM providers and local model deployment
- π― Smart Filtering: AI-powered content filtering and relevance scoring
- Python 3.8+
- Internet connection (for API calls and searches)
- Optional: GPU for local model deployment
-
Clone the repository:
git clone https://github.com/your-username/DeepResearchScribe.git cd DeepResearchScribe -
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
cp .env.example .env # Edit .env file and add your API keys
streamlit run src/ui/streamlit_app.pypython scripts/run_cli.py "Your research topic"from src.core.researcher import DeepResearcher
researcher = DeepResearcher()
result = researcher.run("Artificial Intelligence in Healthcare")
print(result['final_report'])DeepResearchScribe/
βββ src/
β βββ __init__.py
β βββ core/
β β βββ __init__.py
β β βββ researcher.py # Main research engine
β β βββ llm_connector.py # LLM connection handler
β β βββ search_tool.py # Search tool integration
β β βββ integrator.py # Content integration
β βββ ui/
β β βββ __init__.py
β β βββ streamlit_app.py # Web interface
β βββ utils/
β βββ __init__.py
β βββ helpers.py # Utility functions
β βββ parser.py # Content parsing
β βββ prompt_templates.py # LLM prompts
βββ assets/ # Demo images and resources
βββ config/ # Configuration files
βββ example_report/ # Sample research reports
βββ llm_responses/ # Cached LLM responses
βββ scripts/ # Execution scripts
βββ tests/ # Test files
βββ requirements.txt # Dependencies
βββ deploy.sh # Deployment script
βββ README.md # Project documentation
You need to configure the following environment variables in your .env file:
# Required: LLM API Configuration
DEEPSEEK_API_KEY=your_deepseek_api_key_here
# Required: Search API Configuration
JINA_API_KEY=your_jina_api_key_here
# Optional: Local Model Server (if using local deployment)
VLLM_SERVER_URL=http://localhost:8000
# Optional: Content Filtering (requires LLM key for intelligent filtering)
ENABLE_CONTENT_FILTERING=true
FILTERING_MODEL=deepseek-chat # or your preferred modelDeepResearchScribe uses Jina Search API for comprehensive web searches with advanced filtering capabilities:
- Get Jina API Key: Sign up at Jina AI and obtain your API key
- Configure Filtering: For intelligent content filtering, you need to provide an LLM API key (DeepSeek, OpenAI, etc.)
- Search Parameters: The system automatically optimizes search queries and applies relevance filtering
# Example search configuration
SEARCH_CONFIG = {
'max_results': 20,
'enable_reranking': True,
'content_filtering': True,
'relevance_threshold': 0.7
}For enhanced privacy and control, you can deploy local LLM models using the provided deployment script. We recommend using Qwen3 series models for optimal performance and reliability.
We recommend the following Qwen3 models based on your hardware capabilities:
| Model | Parameters | VRAM Required | Use Case |
|---|---|---|---|
| Qwen/Qwen3-32B | 32B | 64GB+ | Best performance, research servers |
| Qwen/Qwen3-14B | 14B | 28GB+ | Balanced performance, mid-range GPUs |
| Qwen/Qwen3-8B | 8B | 16GB+ | Good performance, consumer GPUs |
Why Qwen3 Series?
- Superior Reasoning: Excellent performance in multi-step reasoning and complex analysis
- Research Optimized: Specifically tuned for research and analytical tasks
- Multilingual Support: Strong capabilities in both English and Chinese content analysis
- Long Context: Support for extended context windows (up to 131K tokens)
- Open Source: Fully open source with permissive licensing for commercial use
The deploy.sh script includes automated local model deployment with Qwen3:
# Make the script executable
chmod +x deploy.sh
# Run deployment (includes Qwen3 model server setup)
./deploy.shFor custom local model deployment with Qwen3 series:
# Install LMDeploy for model serving
pip install lmdeploy
# Deploy Qwen3-32B model (recommended for servers)
lmdeploy serve api_server Qwen/Qwen3-32B \
--model-name qwen3-32b \
--session-len 131000 \
--server-port 8000 \
--max-batch-size 1 \
--cache-max-entry-count 0.7 \
--tp 4
# Deploy Qwen3-14B model (recommended for mid-range setups)
lmdeploy serve api_server Qwen/Qwen3-14B \
--model-name qwen3-14b \
--session-len 131000 \
--server-port 8000 \
--max-batch-size 2 \
--cache-max-entry-count 0.8 \
--tp 2
# Deploy Qwen3-8B model (recommended for consumer GPUs)
lmdeploy serve api_server Qwen/Qwen3-8B \
--model-name qwen3-8b \
--session-len 131000 \
--server-port 8000 \
--max-batch-size 4 \
--cache-max-entry-count 0.9 \
--tp 1Update your .env file for local model usage with Qwen3:
# Local model configuration
USE_LOCAL_MODEL=true
VLLM_SERVER_URL=http://localhost:8000
# Choose your Qwen3 model
LOCAL_MODEL_NAME=qwen3-32b # or qwen3-14b, qwen3-8b
# Optional: Model-specific settings
MODEL_MAX_TOKENS=131000
MODEL_TEMPERATURE=0.7from src.core.researcher import DeepResearcher
# Initialize researcher
researcher = DeepResearcher()
# Conduct research
result = researcher.run("Climate change impact on agriculture")
# Access results
print("Final Report:", result['final_report'])
print("Search History:", result['search_history'])
print("Key Findings:", result['key_insights'])# Custom configuration
config = {
'max_search_rounds': 10,
'min_sources_per_topic': 5,
'report_depth': 'comprehensive',
'enable_citations': True
}
researcher = DeepResearcher(config=config)
result = researcher.run("Quantum computing applications", config=config)# Process multiple research topics
topics = [
"Renewable energy trends 2024",
"Artificial intelligence ethics",
"Space exploration technologies"
]
results = []
for topic in topics:
result = researcher.run(topic)
results.append(result)- Topic Input: Enter your research query in natural language
- Configuration Panel: Adjust search parameters and model settings
- Progress Tracking: Real-time progress updates during research
- Results Display: Structured presentation of findings
- Source Tracking: Monitor search sources and credibility
- Content Preview: Preview relevant content before integration
- Relevance Scoring: AI-powered relevance assessment
- Citation Management: Automatic citation generation and tracking
-
API Key Errors
- Verify your API keys are correctly set in
.env - Check API quotas and rate limits
- Ensure network connectivity
- Verify your API keys are correctly set in
-
Local Model Issues
- Check GPU memory availability
- Verify model path and permissions
- Monitor server logs for errors
-
Search Quality Issues
- Adjust relevance thresholds
- Enable content filtering
- Refine search queries
-
Performance Optimization
- Use local models for faster processing
- Enable response caching
- Adjust batch sizes for your hardware
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
python -m pytest tests/
# Format code
black src/ scripts/ tests/
# Lint code
flake8 src/ scripts/ tests/This project is licensed under the MIT License - see the LICENSE file for details.
- Jina AI for powerful search capabilities
- DeepSeek for advanced language models
- LMDeploy for local model deployment
- Streamlit for the web interface framework
For questions or suggestions, please:
- Submit an issue on GitHub
- Contact the maintainer directly
- Join our community discussions
If you find this project useful, please consider giving it a star on GitHub!
Note: This tool is designed for research and educational purposes. Always verify information from multiple sources and apply critical thinking to the generated reports.
