DeepResearchScribe - Intelligent Multi-Round Research Assistant

An advanced multi-round reasoning research tool powered by Large Language Models (LLMs) that automatically decomposes complex queries, executes multi-round searches, and generates in-depth analysis reports.

🎯 Project Overview

DeepResearchScribe is an intelligent research assistant that combines the power of large language models with advanced search capabilities to provide comprehensive research reports on complex topics. The system automatically breaks down complex research queries into multiple focused aspects, conducts iterative searches, and synthesizes the findings into structured, professional reports.

🌟 Results Showcase

Main Interface

Clean and intuitive web interface for research queries

Search Content Analysis

Real-time search results and analysis interface

Research Process Demo

Complete research workflow demonstration

📄 Sample Output

Check out our Current Development of AI and our Current Development of LLMs- a comprehensive analysis demonstrating the system's capability to generate detailed, structured research reports on complex topics like AI development trends, key participants, technological breakthroughs, and strategic implications.

Key Features

🔍 Intelligent Search: Integrated with Jina Search API supporting re-ranking and content filtering
🧠 Multi-Round Reasoning: LLM-powered multi-round search reasoning and analysis
📊 Structured Reports: Automatically decompose complex topics into multiple sections and generate structured reports
🌐 Web Interface: Intuitive Streamlit-based web interface
🔧 Flexible Configuration: Support for multiple LLM providers and local model deployment
🎯 Smart Filtering: AI-powered content filtering and relevance scoring

🚀 Quick Start

Requirements

Python 3.8+
Internet connection (for API calls and searches)
Optional: GPU for local model deployment

Installation

Clone the repository:

git clone https://github.com/your-username/DeepResearchScribe.git
cd DeepResearchScribe

Install dependencies:
```
pip install -r requirements.txt
```

Configure environment variables:

cp .env.example .env
# Edit .env file and add your API keys

Usage

Web Interface (Recommend)

streamlit run src/ui/streamlit_app.py

Command Line Interface

python scripts/run_cli.py "Your research topic"

Python API

from src.core.researcher import DeepResearcher

researcher = DeepResearcher()
result = researcher.run("Artificial Intelligence in Healthcare")
print(result['final_report'])

📁 Project Structure

DeepResearchScribe/
├── src/
│   ├── __init__.py
│   ├── core/
│   │   ├── __init__.py
│   │   ├── researcher.py        # Main research engine
│   │   ├── llm_connector.py     # LLM connection handler
│   │   ├── search_tool.py       # Search tool integration
│   │   └── integrator.py        # Content integration
│   ├── ui/
│   │   ├── __init__.py
│   │   └── streamlit_app.py     # Web interface
│   └── utils/
│       ├── __init__.py
│       ├── helpers.py           # Utility functions
│       ├── parser.py            # Content parsing
│       └── prompt_templates.py  # LLM prompts
├── assets/                      # Demo images and resources
├── config/                      # Configuration files
├── example_report/              # Sample research reports
├── llm_responses/              # Cached LLM responses
├── scripts/                    # Execution scripts
├── tests/                      # Test files
├── requirements.txt            # Dependencies
├── deploy.sh                   # Deployment script
└── README.md                   # Project documentation

🔧 Configuration

Environment Variables

You need to configure the following environment variables in your .env file:

# Required: LLM API Configuration
DEEPSEEK_API_KEY=your_deepseek_api_key_here

# Required: Search API Configuration  
JINA_API_KEY=your_jina_api_key_here

# Optional: Local Model Server (if using local deployment)
VLLM_SERVER_URL=http://localhost:8000

# Optional: Content Filtering (requires LLM key for intelligent filtering)
ENABLE_CONTENT_FILTERING=true
FILTERING_MODEL=deepseek-chat  # or your preferred model

🔍 Jina Search Setup

DeepResearchScribe uses Jina Search API for comprehensive web searches with advanced filtering capabilities:

Get Jina API Key: Sign up at Jina AI and obtain your API key
Configure Filtering: For intelligent content filtering, you need to provide an LLM API key (DeepSeek, OpenAI, etc.)
Search Parameters: The system automatically optimizes search queries and applies relevance filtering

# Example search configuration
SEARCH_CONFIG = {
    'max_results': 20,
    'enable_reranking': True,
    'content_filtering': True,
    'relevance_threshold': 0.7
}

🖥️ Local Model Deployment

For enhanced privacy and control, you can deploy local LLM models using the provided deployment script. We recommend using Qwen3 series models for optimal performance and reliability.

🌟 Recommended Models: Qwen3 Series

We recommend the following Qwen3 models based on your hardware capabilities:

Model	Parameters	VRAM Required	Use Case
Qwen/Qwen3-32B	32B	64GB+	Best performance, research servers
Qwen/Qwen3-14B	14B	28GB+	Balanced performance, mid-range GPUs
Qwen/Qwen3-8B	8B	16GB+	Good performance, consumer GPUs

Why Qwen3 Series?

Superior Reasoning: Excellent performance in multi-step reasoning and complex analysis
Research Optimized: Specifically tuned for research and analytical tasks
Multilingual Support: Strong capabilities in both English and Chinese content analysis
Long Context: Support for extended context windows (up to 131K tokens)
Open Source: Fully open source with permissive licensing for commercial use

Using the Deployment Script

The deploy.sh script includes automated local model deployment with Qwen3:

# Make the script executable
chmod +x deploy.sh

# Run deployment (includes Qwen3 model server setup)
./deploy.sh

Manual Local Model Setup

For custom local model deployment with Qwen3 series:

# Install LMDeploy for model serving
pip install lmdeploy

# Deploy Qwen3-32B model (recommended for servers)
lmdeploy serve api_server Qwen/Qwen3-32B \
    --model-name qwen3-32b \
    --session-len 131000 \
    --server-port 8000 \
    --max-batch-size 1 \
    --cache-max-entry-count 0.7 \
    --tp 4

# Deploy Qwen3-14B model (recommended for mid-range setups)
lmdeploy serve api_server Qwen/Qwen3-14B \
    --model-name qwen3-14b \
    --session-len 131000 \
    --server-port 8000 \
    --max-batch-size 2 \
    --cache-max-entry-count 0.8 \
    --tp 2

# Deploy Qwen3-8B model (recommended for consumer GPUs)
lmdeploy serve api_server Qwen/Qwen3-8B \
    --model-name qwen3-8b \
    --session-len 131000 \
    --server-port 8000 \
    --max-batch-size 4 \
    --cache-max-entry-count 0.9 \
    --tp 1

Configuration for Local Models

Update your .env file for local model usage with Qwen3:

# Local model configuration
USE_LOCAL_MODEL=true
VLLM_SERVER_URL=http://localhost:8000

# Choose your Qwen3 model
LOCAL_MODEL_NAME=qwen3-32b  # or qwen3-14b, qwen3-8b

# Optional: Model-specific settings
MODEL_MAX_TOKENS=131000
MODEL_TEMPERATURE=0.7

📈 Usage Examples

Basic Research

from src.core.researcher import DeepResearcher

# Initialize researcher
researcher = DeepResearcher()

# Conduct research
result = researcher.run("Climate change impact on agriculture")

# Access results
print("Final Report:", result['final_report'])
print("Search History:", result['search_history'])
print("Key Findings:", result['key_insights'])

Advanced Configuration

# Custom configuration
config = {
    'max_search_rounds': 10,
    'min_sources_per_topic': 5,
    'report_depth': 'comprehensive',
    'enable_citations': True
}

researcher = DeepResearcher(config=config)
result = researcher.run("Quantum computing applications", config=config)

Batch Processing

# Process multiple research topics
topics = [
    "Renewable energy trends 2024",
    "Artificial intelligence ethics",
    "Space exploration technologies"
]

results = []
for topic in topics:
    result = researcher.run(topic)
    results.append(result)

🌐 Web Interface Features

Main Interface

Topic Input: Enter your research query in natural language
Configuration Panel: Adjust search parameters and model settings
Progress Tracking: Real-time progress updates during research
Results Display: Structured presentation of findings

Search Analysis Interface

Source Tracking: Monitor search sources and credibility
Content Preview: Preview relevant content before integration
Relevance Scoring: AI-powered relevance assessment
Citation Management: Automatic citation generation and tracking

🐛 Troubleshooting

Common Issues

API Key Errors
- Verify your API keys are correctly set in .env
- Check API quotas and rate limits
- Ensure network connectivity
Local Model Issues
- Check GPU memory availability
- Verify model path and permissions
- Monitor server logs for errors
Search Quality Issues
- Adjust relevance thresholds
- Enable content filtering
- Refine search queries
Performance Optimization
- Use local models for faster processing
- Enable response caching
- Adjust batch sizes for your hardware

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
python -m pytest tests/

# Format code
black src/ scripts/ tests/

# Lint code
flake8 src/ scripts/ tests/

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Jina AI for powerful search capabilities
DeepSeek for advanced language models
LMDeploy for local model deployment
Streamlit for the web interface framework

📞 Contact

For questions or suggestions, please:

Submit an issue on GitHub
Contact the maintainer directly
Join our community discussions

📚 Related Resources

⭐ Star History

If you find this project useful, please consider giving it a star on GitHub!

Note: This tool is designed for research and educational purposes. Always verify information from multiple sources and apply critical thinking to the generated reports.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
config		config
docs		docs
example_report		example_report
examples		examples
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
deploy.sh		deploy.sh
requirements.txt		requirements.txt
setup.py		setup.py

Kwen-Chen/DeepResearchScribe

Folders and files

Latest commit

History

Repository files navigation

DeepResearchScribe - Intelligent Multi-Round Research Assistant

🎯 Project Overview

🌟 Results Showcase

Main Interface

Search Content Analysis

Research Process Demo

📄 Sample Output

Key Features

🚀 Quick Start

Requirements

Installation

Usage

Web Interface (Recommend)

Command Line Interface

Python API

📁 Project Structure

🔧 Configuration

Environment Variables

🔍 Jina Search Setup

🖥️ Local Model Deployment

🌟 Recommended Models: Qwen3 Series

Using the Deployment Script

Manual Local Model Setup

Configuration for Local Models

📈 Usage Examples

Basic Research

Advanced Configuration

Batch Processing

🌐 Web Interface Features

Main Interface

Search Analysis Interface

🐛 Troubleshooting

Common Issues

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

📞 Contact

📚 Related Resources

⭐ Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages