Skip to content

Open Source Ad Serving Platform with ML-Powered CTR Prediction | Self-hosted alternative to Google Ad Manager | Python, FastAPI, PyTorch

License

Notifications You must be signed in to change notification settings

seanZhang414/openadserver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

5 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

OpenAdServer

OpenAdServer

Open Source Ad Serving Platform with ML-Powered CTR Prediction
Production-ready ad server for SMBs, startups, and developers

Features โ€ข Quick Start โ€ข Docs โ€ข Benchmarks โ€ข Roadmap

Python 3.11+ License PRs Welcome Stars

๐ŸŒŸ If this project helps you, please give it a star! ๐ŸŒŸ


๐Ÿค” Why OpenAdServer?

Most ad servers are either too simple (just serving static banners) or too complex (requiring Google-scale infrastructure).

OpenAdServer is the sweet spot โ€” a production-ready, self-hosted ad platform with machine learning powered CTR prediction, designed for teams who want full control without the complexity.

Comparison

Feature OpenAdServer Google Ad Manager Revive Adserver AdButler
Self-hosted โœ… โŒ โœ… โŒ
ML CTR Prediction โœ… DeepFM/LR โŒ โŒ โŒ
Real-time eCPM Bidding โœ… โœ… โŒ โš ๏ธ
Modern Tech Stack โœ… Python/FastAPI N/A โŒ PHP โŒ
One-click Deploy โœ… Docker โŒ โš ๏ธ โŒ
Free & Open Source โœ… โŒ โœ… โŒ
No Revenue Share โœ… โŒ ๐Ÿ’ฐ โœ… โŒ ๐Ÿ’ฐ

Perfect For

  • ๐Ÿข SMBs building their own ad network
  • ๐ŸŽฎ Gaming companies monetizing in-app traffic
  • ๐Ÿ“ฑ App developers running house ads or direct deals
  • ๐Ÿ›’ E-commerce platforms with sponsored listings
  • ๐Ÿ”ฌ Researchers studying computational advertising
  • ๐ŸŽ“ Students learning ad-tech systems

โœจ Features

๐Ÿš€ Ad Serving

  • High-Performance API โ€” <10ms P99 latency with FastAPI
  • Multiple Ad Formats โ€” Banner, native, video, interstitial
  • Smart Targeting โ€” Geo, device, OS, demographics, interests
  • Frequency Capping โ€” Daily/hourly limits per user
  • Budget Pacing โ€” Smooth delivery throughout the day

๐Ÿค– Machine Learning

  • CTR Prediction Models โ€” DeepFM, Logistic Regression, FM
  • Real-time Inference โ€” <5ms prediction latency
  • Automatic Feature Engineering โ€” Sparse/dense feature processing
  • Model Hot-swap โ€” Update models without downtime

๐Ÿ’ฐ Monetization

  • eCPM Ranking โ€” Maximize revenue automatically
  • Multiple Bid Types โ€” CPM, CPC, CPA, oCPM
  • Real-time Bidding Ready โ€” OpenRTB compatible (roadmap)

๐Ÿ“Š Analytics

  • Event Tracking โ€” Impressions, clicks, conversions
  • Real-time Dashboards โ€” Grafana integration
  • Prometheus Metrics โ€” Full observability

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        Ad Request Flow                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                 โ”‚
โ”‚   ๐Ÿ“ฑ Client                                                      โ”‚
โ”‚      โ”‚                                                          โ”‚
โ”‚      โ–ผ                                                          โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ FastAPI  โ”‚โ”€โ”€โ”€โ–ถโ”‚ Retrieval โ”‚โ”€โ”€โ”€โ–ถโ”‚ Ranking  โ”‚โ”€โ”€โ”€โ–ถโ”‚Response โ”‚ โ”‚
โ”‚   โ”‚  Router  โ”‚    โ”‚(Targeting)โ”‚    โ”‚ (eCPM)   โ”‚    โ”‚         โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚        โ”‚               โ”‚                โ”‚                       โ”‚
โ”‚        โ–ผ               โ–ผ                โ–ผ                       โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                โ”‚
โ”‚   โ”‚PostgreSQLโ”‚    โ”‚   Redis   โ”‚    โ”‚ PyTorch  โ”‚                โ”‚
โ”‚   โ”‚(Campaigns)โ”‚   โ”‚  (Cache)  โ”‚    โ”‚ (Models) โ”‚                โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                โ”‚
โ”‚                                                                 โ”‚
โ”‚   Pipeline: Retrieve โ†’ Filter โ†’ Predict โ†’ Rank โ†’ Return        โ”‚
โ”‚                                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿš€ Quick Start

Option 1: Docker Compose (Recommended)

# Clone the repository
git clone https://github.com/pysean/openadserver.git
cd openadserver

# Start all services (PostgreSQL, Redis, Ad Server)
docker compose up -d

# Initialize sample data
python scripts/init_test_data.py

# Verify it's running
curl http://localhost:8000/health
# {"status":"healthy","version":"1.0.0"}

Option 2: Local Development

# Prerequisites: Python 3.11+, PostgreSQL 14+, Redis 6+

# Install dependencies
pip install -e ".[dev]"

# Start databases
docker compose up -d postgres redis

# Run the server
OPENADSERVER_ENV=dev python -m openadserver.ad_server.main

๐Ÿ“ก Your First Ad Request

curl -X POST http://localhost:8000/api/v1/ad/request \
  -H "Content-Type: application/json" \
  -d '{
    "slot_id": "banner_home",
    "user_id": "user_12345",
    "device": {"os": "ios", "os_version": "17.0"},
    "geo": {"country": "US", "city": "new_york"},
    "num_ads": 3
  }'

Response:

{
  "request_id": "req_a1b2c3d4",
  "ads": [
    {
      "ad_id": "ad_1001_5001",
      "campaign_id": 1001,
      "creative": {
        "title": "Summer Sale - 50% Off!",
        "description": "Limited time offer",
        "image_url": "https://cdn.example.com/ads/summer-sale.jpg",
        "landing_url": "https://shop.example.com/sale"
      },
      "tracking": {
        "impression_url": "http://localhost:8000/api/v1/event/track?type=impression&req=req_a1b2c3d4&ad=1001",
        "click_url": "http://localhost:8000/api/v1/event/track?type=click&req=req_a1b2c3d4&ad=1001"
      },
      "metadata": {
        "ecpm": 35.50,
        "pctr": 0.0355
      }
    }
  ],
  "count": 1
}

๐Ÿ“– Documentation

API Endpoints

Endpoint Method Description
/api/v1/ad/request POST Request ads for a placement
/api/v1/event/track GET/POST Track impression/click/conversion
/api/v1/campaign CRUD Manage campaigns
/api/v1/creative CRUD Manage creatives
/api/v1/advertiser CRUD Manage advertisers
/health GET Health check
/metrics GET Prometheus metrics

Configuration

# configs/production.yaml
server:
  host: "0.0.0.0"
  port: 8000
  workers: 4

database:
  host: "postgres"
  port: 5432
  name: "openadserver"
  user: "adserver"
  password: "${DB_PASSWORD}"

redis:
  host: "redis"
  port: 6379
  db: 0

ad_serving:
  enable_ml_prediction: true
  model_path: "models/deepfm_ctr.pt"
  default_pctr: 0.01
  default_pcvr: 0.001

Train Your Own CTR Model

# Prepare training data from your logs
python scripts/prepare_training_data.py \
  --input logs/events/ \
  --output data/training/

# Train DeepFM model
python -m openadserver.trainer.train_ctr \
  --model deepfm \
  --data data/training/train.parquet \
  --epochs 10 \
  --output models/

# Or train a simpler LR model (faster, good baseline)
python -m openadserver.trainer.train_ctr \
  --model lr \
  --data data/training/train.parquet \
  --output models/

# Evaluate model
python -m openadserver.trainer.evaluate \
  --model models/deepfm_ctr.pt \
  --data data/training/test.parquet
# AUC: 0.72, LogLoss: 0.45

๐Ÿ“Š Benchmarks

Stress Test Results (Simulated 2 vCPU / 6GB)

Full pipeline tested: Retrieval โ†’ Filter โ†’ Prediction โ†’ Ranking โ†’ Rerank

Test Environment: SQLite in-memory + FakeRedis (zero external dependencies). Results reflect core pipeline performance without network I/O overhead.

Model QPS Avg Latency P95 P99 Relative
LR 189.7 5.24ms 7.64ms 10.02ms 100% (baseline)
FM 166.1 5.99ms 8.10ms 11.54ms 87.6%
DeepFM 151.2 6.58ms 10.30ms 14.13ms 79.7%

Pipeline Stage Breakdown (LR Model)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚     Stage       โ”‚  Avg (ms) โ”‚  % of Total โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Retrieval       โ”‚   0.97    โ”‚    18.5%    โ”‚
โ”‚ Filter          โ”‚   0.20    โ”‚     3.8%    โ”‚
โ”‚ Prediction (ML) โ”‚   3.63    โ”‚    69.3%    โ”‚  โ† Bottleneck
โ”‚ Ranking         โ”‚   0.35    โ”‚     6.7%    โ”‚
โ”‚ Rerank          โ”‚   0.10    โ”‚     1.9%    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Capacity Estimation (1M DAU)

Model Single Server QPS Peak QPS Needed Servers Required
LR ~190 870 5
FM ~166 870 6
DeepFM ~151 870 6

Note: Calculation assumes 1M DAU ร— 15 requests/user/day = 15M daily โ†’ 174 avg QPS โ†’ 870 peak (5x factor). Production deployments with PostgreSQL + Redis may have ~10-20% additional I/O overhead.


๐Ÿงช Dataset & Model Training

Criteo Click Logs Dataset

We use the Criteo Display Advertising Challenge dataset for CTR model training and evaluation.

Dataset Characteristics:

  • Size: ~45GB (full), 100K samples used for benchmarks
  • Features: 13 integer features (I1-I13), 26 categorical features (C1-C26)
  • Label: Click (0/1)
  • Positive Rate: ~3.4%

Model Comparison (100K Criteo Samples)

Model Test AUC Model Size Description
LR 0.7577 0.49 MB Logistic Regression โ€” fastest, best AUC
FM 0.7472 4.34 MB Factorization Machine โ€” captures feature interactions
DeepFM 0.7178 8.77 MB Deep FM โ€” deep learning + FM combined

LR achieves highest AUC with fastest inference โ€” recommended for production.

Feature Engineering

  • Numba JIT acceleration for feature hashing and encoding
  • Sparse features: 26 categorical features (user, ad, context)
  • Dense features: 13 numerical features (normalized)

Run Stress Test

The stress test uses SQLite in-memory for campaign data and FakeRedis for frequency capping, enabling zero-dependency testing:

# Quick test (10 campaigns, 100 requests, no ML)
python scripts/criteo/stress_test.py --campaigns 10 --requests 100 --no-ml

# With ML model (LR recommended)
python scripts/criteo/stress_test.py --campaigns 200 --requests 10000 --model lr

# Compare all models
python scripts/criteo/compare_models.py

๐Ÿ“ Project Structure

openadserver/
โ”œโ”€โ”€ ad_server/              # FastAPI application
โ”‚   โ”œโ”€โ”€ routers/            # API endpoints (ad, event, campaign)
โ”‚   โ”œโ”€โ”€ services/           # Business logic
โ”‚   โ””โ”€โ”€ middleware/         # Logging, metrics, auth
โ”œโ”€โ”€ rec_engine/             # Recommendation engine
โ”‚   โ”œโ”€โ”€ retrieval/          # Candidate retrieval & targeting
โ”‚   โ”œโ”€โ”€ ranking/            # eCPM bidding & ranking
โ”‚   โ”œโ”€โ”€ filter/             # Budget, frequency, quality filters
โ”‚   โ””โ”€โ”€ reranking/          # Diversity & exploration
โ”œโ”€โ”€ ml_engine/              # Machine learning
โ”‚   โ”œโ”€โ”€ models/             # DeepFM, LR, FM implementations
โ”‚   โ”œโ”€โ”€ features/           # Feature engineering pipeline
โ”‚   โ””โ”€โ”€ serving/            # Online prediction server
โ”œโ”€โ”€ common/                 # Shared utilities
โ”‚   โ”œโ”€โ”€ config.py           # Configuration management
โ”‚   โ”œโ”€โ”€ database.py         # PostgreSQL connection
โ”‚   โ”œโ”€โ”€ cache.py            # Redis client
โ”‚   โ””โ”€โ”€ logger.py           # Structured logging
โ”œโ”€โ”€ trainer/                # Model training
โ”œโ”€โ”€ scripts/                # Utility scripts
โ”œโ”€โ”€ configs/                # YAML configurations
โ”œโ”€โ”€ deployment/             # Docker, K8s, Nginx
โ””โ”€โ”€ tests/                  # Test suite

๐Ÿ—บ๏ธ Roadmap

โœ… v1.0 (Current)

  • Core ad serving API
  • eCPM-based ranking (CPM/CPC/CPA)
  • Targeting engine (geo, device, demographics)
  • DeepFM & LR CTR models
  • Event tracking (impression/click/conversion)
  • Docker Compose deployment
  • Prometheus + Grafana monitoring

๐Ÿšง v1.1 (Next)

  • Admin dashboard UI (React)
  • Campaign management API
  • Audience segments
  • A/B testing framework

๐Ÿ”ฎ v2.0 (Future)

  • OpenRTB 2.5 support
  • Header bidding
  • Multi-tenant SaaS mode
  • Kubernetes Helm charts
  • Video ad support (VAST)

๐Ÿ†š Why Not Just Use...

Google Ad Manager?

  • ๐Ÿ’ฐ Takes 20-30% revenue share
  • ๐Ÿ”’ Your data belongs to Google
  • ๐Ÿšซ Limited customization
  • OpenAdServer: Keep 100% revenue, own your data

Revive Adserver?

  • ๐Ÿ‘ด Legacy PHP codebase
  • ๐ŸŒ No ML capabilities
  • ๐Ÿ“Š Basic reporting only
  • OpenAdServer: Modern Python, ML-powered, real eCPM

Building from scratch?

  • โฐ 6-12 months development
  • ๐Ÿ’ธ $100K+ engineering cost
  • ๐Ÿ› Countless edge cases
  • OpenAdServer: Production-ready in hours

๐Ÿค Contributing

We love contributions! See CONTRIBUTING.md for guidelines.

# Setup development environment
make setup

# Run tests
make test

# Run linting
make lint

# Format code
make format

๐Ÿ“„ License

Apache License 2.0 โ€” See LICENSE for details.

Free for commercial use. No attribution required (but appreciated! ๐Ÿ™)


๐Ÿ’ฌ Community & Support


Built with โค๏ธ by engineers who've scaled ad systems to 100M+ daily requests
Extracted from production systems serving billions of ad impressions

GitHub stars

Keywords: open source ad server, self-hosted ad platform, ad serving, programmatic advertising, CTR prediction, DeepFM, ad tech, digital advertising platform, ad network software, DSP, SSP, advertising API, ad management system, Python ad server

Releases

No releases published

Packages

No packages published

Languages