Skip to content

KunalKumar-1/http-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Highly Scalable HTTP Server with Go

License: MIT Go Version Performance

Production-ready, high-performance HTTP server built in Go, designed to handle 1M+ requests with optimal concurrency. Features worker pools, rate limiting, comprehensive metrics, and advanced observability.

Table of Contents

Introduction

This repository contains a production-ready, highly scalable HTTP server built with Golang. The server demonstrates modern concurrency patterns, efficient resource management, and comprehensive observability features. It's designed to handle extreme loads (1M+ requests) while maintaining low latency and high reliability.

The project includes both a high-performance server and a sophisticated load testing client for benchmarking and validation.

Architecture

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                     HTTP Server (Gin)                       │
└────────────────────────┬────────────────────────────────────┘
                         │
    ┌────────────────────▼─────────────────────────┐
    │         Middleware Stack                     │
    │  ┌────────────────────────────────────────┐  │
    │  │  Request ID Generator                  │  │
    │  │  Structured Logger                     │  │
    │  │  Metrics Collector                     │  │
    │  │  Token Bucket Rate Limiter             │  │
    │  │  Request Timeout Handler               │  │
    │  └────────────────────────────────────────┘  │
    └────────────────────┬─────────────────────────┘
                         │
    ┌────────────────────▼─────────────────────────┐
    │            Route Handlers                    │
    │  ┌──────────────┐      ┌──────────────────┐ │
    │  │ Fast Compute │      │ Intensive Compute│ │
    │  │ (Direct)     │      │ (Worker Pool)    │ │
    │  └──────────────┘      └────────┬─────────┘ │
    └─────────────────────────────────┼───────────┘
                                      │
                         ┌────────────▼──────────────┐
                         │     Worker Pool           │
                         │  ┌────────────────────┐   │
                         │  │  Job Queue         │   │
                         │  │  (Buffered Chan)   │   │
                         │  └──────────┬─────────┘   │
                         │             │             │
                         │  ┌──────────▼─────────┐   │
                         │  │  Worker Goroutines │   │
                         │  │  (Configurable)    │   │
                         │  └──────────┬─────────┘   │
                         │             │             │
                         │  ┌──────────▼─────────┐   │
                         │  │  Results Channel   │   │
                         │  └────────────────────┘   │
                         └───────────────────────────┘

Core Components

  • Gin Router: High-performance HTTP router and middleware framework
  • Worker Pool: Efficient concurrent job processing with configurable workers
  • Rate Limiter: Token bucket algorithm preventing system overload
  • Metrics System: Atomic counters tracking performance in real-time
  • Health Checker: Pluggable health check system for monitoring
  • Graceful Shutdown: Clean shutdown with configurable timeout

Getting Started

Prerequisites

Before you begin, ensure you have the following installed:

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd bootcamp-web-http
  2. Install dependencies:

    go mod download
  3. Build the server:

    go build -o server cmd/main.go
  4. Build the load testing client:

    go build -o client client/client.go

Running the Server

  1. Start the server with default configuration:

    ./server
  2. Start with custom configuration:

    export SERVER_PORT=:3000
    export WORKER_COUNT=16
    export QUEUE_SIZE=20000
    export RATE_LIMIT=200000
    export ENVIRONMENT=production
    ./server
  3. Verify the server is running:

    curl http://localhost:8080/health

Expected output:

{
  "status": {
    "worker_pool": "healthy"
  },
  "timestamp": 1704974400,
  "goroutines": 42,
  "version": "1.0.0"
}

API Endpoints

Health & Monitoring

Health Check

GET /health

Returns the current health status of the server.

Response:

{
  "status": {
    "worker_pool": "healthy"
  },
  "timestamp": 1704974400,
  "goroutines": 42,
  "version": "1.0.0"
}

Example:

curl http://localhost:8080/health

Metrics Endpoint

GET /metrics
# or
GET /api/v1/stats

Returns comprehensive server performance metrics.

Response:

{
  "total_requests": 1543289,
  "success_count": 1542100,
  "error_count": 1189,
  "active_requests": 234,
  "rejected_count": 0,
  "success_rate": 99.92,
  "avg_latency_us": 125,
  "queue_depth": 45,
  "worker_count": 16,
  "goroutines": 42,
  "timestamp": 1704974400
}

Metrics Explained:

  • total_requests: Total requests received since startup
  • success_count: Successfully processed requests (2xx responses)
  • error_count: Failed requests (4xx/5xx responses)
  • rejected_count: Requests rejected by rate limiter (429)
  • active_requests: Currently processing requests
  • avg_latency_us: Average request latency in microseconds
  • queue_depth: Current number of jobs in worker pool queue
  • success_rate: Success percentage (success_count/total_requests * 100)
  • worker_count: Number of active worker goroutines
  • goroutines: Current number of goroutines
  • timestamp: Unix timestamp of the metrics snapshot

Example:

curl http://localhost:8080/metrics | jq

Compute Endpoints

Fast Compute (Direct Processing)

POST /api/v1/compute/fast
Content-Type: application/json

{
  "number": 42
}

Performs direct computation (number²) without using the worker pool. Ideal for lightweight operations.

Request Body:

{
  "number": integer (required)
}

Response (Success - 200 OK):

{
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "job_id": 0,
  "result": 1764,
  "latency_us": 45,
  "timestamp": 1704974400
}

Response Fields:

  • request_id: Unique identifier for request tracing
  • job_id: Job identifier (0 for direct processing)
  • result: Computation result (number²)
  • latency_us: Processing latency in microseconds
  • timestamp: Unix timestamp

Example:

curl -X POST http://localhost:8080/api/v1/compute/fast \
  -H "Content-Type: application/json" \
  -d '{"number": 42}'

CPU Intensive Compute (Worker Pool)

POST /api/v1/compute/intensive
Content-Type: application/json

{
  "number": 42
}

Performs CPU-intensive computation using the worker pool. Suitable for heavy operations requiring concurrency control.

Request Body:

{
  "number": integer (required)
}

Response (Success - 200 OK):

{
  "request_id": "550e8400-e29b-41d4-a716-446655440001",
  "job_id": 12345,
  "result": 1764,
  "latency_us": 234,
  "timestamp": 1704974400
}

Example:

curl -X POST http://localhost:8080/api/v1/compute/intensive \
  -H "Content-Type: application/json" \
  -d '{"number": 100}'

Error Responses

400 Bad Request

Invalid request format or missing required fields.

{
  "request_id": "550e8400-e29b-41d4-a716-446655440002",
  "error": "Invalid request format",
  "code": "INVALID_REQUEST",
  "timestamp": 1704974400
}

Example:

# Missing number field
curl -X POST http://localhost:8080/api/v1/compute/fast \
  -H "Content-Type: application/json" \
  -d '{}'

408 Request Timeout

Request exceeded the configured timeout duration.

{
  "request_id": "550e8400-e29b-41d4-a716-446655440003",
  "error": "request timeout",
  "code": "TIMEOUT",
  "timestamp": 1704974400
}

429 Too Many Requests

Rate limit exceeded. Server is protecting itself from overload.

{
  "request_id": "550e8400-e29b-41d4-a716-446655440004",
  "error": "Rate limit exceeded",
  "code": "RATE_LIMIT_EXCEEDED",
  "timestamp": 1704974400
}

503 Service Unavailable

Worker pool queue is full. Server cannot accept more jobs.

{
  "request_id": "550e8400-e29b-41d4-a716-446655440005",
  "error": "job queue full",
  "code": "SERVICE_OVERLOADED",
  "timestamp": 1704974400
}

500 Internal Server Error

Internal processing error or worker pool error.

Possible error codes:

  • INTERNAL_ERROR: General internal server error
  • PROCESSING_ERROR: Error during job processing in worker pool

Example (INTERNAL_ERROR):

{
  "request_id": "550e8400-e29b-41d4-a716-446655440006",
  "error": "internal server error",
  "code": "INTERNAL_ERROR",
  "timestamp": 1704974400
}

Example (PROCESSING_ERROR):

{
  "request_id": "550e8400-e29b-41d4-a716-446655440007",
  "error": "invalid data type for cpu_intensive job",
  "code": "PROCESSING_ERROR",
  "timestamp": 1704974400
}

Load Testing

Running Load Tests

The included load testing client can simulate high-volume traffic to benchmark server performance.

  1. Ensure the server is running:

    ./server
  2. In another terminal, run the client:

    ./client

Client Configuration

Modify the test scenarios in client/client.go:

config := &ClientConfig{
    ServerURL:      "http://localhost:8080",
    TotalRequests:  1_000_000,        // Total requests to send
    Concurrency:    500,              // Concurrent connections
    RequestTimeout: 30 * time.Second, // Timeout per request
    TestDuration:   5 * time.Minute,  // Max test duration
    WarmupRequests: 1000,             // Warmup requests
    ReportInterval: 5 * time.Second,  // Progress report interval
}

Test Scenarios

The client runs two test scenarios:

  1. Fast Endpoint Load Test

    • 1,000,000 requests
    • 500 concurrent connections
    • Tests direct processing path
  2. CPU Intensive Endpoint Load Test

    • 100,000 requests
    • 500 concurrent connections
    • Tests worker pool performance

Sample Output

=== Load Testing HTTP Server ===

✓ Server health check passed
✓ Running 1000 warmup requests...
  Warmup completed in 2.3s

=== Fast Endpoint Test ===
Sending 1,000,000 requests with 500 concurrent connections...

Progress: 250,000/1,000,000 (25%) | RPS: 45,234 | Errors: 12
Progress: 500,000/1,000,000 (50%) | RPS: 47,891 | Errors: 23
Progress: 750,000/1,000,000 (75%) | RPS: 46,543 | Errors: 31
Progress: 1,000,000/1,000,000 (100%) | RPS: 48,120 | Errors: 45

Results:
  Total Requests:     1,000,000
  Successful:         999,955 (99.99%)
  Failed:             45 (0.01%)
  Duration:           21.2s
  Requests/sec:       47,169
  Avg Latency:        10.5ms
  Min Latency:        1.2ms
  Max Latency:        234.5ms
  Rate Limit Hits:    0

Load Test Metrics

The client reports:

  • Total requests/second (RPS): Throughput measurement
  • Success rate: Percentage of successful requests
  • Latency statistics: Average, min, and max latency
  • Error breakdown: Detailed error categorization by status code
  • Rate limit hits: Number of 429 responses

Project Structure

bootcamp-web-http/
├── cmd/
│   └── main.go                    # HTTP server implementation
│       ├── Server struct          # Main server configuration
│       ├── WorkerPool             # Worker pool implementation
│       ├── RateLimiter            # Token bucket rate limiter
│       ├── HealthChecker          # Health check system
│       ├── Metrics                # Metrics collection
│       └── Middleware             # HTTP middleware stack
├── client/
│   └── client.go                  # Load testing client
│       ├── ClientConfig           # Client configuration
│       ├── LoadTestClient         # Load test client implementation
│       └── Metrics                # Client metrics tracking
├── go.mod                         # Go module dependencies
├── go.sum                         # Dependency checksums
└── README.md                      # Project documentation

Key Components

  • cmd/main.go: Complete server implementation with all features
  • client/client.go: Sophisticated load testing client
  • Server struct: Main server configuration and state
  • WorkerPool: Concurrent job processing system
  • RateLimiter: Token bucket algorithm implementation
  • HealthChecker: Pluggable health check system
  • Metrics: Atomic performance counters

Technologies Used

  • Golang: Primary language (1.23+)
  • Gin: High-performance HTTP web framework
  • sync/atomic: Lock-free atomic operations for metrics
  • context: Request cancellation and timeouts
  • time: Token bucket rate limiting
  • encoding/json: JSON request/response handling
  • log/slog: Structured logging
  • os/signal: Graceful shutdown handling

Configuration

Environment Variables

Variable Description Default Required
SERVER_PORT Server listen address :8080 No
WORKER_COUNT Number of worker goroutines CPU_COUNT * 2 No
QUEUE_SIZE Worker pool job queue size 10000 No
SHUTDOWN_TIMEOUT Graceful shutdown timeout 30s No
REQUEST_TIMEOUT Request timeout duration 30s No
RATE_LIMIT Requests per second limit 100000 No
ENVIRONMENT Environment mode development No

Configuration Examples

Development Mode:

export SERVER_PORT=:8080
export WORKER_COUNT=8
export QUEUE_SIZE=5000
export RATE_LIMIT=50000
export ENVIRONMENT=development
./server

Production Mode:

export SERVER_PORT=:8080
export WORKER_COUNT=32
export QUEUE_SIZE=50000
export RATE_LIMIT=200000
export ENVIRONMENT=production
export REQUEST_TIMEOUT=60s
export SHUTDOWN_TIMEOUT=60s
./server

High-Throughput Mode:

export WORKER_COUNT=64
export QUEUE_SIZE=100000
export RATE_LIMIT=500000
./server

Performance Characteristics

Server Performance

  • Throughput: 100,000+ requests/second (with default rate limit)
  • Latency: Sub-millisecond average for fast endpoint
  • Concurrency: Efficiently handles 500+ concurrent connections
  • Scalability: Worker pool architecture enables horizontal scaling
  • Memory: Low memory footprint with connection pooling
  • CPU: Efficient CPU utilization with worker pools

Benchmark Results

Fast Endpoint (Direct Processing):

  • RPS: 150,000+ req/s
  • Avg Latency: <1ms
  • P95 Latency: <5ms
  • P99 Latency: <10ms

CPU Intensive Endpoint (Worker Pool):

  • RPS: 50,000+ req/s
  • Avg Latency: 2-5ms
  • P95 Latency: <15ms
  • P99 Latency: <30ms

Load Test Scenarios

Scenario 1: Fast Endpoint

  • Total Requests: 1,000,000
  • Concurrency: 500
  • Success Rate: 99.99%
  • Duration: ~21 seconds

Scenario 2: CPU Intensive Endpoint

  • Total Requests: 100,000
  • Concurrency: 500
  • Success Rate: 99.95%
  • Duration: ~2 seconds

Key Features

1. Worker Pool Architecture

Efficient concurrent request processing using configurable worker pools:

type WorkerPool struct {
    workers     int
    jobQueue    chan Job
    resultQueue chan JobResult
    ctx         context.Context
    cancel      context.CancelFunc
}

Benefits:

  • Controlled concurrency prevents resource exhaustion
  • Buffered job queue handles traffic bursts
  • Graceful degradation when queue is full

2. Token Bucket Rate Limiting

Built-in rate limiting prevents system overload:

type RateLimiter struct {
    rate        int
    bucket      int
    maxBucket   int
    lastRefill  time.Time
    mu          sync.Mutex
}

Features:

  • Configurable requests per second
  • Smooth traffic distribution
  • Prevents thundering herd problems

3. Request Queuing

Configurable job queue for handling burst traffic:

  • Buffered channels for job distribution
  • Queue depth monitoring
  • Backpressure when queue is full

4. Graceful Shutdown

Clean shutdown with configurable timeout:

  • Stops accepting new connections
  • Waits for in-flight requests
  • Shuts down worker pool gracefully
  • Logs final statistics

5. Health Checks

Built-in health check endpoint:

  • Customizable health checks
  • Goroutine count monitoring
  • Timestamp tracking

6. Comprehensive Metrics

Real-time performance tracking:

  • Request counts (total, success, error, rejected)
  • Latency tracking (atomic operations)
  • Queue depth monitoring
  • Active request tracking
  • Success rate calculation

7. Request Timeout

Configurable request timeout middleware:

  • Per-request timeout enforcement
  • Context cancellation
  • Timeout error responses

8. Request Tracing

Automatic request ID generation:

  • UUID-based request IDs
  • Request ID propagation through middleware
  • Included in all responses and logs

9. Structured Logging

JSON-formatted structured logging:

  • Configurable log levels
  • Request/response logging
  • Performance metrics in logs
  • Production-ready format

Monitoring & Observability

Real-Time Metrics

Monitor server performance in real-time:

# View metrics
curl http://localhost:8080/metrics | jq

# Watch metrics continuously
watch -n 1 'curl -s http://localhost:8080/metrics | jq'

Key Metrics to Monitor

  1. Success Rate: Should stay above 99.9%
  2. Average Latency: Monitor for degradation
  3. Queue Depth: Should not consistently max out
  4. Active Requests: Indicates current load
  5. Rejected Requests: Rate limit effectiveness

Logging

Development Mode:

  • Human-readable console output
  • Detailed request/response logging
  • Debug information included

Production Mode:

  • JSON-formatted structured logs
  • Optimized log levels
  • Request ID tracing

Example log entry:

{
  "time": "2025-01-11T10:30:00Z",
  "level": "INFO",
  "msg": "request completed",
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "method": "POST",
  "path": "/api/v1/compute/fast",
  "status": 200,
  "latency_ms": 1.2,
  "client_ip": "192.168.1.100"
}

Health Monitoring

Implement custom health checks:

server.healthCheck.RegisterCheck("database", func() error {
    // Check database connection
    return db.Ping()
})

server.healthCheck.RegisterCheck("cache", func() error {
    // Check cache connection
    return cache.Ping()
})

Development

Adding New Endpoints

  1. Create a handler function:
func (s *Server) HandleNewEndpoint(c *gin.Context) {
    requestID := c.GetString("request_id")
    
    // Your logic here
    
    c.JSON(http.StatusOK, Response{
        RequestID: requestID,
        Result:    result,
        Timestamp: time.Now().Unix(),
    })
}
  1. Register the route in SetupRouter:
api := router.Group("/api/v1")
{
    api.POST("/your/endpoint", server.HandleNewEndpoint)
}

Adding Middleware

func CustomMiddleware() gin.HandlerFunc {
    return func(c *gin.Context) {
        // Before request
        start := time.Now()
        
        c.Next()
        
        // After request
        latency := time.Since(start)
        log.Printf("Request took %v", latency)
    }
}

// Register middleware
router.Use(CustomMiddleware())

Adding Health Checks

server.healthCheck.RegisterCheck("custom_check", func() error {
    if isUnhealthy() {
        return fmt.Errorf("custom check failed: %v", reason)
    }
    return nil
})

Code Formatting

# Format all Go files
go fmt ./...

# Using goimports
go install golang.org/x/tools/cmd/goimports@latest
goimports -w .

Running Tests

# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run tests with race detection
go test -race ./...

# Benchmark tests
go test -bench=. -benchmem

Deployment

Docker Deployment

Create a Dockerfile:

# Build stage
FROM golang:1.23-alpine AS builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o server ./cmd/main.go

# Runtime stage
FROM alpine:latest

RUN apk --no-cache add ca-certificates
WORKDIR /root/

COPY --from=builder /app/server .

EXPOSE 8080

CMD ["./server"]

Build and run:

# Build image
docker build -t scalable-http-server .

# Run container
docker run -p 8080:8080 \
  -e WORKER_COUNT=16 \
  -e QUEUE_SIZE=20000 \
  -e RATE_LIMIT=200000 \
  -e ENVIRONMENT=production \
  scalable-http-server

Docker Compose

Create a docker-compose.yml:

version: '3.8'

services:
  server:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SERVER_PORT=:8080
      - WORKER_COUNT=32
      - QUEUE_SIZE=50000
      - RATE_LIMIT=200000
      - ENVIRONMENT=production
      - REQUEST_TIMEOUT=60s
      - SHUTDOWN_TIMEOUT=60s
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Run with Docker Compose:

docker-compose up -d --build

Kubernetes Deployment

Create deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: scalable-http-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: http-server
  template:
    metadata:
      labels:
        app: http-server
    spec:
      containers:
      - name: server
        image: scalable-http-server:latest
        ports:
        - containerPort: 8080
        env:
        - name: SERVER_PORT
          value: ":8080"
        - name: WORKER_COUNT
          value: "32"
        - name: QUEUE_SIZE
          value: "50000"
        - name: RATE_LIMIT
          value: "200000"
        - name: ENVIRONMENT
          value: "production"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "256Mi"
            cpu: "500m"
          limits:
            memory: "512Mi"
            cpu: "1000m"
---
apiVersion: v1
kind: Service
metadata:
  name: http-server-service
spec:
  selector:
    app: http-server
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
  type: LoadBalancer

Deploy to Kubernetes:

kubectl apply -f deployment.yaml

Production Best Practices

  1. Environment Configuration

    • Always set ENVIRONMENT=production
    • Use environment-specific configurations
    • Externalize sensitive configuration
  2. Resource Tuning

    • Set WORKER_COUNT based on CPU cores (2-4x CPU count)
    • Configure QUEUE_SIZE based on expected traffic (10x RPS)
    • Adjust RATE_LIMIT to prevent overload
  3. Monitoring

    • Integrate with Prometheus/Grafana
    • Set up alerts for high error rates
    • Monitor queue depth and latency
  4. Load Balancing

    • Run multiple instances behind a load balancer
    • Use health checks for instance management
    • Implement circuit breakers
  5. Logging

    • Use centralized logging (ELK, Splunk)
    • Implement log rotation
    • Enable structured JSON logging
  6. Security

    • Use HTTPS/TLS in production
    • Implement authentication/authorization
    • Enable CORS with proper configuration
    • Add security headers middleware

Roadmap

  • High-performance HTTP server
  • Worker pool architecture
  • Token bucket rate limiting
  • Comprehensive metrics
  • Graceful shutdown
  • Health check system
  • Request tracing
  • Structured logging
  • Load testing client
  • Prometheus metrics integration
  • OpenTelemetry tracing
  • Circuit breaker pattern
  • Request caching layer
  • WebSocket support
  • gRPC endpoints
  • Database connection pooling
  • Redis integration
  • API authentication (JWT)
  • API versioning
  • OpenAPI/Swagger documentation
  • Distributed tracing (Jaeger)
  • Performance profiling endpoints
  • Horizontal pod autoscaling (HPA)
  • Service mesh integration (Istio)

Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests for new features
  4. Ensure all tests pass (go test ./...)
  5. Commit your changes (git commit -m 'Add some amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Contribution Guidelines

  • Follow Go code style guidelines (gofmt, golint)
  • Maintain or improve test coverage
  • Update documentation for new features
  • Add examples for new endpoints
  • Write clear commit messages
  • Include performance benchmarks for optimizations

Code Review Checklist

  • Code follows Go best practices
  • Tests added and passing
  • Documentation updated
  • No performance regressions
  • Error handling implemented
  • Logging added where appropriate
  • Metrics updated if needed

License

This project is licensed under the MIT License - see the LICENSE file for details.


About

Highly scalable HTTP server and client scalable upto 1 Million requests.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages