-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
Environment:
- OpenVPN 2.6.14 with DCO enabled
- Linux kernel 6.14.0 with ovpn-dco kernel module
- Production setup: 50+ OpenVPN server processes per machine
- dnsproxy forwarding traffic using random 127.0.x.x source IPs (censorship circumvention)
Problem:
Server with DCO consistently shows 70-150% more peer entries in status files compared to identical non-DCO server with equal load distribution:
- DCO server: 1,282 unique clients (after deduplication)
- Non-DCO server: 741 unique clients
- Inflation: 541 extra entries (73%)
Root Cause Analysis:
When clients disconnect from process A and reconnect to process B, the old entry in process A is not removed:
- DCO: Old entries persist indefinitely in status files → accumulation over time
- Non-DCO: Old entries are cleaned up properly → 1:1 ratio maintained
Evidence:
Example client appearing in 3 different OpenVPN processes:
server-72581: connected 05:38:25 (still in status file after 2+ hours)
server-91967: connected 05:56:00 (still in status file after 1.5+ hours)
server-91970: connected 06:30:16 (current active connection)
All 3 entries remain in respective status files. With non-DCO, only the most recent connection appears.
Keepalive Configuration:
keepalive 25 180
Server-side timeout should be 360 seconds (180 × 2), but old entries never expire.
Log Analysis (2-hour window):
- New connections created: 634
- DEL_PEER notifications received: 283
- Gap: 351 peers never sent expiry notification
This suggests DCO kernel module is not triggering keepalive expiry for all disconnected peers.
What We've Tried:
- Periodic cleanup of orphaned instances - Failed: Either found nothing to clean or removed active connections
- Duplicate detection at instance creation - Failed: Common name not available until after TLS handshake completes
- Duplicate detection after TLS handshake - Partial success: Prevents within-process duplicates, but doesn't fix
cross-process inflation (the main problem)
Current Status:
- Within-process duplicates: Fixed (0 duplicates found in same process)
- Cross-process stale entries: Not fixed (70% inflation persists)
Question for OpenVPN Team:
Why would DCO fail to send DEL_PEER notifications for ~50% of disconnected peers, causing stale entries to persist indefinitely in userspace status files? Is this a known limitation with DCO keepalive mechanism, or is there a configuration/implementation issue we're missing?
Any guidance on how to ensure proper cleanup of stale DCO peer entries would be greatly appreciated.