Rate Limiting Component
The rate limiting component (internal/rate
) provides per-host rate limiting to ensure respectful probing and prevent overwhelming target servers.
Overview
The rate limiting component implements a token bucket rate limiter with per-host isolation, automatic cleanup, and configurable burst capacity. It ensures SPYDER operates as a responsible internet citizen by respecting server capacity and preventing abuse.
Core Structure
PerHost
Main rate limiting structure that manages per-host limiters:
type PerHost struct {
mu sync.Mutex // Thread-safe access protection
m map[string]*limitEntry // Per-host limiter storage
perSecond float64 // Requests per second rate
burst int // Burst capacity
maxEntries int // Maximum stored entries (10,000)
}
limitEntry
Individual host rate limiting entry:
type limitEntry struct {
limiter *rate.Limiter // Token bucket limiter for the host
lastUsed time.Time // Last access time for cleanup
}
Core Functions
New(perSecond float64, burst int) *PerHost
Creates a new per-host rate limiter with automatic cleanup.
Parameters:
perSecond
: Maximum requests per second per hostburst
: Maximum burst capacity per host
Returns:
*PerHost
: Configured rate limiter instance
Features:
- Automatic Cleanup: Starts background goroutine for memory management
- Memory Protection: Limits maximum entries to 10,000 hosts
- Thread Safety: Mutex-protected concurrent access
Allow(host string) bool
Checks if a request is allowed under the rate limit without blocking.
Parameters:
host
: The target hostname for rate limiting
Returns:
bool
:true
if request is allowed,false
if rate limited
Behavior:
- Immediate Response: Non-blocking check
- Token Consumption: Consumes token if available
- Lazy Initialization: Creates limiter entry if not exists
Wait(host string)
Blocks until a request token becomes available for the host.
Parameters:
host
: The target hostname for rate limiting
Behavior:
- Blocking Operation: Waits until token is available
- Guaranteed Execution: Always allows request after wait
- Context-Free: Uses background context for waiting
Rate Limiting Algorithm
Token Bucket Implementation
- Algorithm: Uses
golang.org/x/time/rate
token bucket - Token Refill: Continuous refill at specified rate
- Burst Handling: Allows bursts up to configured capacity
- Precision: Supports fractional requests per second
Per-Host Isolation
- Independent Limits: Each host has its own rate limiter
- No Cross-Contamination: One host's rate limiting doesn't affect others
- Dynamic Creation: Limiters created on first access per host
Automatic Cleanup System
Background Cleanup Process
func (p *PerHost) cleanup() {
ticker := time.NewTicker(5 * time.Minute) // Every 5 minutes
// Remove entries older than 1 hour when exceeding maxEntries
}
Cleanup Triggers
- Time-Based: Runs every 5 minutes
- Memory-Based: Only cleans when exceeding 10,000 entries
- Age-Based: Removes entries unused for over 1 hour
Memory Management
- Prevents Memory Leaks: Removes unused host entries
- Production Ready: Handles long-running operation scenarios
- Configurable Limits: Maximum 10,000 concurrent host entries
Thread Safety
Concurrent Access Protection
- Mutex Locking: Protects map operations with mutex
- Read/Write Consistency: Ensures consistent limiter state
- Race Condition Prevention: Safe for concurrent goroutine access
Lock Optimization
- Minimal Lock Duration: Releases lock before token bucket operations
- Per-Host Granularity: Independent limiters reduce contention
- Lazy Initialization: Creates entries only when needed
Integration Points
Probe Pipeline Integration
- Pre-Request Check:
Allow()
for immediate rate limit checking - Blocking Wait:
Wait()
for guaranteed request execution - Host-Based: Applied per target hostname
Configuration Integration
- Rate Configuration: Configurable via probe settings
- Burst Configuration: Adjustable burst capacity per deployment
- Cleanup Tuning: Fixed cleanup intervals for production stability
Performance Considerations
Memory Usage
- Per-Host Storage: Memory usage scales with unique hosts
- Automatic Cleanup: Prevents unlimited memory growth
- Lightweight Entries: Minimal memory footprint per host
CPU Usage
- Efficient Algorithms: Uses optimized token bucket implementation
- Background Cleanup: Minimal CPU overhead for maintenance
- Lock Contention: Minimal due to per-host isolation
Use Cases
Respectful Probing
limiter := rate.New(1.0, 3) // 1 req/sec, burst of 3
if limiter.Allow("example.com") {
// Make request immediately
} else {
// Rate limited, handle accordingly
}
Guaranteed Execution
limiter := rate.New(0.5, 1) // 0.5 req/sec, burst of 1
limiter.Wait("example.com") // Wait for token
// Request is guaranteed to be allowed
Configuration Examples
Conservative Settings
rate.New(0.1, 1) // 1 request per 10 seconds, no burst
Standard Settings
rate.New(1.0, 3) // 1 request per second, burst of 3
Aggressive Settings
rate.New(10.0, 20) // 10 requests per second, burst of 20
Error Handling
Graceful Degradation
- No Error Returns: Rate limiting always succeeds
- Blocking Behavior:
Wait()
blocks until success - Immediate Feedback:
Allow()
provides immediate status
Resource Management
- Memory Limits: Automatic cleanup prevents resource exhaustion
- Goroutine Management: Single cleanup goroutine per limiter instance
- Clean Shutdown: Cleanup goroutine terminates with limiter
Monitoring Metrics
Rate limiting should be monitored for:
- Rate Limit Hit Rate: Percentage of requests that are rate limited
- Average Wait Time: Time spent waiting for rate limit clearance
- Active Host Count: Number of hosts currently being rate limited
- Memory Usage: Memory consumption of rate limiter storage
Best Practices
Rate Selection
- Server Respect: Choose rates that respect target server capacity
- Network Conditions: Consider network latency and server response times
- Burst Sizing: Configure burst to handle legitimate traffic spikes
Host Management
- Hostname Consistency: Use consistent hostname formats for effective limiting
- Apex vs Subdomain: Consider whether to limit by apex domain or individual hosts
- DNS Resolution: Apply rate limiting after DNS resolution to actual target hosts
Security Considerations
DoS Prevention
- Self-Protection: Prevents SPYDER from overwhelming target servers
- Reputation Protection: Maintains good internet citizenship
- Compliance: Helps comply with terms of service and robots.txt
Resource Protection
- Memory Bounds: Automatic cleanup prevents memory exhaustion attacks
- CPU Bounds: Efficient algorithms prevent CPU exhaustion
- Goroutine Bounds: Single cleanup goroutine prevents goroutine leaks
Troubleshooting
Common Issues
- Rate Too High: Servers returning errors or blocking requests
- Rate Too Low: Probe performance slower than expected
- Memory Growth: Cleanup not removing old entries effectively
Debugging Steps
- Monitor Rate Limit Hits: Check how often rate limits are triggered
- Server Response Analysis: Monitor target server response patterns
- Memory Usage Tracking: Watch rate limiter memory consumption
- Performance Profiling: Analyze impact on overall probe performance
Advanced Configuration
Dynamic Rate Adjustment
While not built-in, rate limits can be adjusted by:
- Creating new rate limiter instances
- Implementing rate adjustment based on server response patterns
- Using different rates for different host categories
Integration with Circuit Breakers
- Rate limiting complements circuit breaker functionality
- Provides primary request throttling
- Circuit breakers handle failure scenarios
- Together they provide comprehensive traffic control