268 lines
10 KiB
Markdown
268 lines
10 KiB
Markdown
# DDoS Protection Challenge Module
|
|
|
|
A Cloudflare-style "Under Attack" mode for aproxy that protects your service from DDoS attacks, aggressive scraping, and automated bots.
|
|
|
|
## How It Works
|
|
|
|
This module implements a multi-layered defense system:
|
|
|
|
### 1. Challenge-Response System
|
|
When an unverified visitor (without a valid token) accesses your site, they see a security challenge page instead of the actual content. The visitor must click a "Verify I'm Human" button to prove they're not a bot.
|
|
|
|
### 2. Honeypot Detection
|
|
The challenge page includes a hidden link that's invisible to humans but may be discovered by automated scrapers and bots. If this link is accessed, the IP is immediately banned for the configured duration.
|
|
|
|
### 3. Token-Based Validation
|
|
Upon successfully completing the challenge, users receive a cookie with a cryptographic token. This token remains valid for the configured duration (default: 24 hours), so legitimate users don't have to solve challenges repeatedly.
|
|
|
|
### 4. IP Banning
|
|
IPs that trigger the honeypot are temporarily banned and cannot access your service. The ban duration is configurable.
|
|
|
|
## Why This Helps With DDoS/Scraping
|
|
|
|
- **Computational Cost**: Most DDoS attacks and scrapers make thousands of requests. Each request hitting your application has computational cost. This module intercepts requests before they reach your backend.
|
|
- **Bot Detection**: Automated tools often don't execute JavaScript or render pages properly. The challenge page requires interaction, filtering out most bots.
|
|
- **Honeypot Trap**: Scrapers that parse HTML for links will likely find and follow the honeypot link, getting themselves banned.
|
|
- **Rate Limiting Effect**: Even sophisticated bots that can solve the challenge have to do extra work, effectively rate-limiting them.
|
|
|
|
## Configuration
|
|
|
|
### Nginx Setup
|
|
|
|
**REQUIRED**: Add these shared dictionaries to your nginx/OpenResty configuration:
|
|
|
|
```nginx
|
|
http {
|
|
# Shared dictionary for banned IPs
|
|
lua_shared_dict aproxy_bans 10m;
|
|
|
|
# Shared dictionary for valid tokens
|
|
lua_shared_dict aproxy_tokens 10m;
|
|
|
|
# ... rest of your config
|
|
}
|
|
```
|
|
|
|
### aproxy Configuration
|
|
|
|
Add to your `conf.lua`:
|
|
|
|
```lua
|
|
return {
|
|
version = 1,
|
|
wantedScripts = {
|
|
['ddos_protection_challenge'] = {
|
|
ban_duration = 3600, -- 1 hour ban for honeypot triggers
|
|
token_duration = 86400, -- 24 hour token validity
|
|
cookie_name = 'aproxy_token',
|
|
shared_dict_bans = 'aproxy_bans',
|
|
shared_dict_tokens = 'aproxy_tokens',
|
|
protected_paths = { -- Optional: specific paths to protect
|
|
'/api/.*', -- Protect all API endpoints
|
|
'/search', -- Protect search endpoint
|
|
},
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Protect Specific Paths Only**: By default, if `protected_paths` is not configured or is empty, the challenge applies to ALL requests. You can configure specific paths to protect expensive endpoints while leaving static assets unprotected:
|
|
|
|
```lua
|
|
-- Protect only expensive API endpoints
|
|
protected_paths = {'/api/.*', '/search'}
|
|
|
|
-- This allows static assets, images, etc. to pass through freely
|
|
-- while requiring challenge for costly operations
|
|
```
|
|
|
|
**Challenge Types**: Choose from three different challenge mechanisms:
|
|
|
|
```lua
|
|
-- Option 1: Simple button (default) - easiest for users
|
|
challenge_type = 'button'
|
|
|
|
-- Option 2: Multiple-choice question - better bot filtering
|
|
challenge_type = 'question'
|
|
|
|
-- Option 3: Proof-of-work - computational challenge, strongest protection
|
|
challenge_type = 'pow'
|
|
pow_difficulty = 4 -- Number of leading zeros (4 = ~1-3 seconds)
|
|
```
|
|
|
|
### Configuration Options
|
|
|
|
| Option | Type | Default | Description |
|
|
|--------|------|---------|-------------|
|
|
| `ban_duration` | number | 3600 | How long to ban IPs (in seconds) that trigger the honeypot |
|
|
| `token_duration` | number | 86400 | How long tokens remain valid after passing challenge (in seconds) |
|
|
| `cookie_name` | string | `aproxy_token` | Name of the validation cookie |
|
|
| `shared_dict_bans` | string | `aproxy_bans` | Name of nginx shared dict for banned IPs |
|
|
| `shared_dict_tokens` | string | `aproxy_tokens` | Name of nginx shared dict for valid tokens |
|
|
| `protected_paths` | list | `[]` (all paths) | List of PCRE regex patterns for paths to protect. If empty, all paths are protected |
|
|
| `challenge_type` | string | `button` | Type of challenge: `button`, `question`, or `pow` |
|
|
| `pow_difficulty` | number | 4 | Proof-of-work difficulty (leading zeros). Only used when `challenge_type` is `pow` |
|
|
|
|
## Special Endpoints
|
|
|
|
This module uses two special endpoints:
|
|
|
|
- `/__aproxy_challenge_verify` - Challenge form submission endpoint (POST)
|
|
- `/__aproxy_challenge_trap` - Honeypot link that bans IPs (GET)
|
|
|
|
⚠️ **Warning**: Don't create routes with these paths in your application.
|
|
|
|
## User Experience
|
|
|
|
### First Visit
|
|
1. User visits your site
|
|
2. Sees a security check page with a "Verify I'm Human" button
|
|
3. Clicks the button
|
|
4. Gets redirected to their original destination
|
|
5. Cookie is set for 24 hours (configurable)
|
|
|
|
### Subsequent Visits
|
|
- Users with valid cookies pass through immediately
|
|
- No challenge shown until cookie expires
|
|
|
|
### Bots/Scrapers
|
|
- Simple bots see the challenge page and likely fail to proceed
|
|
- HTML parsers might find and click the honeypot link → IP banned
|
|
- Sophisticated bots have to solve the challenge, slowing them down significantly
|
|
|
|
## Challenge Types
|
|
|
|
The module supports three different types of challenges, allowing you to experiment with different DDoS mitigation strategies:
|
|
|
|
### 1. Button Challenge (`challenge_type = 'button'`)
|
|
|
|
**How it works**: Users see a simple page with a "Verify I'm Human" button. Click the button to pass.
|
|
|
|
**Pros**:
|
|
- Easiest for legitimate users
|
|
- No friction for human visitors
|
|
- Fast (instant)
|
|
|
|
**Cons**:
|
|
- Can be bypassed by sophisticated bots that can interact with forms
|
|
- Minimal computational cost for attackers
|
|
|
|
**Best for**: General protection where UX is priority
|
|
|
|
```lua
|
|
challenge_type = 'button'
|
|
```
|
|
|
|
### 2. Question Challenge (`challenge_type = 'question'`)
|
|
|
|
**How it works**: Users must answer a simple multiple-choice question (e.g., "What is 7 + 5?", "How many days in a week?")
|
|
|
|
**Pros**:
|
|
- Harder for simple bots to bypass
|
|
- Still easy for humans
|
|
- Moderate filtering of automated tools
|
|
|
|
**Cons**:
|
|
- Requires human interaction
|
|
- Can be annoying if cookies expire frequently
|
|
- Sophisticated bots with NLP can solve these
|
|
|
|
**Best for**: Balancing security and UX, filtering out simple scrapers
|
|
|
|
```lua
|
|
challenge_type = 'question'
|
|
```
|
|
|
|
### 3. Proof-of-Work Challenge (`challenge_type = 'pow'`)
|
|
|
|
**How it works**: Client's browser must compute a SHA-256 hash with a specific number of leading zeros. JavaScript automatically solves this in the background.
|
|
|
|
**Pros**:
|
|
- Strong protection against volumetric attacks
|
|
- Requires actual computational cost from attacker
|
|
- Transparent to user (happens automatically in ~1-3 seconds)
|
|
- Bots must burn CPU time to access your site
|
|
|
|
**Cons**:
|
|
- Requires JavaScript enabled
|
|
- Uses client CPU (battery drain on mobile)
|
|
- Slower than other methods (configurable)
|
|
- Can be bypassed by distributed attackers (but at higher cost)
|
|
|
|
**Best for**: Sites under active attack, expensive endpoints, maximum protection
|
|
|
|
```lua
|
|
challenge_type = 'pow'
|
|
pow_difficulty = 4 -- Difficulty levels:
|
|
-- 3 = ~0.1 seconds (light)
|
|
-- 4 = ~1-3 seconds (moderate, default)
|
|
-- 5 = ~10-30 seconds (strong)
|
|
-- 6 = ~few minutes (very strong)
|
|
```
|
|
|
|
**How PoW difficulty works**: The `pow_difficulty` setting determines how many leading zeros the hash must have. Each additional zero makes the challenge ~16x harder:
|
|
- Difficulty 3: Client tries ~4,000 hashes (0.1s on modern device)
|
|
- Difficulty 4: Client tries ~65,000 hashes (1-3s)
|
|
- Difficulty 5: Client tries ~1,000,000 hashes (10-30s)
|
|
|
|
This creates real computational cost for attackers - a bot making 1000 requests/sec would need to spend 1000-3000 seconds of CPU time with difficulty 4.
|
|
|
|
**Security**: The server verifies the proof-of-work by computing `SHA-256(challenge + nonce)` and checking that it has the required leading zeros. Bots cannot bypass this by submitting random nonces.
|
|
|
|
## Path-Based Protection
|
|
|
|
You can configure the module to protect only specific paths, which is useful for:
|
|
|
|
- **Protecting expensive endpoints** while leaving static assets unrestricted
|
|
- **Selective protection** for API routes that cause high computational cost
|
|
- **Hybrid approach** where public pages are open but authenticated/search endpoints are protected
|
|
|
|
### Example Use Cases
|
|
|
|
**Protect only API endpoints:**
|
|
```lua
|
|
protected_paths = {'/api/.*'}
|
|
-- Static assets, homepage, etc. pass through freely
|
|
-- Only /api/* routes require the challenge
|
|
```
|
|
|
|
**Protect multiple expensive operations:**
|
|
```lua
|
|
protected_paths = {
|
|
'/api/.*', -- All API routes
|
|
'/search', -- Search endpoint
|
|
'/.well-known/webfinger', -- Webfinger (can be DB-heavy)
|
|
}
|
|
```
|
|
|
|
**Protect everything (default):**
|
|
```lua
|
|
protected_paths = {}
|
|
-- OR simply omit the protected_paths config entirely
|
|
-- All requests require challenge verification
|
|
```
|
|
|
|
### Important Notes on Path Protection
|
|
|
|
1. **Special endpoints always work**: The challenge verification (`/__aproxy_challenge_verify`) and honeypot (`/__aproxy_challenge_trap`) endpoints always function regardless of `protected_paths` configuration.
|
|
|
|
2. **IP bans are path-specific**: If an IP is banned and tries to access an unprotected path, they can still access it. Bans only apply to protected paths. This is intentional - you probably don't want to prevent banned IPs from loading CSS/images.
|
|
|
|
3. **Token applies everywhere**: Once a user passes the challenge for a protected path, their token is valid for ALL protected paths. They don't need to solve the challenge separately for each path.
|
|
|
|
4. **Use PCRE regex**: Patterns are PCRE regular expressions, so you can use advanced patterns like `^/api/v[0-9]+/search$` for complex matching.
|
|
|
|
## Security Considerations
|
|
|
|
2. **Cookie Security**: Cookies are set with `HttpOnly` and `SameSite=Lax` flags for security. Consider adding `Secure` flag if you're running HTTPS only.
|
|
|
|
3. **Shared Dictionary Size**: Size the shared dictionaries appropriately:
|
|
- Each banned IP takes ~100 bytes
|
|
- Each token takes ~100 bytes
|
|
- 10MB can store ~100,000 entries
|
|
|
|
4. **IP Address Source**: Uses `ngx.var.remote_addr`. If behind a proxy/load balancer, configure nginx to use the correct IP:
|
|
```nginx
|
|
set_real_ip_from 10.0.0.0/8; # Your proxy IP range
|
|
real_ip_header X-Forwarded-For;
|
|
```
|