Rate Limiting and API Key Auth
Two pieces of axum middleware that turn a learning toy into something you can leave running on the public internet.
Why Both
| Concern | Without protection | With protection |
|---|---|---|
| Account creation spam | Anyone can fill accounts | API key required |
| Request flooding | One client can saturate the server | 429 after burst |
| Resource exhaustion | OOM in mempool / disk | Bounded |
| Bad actors finding it | Just hits work | Slowed; visible in logs |
For our project: API key on POST /accounts, rate limit on everything.
Tower Middleware Model
axum is built on tower — every middleware is a function that wraps a request handler.
Request flow
───────────►
┌─────────────────────────────────┐
│ CorsLayer │
│ ┌────────────────────────────┐ │
│ │ GovernorLayer (rate limit) │ │
│ │ ┌─────────────────────┐ │ │
│ │ │ require_api_key │ │ │
│ │ │ ┌───────────────┐ │ │ │
│ │ │ │ create_account│ │ │ │
│ │ │ └───────────────┘ │ │ │
│ │ └─────────────────────┘ │ │
│ └────────────────────────────┘ │
└─────────────────────────────────┘
A request enters from the outside (CORS) and unwinds to the handler in the middle. Each layer can:
- Inspect the request
- Reject (return early without calling
next.run) - Modify request before passing on
- Modify response after the handler returns
Two Styles of Middleware
1. Function style (middleware::from_fn)
The simplest. A regular async fn:
async fn require_api_key(
headers: HeaderMap,
request: Request,
next: Next,
) -> Result<Response, StatusCode> {
// ... check ...
if ok {
Ok(next.run(request).await) // call the inner handler
} else {
Err(StatusCode::UNAUTHORIZED) // bail out
}
}
// Apply:
.layer(middleware::from_fn(require_api_key))
The function takes extractors (like HeaderMap) plus the special Next.
2. Tower service style
For complex stuff (state, configuration, lifetimes). You implement tower::Service. We don’t use this here, but GovernorLayer is one example — it comes pre-built so you just configure and add it.
API Key Auth Implementation
async fn require_api_key(
headers: HeaderMap,
request: Request,
next: Next,
) -> Result<Response, StatusCode> {
let expected = std::env::var("API_KEY").unwrap_or_default();
// No API_KEY env → skip auth (dev mode)
if expected.is_empty() {
return Ok(next.run(request).await);
}
let provided = headers
.get("x-api-key")
.and_then(|v| v.to_str().ok())
.unwrap_or("");
if provided == expected {
Ok(next.run(request).await)
} else {
Err(StatusCode::UNAUTHORIZED)
}
}
Key points:
- Read env at request time, not startup. Lets you change keys without restart.
- Empty env → skip. Dev mode without auth, prod with
API_KEY=.... - Constant-time compare would be more secure (
constant_time_eqcrate). For our learning toy, plain==is fine — but in production, timing attacks can leak the key character by character.
Per-Route vs Global
// Global — applies to all routes
.layer(middleware::from_fn(require_api_key))
// Per-route — only this method on this path
.route("/accounts", post(create_account)
.layer(middleware::from_fn(require_api_key)))
// Sub-router with shared middleware
let admin = Router::new()
.route("/accounts", post(create_account))
.route_layer(middleware::from_fn(require_api_key));
let app = Router::new().merge(admin)...;
For us: only POST /accounts requires a key. POST /tx doesn’t, because the signature itself is the auth.
Rate Limiting with tower-governor
Token bucket algorithm:
[●●●●●] initial bucket (capacity = burst_size)
│
▼ each request takes 1 token
[●●●●·]
[●●●··]
[●●···]
↻ +1 token added every (1 / per_second) seconds
If bucket is empty: 429 Too Many Requests.
Configuration
use tower_governor::{governor::GovernorConfigBuilder, GovernorLayer};
let conf = std::sync::Arc::new(
GovernorConfigBuilder::default()
.per_second(2) // refill rate
.burst_size(5) // max bucket size
.finish()
.unwrap()
);
let governor_layer = GovernorLayer::new(conf);
Behavior:
- First 5 requests instant (drain the bucket)
- 6th request blocked (bucket empty)
- After ~3s: bucket refills to 5
- Steady state: max 2 req/sec sustained
Connection Info Injection
tower-governor keys the bucket on client IP by default. To get the IP, axum needs to be told to expose it:
axum::serve(
listener,
app.into_make_service_with_connect_info::<std::net::SocketAddr>(),
)
.await
.unwrap();
Without this, tower-governor returns 500 “Unable To Extract Key!”.
into_make_service_with_connect_info::<SocketAddr>() injects the client SocketAddr into request extensions; governor pulls it from there.
Alternative key extractors
use tower_governor::key_extractor::GlobalKeyExtractor;
GovernorConfigBuilder::default()
.key_extractor(GlobalKeyExtractor) // one bucket for everyone
.finish()
Useful for:
- Dev/testing where IP detection is fiddly
- Behind a load balancer where every “client IP” looks the same (you’d want to extract from
X-Forwarded-Forinstead)
For prod behind a reverse proxy, you typically:
- Read
X-Forwarded-Forheader - Use a custom
KeyExtractor
Choosing Limits
| Endpoint kind | Sensible default |
|---|---|
| Public read API | per_second 60, burst 100 |
| Public write API | per_second 10, burst 20 |
| Auth endpoint | per_second 1, burst 5 |
| Internal/admin | None (auth handles it) |
For our learning project: 2/s + burst 5 is too tight (the playground hits /params + /tx + /accounts/X in quick succession). For real demo: bump to ~20/s + burst 50.
Layer Order Matters
let app = Router::new()
.route(...)
.layer(governor_layer) // applied AFTER cors
.layer(cors); // applied FIRST (outermost)
Reading: cors is outermost — it sees the request first, handles OPTIONS preflight, and sets CORS headers on responses.
Why: if rate-limit blocked the OPTIONS preflight, browsers would never make the actual request, and you’d never see CORS errors. CORS first lets browsers correctly diagnose CORS problems.
General order:
- Outermost: connection-level concerns — CORS, request ID, tracing
- Middle: rate limiting / quota
- Inner: auth, parsing
- Innermost: handler
Testing
# Auth (with API_KEY=secret123 set)
curl -i -X POST http://localhost:3000/accounts \
-H "Content-Type: application/json" \
-d '{"id":1,"balance":1,"pubkey":"1"}'
# → 401 Unauthorized
curl -i -X POST http://localhost:3000/accounts \
-H "Content-Type: application/json" \
-H "x-api-key: secret123" \
-d '{"id":1,"balance":1,"pubkey":"1"}'
# → 200 / "account 1 created"
# Rate limit
for i in {1..10}; do
curl -s -o /dev/null -w "$i: %{http_code} " http://localhost:3000/health
done
# 1: 200 2: 200 3: 200 4: 200 5: 200 6: 429 7: 429 ...
Production Concerns
We’ve skipped:
| Concern | What you’d add |
|---|---|
| Constant-time auth compare | constant_time_eq crate |
| Multiple API keys / rotation | DB lookup or HMAC-based tokens |
| OAuth / JWT | jsonwebtoken crate |
X-Forwarded-For for real IP | Custom KeyExtractor |
| Per-endpoint limits | Multiple route_layer |
| Distributed rate limit (multi-instance) | Redis + tower-redis-rate-limit |
| Burst protection vs sustained protection | Two layered limiters |
429 with Retry-After header | Already done by tower-governor |
| Auth audit log | Inside the middleware fn, log success/fail |
These all fit naturally into the same tower::Service model — additive, not destructive. That’s why people like axum/tower.
Rust Skills Reinforced
async fnmiddleware —from_fnto register a function as a layerNext— abstraction over “the rest of the request pipeline”HeaderMapextractor — type-safe header accessResult<Response, StatusCode>— early-return as 4xx/5xxstd::sync::Arc<T>— share the governor config across requests (it’s read-only state)- Layer ordering — onion model, last
.layer()is innermost - Per-route vs global layering —
.layer()vs.route_layer() into_make_service_with_connect_info::<T>()— opt into peer info injection
Mental Model
Middleware = a function that wraps the handler.
Next.run(request)= “call the inner thing, give me back its response.” Layer order = onion: last.layer(x)is outermost. Per-route layer =.route(path, handler.layer(x)).
That covers 95% of real axum middleware code.