# Runbook: Rate Limit by User

- Canonical URL: https://docs.fairvisor.com/docs/cookbook/rate-limit-by-user/
- Section: docs
- Last updated: n/a
> Design, roll out, and operate per-user limits with minimal collateral impact.


## Purpose / When to use

Use this runbook when you need stable per-user fairness and protection against noisy clients without throttling whole tenants.

## Blast radius & risk level

- Risk level: medium
- Typical impact if misconfigured: high 429 rate for legitimate traffic sharing the same identity key or missing descriptor

## Signals / symptoms

- One user can starve shared backend capacity
- Per-IP limits look healthy, but user-level abuse still passes
- `fairvisor_descriptor_missing_total` grows for your identity key

## Detection queries

```promql
sum by (reason) (rate(fairvisor_decisions_total{action="reject"}[5m]))
rate(fairvisor_descriptor_missing_total{key="jwt:sub"}[5m])
rate(fairvisor_descriptor_missing_total{key="header:x-user-id"}[5m])
```

Optional header trace:

```bash
curl -i -X POST http://localhost:8080/v1/decision \
  -H 'X-Original-Method: GET' \
  -H 'X-Original-URI: /api/v1/items' \
  -H 'Authorization: Bearer <jwt>'
```

## Triage checklist

1. Pick the identity source (`jwt:sub` preferred, `header:x-user-id` fallback).
2. Confirm identity field exists on production traffic paths.
3. Confirm gateway forwards required headers consistently.
4. Validate no descriptor missing spikes before enforcement.
5. Confirm route selector scope is narrow enough (avoid global accidental coverage).

## Mitigation playbook

Safe-first path:

1. Create policy with `mode: shadow` and per-user token bucket.
2. Observe would-reject volume for at least one traffic cycle.
3. Tune `tokens_per_second` and `burst` until false positives are acceptable.
4. Promote to `mode: enforce` with monotonic `bundle_version` bump.

Reference policy snippet:

```json
{
  "id": "api-per-user-limit",
  "spec": {
    "mode": "shadow",
    "selector": { "pathPrefix": "/api/" },
    "rules": [
      {
        "name": "per-user-rps",
        "limit_keys": ["jwt:sub"],
        "algorithm": "token_bucket",
        "algorithm_config": {
          "tokens_per_second": 10,
          "burst": 20
        }
      }
    ]
  }
}
```

Fallback identity variant:

```json
"limit_keys": ["header:x-user-id"]
```

## Verification checklist

1. Reject reason distribution is stable and expected.
2. `fairvisor_descriptor_missing_total` is near zero for chosen key.
3. No unexpected 429 surge on core endpoints.
4. `RateLimit-*` headers align with expected user-level buckets.

## Exit criteria

- No sustained user-facing error regression
- Per-user fairness objective achieved
- Alert noise stays within team threshold

## Rollback / recovery path

1. Switch policy back to `mode: shadow`.
2. If still noisy, remove policy and redeploy known-good bundle.
3. Verify reject-rate baseline recovery.

## Post-incident notes

Record:

- chosen identity key and reason
- final threshold values
- false-positive examples
- gateway forwarding fixes applied

## Do not

- Do not enforce per-user limits before validating descriptor presence.
- Do not combine multiple new identity keys in one rollout.
- Do not rely on IP as the primary identity for authenticated APIs.

## Related docs

- [Rules & Descriptors](/docs/policy/rules/)
- [Shadow Mode](/docs/policy/shadow-mode/)
- [Decision Tracing](/docs/reference/decision-tracing/)

