Cardinality management
What cardinality is
The number of unique combinations of label values. Each combination = a separate time series in the TSDB. 10k series is nothing, 1M hurts, 10M brings the TSDB down.
The main sources of an explosion
- user_id / session_id in labels — thousands of users × N other labels = millions of series
- HTTP path without templating —
/users/123/ordersinstead of/users/:id/orders - Timestamps in labels — never
How to find it
GET /api/v1/orgs/:slug/metricsThen for a suspicious metric:
GET /api/v1/orgs/:slug/metrics/:name/labelsMore than 1000 values per label is a red flag.
The fix
- Template the URL path (
/users/:id/...) - Drop labels on the agent via include/exclude configuration
- Move user_id into logs, not metrics