Understanding cardinality is crucial for building effective observability systems. This concept, borrowed from set theory, directly impacts the performance and cost of your monitoring infrastructure.
The Mathematical Origin
Cardinality comes from set theory and refers to the number of distinct elements in a set. In mathematics, the cardinality of a set A is denoted as |A|.
Simple Example
Consider a metric with two dimensions: service and status.
Initial state:
service: {api, web, db} → 3 valuesstatus: {success, error} → 2 values
Cardinality = 3 × 2 = 6 unique metric series
|
|
Now add a high-cardinality dimension:
service: {api, web, db} → 3 valuesstatus: {success, error} → 2 valuesuser_id: {user1, user2, …, user1000} → 1000 values
New cardinality = 3 × 2 × 1000 = 6,000 unique metric series
Each unique combination of dimension values creates a separate time series that must be tracked, stored, and queried.
Impact on Observability
Memory and Storage
High cardinality metrics consume exponentially more resources:
- Low cardinality (6 series): Minimal memory, fast queries
- High cardinality (6,000 series): 1000x more memory, slower queries
Query Performance
As cardinality grows, query performance degrades:
|
|
Best Practices
- Avoid high-cardinality dimensions: Don’t use
user_id,request_id, or timestamps as labels - Use aggregation: Group similar events together (e.g., by
statusinstead of individual errors) - Monitor cardinality: Track the number of unique series per metric
- Use logs for high-cardinality data: Save detailed, unique identifiers for log analysis, not metrics
The Cardinality Explosion Problem
When a dimension has unbounded or very large value sets (like user IDs, request IDs, or IP addresses), you create a cardinality explosion:
- Each unique value multiplies the total series count
- Most series will have count=1 (no aggregation benefit)
- Storage and query costs grow linearly with unique values
Key insight: Metrics should aggregate, not enumerate. If you need to track every unique occurrence, use logs or traces instead.