Understanding how to configure dimensions in OpenTelemetry’s SpanMetrics connector is crucial for creating effective, efficient metrics from spans. This guide explains the fundamental concepts and best practices for avoiding common pitfalls like cardinality explosion.
What is a Metric Series?
A metric series (also called a time series) is a unique stream of metric data points identified by:
- A metric name (e.g.,
ci_span_metrics_calls) - A unique combination of dimension labels (e.g.,
{service.name="api", ci.job.name="build", ci.status="success"})
Example: Understanding Metric Series
Metric name: ci_span_metrics_calls
Three different metric series for the same metric:
Series 1: ci_span_metrics_calls{service.name="api", ci.job.name="build", ci.status="success"}
Series 2: ci_span_metrics_calls{service.name="api", ci.job.name="build", ci.status="failed"}
Series 3: ci_span_metrics_calls{service.name="api", ci.job.name="test", ci.status="success"}
Even though they share the same metric name, each combination of label values creates a separate series that tracks its own count over time.
How Series Work with SpanMetrics
When a span arrives, spanmetricsconnector:
- Extracts the dimension values from the span
- Builds a unique key from ALL dimension values
- Finds or creates the metric series for that key
- Increments the counter for that series by 1
key := p.buildKey(serviceName, span, callsDimensions, resourceAttr)
attributesFun := func() pcommon.Map {
return p.buildAttributes(serviceName, span, resourceAttr, callsDimensions, ils.Scope())
}
// aggregate sums metrics
s, limitReached := sums.GetOrCreate(key, attributesFun, startTimestamp)
if !limitReached && p.config.Exemplars.Enabled && !span.TraceID().IsEmpty() {
s.AddExemplar(span.TraceID(), span.SpanID(), duration)
}
s.Add(1)
Good Aggregation (Proper Dimensions)
Span 1: {job="build", status="success"} → Series A count: 1
Span 2: {job="build", status="success"} → Series A count: 2 ✅ Aggregated!
Span 3: {job="build", status="success"} → Series A count: 3 ✅ Aggregated!
Result: 1 series with meaningful count
Bad Aggregation (With Timestamps)
Span 1: {job="build", started_at="100"} → Series A count: 1
Span 2: {job="build", started_at="101"} → Series B count: 1 ❌ New series!
Span 3: {job="build", started_at="102"} → Series C count: 1 ❌ New series!
Result: 3 series, each with count 1 (no aggregation)
Cardinality Impact
The number of possible metric series (cardinality) is the product of all dimension values:
# Example with good dimensions:
10 jobs × 3 statuses × 5 projects = 150 possible series ✅ Manageable
# Example with timestamp dimensions:
10 jobs × 3 statuses × 1000 unique timestamps = 30,000 series ❌ Explosion!
High cardinality causes:
- Excessive memory usage
- Poor query performance
- Storage problems in metrics backends
- Most series having count = 1 (no aggregation benefit)
Visual Impact: Aggregation vs No Aggregation
✅ WITH PROPER DIMENSIONS (Good Aggregation)
10 Spans arrive with: {job="build", status="success"}
Span 1 ──┐
Span 2 ──┤
Span 3 ──┤
Span 4 ──┼──► Series A: {job="build", status="success"} → COUNT: 10
Span 5 ──┤
Span 6 ──┤
Span 7 ──┤
Span 8 ──┤
Span 9 ──┤
Span 10 ─┘
Result: 1 metric series with value 10
- ✅ Useful for queries: “How many build successes?”
- ✅ Storage: 1 time series to track
- ✅ Memory: Minimal
❌ WITH TIMESTAMP DIMENSIONS (No Aggregation)
Same 10 Spans, but each has unique timestamp:
Span 1 ──► Series A: {job="build", started_at="100"} → COUNT: 1
Span 2 ──► Series B: {job="build", started_at="101"} → COUNT: 1
Span 3 ──► Series C: {job="build", started_at="102"} → COUNT: 1
Span 4 ──► Series D: {job="build", started_at="103"} → COUNT: 1
Span 5 ──► Series E: {job="build", started_at="104"} → COUNT: 1
Span 6 ──► Series F: {job="build", started_at="105"} → COUNT: 1
Span 7 ──► Series G: {job="build", started_at="106"} → COUNT: 1
Span 8 ──► Series H: {job="build", started_at="107"} → COUNT: 1
Span 9 ──► Series I: {job="build", started_at="108"} → COUNT: 1
Span 10 ─► Series J: {job="build", started_at="109"} → COUNT: 1
Result: 10 metric series, each with value 1
- ❌ Useless for queries: “How many build successes?”
- ❌ Storage: 10 time series to track (10x overhead!)
- ❌ Memory: Grows linearly with span count
Over time with 1000 spans:
- ► 1000 series, each with count 1 ❌
- ► Instead of 1 series with count 1000 ✅
Impact Over Time
| Time | Good Dimensions | Bad Dimensions (Timestamps) |
|---|---|---|
| Minute 1 | 1 series, count=100 | 100 series, count=1 each |
| Minute 2 | 1 series, count=200 | 200 series, count=1 each |
| Minute 3 | 1 series, count=300 | 300 series, count=1 each |
| Day 1 | 1 series, count=144k | 144,000 series, count=1 each ⚠️ |
| Month 1 | 1 series, count=4.3M | 4,300,000 series, count=1 each 💥 |
Key Insight: With timestamps as dimensions, you’re creating a new metric series for every single span instead of aggregating them. You lose all the benefits of metrics!