'How to get the count of distinct values for a certain label in prometheus?
We are monitoring kafka consumers with prometheus and grafana by drawing a consuming rate curve per topic and partition. We noticed that a consumer for some partition may stop working because of some error. It would be convenient to add an alert if there is some function that counts the number of distinct partitions (as a label value) that's being consumed.
Update:
We have a time series like this:
consume_rate_count{topic="my-kafka-topic",partition="0"} 320 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="1"} 316 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="2"} 331 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="3"} 322 1495164869031
And we're looking for way to get the count of distinct partitions with a positive consume rate. So if we get the following data, an alert will be trigger, because we have 4 partitions in total, but only 3 of them is being consumed.
consume_rate_count{topic="my-kafka-topic",partition="0"} 320 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="1"} 316 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="2"} 0 1495164869031
consume_rate_count{topic="my-kafka-topic",partition="3"} 322 1495164869031
Solution 1:[1]
consume_rate_count == 0
will do it.
Solution 2:[2]
The following query returns the number of per-topic partitions with non-zero consume rate:
count(consume_rate_count > 0) without (partition)
The query uses >
operator and count() aggregate function for counting the number of non-zero time series and grouping them by all the labels except partition
. See these docs about >
operator.
The following query returns the number of per-topic partitions with zero consume rate:
count(consume_rate_count == 0) without (partition)
The following query returns non-empty result (e.g. an alert) for topics with at least a single partition with zero consume rate if this topic contains at least a single partition with non-zero consume rate:
count(consume_rate_count == 0) without (partition) > 0
and
count(consume_rate_count > 0) without (partition) > 0
This query uses and
operator, which returns time series on the left side only if there are time series with the same set of labels on the right side - see these docs for details about this operator.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | brian-brazil |
Solution 2 | valyala |