'Calculate Max in value with prometheus
Since I am prometheus-newbie I do not know how to express the question:
"What is the maximum number of messages which have been processed per second during the last day". The metric is named messages_in_total
I tried
max_over_time(messages_in_total{}[1d])
- but this returns the maximum of the counter valueicrease(messages_in_total{}[1d])
- but this returns the number the counter increased
What I really need would be something like (pseudocode)
1.) Transform range vector which contains absolute messages_in_total to a range vector with which has a value for each second.
2.) get the max out out of it
Example:
- initial range vector values = (3000,4000, 7000, 8009)
- adjusted range vector values with rate for each second (values are guessed) = (40, 70, 40)
- max_value => 70 messages processed per second
Any ideas?
Solution 1:[1]
It is possible.
Example query:
max_over_time(
irate( messages_in_total[2m] )[1d:1m]
)
This will:
- take last 1 day
- For every 1 minute in that 1 day range it will execute
irate( messages_in_total[2m] )
- Combine that into range vector
- Call max_over_time on all results
See subquery documentation for more information!
Solution 2:[2]
While the answer returns the maximum per-second rate over the last 24 hours for messages_in_total
metric, it has the following potential issues:
- It may skip a part of raw samples if the interval between them (aka
scrape_interval
) is smaller than one minute. This can be fixed by reducing thestep
value in square brackets after the colon, so it doesn't exceed thescrape_interval
. - It may return an empty result or incomplete result if the scrape interval exceeds 2m (e.g. 2 minutes). This can be fixed by increasing the lookbehind window in the inner square brackets from
2m
to the value exceeding 2xscrape_interval
. - It may become very slow and resource hungry because of subquery overhead.
- Subqueries are easy to mis-use, so they would silently return unexpected results.
While Prometheus doesn't provide the reliable and easy to use solution for these issues, other Prometheus-like systems may have the solution. For example, the following MetricsQL query returns the maximum, the minimum and the average per-second increase rates for messages_in_total
time series for the last 24 hours:
rollup_rate(messages_in_total[1d])
It uses rollup_rate function.
If you need only the maximum per-second rate, then the query can be wrapped into label_match function, which leaves only time series with rollup="max"
label:
label_match(
rollup_rate(messages_in_total[1d]),
"rollup", "max"
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | valyala |