'Modelling Total Requests received per hour in Prometheus and Grafana

I am trying to chart total number of requests each hour with Grafana and Prometheus counters.

So I have a counter which gets incremented at every request http_requests.

I am using increase(http_requests[60m]) to calculate total requests in last 60 minutes from given instant T.

But this is giving me a trend line and I wish to get a histogram.

So for example

10:00-11:00 - 100 (calculated by counter_value_at_11 - counter_value_at_10)

Now lets say current time is 11:30 so I wish to get counts for the bucket 11:00-12:00 by giving me (count_now - count_at_11).

1.) Can counters be used to model such data ?

2.) I am open to use other metric types in Prometheus if they support such modelling



Solution 1:[1]

For histogram graph you can found it in Visualization -> Draw Modes, there's Bars toggle that can be activated.

And for bucketing the data for each hour, you can set it in the Query section, by adding Min step value to "1h".

Example

Solution 2:[2]

The following PromQL query returns per-hour increase for http_requests metric:

last_over_time(increase(http_requests[1h])[1h:1h])

This query uses subqueries functionality for wrapping increase() function into last_over_time() function.

The returned numbers are shifted by one hour in the past, e.g. it shows counter increase for 10:00 - 11:00 during the next hour - 11:00 - 12:00. This time shift can be removed by adding offset -1h to the query:

last_over_time(increase(http_requests[1h] offset -1h)[1h:1h])

Prometheus doesn't support negative offsets by default, so this query returns negative offset is disabled, use --enable-feature=promql-negative-offset to enable it error unless Prometheus runs with --enable-feature=promql-negative-offset command-line flag (btw, other Prometheus-like systems such as VictoriaMetrics support negative offsets out of the box).

Note also that Prometheus has the following issues with increase() function:

  • The increase() over integer counter can return fractional results because of extrapolation. See this issue for details.
  • The increase(http_requests[1h]) doesn't take into account counter increase between the last raw sample on the previous hour and the first raw sample on the current hour. See this article and this comment for details. This may result in lower than expected increase() results over slow-moving counters.

Both issues are going to be fixed in Prometheus according to this design doc. In the mean time other Prometheus-like systems such as VictoriaMetrics may be used - they are free from these issues.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ryosagisu
Solution 2 valyala