'Prometheus Alert Manager - CPU high not alerting

I configured prometheus alert manager, but he is not alerting when the CPU of one of my server goes to 99% of usage. This is the alert :

- alert: HostHighCpuLoad
  expr: avg(irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "High usage on {{ $labels.instance }}"
    description: "{{ $labels.instance }} has a average CPU idle (current value: {{ $value }}s)"

It looks like my expression, take the global average of all my servers, but i need to monitor this measure for every single server.

Someone already got this problem ?



Solution 1:[1]

Yes, it is considering the average of all instances. Change the expression to:

avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Marcelo Ávila de Oliveira