My node-exporter metrics are something like: process_cpu_seconds_total{instance="10.1.1.1:8080",job="node_info"} process_cpu_seconds_total{instance="10.1.1.2:80
Periodically I see the container Status: terminated - OOMKilled (exit code: 137) But it's scheduled to the node with plenty of memory $ k get statefulset -n met
How can I calculate the per-second instant rate of increase of the time series in Prometheus or Grafana without using rate() or irate()? This drive function is
If I have two metrics: kube_pod_container_status_restarts_total {..., pod="my_pod_name_42", ...} and container_memory_usage_bytes {..., pod_name="my_pod_nam
I have configured Prometheus via helm chart https://github.com/helm/charts/tree/master/stable/prometheus-operator I need to update Prometheus rules and configur
any one knows how to send metrics from airflow to prometheus, I'm not finding much documents about it, I tried the airflow operator metrics on Grafana but it d
Hy there, I'm trying to configure Kubernetes Cronjobs monitoring & alerts with Prometheus. I found this helpful guide But I always get a many-to-many matc
I have a Prometheus federation with 2 prometheus' servers - one per Kubernetes cluster and a central to rule them all. Over time the scrape durations increase.
I've configured prometheus on Centos, version details are follows. prometheus-2.5.0.linux-386 I've added two targets on the prometheus.yml configuration file
I wonder if/how it's possible to add dynamic labels. I don't know the key or the quantity of the labels which I would like to add to my gauge values. What I tr
Let's suppose I have metric purchases_total. It's a counter ( which constantly increases ). I would like to make a table in Grafana which: Shows the last 7 day
I have two exporters for feeding data into prometheus - the node exporter and the elasticsearch exporter. I'm trying to combine sources from both exporters into
In the prometheus configuration I have a job with these specs: - job_name: name_of_my_job scrape_interval: 5m scrape_timeout: 30s metrics_path:
I have a metric in Prometheus called unifi_devices_wireless_received_bytes_total, it represents the cumulative total amount of bytes a wireless device has recei
How can I match all Prometheus metrics except some? E.g: {__name__!~"metric_to_discard"} Does not work, it returns Error executing query: parse error at char
I want to send an email alert (I have a custom template) that looks like this: Description = Disk is almost full: < 20% left Summary = Volume D: on 192.168.1
I'm trying to get my query to sum over intervals in grafana but I get this error: "query processing would load too many samples into memory in query execution"
I am trying to chart total number of requests each hour with Grafana and Prometheus counters. So I have a counter which gets incremented at every request http_
I have situation where my metric is set to 0 by a program when everything works fine. I would like to treat null value as an error value (in my case 1). The eas
I have been 3 days reading about this, even configuring a set of containers to test them, but I have doubts. I understand that the architecture of Prometheus +