Category "prometheus"

Prometheus for k8s multi clusters

I have 3 kubernetes clusters (prod, test, monitoring). Iam new to prometheus so i have tested it by installing it in my test environment with the helm chart: #

How to write query on the elements value in prometheus

I have a metrics called - kube_node_status_condition in which I have elements that have value 0 or 1. I need to write the query in such a way that it will only

How to get all the metrics of an instance with prometheus api?

I want to fetch the monitor host's metrics through the api of prometheus, and I need to initiate a request for each metric requested. curl http://IP:9090/api/v1

How to set Prometheus Alertmanager external URL via configuration

I'm using a vanilla Docker container to start an Alertmanager. As far as I know, I cannot provide the external URL via parameter in this case, so I have to find

password protect prometheus access [closed]

I am using prometheus to graph stats on my server. The problem is that that annybody can access the graphs from http://my.Ip.Adress:port/index

How can I setup network traffic alerts on a Linux machine using Prometheus?

I am using Prometheus to monitor network traffic on Linux machines. I see several useful metrics like node_network_receive_bytes_total, node_network_transmit_by

PrometheusOperator Helm Chart: adding labels to default rules

I need to add a label to all default rules that come with the Helm chart. I tried setting the label under commonLabels in the values file, to no avail. I also t

How to get the count of distinct values for a certain label in prometheus?

We are monitoring kafka consumers with prometheus and grafana by drawing a consuming rate curve per topic and partition. We noticed that a consumer for some par

prometheus how to change instance name

"targets": [ "10.123.175.30:9100","10.125.150.14:9100"], "labels": { "env": "dev", "job": "node", "group": "developer" } for t

Is there a way to produce an alert when an IIS site goes down using Prometheus?

I have a few IIS sites I would like to monitor using Prometheus. Specifically monitor and alert on outages. I cannot figure out how to grab a metric when a site

Silence prometheus alerts based on label value / Ignore alerts from label

tl;dr I have a label in prometheus called "ignore" with value "yes": metric_test{label1="label1",ignore="yes"} 1 I want to disable alerts for any metrics with

Is there a way to use a Prometheus counter with a holt_winters function call?

In the Prometheus documentation it describes the holt_winters() function that can be used generate a smoothed curve. However the documentation states it should

Prometheus query to return Top 5 results

I am running a diskspace used query in Prometheus and would like to return only the top 5 or 10 entries from the search result. Is there anyway I can achieve th

How to get the 95th percentile of an average in Prometheus?

So I'm aware of some percentile functions in PromQL like histogram_quantile which is used in a case like this: // Over the past 5 minutes, what's the maximum ht

CPU Load average rule for 5 minutes

We are using Prometheus-Grafana. Now we want to set alert for CPU load average of 5 minutes. We have 60 servers which have different CPU core like few machine

Prometheus and nfs storage

As per prometheus storage.md , the recommendation is not to use nfs storage as persistent volume for prometheus. But solutions like prometheus operator and ope

How to sum prometheus counters when k8s pods restart

I'm running Prometheus in a kubernetes cluster. All is running find and my UI pods are counting visitors. Please ignore the title, what you see here is the

Calculate success rate with prometheus when the numerator is null

We have many use-cases when we want to calculate success rate but there were not tasks that succeeded, we would expect that the success rate will be 0, but it's

Adding metric labels in prometheus on the fly

I have a counter metric in prometheus. I want to add lables to it dynamically for example if my request comes http://abc123.com/{p1} ,I want my custom_metric_na

Prometheus query quantile of pod memory usage performance

I'd like to get the 0.95 percentile memory usage of my pods from the last x time. However this query start to take too long if I use a 'big' (7 / 10d) range. T