'Nginx Ingress Controller - Failed Calling Webhook

I set up a k8s cluster using kubeadm (v1.18) on an Ubuntu virtual machine. Now I need to add an Ingress Controller. I decided for nginx (but I'm open for other solutions). I installed it according to the docs, section "bare-metal":

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.31.1/deploy/static/provider/baremetal/deploy.yaml

The installation seems fine to me:

kubectl get all -n ingress-nginx

NAME                                            READY   STATUS      RESTARTS   AGE
pod/ingress-nginx-admission-create-b8smg        0/1     Completed   0          8m21s
pod/ingress-nginx-admission-patch-6nbjb         0/1     Completed   1          8m21s
pod/ingress-nginx-controller-78f6c57f64-m89n8   1/1     Running     0          8m31s

NAME                                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/ingress-nginx-controller             NodePort    10.107.152.204   <none>        80:32367/TCP,443:31480/TCP   8m31s
service/ingress-nginx-controller-admission   ClusterIP   10.110.191.169   <none>        443/TCP                      8m31s

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-controller   1/1     1            1           8m31s

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-controller-78f6c57f64   1         1         1       8m31s

NAME                                       COMPLETIONS   DURATION   AGE
job.batch/ingress-nginx-admission-create   1/1           2s         8m31s
job.batch/ingress-nginx-admission-patch    1/1           3s         8m31s

However, when trying to apply a custom Ingress, I get the following error:

Error from server (InternalError): error when creating "yaml/xxx/xxx-ingress.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: Temporary Redirect

Any idea what could be wrong?

I suspected DNS, but other NodePort services are working as expected and DNS works within the cluster.

The only thing I can see is that I don't have a default-http-backend which is mentioned in the docs here. However, this seems normal in my case, according to this thread.

Last but not least, I tried as well the installation with manifests (after removing ingress-nginx namespace from previous installation) and the installation via Helm chart. It has the same result.

I'm pretty much a beginner on k8s and this is my playground-cluster. So I'm open to alternative solutions as well, as long as I don't need to set up the whole cluster from scratch.

Update: With "applying custom Ingress", I mean: kubectl apply -f <myIngress.yaml>

Content of myIngress.yaml

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /someroute/fittingmyneeds
        pathType: Prefix
        backend:
          serviceName: some-service
          servicePort: 5000


Solution 1:[1]

I am not sure if this helps this late, but might it be, that your cluster was behind proxy? Because in that case you have to have no_proxy configured correctly. Specifically, it has to include .svc,.cluster.local otherwise validation webhook requests such as https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s will be routed via proxy server (note that .svc in the URL).

I had exactly this issue and adding .svc into no_proxy variable helped. You can try this out quickly by modifying /etc/kubernetes/manifests/kube-apiserver.yaml file which will in turn automatically recreate your kubernetes api server pod.

This is not the case just for ingress validation, but also for other things that might refer URL in your cluster ending with .svc or .namespace.svc.cluster.local (i.e. see this bug)

Solution 2:[2]

Another option you have is to remove the Validating Webhook entirely:

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

I found I had to do that on another issue, but the workaround/solution works here as well.

This isn't the best answer; the best answer is to figure out why this doesn't work. But at some point, you live with workarounds.

I'm installing on Docker for Mac, so I used the cloud rather than baremetal version:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.34.1/deploy/static/provider/cloud/deploy.yaml

Solution 3:[3]

In my case I'd mixed the installations up. I resolved the issue by executing the following steps:

$ kubectl get validatingwebhookconfigurations 

I iterated through the list of configurations received from the above steps and deleted the configuration using

$ `kubectl delete validatingwebhookconfigurations [configuration-name]`

Solution 4:[4]

In my case I didn't need to delete the ValidatingWebhookConfiguration. The issue was that I was using a private cluster on GCP version 1.17.14-gke.1600. If I got it correctly, on a default Kubernetes installation, the valitaingwebhook API (which of course is running on the master node), is exposed at port 443. But with GCP they changed the port to 8443 due to security reasons because in order to allocate port 443, the service needs to have root access to the node. Since they didn't want that, they changed to 8443. Now, since a private cluster only has the ports 80/443 externally allowed for Ingress on the nodes (that is, all the nodes will only accept requests to these ports), when the Kubernetes tries to validate your Ingress against the validatingwebhook-address:8443 it will fail - it would not fail if it ran on 443. This thread contains more detailed information.

So the current workaround for that, as recommended by Google itself (but very poorly documented) is adding a Firewall rule on GCP, that will allow inbound (Ingress) TCP requests to your master node at port 8443, so that the other nodes within the cluster can reach the master for validatingwebhook API running on it with that very port.

As to how to create the rule, this is how I did it:

  1. Went to Firewall Rules and added a new one.
  2. At the field Network I selected the VPC from which my cluster is.
  3. Direction of traffic I set as Ingress
  4. Action on match to Allow
  5. Targets to Specified target tags
  6. The Target tags can be found on the master node details in a property called Network tags. To find it, I opened a new window, went to my cluster node pools, found the master node pool. Then entered one of the nodes to look for the Virtual Machine details. There I found Network Tags. Copied its value and went back to the Firewall Rule form.
  7. Pasted the copied network tag to the tag field
  8. At Protocols and ports, checked the Specified protocols and ports
  9. Then checked TCP and placed 8443
  10. Saved the rule and applied the manifest again.

NOTE: Most threads out there will say it's the port 9443. It may work. But I first attempted 8443 since it was reported to work on this thread. It worked for me so I didn't even try 9443.

Solution 5:[5]

I've solved this issue. The problem was that you use Kubernetes version 1.18, but the ValidatingWebhookConfiguration in current ingress-Nginx uses the oldest API; see the doc: https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites

Ensure that the Kubernetes cluster is at least as new as v1.16 (to use admissionregistration.k8s.io/v1), or v1.9 (to use admissionregistration.k8s.io/v1beta1).

And in current yaml :

 # Source: ingress-nginx/templates/admission-webhooks/validating-webhook.yaml
    # before changing this value, check the required kubernetes version
    # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites
apiVersion: admissionregistration.k8s.io/v1beta1

and in rules :

apiVersions:
          - v1beta1

So you need to change it on v1 :

apiVersion: admissionregistration.k8s.io/v1

and add rule -v1 :

apiVersions:
          - v1beta1
          - v1

After you change it and redeploy -your custom ingress service will deploy sucessfull

Solution 6:[6]

Finally, I managed to run Ingress Nginx properly by changing the way of installation. I still don't understand why the previous installation didn't work, but I'll share nevertheless the solution along with some more insights into the original problem.

Solution

Uninstall ingress nginx: Delete the ingress-nginx namespace. This does not remove the validating webhook configuration - delete this one manually. Then install MetalLB and install ingress nginx again. I now used the version from the Helm stable repo. Now everything works as expected. Thanks to Long on the kubernetes slack channel!

Some more insights into the original problem

The yamls provided by the installation guide contain a ValidatingWebHookConfiguration:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  name: ingress-nginx-admission
  namespace: ingress-nginx
webhooks:
  - name: validate.nginx.ingress.kubernetes.io
    rules:
      - apiGroups:
          - extensions
          - networking.k8s.io
        apiVersions:
          - v1beta1
        operations:
          - CREATE
          - UPDATE
        resources:
          - ingresses
    failurePolicy: Fail
    clientConfig:
      service:
        namespace: ingress-nginx
        name: ingress-nginx-controller-admission
        path: /extensions/v1beta1/ingresses

Validation is performed whenever I create or update an ingress (the content of my ingress.yaml doesn't matter). The validation failed, because when calling the service, the response is a Temporary Redirect. I don't know why. The corresponding service is:

apiVersion: v1
kind: Service
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller-admission
  namespace: ingress-nginx
spec:
  type: ClusterIP
  ports:
    - name: https-webhook
      port: 443
      targetPort: webhook
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/component: controller

The pod matching the selector comes from this deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/instance: ingress-nginx
      app.kubernetes.io/component: controller
  revisionHistoryLimit: 10
  minReadySeconds: 0
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/component: controller
    spec:
      dnsPolicy: ClusterFirst
      containers:
        - name: controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.32.0
          imagePullPolicy: IfNotPresent
          lifecycle:
            preStop:
              exec:
                command:
                  - /wait-shutdown
          args:
            - /nginx-ingress-controller
            - --election-id=ingress-controller-leader
            - --ingress-class=nginx
            - --configmap=ingress-nginx/ingress-nginx-controller
            - --validating-webhook=:8443
            - --validating-webhook-certificate=/usr/local/certificates/cert
            - --validating-webhook-key=/usr/local/certificates/key
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
            runAsUser: 101
            allowPrivilegeEscalation: true
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          livenessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 3
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
            - name: https
              containerPort: 443
              protocol: TCP
            - name: webhook
              containerPort: 8443
              protocol: TCP
          volumeMounts:
            - name: webhook-cert
              mountPath: /usr/local/certificates/
              readOnly: true
          resources:
            requests:
              cpu: 100m
              memory: 90Mi
      serviceAccountName: ingress-nginx
      terminationGracePeriodSeconds: 300
      volumes:
        - name: webhook-cert
          secret:
            secretName: ingress-nginx-admission

Something in this validation chain goes wrong. Would be interesting to know, what and why, but I can continue working with my MetalLB solution. Note that this solution does not contain a validating webhook at all.

Solution 7:[7]

On a baremetal cluster, I disabled the admissionWebhooks during the Helm3 install:

kubectl create ns ingress-nginx

helm install [RELEASE_NAME] ingress-nginx/ingress-nginx -n ingress-nginx --set controller.admissionWebhooks.enabled=false

Solution 8:[8]

Might be because of a previous nginx-ingress-controller configuration.
You can try to run the following command -

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

Solution 9:[9]

If using terraform and helm disable the Validating Webhook

resource "helm_release" "nginx_ingress" {

...

  set {
    name  = "controller.admissionWebhooks.enabled"
    value = "false"
  }

...

}

Solution 10:[10]

what worked for me was to increase the timeout while waiting for ingress to come up.

Solution 11:[11]

I was bringing up a cluster with a known-good configuration and another had been created just last week in essentially the same way. And my error message was a little more specific about what failed in the webhook :

? Error: Failed to create Ingress
'auth-system/alertmanager-oauth2-proxy' 
because: Internal error occurred: failed calling webhook
"validate.nginx.ingress.kubernetes.io": Post
"https://nginx-nginx-ingress-controller-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s":
x509: certificate signed by unknown authority

It turns out that in my many configs, one of them had a typo in the DNS names input to nginx creation. So nginx thought it had one domain name, but it got a certificate for a slightly different dns name, which caused the validating web hook to fail.

The solution was not to delete the hook, but to address the underlying config problem in nginx dns so that it matched its X.509 certificate domain.

Solution 12:[12]

I had this error. Basically I have a script installing the nginx controller with helm; the script then immediately installs an application that uses ingress, also with helm. That app install failed, just the ingress part.

Solution was to wait 60s after the install of the nginx, to give the WebAdmissionHook time to come up and be ready.