'Kubernetes ingress controller upgrade doesn't finish the upgrade
We have an issue upgrading our nginx ingress controller:
We have thousands of ingress objects - all with the same ingress class, provided as an annotation and not as the IngressClass object (because of the version of the nginx, see versions at the end).
When we run the upgrade, the new replicaset pods just don't finish syncing the ingresses, stuck on 0/1 Running
, and eventually get restarted.
If we do a helm rollback to the previous revision, the pods that come up finish the sync in seconds, and get to a 1/1 Running
state.
We tried installing a new chart of the latest version and have it "listen" on the same ingress class, but it seems like the old and the new deployments fight one another on control of the ingresses. So the pods are stuck in a restart loop and the deployment never finishes.
When doing an upgrade (and not a new chart release install) we tried scaling down the old replicaset of the deployment to zero (causing downtime) to try and rule out any cases of a race condition loop, but the new pods still never finished starting.
I know it’s a lot of ingresses, but as I said before - when rolling back to a previous revision, the pods manage to finish the syncing in seconds and there aren’t any "races" or collisions.
I know using the ingress class annotation is deprecated but in order to migrate to the IngressClass object we need to first upgrade the ingress controller.
Here's an example for the output of kubectl describe
on a failing pod of the latest chart and app versions:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m17s default-scheduler Successfully assigned my-namespace/nginx-ingress-controller-d58fdfd89-54b2w to ip-10-10-163-229.ec2.internal
Warning RELOAD 84s nginx-ingress-controller Error reloading NGINX:
-------------------------------------------------------------------------------
Error: signal: terminated
2022/05/11 09:31:52 [warn] 41#41: the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg3797289662:150
nginx: [warn] the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg3797289662:150
2022/05/11 09:31:52 [warn] 41#41: the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg3797289662:151
nginx: [warn] the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg3797289662:151
2022/05/11 09:31:52 [warn] 41#41: the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx/nginx-cfg3797289662:152
nginx: [warn] the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx/nginx-cfg3797289662:152
-------------------------------------------------------------------------------
Normal Killing 84s kubelet Container controller failed liveness probe, will be restarted
Normal Pulled 72s (x2 over 2m16s) kubelet Container image "k8s.gcr.io/ingress-nginx/controller:v1.2.0@sha256:d8196e3bc1e72547c5dec66d6556c0ff92a23f6d0919b206be170bc90d5f9185" already present on machine
Normal Created 71s (x2 over 2m16s) kubelet Created container controller
Normal Started 71s (x2 over 2m16s) kubelet Started container controller
Warning Unhealthy 34s (x8 over 2m4s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 26s (x10 over 2m6s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning RELOAD 14s nginx-ingress-controller Error reloading NGINX:
-------------------------------------------------------------------------------
Error: signal: terminated
2022/05/11 09:32:57 [warn] 40#40: the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg1563251063:150
nginx: [warn] the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg1563251063:150
2022/05/11 09:32:57 [warn] 40#40: the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg1563251063:151
nginx: [warn] the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx/nginx-cfg1563251063:151
2022/05/11 09:32:57 [warn] 40#40: the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx/nginx-cfg1563251063:152
nginx: [warn] the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx/nginx-cfg1563251063:152
-------------------------------------------------------------------------------
Here's the versions of everything we use:
- kubernetes version 1.19.16 (eks)
- helm version 3.8.2
- helm repo used - https://kubernetes.github.io/ingress-nginx
- current ingress controller version: helm chart 3.7.1, app version 0.40.2
- we’ve tried updating to the latest and we’ve tried updating to the chart 3.8.0, app version 0.41.0 but both cases resulted in the same outcome. Any ideas what can we do?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|