'kubectl exec/logs on GKE returns "remote error: tls: internal error"

I'm currently getting errors when trying to exec or get logs for my pods on my GKE cluster.

$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error
$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error

One suspicious thing I found while troubleshooting is that all CSRs are getting denied...

$ kubectl get csr
NAME        AGE     SIGNERNAME                      REQUESTOR                 CONDITION
csr-79zkn   4m16s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7b5sx   91m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7fzjh   103m    kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7gstl   19m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7hrvm   11m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7mn6h   87m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7nd7h   4m57s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
...

Any idea why this is happening ? Maybe a firewall issue ?

Thanks in advance !

Update 1

Here the same commands with a verbose output --v=8, withtout the goroutines stack trace

$ kubectl logs --v=8 <POD-NAME>

I0527 09:27:59.624843   10407 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:27:59.628621   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.628649   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411   10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.727496   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.727512   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.727528   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756   10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.746071   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586   10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.800668   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.800680   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.800693   10407 round_trippers.go:452]     Content-Length: 217
I0527 09:27:59.800712   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772   10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error","code":500}
I0527 09:27:59.801848   10407 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error",
  "code": 500
}]
F0527 09:27:59.801944   10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error

kubectl exec --v=8 -it <POD-NAME> -- sh

I0527 09:44:48.673774   11157 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:44:48.678514   11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535   11157 round_trippers.go:431]     Accept: application/json, */*
I0527 09:44:48.678543   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864   11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.795995   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.796019   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.796037   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644   11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315   11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471   11157 round_trippers.go:431]     X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928   11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.914029   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.914054   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.914077   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099   11157 round_trippers.go:452]     Content-Length: 149
I0527 09:44:48.915741   11157 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "error dialing backend: remote error: tls: internal error",
  "code": 500
}]
F0527 09:44:48.915837   11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error

Update 2

After connecting to one of the GKE worker nodes and checking kubelet logs I found these wired lines

May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998    1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826    1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322    1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976    1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045    1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:

Update 3

I've updated the cluster version from 1.19.9-gke.1400 to 1.19.9-gke.1900. Didn't solved the problem...

Made a Credentials Rotation on the cluster. But didn't solved as well...

Final

After trying lot of changes in the cluster :

  • Restarting kubelet on nodes
  • Restarting nodes
  • Upscaling/Downscaling node pool size
  • Upgrading cluster version
  • Rotating cluster certificates

Even creating a new cluster (on the same project, with same VPC, etc) didn't solve the issue...

This problem might be related to changes made on Firewall rules.

Only solution found, creating a new GKE cluster in a new GCP project and migrating workflow using Velero.



Solution 1:[1]

This is Known issue and the issue causes when the cluster get upgraded but error has been fixed in some versions.The fixed GKE versions are:

1.19.11-gke.1700+
1.20.7-gke.1800+
1.21.1-gke.1800+

So the recommended workaround is creating a new cluster in a new project.

Solution 2:[2]

Assume this answer may help ,

*This issue is due to pending Certificate Sigining Requests for nodes made by kubelet running in each node *

to check the csr pending for nodes

kubectl get csr  --sort-by=.metadata.creationTimestamp

then approve the csr for each node

kubectl certificate  approve  <csr-id>

Further details ,refer this section in k8s documentation

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs

One known limitation is that the CSRs (Certificate Signing Requests) for these certificates cannot be automatically approved by the default signer in the kube-controller-manager - kubernetes.io/kubelet-serving. This will require action from the user or a third party controller

.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2