'kubectl exec/logs on GKE returns "remote error: tls: internal error"
I'm currently getting errors when trying to exec or get logs for my pods on my GKE cluster.
$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error
$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error
One suspicious thing I found while troubleshooting is that all CSRs are getting denied...
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-79zkn 4m16s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7b5sx 91m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7fzjh 103m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7gstl 19m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7hrvm 11m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7mn6h 87m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7nd7h 4m57s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
...
Any idea why this is happening ? Maybe a firewall issue ?
Thanks in advance !
Update 1
Here the same commands with a verbose output --v=8
, withtout the goroutines
stack trace
$ kubectl logs --v=8 <POD-NAME>
I0527 09:27:59.624843 10407 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:27:59.628621 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.628649 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411 10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.727496 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.727512 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.727528 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756 10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.746071 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586 10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.800668 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.800680 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.800693 10407 round_trippers.go:452] Content-Length: 217
I0527 09:27:59.800712 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772 10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error","code":500}
I0527 09:27:59.801848 10407 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get \"https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>\": remote error: tls: internal error",
"code": 500
}]
F0527 09:27:59.801944 10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error
kubectl exec --v=8 -it <POD-NAME> -- sh
I0527 09:44:48.673774 11157 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:44:48.678514 11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535 11157 round_trippers.go:431] Accept: application/json, */*
I0527 09:44:48.678543 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864 11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.795995 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.796019 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.796037 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644 11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"...\"}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315 11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406 11157 round_trippers.go:431] X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420 11157 round_trippers.go:431] X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445 11157 round_trippers.go:431] X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471 11157 round_trippers.go:431] X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928 11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.914029 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.914054 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.914077 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099 11157 round_trippers.go:452] Content-Length: 149
I0527 09:44:48.915741 11157 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "error dialing backend: remote error: tls: internal error",
"code": 500
}]
F0527 09:44:48.915837 11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error
Update 2
After connecting to one of the GKE worker nodes and checking kubelet
logs I found these wired lines
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998 1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826 1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322 1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976 1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045 1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:
Update 3
I've updated the cluster version from 1.19.9-gke.1400
to 1.19.9-gke.1900
. Didn't solved the problem...
Made a Credentials Rotation
on the cluster. But didn't solved as well...
Final
After trying lot of changes in the cluster :
- Restarting kubelet on nodes
- Restarting nodes
- Upscaling/Downscaling node pool size
- Upgrading cluster version
- Rotating cluster certificates
Even creating a new cluster (on the same project, with same VPC, etc) didn't solve the issue...
This problem might be related to changes made on Firewall rules.
Only solution found, creating a new GKE cluster in a new GCP project and migrating workflow using Velero.
Solution 1:[1]
This is Known issue and the issue causes when the cluster get upgraded but error has been fixed in some versions.The fixed GKE versions are:
1.19.11-gke.1700+
1.20.7-gke.1800+
1.21.1-gke.1800+
So the recommended workaround is creating a new cluster in a new project.
Solution 2:[2]
Assume this answer may help ,
*This issue is due to pending Certificate Sigining Requests for nodes made by kubelet running in each node *
to check the csr pending for nodes
kubectl get csr --sort-by=.metadata.creationTimestamp
then approve the csr for each node
kubectl certificate approve <csr-id>
Further details ,refer this section in k8s documentation
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs
One known limitation is that the CSRs (Certificate Signing Requests) for these certificates cannot be automatically approved by the default signer in the kube-controller-manager - kubernetes.io/kubelet-serving. This will require action from the user or a third party controller
.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |