'AWS Databricks Cluster terminated.Reason:Container launch failure
We're developing custom runtime for databricks cluster. We need to version and archive our clusters for client. We made it run successfully in our own environment but we're not able to make it work in client's environment. It's large corporation with many restrictions.
We’re able to start EC2 instance and pull image, but there must be some other blocker. I think ec2 instance is succefully running, but I have error in databricks
Cluster terminated.Reason:Container launch failure
An unexpected error was encountered while launching containers on worker instances for the cluster. Please retry and contact Databricks if the problem persists.
Instance ID: i-0fb50653895453fdf
Internal error message: Failed to launch spark container on instance i-0fb50653895453fdf. Exception: Container setup has timed out
It should be in some settings/permissions inside client's environment.
Here is end of ec2 log
-----END SSH HOST KEY KEYS----- [ 59.876874] cloud-init[1705]: Cloud-init v. 21.4-0ubuntu1~18.04.1 running 'modules:final' at Wed, 09 Mar 2022 15:05:30 +0000. Up 17.38 seconds. [ 59.877016] cloud-init[1705]: Cloud-init v. 21.4-0ubuntu1~18.04.1 finished at Wed, 09 Mar 2022 15:06:13 +0000. Datasource DataSourceEc2Local. Up 59.86 seconds [ 59.819059] audit: kauditd hold queue overflow [
66.068641] audit: kauditd hold queue overflow [ 66.070755] audit: kauditd hold queue overflow [ 66.072833] audit: kauditd hold queue overflow [ 74.733249] audit: kauditd hold queue overflow [
74.735227] audit: kauditd hold queue overflow [ 74.737109] audit: kauditd hold queue overflow [ 79.899966] audit: kauditd hold queue overflow [ 79.903557] audit: kauditd hold queue overflow [
79.907108] audit: kauditd hold queue overflow [ 89.324990] audit: kauditd hold queue overflow [ 89.329193] audit: kauditd hold queue overflow [ 89.333125] audit: kauditd hold queue overflow [ 106.617320] audit: kauditd hold queue overflow [ 106.620980] audit: kauditd hold queue overflow [ 107.464865] audit: kauditd hold queue overflow [ 127.175767] audit: kauditd hold queue overflow [ 127.179897] audit: kauditd hold queue overflow [ 127.215281] audit: kauditd hold queue overflow [ 132.190357] audit: kauditd hold queue overflow [ 132.193968] audit: kauditd hold queue overflow [ 132.197546] audit: kauditd hold queue overflow [ 156.211713] audit: kauditd hold queue overflow [ 156.215388] audit: kauditd hold queue overflow [ 228.558571] audit: kauditd hold queue overflow [ 228.562120] audit: kauditd hold queue overflow [ 228.565629] audit: kauditd hold queue overflow [ 316.405562] audit: kauditd hold queue overflow [ 316.409136] audit: kauditd hold queue overflow
Solution 1:[1]
This is usually caused by slowness in downloading the custom docker image, please check if you can download from the docker repository properly from the network where your VMs are launched.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | user13035930 |