'Can you use a Jupyter notebook on my GCP VM to run TPU training in Google Cloud?
I am switching from running TPUs in colab to running TPUs in Google cloud. I am used to running training in the colab jupyter notebook, but from the GCP TPU quickstart guide, I'll need to use the shell script, and convert my code into a script.
https://cloud.google.com/tpu/docs/quickstart
Is there way to open a Jupyter notebook version of my GCP VM?
Solution 1:[1]
Yes, you open and run Jupyter notebook on your GCP VM. There must be other ways to do this but here's what I followed and worked for me -
Phase 1 - Make sure you have set up your GCP Project and set up a VM instance in the zone TPUs are supported. For mine, I have used us-central1-f.
Phase 2 - Make sure you have your VM (Compute Engine), Cloud TPU and Cloud Storage are all set and linked according to instructions provided here - https://cloud.google.com/tpu/docs/quickstart
Phase 3 - For VM, you need to enable firewall settings with following
- Name:
- Targets: All instances in the network
- Source IP ranges: 0.0.0.0/0
- Protocols and ports: Select “Specified protocols and ports” option.
- tcp: 8888 Keep other configuration as default.
Phase 4 - You need to install the following:
- Anaconda
wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh
- Tensorflow, Keras and any other libraries you need
source ~/.bashrc
pip install tensorflow
pip install keras
Phase 5 - Make sure you set up your Jupyter configuration
$ jupyter notebook --generate-config
$ nano ~/.jupyter/jupyter_notebook_config.py # I use nano editor
Drop these four lines at the top of this config file and save
c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888
And that's it. You just need to run
$ jupyter notebook
and hit your browser with http://your_external_IP:8888
Solution 2:[2]
If you're using the helm chart for JupyterHub on GKE, it appears that you can also use a profile for JupyterHub as well. Make sure to set the correct overrides for kubeSpawner settings:
singleuser:
profileList:
scheduler_name: default-scheduler
extra_annotations:
tf-version.cloud-tpus.google.com: "pytorch-1.11"
extra_resource_limits:
cloud-tpus.google.com/v2: 8
It's not documented but you'll need to use the "default-scheduler" since GKE will require it to spawn a TPU instances.
Additional documentation here:
https://cloud.google.com/tpu/docs/kubernetes-engine-setup#job-spec
https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Vamsi Sistla |
Solution 2 | Henry Tseng |