'Can you use a Jupyter notebook on my GCP VM to run TPU training in Google Cloud?

I am switching from running TPUs in colab to running TPUs in Google cloud. I am used to running training in the colab jupyter notebook, but from the GCP TPU quickstart guide, I'll need to use the shell script, and convert my code into a script.

https://cloud.google.com/tpu/docs/quickstart

Is there way to open a Jupyter notebook version of my GCP VM?



Solution 1:[1]

Yes, you open and run Jupyter notebook on your GCP VM. There must be other ways to do this but here's what I followed and worked for me -

Phase 1 - Make sure you have set up your GCP Project and set up a VM instance in the zone TPUs are supported. For mine, I have used us-central1-f.

Phase 2 - Make sure you have your VM (Compute Engine), Cloud TPU and Cloud Storage are all set and linked according to instructions provided here - https://cloud.google.com/tpu/docs/quickstart

Phase 3 - For VM, you need to enable firewall settings with following

  • Name:
  • Targets: All instances in the network
  • Source IP ranges: 0.0.0.0/0
  • Protocols and ports: Select “Specified protocols and ports” option.
  • tcp: 8888 Keep other configuration as default.

Phase 4 - You need to install the following:

  • Anaconda
wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh
  • Tensorflow, Keras and any other libraries you need
source ~/.bashrc
pip install tensorflow
pip install keras

Phase 5 - Make sure you set up your Jupyter configuration

$ jupyter notebook --generate-config
$ nano ~/.jupyter/jupyter_notebook_config.py # I use nano editor

Drop these four lines at the top of this config file and save

c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888

And that's it. You just need to run

$ jupyter notebook

and hit your browser with http://your_external_IP:8888

Solution 2:[2]

If you're using the helm chart for JupyterHub on GKE, it appears that you can also use a profile for JupyterHub as well. Make sure to set the correct overrides for kubeSpawner settings:

singleuser:
  profileList:
        scheduler_name: default-scheduler
        extra_annotations:
          tf-version.cloud-tpus.google.com: "pytorch-1.11"
        extra_resource_limits:
          cloud-tpus.google.com/v2: 8

It's not documented but you'll need to use the "default-scheduler" since GKE will require it to spawn a TPU instances.

Additional documentation here:

https://cloud.google.com/tpu/docs/kubernetes-engine-setup#job-spec

https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vamsi Sistla
Solution 2 Henry Tseng