Skip to content

Jupyter NodeJupyter Node

This app chart will deploy a JupyterLab server accessible through a web interface, with the home directory on persistent storage. It can be configured to:

  • Use up to eight Nvidia GPUs per instance (Nvidia GTX 1080 Ti or Nvidia RTX 2080 Ti).
  • Access Jupyter through a web interface on a subdomain, e.g. myname.icedc.se.
  • Run Python, R, Julia, and other programming languages to perform data analysis, machine learning, or scientific computing.
  • Provide external SSH access through a NodePort.
  • Install software through Ubuntu/Debian apt or Python pip.
  • Deploy long-running workloads, such as training machine learning models over several days.
  • Access S3 storage with Python using boto3, and from the command line with rclone.
  • Attach pre-existing persistent storage that is shared with other users in the same namespace.
  • Access a password protected HTTP server running in the same pod as Jupyter through jupyter-server-proxy.

Helm Chart on GitLab

Installation

Install the app through Rancher.

Follow the steps below:

  1. On the Rancher homepage, select the icekube cluster.
  2. In the left-side navigation pane, open the Apps section.
  3. Click on Charts and find the dropdown menu labeled All. Select rise-charts.
    Screenshot of the app Charts page.
    Screenshot of the app Charts page.
  4. Open the card labeled jupyternode with the Jupyter logo.
  5. Click on the Install button to start the installation process.
  6. Before you can install the app, you must create a namespace. Read the EKC usage page. Provide a valid name, such as johank-jupyternotebook. Remember, spaces are not permitted in the name.
    Screenshot of the namespace selection page.
    Screenshot of the namespace selection page.
  7. Select your Namespace and a unique Name for the app. Be aware that the installation cannot proceed without sufficient quotas in the selected namespace.
    Screenshot of the app configuration page.
    Screenshot of the app configuration page.
  8. For a basic configuration of your JupyterLab server, you must supply the following information:
    1. An Authentication token.
    2. A valid Subdomain for icedc.se. Note that *.icedc.se is all resolving to Kubernetes ingress, so we suggest using it, i.e. [yourprojectname]-[anyname].icedc.se. Example: johank-notebook.icedc.se
  9. You should also specify:

    1. Hardware requirements, such as CPU, GPU and memory for resource allocation.
    2. The size of Persistent storage you require is specified in Gibibytes (GiB). Note that storage quotas for a namespace are usually defined in Mebibytes (MiB). To convert MiB to GiB, divide the number by 1024.

    Note

    Containers are ephemeral, and all data outside persistent storage will be lost when the container is restarted. This can happen if the Kubernetes server running the container is rebooted for maintenance, or if the container crashes.

    Only store your work in persistent storage, i.e. /home/jovyan, /tf, or /root. You can also use S3 storage for large files, such as datasets, models, and checkpoints.

  10. Once these parameters are set: Click the Install button to create the JupyterLab server. After the installation, it can be accessed via the URL, e.g. https://johank-notebook.icedc.se.

The following sections provide more detailed information on how to configure the app.

Supported Docker images

Requirements

The app supports images built on Ubuntu since apt is used to automatically install required software such as SSH.

If you want to use Nvidia GPUs, the image must have Nvidia CUDA pre-installed.

Default Docker image

The default Docker image is based on the latest version of TensorFlow with GPU support, which includes Nvidia CUDA and cuDNN.

You may want to use a specific tag instead, such as :2.16.1-gpu, to ensure compatibility with your workload.

Jupyter not pre-installed

If you want to use a Docker image that does not have Jupyter pre-installed, you can install JupyterLab through the Entrypoint override. This requires Python pip to be installed in the image, for example:

JupyterLab or Jupyter Notebook pre-installed

You can use the app with a Docker image that has Jupyter pre-installed, such as:

Basic CUDA image

JupyterLab without CUDA

If you do not need a GPU, you can use a smaller image, such as:

You can also use older images by specifying the tag, e.g.

  • tensorflow/tensorflow:1.14.0-gpu-py3-jupyter.

Custom Docker image

If you need specific versions of CUDA, TensorFlow, PyTorch, etc., create your Docker image and publish it on Harbor. This Dockerfile example can be used as a starting point.

You can use the image by specifying its address on Harbor, e.g.

  • registry.ice.ri.se/myproject/custom-jupyter:0.0.1

Configure

General

Docker image (required): The Docker image to use for the Jupyter server, as described in the Supported Docker images section.

Authentication token (required): The token to access Jupyter through the web UI. Can be used to change the password later (container restart is required).

Subdomain for icedc.se (required): Access Jupyter through a subdomain of icedc.se, for example, myname.icedc.se. The subdomain must be unique, so check with your browser that it is not already in use.

Access

Enable SSH access (optional): If you want to access the Jupyter server through SSH, enable this option. You must provide your SSH public key.

Authorized keys (optional): Your public key from ~/.ssh/id_rsa.pub here. This will allow you to access the server through a NodePort. Read the EKC development guide for more information.

Jupyter port (required): The port on which to expose the Jupyter server. The default is 8888. If you want to use a different port, you must also change the port in the Entrypoint override.

Enable Jupyter URL path (optional): If you want to access Jupyter through a subdomain path, e.g. https://myname.icedc.se/jupyter, enable this option. The subdomain path must be unique.

Path at subdomain (optional): The path at the subdomain to access Jupyter, e.g. jupyter.

Port for root path (optional): The port on which to expose another service than Jupyter (e.g. a custom HTTP server). The default is 8000.

Create default HTTP server (optional): If you want to create a default HTTP server on the root path, enable this option. The server will be accessible through the subdomain.

Jupyter config file (optional): These contents will be saved to /root/.jupyter/jupyter_lab_config.py. Use this file to configure Jupyter, for example, to access an HTTP server using jupyter-server-proxy.

Hardware

Requested CPU (required): The minimum number of CPU cores to request for the Jupyter server, in milli-CPU units (m). The default is 1000m, which is equivalent to one CPU core. More cores will be available freely depending on the server load, with a maximum of 64 cores.

Memory limit (required): The maximum amount of memory to request for the Jupyter server, in Gibibytes (Gi).

Attach GPU (optional): If you want to use an Nvidia GPU, enable this option. Select the GPU type and the number of GPUs to request. You must use a Docker image that has Nvidia CUDA pre-installed.

Persistent storage

Storage size (required): The amount of persistent storage to request for the Jupyter server, in Gibibytes (Gi). The storage volume will be mounted in the Jupyter home directory.

Jupyter home directory path(s) (optional): If you want to use a different home directory path(s) than /home/jovyan, /tf and /root, specify it here. This is useful if you want to use a Docker image that has a different default home directory.

Existing shared volumes (optional): If you want to use existing persistent storage that is shared with other deployments in the same namespace, select them here as a comma-separated list. A volume named vol1 will be mounted at /mnt/vol1 etc.

Mount path for shared volumes (optional): The path at which to mount the shared volumes. The default is /mnt.

S3 storage

S3 storage (optional): S3 buckets are useful for storing large files, such as datasets, models, and checkpoints. Unlike persistent storage, S3 buckets can be dynamically resized and shared with other software platforms.

To access S3 storage through boto3 or rclone, provide your credentials:

S3 endpoint: s3.ice.ri.se

S3 access key: 20 characters

S3 secret key: 40 characters

See the documentation for S3 Access keys for more information.

The credentials will be saved to /root/.aws/credentials.

Commands

Autostart script (optional): Script to install your workload and start it. Runs detached at container post start and does not block Jupyter from starting.

  • The script is run from the file ~/autostart.sh.
  • Output is saved to file ~/autostart.log.

In this example, we clone a repository and autostart a Python script. You can clone private git repositories using access tokens.

Autostart script
git clone https://github.com/myname/ml-project.git || true # ignore errors if repo already exists
cd ml-project
python ./my-training-script.py

Entrypoint override (optional): This field can be used to override the Docker image ENTRYPOINT. Must block the container from exiting. Used to install JupyterLab and start it.

Entrypoint override
pip install jupyterlab jupyter-server-proxy
jupyter lab --ip=0.0.0.0 --port=8888 --allow-root --no-browser

or

Keep container running
sleep infinity

or leave blank to use the default Docker image ENTRYPOINT.

Usage

Access the Jupyter web interface through the subdomain you specified, e.g.

https://myname.icedc.se

and log in with the authentication token.

Screenshot of the JupyterLab login page.

Screenshot of the JupyterLab login page.

See the JupyterLab or Jupyter Notebook documentation for more information.

Access an HTTP server

Suppose that you have developed a web application that you want to access through the web.

You can run an HTTP server in the same pod as Jupyter and proxy web requests to it using the extension jupyter-server-proxy. The proxy requires you to log in to Jupyter, so it is not suitable for public access.

Set up the proxy in the Jupyter configuration file with the following settings:

~/.jupyter/jupyter_lab_config.py
c.ServerProxy.servers = {
    'web': {            # name of the proxy path
        'port': 8000,   # port to proxy '/web' path to
        'command': [    # optional command to run
           'python', '-m', 'http.server', '8000'
        ],
        'launcher_entry': {
            'title': 'Python HTTP Server',
        }
    }
}

Here, we start a Python HTTP server on port 8000 accessible through the subdomain https://myname.icedc.se/web.

Read more about ServerProxy settings in the jupyter-server-proxy documentation.

Model serving through ASGI

Modern machine-learning applications are often served using ASGI servers, such as:

Follow the previous instructions to access an HTTP server through the Jupyter server subdomain.

You must configure the ASGI server to use the correct root path, i.e. /web, for CSS and JavaScript files to be served correctly.

Uvicorn

server-start.sh
uvicorn main:app --root-path /web

Gradio

server-start.py
import gradio as gr

def greet(name):
    return "Hello " + name + "!"

demo = gr.Interface(fn=greet, inputs="text", outputs="text")
demo.launch(server_name="0.0.0.0", root_path="/web", server_port=8000)

Long-running workloads

If you want to run long-running workloads, such as training AI models that take several days to complete, you should use checkpoints to regularly save the state of your model to persistent storage (the Jupyter home directory).

If/when the Jupyter container is restarted, the execution of your workload will be interrupted. Any work that has not been saved to persistent storage will be lost. By using checkpoints, you can resume your work from the last saved state.

After the container is restarted, the Autostart script will always be executed. You can use the script to automatically resume your work from the last checkpoint.

Read more about running .ipynb notebooks from the terminal with nbconvert, and using checkpoints in TensorFlow and PyTorch.

Connect with Visual Studio Code

If you do not want to use the Jupyter web interface, you can connect to the Jupyter server using Visual Studio Code.

  1. When installing/upgrading the app in Rancher, provide your SSH public key from ~/.ssh/id_rsa.pub in the SSH public key field.
  2. Follow the EKC usage page for Visual Studio Code.
  3. Optionally, you can install the Jupyter extension for Visual Studio Code to run Jupyter Notebooks directly in the editor.

S3 storage

You can access S3 storage from Python, for example, to save a PyTorch model:

pip install -U boto3, torch
import boto3, torch, io
# ... Create and train your model
s3 = boto3.resource("s3", endpoint_url="https://s3.ice.ri.se")
bucket = s3.Bucket("my-bucket")
buffer = io.BytesIO()
torch.save(model, buffer)
bucket.put_object(Key="my_model_file.json", Body=buffer.getvalue())

The model will be saved to s3://my-bucket/my_model_file.json. You can then similarly load the PyTorch model.

Your S3 credentials will also be saved to /root/.config/rclone/rclone.conf, allowing you to access S3 storage from the command line:

rclone ls s3:my-bucket

Troubleshooting

My Jupyter app is not starting

If the Jupyter app is not starting, click on the app in Rancher and check the logs. If the logs are empty, the app is probably waiting for a large Docker image to be downloaded, or persistent storage to be created. If the logs contain errors, the app may not be configured correctly. Common issues are:

  • The subdomain is already in use by another app.

  • You are trying to mount a volume that does not exist or is not configured to be shared with multiple deployments.

  • The Docker image is not available or does not have apt or pip installed, which is required to install JupyterLab.

My Notebook execution has stopped

If your Notebook workload suddenly stops executing, the container has likely been restarted. This can happen if you have set a memory limit that is too low, or if the Kubernetes server running Jupyter has crashed.

  • If the container runs out of memory, it will be automatically restarted on the same Kubernetes server. Increase the Memory limit to avoid this.

  • When the Kubernetes server crashes, the Jupyter container will be automatically restarted on a working server. Save checkpoints to persistent storage, and use the Autostart script to automatically resume your work.