Connecting to Rose Hulman’s GPU servers

The Rose-Hulman department of Computer Science and Software Engineering has two compute servers equipped with GPUs. You’ll use these servers to train your models.

Access to these servers is restricted to the Rose-Hulman campus network. If you’re off campus you’ll need to first connect to the Rose-Hulman VPN. This EIT knowledge base article explains how to do this.

These are the available servers:

Server CPU cores RAM GPUs JupyterHub link
hinton 32 128GB 8x Nvidia 1080Ti, 11GB https://hinton.csse.rose-hulman.edu:8000/
gebru 48 192GB 10x Nvidia Quadro RTX 600, 24GB https://gebru.csse.rose-hulman.edu:8000/

You may follow the links in the left hand column to the server landing pages. These have tech specs, basic use instructions, and infomation about their namesake researchers.

You’re welcome to use either server. The gebru server is newer and faster, but I found that real-world performance was similar for the sorts of models we’ll be training in this assignment.

Working on the GPU servers

We’ll use the servers in three ways, all covered in the server instruction pages above:

If you’re coming from a non-CSSE backgroud and want help with any of these, please speak up!

The GPU servers share their home folders. So any files you create on one server will also be there on the other one.

First-time Setup

(NOTE: the department-supplied Python environment isn’t working. I’ll update this section once I debug things.)

Our department sysadmin, Darryl Mouck, has prepared a Python environment for our class to use. You’ll need to follow some short instructions to connect to it:

  1. ssh to either GPU server
  2. Run the following two commands, exactly as written:
mkdir -p ~/.local/share/jupyter/kernel
ln -s /opt/venv/kernels/csse461 ~/.local/share/jupyter/kernel/csse461

From now on, when you connect to the GPU servers via JupyterHub you will be able to choose the csse461 Python kernel.

If you would like to use the course Python kernel outside of a Jupyter Notebook (if you are running code at the command line), first load the kernel with this command:

source csse461 activate

Your command prompt will change to indicate that you are now using the csse461 Python virtual environment.

Creating Your own Python Environment

It’s easy to create a Python environment on the CSSE GPU servers. First, ssh to one of the servers. From your home folder, run

python -m venv my_venv

to create a new Python virtual environment. Next, type

source my_venv/bin/activate

to switch to using your new venv. After creating a new Python environment, it’s always good practice to run this command before doing anything else:

pip install --upgrade pip

Now you may install any Python packages you need with pip. At the minimum, you must install this package:

pip install ipykernel

Finally you can install your venv onto JupyterHub with this command:

python -m ipykernel install --user --name=my_venv

Of course, instead of my_venv you can call your environment whatever you want. Just tweak all of the commands above accordingly.

Server Etiquette

As a general rule, when we’re doing GPU computing we’d rather not have two different jobs on the same card. When this happens both jobs tend to run slowly. Even worse, each GPU has a fixed about of on-board memory. If the card gets full all programs running on that card will crash.

So, it’s your job to check for a free GPU before you start a job. If you crash a classmate’s training run you owe them a cookie.

To check the status of the GPUs on your server, first connect to the server over ssh. Then run this command:

nvidia-smi

You’ll receive a snapshot report about GPU usage on the server. If you’d prefer a continuously-running dashboard display you can try nvtop instead of nvidia-smi.

Getting Help

If you have a question about using the GPU servers, please ask me! Email and Teams are best if you can’t ask in-person.

If you think you need some system administration support, like installing a new Python package, or troubleshooting a server that has gone offline, please email both me and the CSSE department sysadmin, Darryl Mouck.