The Rose-Hulman department of Computer Science and Software Engineering has two compute servers equipped with GPUs. You’ll use these servers to train your models.
Access to these servers is restricted to the Rose-Hulman campus network. If you’re off campus you’ll need to first connect to the Rose-Hulman VPN. This EIT knowledge base article explains how to do this.
These are the available servers:
Server | CPU cores | RAM | GPUs | JupyterHub link |
---|---|---|---|---|
hinton | 32 | 128GB | 8x Nvidia 1080Ti, 11GB | https://hinton.csse.rose-hulman.edu:8000/ |
gebru | 48 | 192GB | 10x Nvidia Quadro RTX 600, 24GB | https://gebru.csse.rose-hulman.edu:8000/ |
You may follow the links in the left hand column to the server landing pages. These have tech specs, basic use instructions, and infomation about their namesake researchers.
You’re welcome to use either server. The gebru server is newer and faster, but I found that real-world performance was similar for the sorts of models we’ll be training in this assignment.
We’ll use the servers in three ways, all covered in the server instruction pages above:
ssh
to check current GPU usage, and perhaps to manage git
scp
as neededIf you’re coming from a non-CSSE backgroud and want help with any of these, please speak up!
The GPU servers share their home folders. So any files you create on one server will also be there on the other one.
(NOTE: the department-supplied Python environment isn’t working. I’ll update this section once I debug things.)
Our department sysadmin, Darryl Mouck, has prepared a Python environment for our class to use. You’ll need to follow some short instructions to connect to it:
mkdir -p ~/.local/share/jupyter/kernel
ln -s /opt/venv/kernels/csse461 ~/.local/share/jupyter/kernel/csse461
From now on, when you connect to the GPU servers via JupyterHub you will be able to choose the csse461
Python kernel.
If you would like to use the course Python kernel outside of a Jupyter Notebook (if you are running code at the command line), first load the kernel with this command:
Your command prompt will change to indicate that you are now using the csse461
Python virtual environment.
It’s easy to create a Python environment on the CSSE GPU servers. First, ssh to one of the servers. From your home folder, run
to create a new Python virtual environment. Next, type
to switch to using your new venv. After creating a new Python environment, it’s always good practice to run this command before doing anything else:
Now you may install any Python packages you need with pip. At the minimum, you must install this package:
Finally you can install your venv onto JupyterHub with this command:
Of course, instead of my_venv
you can call your environment whatever you want. Just tweak all of the commands above accordingly.
As a general rule, when we’re doing GPU computing we’d rather not have two different jobs on the same card. When this happens both jobs tend to run slowly. Even worse, each GPU has a fixed about of on-board memory. If the card gets full all programs running on that card will crash.
So, it’s your job to check for a free GPU before you start a job. If you crash a classmate’s training run you owe them a cookie.
To check the status of the GPUs on your server, first connect to the server over ssh
. Then run this command:
nvidia-smi
You’ll receive a snapshot report about GPU usage on the server. If you’d prefer a continuously-running dashboard display you can try nvtop
instead of nvidia-smi
.
If you have a question about using the GPU servers, please ask me! Email and Teams are best if you can’t ask in-person.
If you think you need some system administration support, like installing a new Python package, or troubleshooting a server that has gone offline, please email both me and the CSSE department sysadmin, Darryl Mouck.