Athena User Guide

You are here:Home-Resources-Athena-Athena User Guide
Athena User Guide 2018-05-31T13:56:39+00:00

Instructions:

To register as a new Athena user, please click Registration. Please wait for DGX Admin to email you the username and password after that.

If you are new to docker/nvidia-docker or DGX-1 please refer to this user guide https://sutd-athena.github.io

House rules:

  1. Prefer you only run production code on Athena. No debugging on Athena and use Dev Boxes (Apollo) instead. To request access, please write to DGX admin at dgx-admin@sutd.edu.sg
  2. Always name your docker container with [username]_[containername], for example, “chenglim_tensorflow1” so that DGX Admin is easier to contact you.
  3. Shut down containers after used to clear GPU memory. Don’t leave your container running while you are not working.
    1. $ docker stop [container_id]
    2. $ docker rm [container_id]
  4. Always mount your /dgxdata/[username] directory in your docker container, be considerate while using / (root) disk space.
  5. If you are working on something at Athena, please check slack chat and email regularly for any announcements.

 

 

Quick Tutorial:

 

To start a new container

  1.  $ NV_GPU=4,5 nvidia-docker run -d --user 0 -it -v /dgxdata/chenglim:/data --name clteo_tf nvcr.io/nvidia/tensorflow:18.03-py3 
  2. “NV_GPU=4,5” I want to isolate my GPU to be GPU number 4 and 5 only
  3. “-d” I will run my container using docker detach mode, or run in background mode
  4. “-user 0” mapping of my host uid to container uid 0 (or root)
  5. “-it” interactive mode with tty
  6. “-v /dgxdata/chenglim:/data” Map host path of /dgxdata/chenglim to container path /data
  7.  “–name clteo_tf” give container a name, later could refer to it using name

 

From host machine to go into a docker container

$ docker exec -it clteo_tf /bin/bash 
  1. “-it” interactive mode with tty
  2. “/bin/bash” with entry point of /bin/bash as my interactive shell

 

To stop a new container and remove a container

$ docker stop clteo_tf 
$ docker rm clteo_tf