The H2P cluster supports various versions of Python. For general use, I suggest the following modules:
Some packages (e.g. Tensorflow) must be dealt with in a more specific way. I will discuss them at the end . All the modules are Anaconda distributions and support pip and conda installation commands. The tricky part is that no user has privelages to install their own python packages. You have a few options you will need to install them:
- pip install --user <package>: This will install into a local directory that you have permission to access.
- Manage your own environment using virtualenvwrapper.
I strongly encourage using virtual environments, which gives you complete control over which versions of packages are installed. Here is a little example of installing PyTorch (CPU Version) into a Virtual Environment using virtualenvwrapper.
$ module load python/3.7 venv/wrap # Available for the versions of Python listed above $ mkvirtualenv pytorch $ workon pytorch $ pip install numpy torch torchvision $ python Python 3.7.0 (default, Jun 28 2018, 13:15:42) [GCC 7.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. @>>> import torch @>>> x = torch.rand(5, 3) @>>> print(x) tensor([[0.6022, 0.5194, 0.3726], [0.0010, 0.7181, 0.7031], [0.7442, 0.5017, 0.2003], [0.1068, 0.4622, 0.2478], [0.8989, 0.8953, 0.0129]]) @>>>
To list your environments, use workon with no arguments. The nice thing about this tool is that you can "hot-swap" Python environments. If you want to swap pytorch for tensorflow you can do workon tensorflow.
In the section above, I went over the CPU version of PyTorch. Tensorflow CPU version can be installed in a similar way. However, if you want to use these packages on GPUs you need to be aware of a few things. First, these codes depend on CUDA. CUDA has 2 components:
- Software: module spider cuda (use cuda/9.1.85 when possible)
- Driver: running on the GPU nodes to drive CUDA software
For PyTorch, if you are installing it yourself use the CUDA 9.0 version. For Tensorflow, the installation is more dependent on the version of CUDA and I highly suggest you use version I installed. If you simply want to use Tensorflow: module load cuda/9.1.85 tensorflow/1.12.0-py36 (it will ONLY work on GPU nodes! py35 also exists for Python 3.5). Additionally, if you want to install a wheel file into your own environment I created an environment variable TENSORFLOW_WHEEL_DIR for you to look in. If you decide that you want to try to compile it yourself, start poking around: /ihome/crc/build/tensorflow/tensorflow-1.12.0
- Load the modules in bmooreii-build.sh (won't work with Python 3.7!), run the following commands (you will need the output during the ./configure step)
- echo $CUDA_ROOT
- ls $CUDA_ROOT/lib64
- Run ./configure which is an interactive script
- For compute capabilities make sure to use 3.5,6.1,7.0
- Leave the part about optimization alone
- If you want the code to work on all of the GPU nodes, set the lines build:opt --copt= and build:opt --host_copt= equal to -march=corei7-avx
- bazel build --jobs=1 --config=opt --config=cuda --config=mkl //tensorflow/tools/pip_package:build_pip_package
- Grab a cup of coffee, this is going to take awhile
- This will fail at some point because the login nodes don't have the CUDA driver running
- Check out the bmooreii-build.slurm which simply submits a job to finish the complication
- Finally build the wheel file
- bash bazel-bin/tensorflow/tools/pip_package/build_pip_package $(pwd)/whl_file
- Wheel file can be installed via pip
The following wheel files are available to pip install:
- GPU Only: /ihome/crc/build/tensorflow/tensorflow-1.14.0/whl_file/tensorflow-1.14.0-cp37-cp37m-linux_x86_64-cuda_10.0.130.whl
- GPU Only: /ihome/crc/build/tensorflow/tensorflow-2.0.0/whl_file/tensorflow-2.0.0-cp37-cp37m-linux_x86_64.whl
- CPU Only: /ihome/crc/build/tensorflow/tensorflow-2.0.0/whl_file/cpu/tensorflow-2.0.0-cp37-cp37m-linux_x86_64.whl
To access a virtual environment from JupyterHub you need to install ipykernel. In the following code snippet, ray is an arbitrary name you can use any name you like (but make sure to replace all instances of ray). /location/to/python is the version of python you want to use. If you want to use the one from python/3.7.0 you can omit this option. To get the path, use which python.
module purge module load python/3.7.0 venv/wrap mkvirtualenv -p /location/to/python ray workon ray pip install ipykernel python -m ipykernel install --user --name=ray