Installing TensorFlow 2 with GPU support on Ubuntu 20.04 LTS

Intro

According to https://www.tensorflow.org/install and https://www.tensorflow.org/install/gpu the following has to be installed:

  • python: 3.5 – 3.7
  • pip: > 19.0

NVIDIA software:

  • NVIDIA GPU drivers: > 418.x
  • CUDA Toolkit: 10.1
  • cuDNN SDK: >= 7.6

Please also note:

  • CUDA 10.1 requires GCC <= 8

At time of writing the following versions were available:

Install compatible versions of python and gcc

Ubuntu 20.04 LTS comes with:

  • python: 3.8
  • gcc: 9

So the first thing we need to do is to install compatible versions of python and gcc - i.e. python 3.7 and gcc 8.

Install gcc's. But let’s use gcc 9 for now as it will be used to install GPU Drivers.

> sudo apt -y install build-essential
> sudo apt -y install gcc-8 g++-8 gcc-9 g++-9

> sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 8 --slave /usr/bin/g++ g++ /usr/bin/g++-8
> sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 9 --slave /usr/bin/g++ g++ /usr/bin/g++-9

> sudo update-alternatives --config gcc

There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-9   9         auto mode
  1            /usr/bin/gcc-8   8         manual mode
  2            /usr/bin/gcc-9   9         manual mode

Press <enter> to keep the current choice[*], or type selection number: 0
update-alternatives: using /usr/bin/gcc-9 to provide /usr/bin/gcc (gcc) in auto mode

There are multiple ways how to manage python versions and envs. I’ve selected pyenv + pyenv-virtualenv

> sudo apt-get install -y zlib1g-dev libbz2-dev libreadline-dev libssl-dev libsqlite3-dev libffi-dev
> pyenv install 3.7.7
> pyenv virtualenv 3.7.7 tf2
> pyenv global tf2

> python -V
Python 3.7.7

NVIDIA GPU drivers

Get and run installer.

> sudo bash NVIDIA-Linux-x86_64-440.82.run

ERROR: The Nouveau kernel driver is currently in use by your system.  This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding.  Please consult the NVIDIA
         driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.

… follow the instructions to blacklist Nouveau kernel driver. Then run:

> sudo update-initramfs -u
> sudo reboot

Retry installation, ignore Xorg warnings and configuration if you are working from command line only.

> sudo bash NVIDIA-Linux-x86_64-440.82.run

Note you still need gcc-9 to install drivers. At the end you should get Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 440.82) is now complete.

Verify installation:

> nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 107...  Off  | 00000000:09:00.0 Off |                  N/A |
|  0%   59C    P0    38W / 180W |      0MiB /  8118MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

CUDA Toolkit and cuDNN

Switch gcc to gcc-8

> sudo update-alternatives --config gcc

There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-9   9         auto mode
  1            /usr/bin/gcc-8   8         manual mode
  2            /usr/bin/gcc-9   9         manual mode

Press <enter> to keep the current choice[*], or type selection number: 1
update-alternatives: using /usr/bin/gcc-8 to provide /usr/bin/gcc (gcc) in manual mode

gcc --version
gcc (Ubuntu 8.4.0-3ubuntu2) 8.4.0

Run installer, accept license, deselect Driver

> sudo bash cuda_10.1.243_418.87.00_linux.run

x CUDA Installer                                                               x
x - [ ] Driver                                                                 x
x      [ ] 418.87.00                                                           x
x + [X] CUDA Toolkit 10.1                                                      x
x   [X] CUDA Samples 10.1                                                      x
x   [X] CUDA Demo Suite 10.1                                                   x
x   [X] CUDA Documentation 10.1                                                x
x   Options                                                                    x
x   Install                                                                    x

Please make sure that

 -   PATH includes /usr/local/cuda-10.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

Unzip the cuDNN package

> tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz

Copy the following files into the CUDA Toolkit directory, and change the file permissions.

> sudo cp cuda/include/cudnn.h /usr/local/cuda/include
> sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
> sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

I had to make the following changes:

> rm /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7 /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so
> sudo ln -s libcudnn.so.7.6.5 /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7
> sudo ln -s libcudnn.so.7 /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so
> sudo ldconfig

TensorFlow 2 with GPU support

> pip install --upgrade pip
> pip install tensorflow
> pip install tensorflow-gpu

Check GPU support is enabled, and you can access your GPU:

> python -c "import tensorflow as tf; tf.config.list_physical_devices('GPU')"

Ignore the warning successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero Also to draft benchmark your GPU you still (repo is not maintained) can do the following:

> git clone https://github.com/tensorflow/benchmarks
> python benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=1 --model resnet50 --batch_size 64

So you can compare your results with some well-known numbers. You also can run nvidia-smi in parallel to check GPU load.

> watch nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 107...  Off  | 00000000:09:00.0 Off |                  N/A |
| 39%   63C    P2   179W / 180W |   7999MiB /  8118MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     84404      C   ...e/ubuntu/.pyenv/versions/tf2/bin/python  7989MiB |
+-----------------------------------------------------------------------------

Comments