install and configure cuda 9.2 with cudnn 7.1 on ubuntu 16.04


cuda 9.2

  • nvidia driver 396.54
  • cuda 9.2 (not install driver,install toolkit and samples)
  • cudnn 7.1.4 for cuda9.2 (for TensorRT) caffe,tensorflow, baidu anakin

cuda 8.0

  • nvidia driver 384.130
  • cuda 8.0 (not install driver,install toolkit and samples)
  • cudnn 6.0.21 for cuda8.0 caffe


GUI vs tty

  • ctrl+alt+F7 to enter GUI
  • ctrl+alt+F1-F6 to enter tty1-6, login with(username,password)

use fbterm instead of default terminal when we are in tty1

sudo apt-get -y install fbterm
sudo fbterm

cuda and cudnn

  • download from cuda
  • download cudnn-9.2-linux-x64-v7.1.tgz from cudnn


install general dependencies

apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler
apt-get install --no-install-recommends libboost-all-dev

# blas
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev

sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

sudo apt-get install git cmake build-essential

# fix missing 
#sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

GUI mode

# disable default ubuntu driver
sudo vim /etc/modprobe.d/blacklist-nouveau.conf

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
sudo reboot

tty mode

ctrl+alt+F1 to enter tty1, login with(username,password)

sudo fbterm

# stop x-server before install cuda driver
sudo service lightdm stop

remove previous nvidia driver + cuda toolkit

sudo apt-get remove --purge nvidia-*
# remove 8.0
sudo /usr/local/cuda-8.0/bin/
# remove 9.2
sudo /usr/local/cuda-9.2/bin/

install nvidia driver from ppa

DO NOT use to install nvidia driver, otherwise we
get Loop Login Problem when we reboot.


sudo add-apt-repository ppa:graphics-drivers/ppa
sudp apt-get update

sudo apt-cache search nvidia-*
# nvidia-384
# nvidia-396
sudo apt-get -y install nvidia-396

# test 
sudo nvidia-smi

install cuda toolkit from run file

  1. DO NOT install nvidia driver, install cuda toolkit + samples.

  2. use default install path /usr/local/cuda-9.2

  3. use /usr/local/cuda-9.2/bin/ to uninstall

chmod +x ./

# Using unspported compiler---> override
./ --override


Do you accept the previously read EULA? 
(accept/decline/quit): accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.37? (y)es/(n)o/(q)uit: no

Install the CUDA 9.2 Toolkit? 
(y)es/(n)o/(q)uit: yes

Enter Toolkit Location 
    [ default is /usr/local/cuda-9.2 ]:

Do you want to install a symbolic link at /usr/local/cuda? (y)es/(n)o/(q)uit: yes

Install the CUDA 9.2 Samples? 
(y)es/(n)o/(q)uit: yes

Enter CUDA Samples Location 
    [ default is /home/kezunlin ]: 

Installing the CUDA Toolkit in /usr/local/cuda-9.2 ...
Installing the CUDA Samples in /home/kezunlin ...

= Summary =

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-9.2
Samples:  Installed in /home/kezunlin

Please make sure that
 -   PATH includes /usr/local/cuda-9.2/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.2/lib64, or, add /usr/local/cuda-9.2/lib64 to /etc/ and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.2/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.2/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.2 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_6659.log

reboot to enter GUI

sudo reboot 

OK. we no longer have Loop Login Problem.

add library path

system env

vim .bashrc

# for cuda and cudnn
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

source .bashrc

or by conf file

sudo vim /etc/

sudo ldconifg



Tue Sep 18 10:35:55 2018       
| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  GeForce GTX 1060    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   58C    P0    31W /  N/A |    288MiB /  6078MiB |      0%      Default |

| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|    0      1636      G   /usr/lib/xorg/Xorg                           164MiB |
|    0      2569      G   compiz                                        40MiB |
|    0      4828      G   ...-token=2DAB0000EFF3321D4D304928FA64B811    81MiB |


cat /proc/driver/nvidia/version


nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148


cd ~/NVIDIA_CUDA-9.2_Samples/1_Utilities/deviceQuery


./deviceQuery Starting...

    CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1060"
    CUDA Driver Version / Runtime Version          9.2 / 9.2
    CUDA Capability Major/Minor version number:    6.1
    Total amount of global memory:                 6078 MBytes (6373572608 bytes)
    (10) Multiprocessors, (128) CUDA Cores/MP:     1280 CUDA Cores
    GPU Max Clock rate:                            1733 MHz (1.73 GHz)
    Memory Clock rate:                             4004 Mhz
    Memory Bus Width:                              192-bit
    L2 Cache Size:                                 1572864 bytes
    Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
    Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
    Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
    Total amount of constant memory:               65536 bytes
    Total amount of shared memory per block:       49152 bytes
    Total number of registers available per block: 65536
    Warp size:                                     32
    Maximum number of threads per multiprocessor:  2048
    Maximum number of threads per block:           1024
    Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
    Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
    Maximum memory pitch:                          2147483647 bytes
    Texture alignment:                             512 bytes
    Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
    Run time limit on kernels:                     Yes
    Integrated GPU sharing Host Memory:            No
    Support host page-locked memory mapping:       Yes
    Alignment requirement for Surfaces:            Yes
    Device has ECC support:                        Disabled
    Device supports Unified Addressing (UVA):      Yes
    Device supports Compute Preemption:            Yes
    Supports Cooperative Kernel Launch:            Yes
    Supports MultiDevice Co-op Kernel Launch:      Yes
    Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
    Compute Mode:
        < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS

we get Result = PASS.

install cudnn

download cudnn-9.2-linux-x64-v7.1.tgz for ubuntu 16.04

  • copy include to /usr/local/cuda-9.2/include
  • copy lib64 to /usr/local/cuda-9.2/lib64


tar -xzvf cudnn-9.2-linux-x64-v7.1.tgz 
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/* /usr/local/cuda/lib64/



  • 20180917: created.

Author: kezunlin
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source kezunlin !