install and configure tensorrt 4 on ubuntu 16.04

Series

Guide

version

ubuntu 16.04 (14.04,16.04 only) not support Windows
~~CUDA 8.0~~ (8.0,9.0,9.2 only)
CUDA 9.2
cudnn 7.1.4 (7.1 only)
TensorRT 4.0.1.6
TensorFlow-gpu v1.4+
python: 3.5.2 (2.7 or 3.5)

TensorRT support matrix

4.0.1.6
5.0.2.6

hardware precision matrix

hardware precision support matrix

see tensorrt-support-matrix

ubuntu

GeForce 1060 (fp32,int8) no fp16

jetson products

Jetson TX1 (fp32,fp16)
Jetson TX2 (fp32,fp16)
Jetson AGX Xavier (fp32,fp16,int8,dla)
Jetson Nano (Jetbot)

install

download and install

download TensorRT-4.0.1.6.Ubuntu-16.04.4.x86_64-gnu.cuda-8.0.cudnn7.1.tar.gz from here

tar zxvf TensorRT-4.0.1.6.Ubuntu-16.04.4.x86_64-gnu.cuda-8.0.cudnn7.1.tar.gz 

ls TensorRT-4.0.1.6
bin  data  doc  graphsurgeon  include  lib  python  samples  targets  TensorRT-Release-Notes.pdf  uff

sudo mv TensorRT-4.0.1.6 /opt/
cd /opt
sudo ln -s TensorRT-4.0.1.6/ tensorrt

Updates: from cuda-8.0 ===> cuda-9.2. download TensorRT-4.0.1.6.Ubuntu-16.04.4.x86_64-gnu.cuda-9.2.cudnn7.1.tar.gz from here

add lib to path

sudo vim /etc/ld.so.conf.d/tensorrt
/opt/tensorrt/lib

sudo ldconfig

vim ~/.bashrc
export LD_LIBRARY_PATH=/opt/tensorrt/lib:$LD_LIBRARY_PATH

source ~/.bashrc

python package

1 2	cd /opt/tensorrt/python sudo pip2 install tensorrt-4.0.1.6-cp27-cp27mu-linux_x86_64.whl

1 2	cd /opt/tensorrt/python sudo pip3 install tensorrt-4.0.1.6-cp35-cp35m-linux_x86_64.whl

uff package

cd /opt/tensorrt/uff 
sudo pip install uff-0.4.0-py2.py3-none-any.whl 

which convert-to-uff
/usr/local/bin/convert-to-uff

folder structure

include

tree include/
include/
├── NvCaffeParser.h
├── NvInfer.h
├── NvInferPlugin.h
├── NvOnnxConfig.h
├── NvOnnxParser.h
├── NvUffParser.h
└── NvUtils.h

lib

ls -al *.4.1.2
lrwxrwxrwx 1 kezunlin kezunlin       21 6月  12 15:42 libnvcaffe_parser.so.4.1.2 -> libnvparsers.so.4.1.2
-rwxrwxr-x 1 kezunlin kezunlin  2806840 6月  12 15:42 libnvinfer_plugin.so.4.1.2
-rwxrwxr-x 1 kezunlin kezunlin 80434488 6月  12 15:42 libnvinfer.so.4.1.2
-rwxrwxr-x 1 kezunlin kezunlin  3951712 6月  12 15:42 libnvparsers.so.4.1.2

bin

tree bin
bin
├── download-digits-model.py
├── giexec
└── trtexec

sample

add envs

vim ~/.bashrc

# tensorrt cuda and cudnn
export CUDA_INSTALL_DIR=/usr/local/cuda
export CUDNN_INSTALL_DIR=/usr/local/cuda

compile all

1 2	cd samples/ make -j8

generate all sample_xxx to bin/ folder.

compile sampleMNIST

cd samples/sampleMNIST
ls 
Makefile  sampleMNIST.cpp
make -j8

error occurs

dpkg-query: no packages found matching cuda-cudart-[0-9]*
../Makefile.config:6: CUDA_INSTALL_DIR variable is not specified, using /usr/local/cuda- by default, use CUDA_INSTALL_DIR=<cuda_directory> to change.
../Makefile.config:9: CUDNN_INSTALL_DIR variable is not specified, using  by default, use CUDNN_INSTALL_DIR=<cudnn_directory> to change.

fix solutions:

vim ~/.bashrc

# tensorrt cuda and cudnn
export CUDA_INSTALL_DIR=/opt/cuda
export CUDNN_INSTALL_DIR=/opt/cuda

make again

:
:
Compiling: sampleMNIST.cpp
Compiling: sampleMNIST.cpp
Linking: ../../bin/sample_mnist
Linking: ../../bin/sample_mnist_debug
# Copy every EXTRA_FILE of this sample to bin dir

test sample_mnist

./sample_mnist
Reading Caffe prototxt: ../../../data/mnist/mnist.prototxt
Reading Caffe model: ../../../data/mnist/mnist.caffemodel

Input:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@%.:@@@@@@@@@@@@
@@@@@@@@@@@@@: *@@@@@@@@@@@@
@@@@@@@@@@@@* =@@@@@@@@@@@@@
@@@@@@@@@@@% :@@@@@@@@@@@@@@
@@@@@@@@@@@- *@@@@@@@@@@@@@@
@@@@@@@@@@# .@@@@@@@@@@@@@@@
@@@@@@@@@@: #@@@@@@@@@@@@@@@
@@@@@@@@@+ -@@@@@@@@@@@@@@@@
@@@@@@@@@: %@@@@@@@@@@@@@@@@
@@@@@@@@+ +@@@@@@@@@@@@@@@@@
@@@@@@@@:.%@@@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@#..:@@@@@@@@
@@@@@@@% +@@@@@-    :@@@@@@@
@@@@@@@% =@@@@%.#@@- +@@@@@@
@@@@@@@@..%@@@*+@@@@ :@@@@@@
@@@@@@@@= -%@@@@@@@@ :@@@@@@
@@@@@@@@@- .*@@@@@@+ +@@@@@@
@@@@@@@@@@+  .:-+-: .@@@@@@@
@@@@@@@@@@@@+:    :*@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Output:

0: 
1: 
2: 
3: 
4: 
5: 
6: **********
7: 
8: 
9:

Sample

compile all samples

1 2	cd sample make -j8

sample_mnist

see above. skip.

ldd sample_mnist
    linux-vdso.so.1 =>  (0x00007ffecd9f3000)
    libnvinfer.so.4 => /opt/tensorrt/lib/libnvinfer.so.4 (0x00007f48de6f2000)
    libnvparsers.so.4.1.2 => /opt/tensorrt/lib/libnvparsers.so.4.1.2 (0x00007f48de12c000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f48ddf24000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f48ddd20000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f48ddb03000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f48dd781000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f48dd478000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f48dd262000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f48dce98000)
    libcudnn.so.7 => /usr/local/cuda/lib64/libcudnn.so.7 (0x00007f48c8818000)
    libcublas.so.9.2 => /usr/local/cuda/lib64/libcublas.so.9.2 (0x00007f48c4dca000)
    libcudart.so.9.2 => /usr/local/cuda/lib64/libcudart.so.9.2 (0x00007f48c4b60000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f48e42bc000)

libnvinfer.so, libnvparsers.so, libcudart.so, libcudnn.so, libcublas.so

sample_onnx_mnist

./sample_onnx_mnist



---------------------------



@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@%.-@@@@@@@@@@@
@@@@@@@@@@@*-    %@@@@@@@@@@
@@@@@@@@@@= .-.  *@@@@@@@@@@
@@@@@@@@@= +@@@  *@@@@@@@@@@
@@@@@@@@* =@@@@  %@@@@@@@@@@
@@@@@@@@..@@@@%  @@@@@@@@@@@
@@@@@@@# *@@@@-  @@@@@@@@@@@
@@@@@@@: @@@@%   @@@@@@@@@@@
@@@@@@@: @@@@-   @@@@@@@@@@@
@@@@@@@: =+*= +: *@@@@@@@@@@
@@@@@@@*.    +@: *@@@@@@@@@@
@@@@@@@@%#**#@@: *@@@@@@@@@@
@@@@@@@@@@@@@@@: -@@@@@@@@@@
@@@@@@@@@@@@@@@+ :@@@@@@@@@@
@@@@@@@@@@@@@@@*  @@@@@@@@@@
@@@@@@@@@@@@@@@@  %@@@@@@@@@
@@@@@@@@@@@@@@@@  #@@@@@@@@@
@@@@@@@@@@@@@@@@: +@@@@@@@@@
@@@@@@@@@@@@@@@@- +@@@@@@@@@
@@@@@@@@@@@@@@@@*:%@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@


 Prob 0  0.00000: 
 Prob 1  0.00001: 
 Prob 2  0.00002: 
 Prob 3  0.00003: 
 Prob 4  0.00044: 
 Prob 5  0.00005: 
 Prob 6  0.00006: 
 Prob 7  0.00007: 
 Prob 8  0.00008: 
 Prob 9  0.99969: **********

sample_uff_mnist

../../../data/mnist/lenet5.uff



---------------------------



@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@%.-@@@@@@@@@@@
@@@@@@@@@@@*-    %@@@@@@@@@@
@@@@@@@@@@= .-.  *@@@@@@@@@@
@@@@@@@@@= +@@@  *@@@@@@@@@@
@@@@@@@@* =@@@@  %@@@@@@@@@@
@@@@@@@@..@@@@%  @@@@@@@@@@@
@@@@@@@# *@@@@-  @@@@@@@@@@@
@@@@@@@: @@@@%   @@@@@@@@@@@
@@@@@@@: @@@@-   @@@@@@@@@@@
@@@@@@@: =+*= +: *@@@@@@@@@@
@@@@@@@*.    +@: *@@@@@@@@@@
@@@@@@@@%#**#@@: *@@@@@@@@@@
@@@@@@@@@@@@@@@: -@@@@@@@@@@
@@@@@@@@@@@@@@@+ :@@@@@@@@@@
@@@@@@@@@@@@@@@*  @@@@@@@@@@
@@@@@@@@@@@@@@@@  %@@@@@@@@@
@@@@@@@@@@@@@@@@  #@@@@@@@@@
@@@@@@@@@@@@@@@@: +@@@@@@@@@
@@@@@@@@@@@@@@@@- +@@@@@@@@@
@@@@@@@@@@@@@@@@*:%@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
10 eltCount
--- OUTPUT ---
0 => -2.75228	 : 
1 => -1.51534	 : 
2 => -4.11729	 : 
3 => 0.316925	 : 
4 => 3.73423	 : 
5 => -3.00593	 : 
6 => -6.18866	 : 
7 => -1.02671	 : 
8 => 1.937	 : 
9 => 14.8275	 : ***

Average over 10 runs is 0.0843257 ms.

sample_mnist_api

./sample_mnist_api
Loading weights: ../../../data/mnist/mnistapi.wts

Input:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@+ @@@@@@@@@@@@@@
@@@@@@@@@@@@. @@@@@@@@@@@@@@
@@@@@@@@@@@@- @@@@@@@@@@@@@@
@@@@@@@@@@@#  @@@@@@@@@@@@@@
@@@@@@@@@@@#  *@@@@@@@@@@@@@
@@@@@@@@@@@@  :@@@@@@@@@@@@@
@@@@@@@@@@@@= .@@@@@@@@@@@@@
@@@@@@@@@@@@#  %@@@@@@@@@@@@
@@@@@@@@@@@@% .@@@@@@@@@@@@@
@@@@@@@@@@@@%  %@@@@@@@@@@@@
@@@@@@@@@@@@%  %@@@@@@@@@@@@
@@@@@@@@@@@@@= +@@@@@@@@@@@@
@@@@@@@@@@@@@* -@@@@@@@@@@@@
@@@@@@@@@@@@@*  @@@@@@@@@@@@
@@@@@@@@@@@@@@  @@@@@@@@@@@@
@@@@@@@@@@@@@@  *@@@@@@@@@@@
@@@@@@@@@@@@@@  *@@@@@@@@@@@
@@@@@@@@@@@@@@  *@@@@@@@@@@@
@@@@@@@@@@@@@@  *@@@@@@@@@@@
@@@@@@@@@@@@@@* @@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Output:

0: 
1: **********
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9:

sample_int8

./sample_int8 mnist

FP32 run:400 batches of size 100 starting at 100
........................................
Top1: 0.9904, Top5: 1
Processing 40000 images averaged 0.00332707 ms/image and 0.332707 ms/batch.

FP16 run:400 batches of size 100 starting at 100
Engine could not be created at this precision

INT8 run:400 batches of size 100 starting at 100
........................................
Top1: 0.9909, Top5: 1
Processing 40000 images averaged 0.00215323 ms/image and 0.215323 ms/batch.

Reference

History

20180907: created.
20181119: add tensorrt-5.0.