compile and install caffe-yolov3 on ubuntu 16.04

deep learning

Publish Date: 2018-12-11

Word Count: 658

Read Times: 4 Min

Read Count:

Series

Guide

requirements

my system requirements (same as caffe on ubuntu 16.04)

ubuntu 16.04
GeForce 1060 (6G) sm_61
cuda: 9.2
cudnn: 7.1.4
opencv: 3.3.0
caffe: latest

compile

git clone https://github.com/kezunlin/caffe-yolov3.git
cd caffe-yolov3
mkdir build && cd build && cmake-gui ..

caffe-yolov3 is based on caffe with UpsampleLayer and darknet.
tips: edit CMakeLists.txt for caffe.
see CMakeLists.txt

make

make -j8
make install

tree install
install
├── bin
│   ├── demo
│   ├── dog.jpg
│   └── libcaffeyolo.so
├── include
│   └── caffeyolo
│       ├── activations.h
│       ├── blas.h
│       ├── box.h
│       ├── cuda.h
│       ├── image.h
│       └── yolo_layer.h
├── lib
│   └── libcaffeyolo.so
└── share
    └── cmake
        └── caffeyolo
            ├── caffeyolo-config.cmake
            └── caffeyolo-config-noconfig.cmake

7 directories, 12 files

demo

cd install/bin
./demo

output

num_inputs is 1
num_outputs is 3
I1211 17:13:30.259755 10384 detectnet.cpp:74] Input data layer channels is  3
I1211 17:13:30.259785 10384 detectnet.cpp:75] Input data layer width is  416
I1211 17:13:30.259806 10384 detectnet.cpp:76] Input data layer height is  416
output blob1 shape c= 255, h = 13, w = 13
output blob2 shape c= 255, h = 26, w = 26
output blob3 shape c= 255, h = 52, w = 52
object-detection:  finished processing data operation  (392)ms
object-detection:  finished processing yolov3 network  (135)ms
16: 99%
x = 0.288428,y =  0.660513,w = 0.243282,h =  0.543122
left = 128,right =  314,top = 224,bot =  536
7: 93%
x = 0.756588,y =  0.222789,w = 0.280549,h =  0.147772
left = 473,right =  688,top = 85,bot =  170
1: 99%
x = 0.473371,y =  0.483899,w = 0.517509,h =  0.575438
left = 164,right =  562,top = 112,bot =  444
detectnet-camera:  video device has been un-initialized.
detectnet-camera:  this concludes the test of the video device.

image

net

input && output

input:

data 1,3,416,416

output

layer82-conv 1,255,13,13
layer94-conv 1,255,26,26
layer106-conv 1,255,52,52

python code

### Input: the model's output dict
### Output: list of tuples in ((cx1, cy1), (cx2, cy2), cls, prob)
def rects_prepare(output, inp_dim=416, num_classes=80):
    prediction = None
    # transform prediction coordinates to correspond to pixel location
    for key, value in output.items():
        # anchor sizes are borrowed from YOLOv3 config file
        if key == 'layer82-conv': 
            anchors = [(116, 90), (156, 198), (373, 326)] 
        elif key == 'layer94-conv':
            anchors = [(30, 61), (62, 45), (59, 119)]
        elif key == 'layer106-conv': 
            anchors = [(10, 13), (16, 30), (33, 23)]

yolov3 model

darknet

files

yolov3.weights
yolov3.cfg
coco.names

caffe-yolov3

files

yolov3.caffemodel
yolov3.prototxt
coco.names
yolov3-cpp.prototxt
yolov3-trt.prototxt

yolov3-cpp.prototxt

compared with yolov3.prototxt

layer {
    bottom: "layer82-conv"
    bottom: "layer94-conv"
    bottom: "layer106-conv"
    type: "Yolov3DetectionOutput"
    top: "detection_out"
    name: "detection_out"
    yolov3_detection_output_param {
        nms_threshold: 0.45
        num_classes: 80
        biases: 10
        biases: 13
        biases: 16
        biases: 30
        biases: 33
        biases: 23
        biases: 30
        biases: 61
        biases: 62
        biases: 45
        biases: 59
        biases: 119
        biases: 116
        biases: 90
        biases: 156
        biases: 198
        biases: 373
        biases: 326
        mask: 6
        mask: 7
        mask: 8
        mask: 3
        mask: 4
        mask: 5
        mask: 0
        mask: 1
        mask: 2
        mask_group_num: 3
        anchors_scale: 32
        anchors_scale: 16
        anchors_scale: 8
    }
}

Use Yolov3DetectionOutput layer with caffe (Upsample+Yolov3DetectionOutput)

yolov3-trt.prototxt

see yolov3-trt.prototxt
compared with yolov3-cpp.prototxt

layer {
    bottom: "layer85-conv"
    top: "layer86-upsample"
    name: "layer86-upsample"
    type: "Upsample"
    #upsample_param {
    #    scale: 2
    #}
}
...

layer {
    bottom: "layer97-conv"
    top: "layer98-upsample"
    name: "layer98-upsample"
    type: "Upsample"
    #upsample_param {
    #    scale: 2
    #}
}
...


layer {
    bottom: "layer82-conv"
    bottom: "layer94-conv"
    bottom: "layer106-conv"
    top: "yolo-det"
    name: "yolo-det"
    type: "Yolo"
}