0%

Tutorial for Training LeNet on MNIST with Caffe

LeNet

The design of LeNet contains the essence of CNNs that are still used in larger models such as the ones in ImageNet. In general, it consists of a convolutional layer followed by a pooling layer, another convolution layer followed by a pooling layer, and then two fully connected layers similar to the conventional multilayer perceptrons.

经典LeNet结构:

input->conv1(20,)-pool1-conv2(50,)-pool2-f1(500,ReLU)-f2(10,softmax)->output

lenet_train_test.prototxt

  • batch size设置在net.prototxt中而不是solver.prototxt中,用以明确blob的dims
  • bottom: layer的input blob; top: layer的output blob
  • 对于0-255区间的pixel,需要归一化到0-1区间,scale = 1/256. = 0.00390625
  • lr_mult: 1表示learning时,weight的learning rate需要x1;
  • lr_mult: 2表示learning时,bias的learning rate需要x2 (this usually leads to better convergence rates)
  • InnerProduct默认输出的是z,而不是a=sigmoid(z)
  • ReLU是Inplace操作,输入输出blob都是ip1,对于其他Layer,input和output的blob不能是相同的

Input Layer types

Input Layer types for train_val.prototxt

``python
solver.net.forward() # load mini-batch images from training data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

**Data**
```prototxt
name: "mnist"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}


name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
# mean pixel / channel-wise mean instead of mean image
# transform_param {
# crop_size: 227
# mean_value: 104
# mean_value: 117
# mean_value: 123
# mirror: true
# }
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}

** ImageData**: read raw images.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
image_data_param {
source: "data/flickr_style/train.txt"
batch_size: 50
new_height: 256
new_width: 256
}
}

Input Layer types for deploy.prototxt

** DummyData **: no labels, only for forward and get probs

1
2
3
4
5
6
7
8
9
10
11
12
13
layer {
name: "data"
type: "DummyData"
top: "data"
dummy_data_param {
shape {
dim: 1
dim: 3
dim: 227
dim: 227
}
}
}

** Input** : typically used for networks that are being deployed.

1
2
3
4
5
6
7
8
9
10
11
12
layer {
name: "data"
type: "Input"
top: "data"
input_param {
shape {
dim: 10
dim: 3
dim: 227
dim: 227
}
}

LeNet train_val.prototxt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
name: "LeNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}

LeNet solver.prototxt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"

# batch_size定义在net.prototxt中,train_mini_batch_size = 64,test_mini_batch_size = 100

# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100 # test_iter = num_test_images/test_mini_batch_size = 10000/100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000 # epoch =
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: GPU

learning rate policy (todo…)

This is the same policy as our default LeNet.

1
2
3
s.lr_policy = 'inv'
s.gamma = 0.0001
s.power = 0.75

EDIT HERE to try the fixed rate (and compare with adaptive solvers)
fixed is the simplest policy that keeps the learning rate constant.

1
s.lr_policy = 'fixed'

Set lr_policy to define how the learning rate changes during training.

1
2
3
4
5
# Here, we 'step' the learning rate by multiplying it by a factor `gamma`
# every `stepsize` iterations.
s.lr_policy = 'step'
s.gamma = 0.1
s.stepsize = 20000

solver types (todo…)

solver types include “SGD”, “Adam”, and “Nesterov” among others.

1
s.type = "SGD"

Train LeNet

1
2
cd $CAFFE_ROOT
./examples/mnist/train_lenet.sh
#!/usr/bin/env sh
set -e

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

train output

I0807 16:15:29.555564  4273 solver.cpp:310] Iteration 10000, loss = 0.00251452
I0807 16:15:29.555619  4273 solver.cpp:330] Iteration 10000, Testing net (#0)
I0807 16:15:29.634243  4281 data_layer.cpp:73] Restarting data prefetching from start.
I0807 16:15:29.635372  4273 solver.cpp:397]     Test net output #0: accuracy = 0.9909
I0807 16:15:29.635409  4273 solver.cpp:397]     Test net output #1: loss = 0.0302912 (* 1 = 0.0302912 loss)
I0807 16:15:29.635416  4273 solver.cpp:315] Optimization Done.
I0807 16:15:29.635439  4273 caffe.cpp:259] Optimization Done.

Deploy model

  • for train, train_test.prototxt + solver.prototxt
  • for deploy, deploy.prototxt+ model.caffemodel

depoly: no weight_filler,bias_filler, loaded from weights.caffemodel. if not set weights file, w,b default to 0s

PyCaffe

pycaffe interfaces

The Python interface – pycaffe – is the caffe module and its scripts in caffe/python. import caffe to load models, do forward and backward, handle IO, visualize networks, and even instrument model solving. All model data, derivatives, and parameters are exposed for reading and writing.

  • caffe.Net is the central interface for loading, configuring, and running models.
  • caffe.Classifier and caffe.Detector provide convenience interfaces for common tasks.
  • caffe.SGDSolver exposes the solving interface.
  • caffe.io handles input / output with preprocessing and protocol buffers.
  • caffe.draw visualizes network architectures.
  • Caffe blobs are exposed as numpy ndarrays for ease-of-use and efficiency.

Tutorial IPython notebooks are found in caffe/examples: do ipython notebook caffe/examples to try them. For developer reference docstrings can be found throughout the code.

Compile pycaffe by make pycaffe. Add the module directory to your $PYTHONPATH by export PYTHONPATH=/path/to/caffe/python:$PYTHONPATH or the like for import caffe.

Reference

History

  • 20180807: created.