The design of LeNet contains the essence of CNNs that are still used in larger models such as the ones in ImageNet. In general, it consists of a convolutional layer followed by a pooling layer, another convolution layer followed by a pooling layer, and then two fully connected layers similar to the conventional multilayer perceptrons.




  • batch size设置在net.prototxt中而不是solver.prototxt中,用以明确blob的dims
  • bottom: layer的input blob; top: layer的output blob
  • 对于0-255区间的pixel,需要归一化到0-1区间,scale = 1/256. = 0.00390625
  • lr_mult: 1表示learning时,weight的learning rate需要x1;
  • lr_mult: 2表示learning时,bias的learning rate需要x2 (this usually leads to better convergence rates)
  • InnerProduct默认输出的是z,而不是a=sigmoid(z)
  • ReLU是Inplace操作,输入输出blob都是ip1,对于其他Layer,input和output的blob不能是相同的

Input Layer types

Input Layer types for train_val.prototxt

solver.net.forward() # load mini-batch images from training data

name: "mnist"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  transform_param {
    scale: 0.00390625
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB

name: "CaffeNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb"
    batch_size: 256
    backend: LMDB

ImageData: read raw images.

layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  image_data_param {
    source: "data/flickr_style/train.txt"
    batch_size: 50
    new_height: 256
    new_width: 256

Input Layer types for deploy.prototxt

DummyData : no labels, only for forward and get probs

layer {
  name: "data"
  type: "DummyData"
  top: "data"
  dummy_data_param {
    shape {
      dim: 1
      dim: 3
      dim: 227
      dim: 227

Input : typically used for networks that are being deployed.

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { 
  shape {
      dim: 10
      dim: 3
      dim: 227
      dim: 227

LeNet train_val.prototxt

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  transform_param {
    scale: 0.00390625
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  transform_param {
    scale: 0.00390625
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"

LeNet solver.prototxt

# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"

# batch_size定义在net.prototxt中,train_mini_batch_size = 64,test_mini_batch_size = 100

# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100 # test_iter = num_test_images/test_mini_batch_size = 10000/100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000  # epoch = 
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: GPU

learning rate policy (todo…)

This is the same policy as our default LeNet.

s.lr_policy = 'inv'
s.gamma = 0.0001
s.power = 0.75

EDIT HERE to try the fixed rate (and compare with adaptive solvers)
fixed is the simplest policy that keeps the learning rate constant.

s.lr_policy = 'fixed'

Set lr_policy to define how the learning rate changes during training.

# Here, we 'step' the learning rate by multiplying it by a factor `gamma`
# every `stepsize` iterations.
s.lr_policy = 'step'
s.gamma = 0.1
s.stepsize = 20000

solver types (todo…)

solver types include “SGD”, “Adam”, and “Nesterov” among others.

s.type = "SGD"

Train LeNet

#!/usr/bin/env sh
set -e

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

train output

I0807 16:15:29.555564  4273 solver.cpp:310] Iteration 10000, loss = 0.00251452
I0807 16:15:29.555619  4273 solver.cpp:330] Iteration 10000, Testing net (#0)
I0807 16:15:29.634243  4281 data_layer.cpp:73] Restarting data prefetching from start.
I0807 16:15:29.635372  4273 solver.cpp:397]     Test net output #0: accuracy = 0.9909
I0807 16:15:29.635409  4273 solver.cpp:397]     Test net output #1: loss = 0.0302912 (* 1 = 0.0302912 loss)
I0807 16:15:29.635416  4273 solver.cpp:315] Optimization Done.
I0807 16:15:29.635439  4273 caffe.cpp:259] Optimization Done.

Deploy model

  • for train, train_test.prototxt + solver.prototxt
  • for deploy, deploy.prototxt+ model.caffemodel

depoly: no weight_filler,bias_filler, loaded from weights.caffemodel. if not set weights file, w,b default to 0s


pycaffe interfaces

The Python interface – pycaffe – is the caffe module and its scripts in caffe/python. import caffe to load models, do forward and backward, handle IO, visualize networks, and even instrument model solving. All model data, derivatives, and parameters are exposed for reading and writing.

  • caffe.Net is the central interface for loading, configuring, and running models.
  • caffe.Classifier and caffe.Detector provide convenience interfaces for common tasks.
  • caffe.SGDSolver exposes the solving interface.
  • caffe.io handles input / output with preprocessing and protocol buffers.
  • caffe.draw visualizes network architectures.
  • Caffe blobs are exposed as numpy ndarrays for ease-of-use and efficiency.

Tutorial IPython notebooks are found in caffe/examples: do ipython notebook caffe/examples to try them. For developer reference docstrings can be found throughout the code.

Compile pycaffe by make pycaffe. Add the module directory to your $PYTHONPATH by export PYTHONPATH=/path/to/caffe/python:$PYTHONPATH or the like for import caffe.



