运行caffe框架中的cifar10样例

1.先运行caffe目录下的data/get_cifar10.sh脚本.

#!/usr/bin/env sh
# This scripts downloads the CIFAR10 (binary version) data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"

echo "Downloading..."

wget --no-check-certificate http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz

echo "Unzipping..."

tar -xf cifar-10-binary.tar.gz && rm -f cifar-10-binary.tar.gz
mv cifar-10-batches-bin/* . && rm -rf cifar-10-batches-bin

# Creation is split out because leveldb sometimes causes segfault
# and needs to be re-created.

echo "Done."

获取数据集.

之后运行example下的cifar10/create_cifar10.sh
但是会遇到以下报错:

这里要运行下面这个命令:

install_name_tool -add_rpath '/Users/liangzhonghao/anaconda2/lib'  /usr/local/Cellar/caffe/build/examples/cifar10/convert_cifar_data.bin

再次运行./examples/cifar10/create_cifar10.sh后又会出现一个错误:

这里要再次执行以下命令:

install_name_tool -add_rpath '/Users/liangzhonghao/anaconda2/lib'  /usr/local/Cellar/caffe/build/tools/compute_image_mean

然后再次执行:

./examples/cifar10/create_cifar10.sh

结果成功了!如下图所示:

Training and Testing the "Quick" Model

因为例子中已经给出定义好的protobuf和solver protobuf文件,所以我们直接运行train_quick.sh

该文件内容为:

#!/usr/bin/env sh
set -e

TOOLS=./build/tools

$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver.prototxt $@

# reduce learning rate by factor of 10 after 8 epochs
$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver_lr1.prototxt \
  --snapshot=examples/cifar10/cifar10_quick_iter_4000.solverstate $@

执行如下命令:

➜ caffe git:(master) ✗ ./examples/cifar10/train_quick.sh

然后输出为:

I0523 15:43:36.608793 2712679360 caffe.cpp:211] Use CPU.
I0523 15:43:36.609737 2712679360 solver.cpp:44] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.001
display: 100
max_iter: 4000
lr_policy: "fixed"
momentum: 0.9
weight_decay: 0.004
snapshot: 4000
snapshot_prefix: "examples/cifar10/cifar10_quick"
solver_mode: CPU
net: "examples/cifar10/cifar10_quick_train_test.prototxt"
train_state {
  level: 0
  stage: ""
}
I0523 15:43:36.610075 2712679360 solver.cpp:87] Creating training net from net file: examples/cifar10/cifar10_quick_train_test.prototxt
I0523 15:43:36.610931 2712679360 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer cifar
I0523 15:43:36.610961 2712679360 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0523 15:43:36.610966 2712679360 net.cpp:51] Initializing net from parameters:
name: "CIFAR10_quick"
state {
  phase: TRAIN
  level: 0
  stage: ""
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_train_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0523 15:43:36.611205 2712679360 layer_factory.hpp:77] Creating layer cifar
I0523 15:43:36.611467 2712679360 db_lmdb.cpp:35] Opened lmdb examples/cifar10/cifar10_train_lmdb
I0523 15:43:36.611524 2712679360 net.cpp:84] Creating Layer cifar
I0523 15:43:36.611531 2712679360 net.cpp:380] cifar -> data
I0523 15:43:36.611549 2712679360 net.cpp:380] cifar -> label
I0523 15:43:36.611565 2712679360 data_transformer.cpp:25] Loading mean file from: examples/cifar10/mean.binaryproto
I0523 15:43:36.611686 2712679360 data_layer.cpp:45] output data size: 100,3,32,32
I0523 15:43:36.617992 2712679360 net.cpp:122] Setting up cifar
I0523 15:43:36.618022 2712679360 net.cpp:129] Top shape: 100 3 32 32 (307200)
I0523 15:43:36.618028 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 15:43:36.618032 2712679360 net.cpp:137] Memory required for data: 1229200
I0523 15:43:36.618041 2712679360 layer_factory.hpp:77] Creating layer conv1
I0523 15:43:36.618052 2712679360 net.cpp:84] Creating Layer conv1
I0523 15:43:36.618057 2712679360 net.cpp:406] conv1 <- data
I0523 15:43:36.618063 2712679360 net.cpp:380] conv1 -> conv1
I0523 15:43:36.618175 2712679360 net.cpp:122] Setting up conv1
I0523 15:43:36.618180 2712679360 net.cpp:129] Top shape: 100 32 32 32 (3276800)
I0523 15:43:36.618185 2712679360 net.cpp:137] Memory required for data: 14336400
I0523 15:43:36.618192 2712679360 layer_factory.hpp:77] Creating layer pool1
I0523 15:43:36.618199 2712679360 net.cpp:84] Creating Layer pool1
I0523 15:43:36.618202 2712679360 net.cpp:406] pool1 <- conv1
I0523 15:43:36.618206 2712679360 net.cpp:380] pool1 -> pool1
I0523 15:43:36.618216 2712679360 net.cpp:122] Setting up pool1
I0523 15:43:36.618219 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.618224 2712679360 net.cpp:137] Memory required for data: 17613200
I0523 15:43:36.618228 2712679360 layer_factory.hpp:77] Creating layer relu1
I0523 15:43:36.618234 2712679360 net.cpp:84] Creating Layer relu1
I0523 15:43:36.618238 2712679360 net.cpp:406] relu1 <- pool1
I0523 15:43:36.618242 2712679360 net.cpp:367] relu1 -> pool1 (in-place)
I0523 15:43:36.618247 2712679360 net.cpp:122] Setting up relu1
I0523 15:43:36.618250 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.618255 2712679360 net.cpp:137] Memory required for data: 20890000
I0523 15:43:36.618263 2712679360 layer_factory.hpp:77] Creating layer conv2
I0523 15:43:36.618273 2712679360 net.cpp:84] Creating Layer conv2
I0523 15:43:36.618276 2712679360 net.cpp:406] conv2 <- pool1
I0523 15:43:36.618281 2712679360 net.cpp:380] conv2 -> conv2
I0523 15:43:36.618585 2712679360 net.cpp:122] Setting up conv2
I0523 15:43:36.618592 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.618597 2712679360 net.cpp:137] Memory required for data: 24166800
I0523 15:43:36.618602 2712679360 layer_factory.hpp:77] Creating layer relu2
I0523 15:43:36.618607 2712679360 net.cpp:84] Creating Layer relu2
I0523 15:43:36.618609 2712679360 net.cpp:406] relu2 <- conv2
I0523 15:43:36.618614 2712679360 net.cpp:367] relu2 -> conv2 (in-place)
I0523 15:43:36.618619 2712679360 net.cpp:122] Setting up relu2
I0523 15:43:36.618623 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.618628 2712679360 net.cpp:137] Memory required for data: 27443600
I0523 15:43:36.618630 2712679360 layer_factory.hpp:77] Creating layer pool2
I0523 15:43:36.618634 2712679360 net.cpp:84] Creating Layer pool2
I0523 15:43:36.618638 2712679360 net.cpp:406] pool2 <- conv2
I0523 15:43:36.618643 2712679360 net.cpp:380] pool2 -> pool2
I0523 15:43:36.618647 2712679360 net.cpp:122] Setting up pool2
I0523 15:43:36.618654 2712679360 net.cpp:129] Top shape: 100 32 8 8 (204800)
I0523 15:43:36.618662 2712679360 net.cpp:137] Memory required for data: 28262800
I0523 15:43:36.618669 2712679360 layer_factory.hpp:77] Creating layer conv3
I0523 15:43:36.618680 2712679360 net.cpp:84] Creating Layer conv3
I0523 15:43:36.618685 2712679360 net.cpp:406] conv3 <- pool2
I0523 15:43:36.618695 2712679360 net.cpp:380] conv3 -> conv3
I0523 15:43:36.619361 2712679360 net.cpp:122] Setting up conv3
I0523 15:43:36.619372 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 15:43:36.619379 2712679360 net.cpp:137] Memory required for data: 29901200
I0523 15:43:36.619385 2712679360 layer_factory.hpp:77] Creating layer relu3
I0523 15:43:36.619390 2712679360 net.cpp:84] Creating Layer relu3
I0523 15:43:36.619393 2712679360 net.cpp:406] relu3 <- conv3
I0523 15:43:36.619398 2712679360 net.cpp:367] relu3 -> conv3 (in-place)
I0523 15:43:36.619403 2712679360 net.cpp:122] Setting up relu3
I0523 15:43:36.619447 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 15:43:36.619459 2712679360 net.cpp:137] Memory required for data: 31539600
I0523 15:43:36.619467 2712679360 layer_factory.hpp:77] Creating layer pool3
I0523 15:43:36.619477 2712679360 net.cpp:84] Creating Layer pool3
I0523 15:43:36.619484 2712679360 net.cpp:406] pool3 <- conv3
I0523 15:43:36.619493 2712679360 net.cpp:380] pool3 -> pool3
I0523 15:43:36.619505 2712679360 net.cpp:122] Setting up pool3
I0523 15:43:36.619513 2712679360 net.cpp:129] Top shape: 100 64 4 4 (102400)
I0523 15:43:36.619523 2712679360 net.cpp:137] Memory required for data: 31949200
I0523 15:43:36.619529 2712679360 layer_factory.hpp:77] Creating layer ip1
I0523 15:43:36.619539 2712679360 net.cpp:84] Creating Layer ip1
I0523 15:43:36.619546 2712679360 net.cpp:406] ip1 <- pool3
I0523 15:43:36.619555 2712679360 net.cpp:380] ip1 -> ip1
I0523 15:43:36.620586 2712679360 net.cpp:122] Setting up ip1
I0523 15:43:36.620602 2712679360 net.cpp:129] Top shape: 100 64 (6400)
I0523 15:43:36.620607 2712679360 net.cpp:137] Memory required for data: 31974800
I0523 15:43:36.620613 2712679360 layer_factory.hpp:77] Creating layer ip2
I0523 15:43:36.620620 2712679360 net.cpp:84] Creating Layer ip2
I0523 15:43:36.620625 2712679360 net.cpp:406] ip2 <- ip1
I0523 15:43:36.620630 2712679360 net.cpp:380] ip2 -> ip2
I0523 15:43:36.620649 2712679360 net.cpp:122] Setting up ip2
I0523 15:43:36.620656 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 15:43:36.620662 2712679360 net.cpp:137] Memory required for data: 31978800
I0523 15:43:36.620673 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 15:43:36.620682 2712679360 net.cpp:84] Creating Layer loss
I0523 15:43:36.620689 2712679360 net.cpp:406] loss <- ip2
I0523 15:43:36.620697 2712679360 net.cpp:406] loss <- label
I0523 15:43:36.620703 2712679360 net.cpp:380] loss -> loss
I0523 15:43:36.620730 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 15:43:36.620749 2712679360 net.cpp:122] Setting up loss
I0523 15:43:36.620756 2712679360 net.cpp:129] Top shape: (1)
I0523 15:43:36.620764 2712679360 net.cpp:132]     with loss weight 1
I0523 15:43:36.620787 2712679360 net.cpp:137] Memory required for data: 31978804
I0523 15:43:36.620795 2712679360 net.cpp:198] loss needs backward computation.
I0523 15:43:36.620800 2712679360 net.cpp:198] ip2 needs backward computation.
I0523 15:43:36.620807 2712679360 net.cpp:198] ip1 needs backward computation.
I0523 15:43:36.620813 2712679360 net.cpp:198] pool3 needs backward computation.
I0523 15:43:36.620820 2712679360 net.cpp:198] relu3 needs backward computation.
I0523 15:43:36.620832 2712679360 net.cpp:198] conv3 needs backward computation.
I0523 15:43:36.620851 2712679360 net.cpp:198] pool2 needs backward computation.
I0523 15:43:36.620859 2712679360 net.cpp:198] relu2 needs backward computation.
I0523 15:43:36.620867 2712679360 net.cpp:198] conv2 needs backward computation.
I0523 15:43:36.620875 2712679360 net.cpp:198] relu1 needs backward computation.
I0523 15:43:36.620882 2712679360 net.cpp:198] pool1 needs backward computation.
I0523 15:43:36.620889 2712679360 net.cpp:198] conv1 needs backward computation.
I0523 15:43:36.620896 2712679360 net.cpp:200] cifar does not need backward computation.
I0523 15:43:36.620904 2712679360 net.cpp:242] This network produces output loss
I0523 15:43:36.620916 2712679360 net.cpp:255] Network initialization done.
I0523 15:43:36.621170 2712679360 solver.cpp:172] Creating test net (#0) specified by net file: examples/cifar10/cifar10_quick_train_test.prototxt
I0523 15:43:36.621199 2712679360 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer cifar
I0523 15:43:36.621210 2712679360 net.cpp:51] Initializing net from parameters:
name: "CIFAR10_quick"
state {
  phase: TEST
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0523 15:43:36.621821 2712679360 layer_factory.hpp:77] Creating layer cifar
I0523 15:43:36.621913 2712679360 db_lmdb.cpp:35] Opened lmdb examples/cifar10/cifar10_test_lmdb
I0523 15:43:36.621933 2712679360 net.cpp:84] Creating Layer cifar
I0523 15:43:36.621943 2712679360 net.cpp:380] cifar -> data
I0523 15:43:36.621950 2712679360 net.cpp:380] cifar -> label
I0523 15:43:36.621958 2712679360 data_transformer.cpp:25] Loading mean file from: examples/cifar10/mean.binaryproto
I0523 15:43:36.622017 2712679360 data_layer.cpp:45] output data size: 100,3,32,32
I0523 15:43:36.624790 2712679360 net.cpp:122] Setting up cifar
I0523 15:43:36.624822 2712679360 net.cpp:129] Top shape: 100 3 32 32 (307200)
I0523 15:43:36.624830 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 15:43:36.624835 2712679360 net.cpp:137] Memory required for data: 1229200
I0523 15:43:36.624840 2712679360 layer_factory.hpp:77] Creating layer label_cifar_1_split
I0523 15:43:36.624851 2712679360 net.cpp:84] Creating Layer label_cifar_1_split
I0523 15:43:36.624856 2712679360 net.cpp:406] label_cifar_1_split <- label
I0523 15:43:36.624862 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_0
I0523 15:43:36.624869 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_1
I0523 15:43:36.624876 2712679360 net.cpp:122] Setting up label_cifar_1_split
I0523 15:43:36.624878 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 15:43:36.624882 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 15:43:36.624886 2712679360 net.cpp:137] Memory required for data: 1230000
I0523 15:43:36.624917 2712679360 layer_factory.hpp:77] Creating layer conv1
I0523 15:43:36.624927 2712679360 net.cpp:84] Creating Layer conv1
I0523 15:43:36.624930 2712679360 net.cpp:406] conv1 <- data
I0523 15:43:36.624935 2712679360 net.cpp:380] conv1 -> conv1
I0523 15:43:36.624987 2712679360 net.cpp:122] Setting up conv1
I0523 15:43:36.624991 2712679360 net.cpp:129] Top shape: 100 32 32 32 (3276800)
I0523 15:43:36.624996 2712679360 net.cpp:137] Memory required for data: 14337200
I0523 15:43:36.625002 2712679360 layer_factory.hpp:77] Creating layer pool1
I0523 15:43:36.625008 2712679360 net.cpp:84] Creating Layer pool1
I0523 15:43:36.625011 2712679360 net.cpp:406] pool1 <- conv1
I0523 15:43:36.625015 2712679360 net.cpp:380] pool1 -> pool1
I0523 15:43:36.625022 2712679360 net.cpp:122] Setting up pool1
I0523 15:43:36.625026 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.625031 2712679360 net.cpp:137] Memory required for data: 17614000
I0523 15:43:36.625036 2712679360 layer_factory.hpp:77] Creating layer relu1
I0523 15:43:36.625041 2712679360 net.cpp:84] Creating Layer relu1
I0523 15:43:36.625043 2712679360 net.cpp:406] relu1 <- pool1
I0523 15:43:36.625048 2712679360 net.cpp:367] relu1 -> pool1 (in-place)
I0523 15:43:36.625053 2712679360 net.cpp:122] Setting up relu1
I0523 15:43:36.625056 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.625061 2712679360 net.cpp:137] Memory required for data: 20890800
I0523 15:43:36.625064 2712679360 layer_factory.hpp:77] Creating layer conv2
I0523 15:43:36.625071 2712679360 net.cpp:84] Creating Layer conv2
I0523 15:43:36.625074 2712679360 net.cpp:406] conv2 <- pool1
I0523 15:43:36.625084 2712679360 net.cpp:380] conv2 -> conv2
I0523 15:43:36.625396 2712679360 net.cpp:122] Setting up conv2
I0523 15:43:36.625402 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.625407 2712679360 net.cpp:137] Memory required for data: 24167600
I0523 15:43:36.625412 2712679360 layer_factory.hpp:77] Creating layer relu2
I0523 15:43:36.625417 2712679360 net.cpp:84] Creating Layer relu2
I0523 15:43:36.625422 2712679360 net.cpp:406] relu2 <- conv2
I0523 15:43:36.625425 2712679360 net.cpp:367] relu2 -> conv2 (in-place)
I0523 15:43:36.625429 2712679360 net.cpp:122] Setting up relu2
I0523 15:43:36.625433 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 15:43:36.625437 2712679360 net.cpp:137] Memory required for data: 27444400
I0523 15:43:36.625440 2712679360 layer_factory.hpp:77] Creating layer pool2
I0523 15:43:36.625445 2712679360 net.cpp:84] Creating Layer pool2
I0523 15:43:36.625448 2712679360 net.cpp:406] pool2 <- conv2
I0523 15:43:36.625452 2712679360 net.cpp:380] pool2 -> pool2
I0523 15:43:36.625458 2712679360 net.cpp:122] Setting up pool2
I0523 15:43:36.625460 2712679360 net.cpp:129] Top shape: 100 32 8 8 (204800)
I0523 15:43:36.625464 2712679360 net.cpp:137] Memory required for data: 28263600
I0523 15:43:36.625468 2712679360 layer_factory.hpp:77] Creating layer conv3
I0523 15:43:36.625474 2712679360 net.cpp:84] Creating Layer conv3
I0523 15:43:36.625479 2712679360 net.cpp:406] conv3 <- pool2
I0523 15:43:36.625483 2712679360 net.cpp:380] conv3 -> conv3
I0523 15:43:36.626077 2712679360 net.cpp:122] Setting up conv3
I0523 15:43:36.626083 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 15:43:36.626088 2712679360 net.cpp:137] Memory required for data: 29902000
I0523 15:43:36.626093 2712679360 layer_factory.hpp:77] Creating layer relu3
I0523 15:43:36.626098 2712679360 net.cpp:84] Creating Layer relu3
I0523 15:43:36.626101 2712679360 net.cpp:406] relu3 <- conv3
I0523 15:43:36.626106 2712679360 net.cpp:367] relu3 -> conv3 (in-place)
I0523 15:43:36.626111 2712679360 net.cpp:122] Setting up relu3
I0523 15:43:36.626113 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 15:43:36.626117 2712679360 net.cpp:137] Memory required for data: 31540400
I0523 15:43:36.626121 2712679360 layer_factory.hpp:77] Creating layer pool3
I0523 15:43:36.626126 2712679360 net.cpp:84] Creating Layer pool3
I0523 15:43:36.626129 2712679360 net.cpp:406] pool3 <- conv3
I0523 15:43:36.626145 2712679360 net.cpp:380] pool3 -> pool3
I0523 15:43:36.626152 2712679360 net.cpp:122] Setting up pool3
I0523 15:43:36.626154 2712679360 net.cpp:129] Top shape: 100 64 4 4 (102400)
I0523 15:43:36.626159 2712679360 net.cpp:137] Memory required for data: 31950000
I0523 15:43:36.626163 2712679360 layer_factory.hpp:77] Creating layer ip1
I0523 15:43:36.626168 2712679360 net.cpp:84] Creating Layer ip1
I0523 15:43:36.626173 2712679360 net.cpp:406] ip1 <- pool3
I0523 15:43:36.626176 2712679360 net.cpp:380] ip1 -> ip1
I0523 15:43:36.626969 2712679360 net.cpp:122] Setting up ip1
I0523 15:43:36.626981 2712679360 net.cpp:129] Top shape: 100 64 (6400)
I0523 15:43:36.626986 2712679360 net.cpp:137] Memory required for data: 31975600
I0523 15:43:36.626992 2712679360 layer_factory.hpp:77] Creating layer ip2
I0523 15:43:36.626999 2712679360 net.cpp:84] Creating Layer ip2
I0523 15:43:36.627003 2712679360 net.cpp:406] ip2 <- ip1
I0523 15:43:36.627008 2712679360 net.cpp:380] ip2 -> ip2
I0523 15:43:36.627024 2712679360 net.cpp:122] Setting up ip2
I0523 15:43:36.627028 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 15:43:36.627032 2712679360 net.cpp:137] Memory required for data: 31979600
I0523 15:43:36.627039 2712679360 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0523 15:43:36.627046 2712679360 net.cpp:84] Creating Layer ip2_ip2_0_split
I0523 15:43:36.627053 2712679360 net.cpp:406] ip2_ip2_0_split <- ip2
I0523 15:43:36.627059 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0523 15:43:36.627068 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0523 15:43:36.627076 2712679360 net.cpp:122] Setting up ip2_ip2_0_split
I0523 15:43:36.627081 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 15:43:36.627085 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 15:43:36.627089 2712679360 net.cpp:137] Memory required for data: 31987600
I0523 15:43:36.627094 2712679360 layer_factory.hpp:77] Creating layer accuracy
I0523 15:43:36.627099 2712679360 net.cpp:84] Creating Layer accuracy
I0523 15:43:36.627102 2712679360 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0523 15:43:36.627106 2712679360 net.cpp:406] accuracy <- label_cifar_1_split_0
I0523 15:43:36.627110 2712679360 net.cpp:380] accuracy -> accuracy
I0523 15:43:36.627116 2712679360 net.cpp:122] Setting up accuracy
I0523 15:43:36.627120 2712679360 net.cpp:129] Top shape: (1)
I0523 15:43:36.627123 2712679360 net.cpp:137] Memory required for data: 31987604
I0523 15:43:36.627126 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 15:43:36.627133 2712679360 net.cpp:84] Creating Layer loss
I0523 15:43:36.627169 2712679360 net.cpp:406] loss <- ip2_ip2_0_split_1
I0523 15:43:36.627178 2712679360 net.cpp:406] loss <- label_cifar_1_split_1
I0523 15:43:36.627183 2712679360 net.cpp:380] loss -> loss
I0523 15:43:36.627189 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 15:43:36.627198 2712679360 net.cpp:122] Setting up loss
I0523 15:43:36.627202 2712679360 net.cpp:129] Top shape: (1)
I0523 15:43:36.627207 2712679360 net.cpp:132]     with loss weight 1
I0523 15:43:36.627213 2712679360 net.cpp:137] Memory required for data: 31987608
I0523 15:43:36.627215 2712679360 net.cpp:198] loss needs backward computation.
I0523 15:43:36.627219 2712679360 net.cpp:200] accuracy does not need backward computation.
I0523 15:43:36.627223 2712679360 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0523 15:43:36.627228 2712679360 net.cpp:198] ip2 needs backward computation.
I0523 15:43:36.627230 2712679360 net.cpp:198] ip1 needs backward computation.
I0523 15:43:36.627234 2712679360 net.cpp:198] pool3 needs backward computation.
I0523 15:43:36.627321 2712679360 net.cpp:198] relu3 needs backward computation.
I0523 15:43:36.627334 2712679360 net.cpp:198] conv3 needs backward computation.
I0523 15:43:36.627341 2712679360 net.cpp:198] pool2 needs backward computation.
I0523 15:43:36.627348 2712679360 net.cpp:198] relu2 needs backward computation.
I0523 15:43:36.627354 2712679360 net.cpp:198] conv2 needs backward computation.
I0523 15:43:36.627387 2712679360 net.cpp:198] relu1 needs backward computation.
I0523 15:43:36.627394 2712679360 net.cpp:198] pool1 needs backward computation.
I0523 15:43:36.627400 2712679360 net.cpp:198] conv1 needs backward computation.
I0523 15:43:36.627409 2712679360 net.cpp:200] label_cifar_1_split does not need backward computation.
I0523 15:43:36.627418 2712679360 net.cpp:200] cifar does not need backward computation.
I0523 15:43:36.627432 2712679360 net.cpp:242] This network produces output accuracy
I0523 15:43:36.627454 2712679360 net.cpp:242] This network produces output loss
I0523 15:43:36.627470 2712679360 net.cpp:255] Network initialization done.
I0523 15:43:36.627553 2712679360 solver.cpp:56] Solver scaffolding done.
I0523 15:43:36.627593 2712679360 caffe.cpp:248] Starting Optimization
I0523 15:43:36.627602 2712679360 solver.cpp:272] Solving CIFAR10_quick
I0523 15:43:36.627610 2712679360 solver.cpp:273] Learning Rate Policy: fixed
I0523 15:43:36.627933 2712679360 solver.cpp:330] Iteration 0, Testing net (#0)
I0523 15:43:46.157997 1515520 data_layer.cpp:73] Restarting data prefetching from start.
I0523 15:43:46.542196 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.0865
I0523 15:43:46.542232 2712679360 solver.cpp:397]     Test net output #1: loss = 2.3025 (* 1 = 2.3025 loss)
I0523 15:43:46.784966 2712679360 solver.cpp:218] Iteration 0 (0 iter/s, 10.157s/100 iters), loss = 2.30202
I0523 15:43:46.785002 2712679360 solver.cpp:237]     Train net output #0: loss = 2.30202 (* 1 = 2.30202 loss)
I0523 15:43:46.785009 2712679360 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I0523 15:44:08.112608 2712679360 solver.cpp:218] Iteration 100 (4.68889 iter/s, 21.327s/100 iters), loss = 1.67773
I0523 15:44:08.112664 2712679360 solver.cpp:237]     Train net output #0: loss = 1.67773 (* 1 = 1.67773 loss)
I0523 15:44:08.112673 2712679360 sgd_solver.cpp:105] Iteration 100, lr = 0.001
I0523 15:44:29.336644 2712679360 solver.cpp:218] Iteration 200 (4.71187 iter/s, 21.223s/100 iters), loss = 1.59886
I0523 15:44:29.336683 2712679360 solver.cpp:237]     Train net output #0: loss = 1.59886 (* 1 = 1.59886 loss)
I0523 15:44:29.336693 2712679360 sgd_solver.cpp:105] Iteration 200, lr = 0.001
I0523 15:44:50.573981 2712679360 solver.cpp:218] Iteration 300 (4.70876 iter/s, 21.237s/100 iters), loss = 1.31839
I0523 15:44:50.574038 2712679360 solver.cpp:237]     Train net output #0: loss = 1.31839 (* 1 = 1.31839 loss)
I0523 15:44:50.574044 2712679360 sgd_solver.cpp:105] Iteration 300, lr = 0.001
I0523 15:45:12.080576 2712679360 solver.cpp:218] Iteration 400 (4.64987 iter/s, 21.506s/100 iters), loss = 1.24876
I0523 15:45:12.080610 2712679360 solver.cpp:237]     Train net output #0: loss = 1.24876 (* 1 = 1.24876 loss)
I0523 15:45:12.080618 2712679360 sgd_solver.cpp:105] Iteration 400, lr = 0.001
I0523 15:45:32.450579 978944 data_layer.cpp:73] Restarting data prefetching from start.
I0523 15:45:33.342396 2712679360 solver.cpp:330] Iteration 500, Testing net (#0)
I0523 15:45:42.732501 1515520 data_layer.cpp:73] Restarting data prefetching from start.
I0523 15:45:43.134589 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.5366
I0523 15:45:43.134620 2712679360 solver.cpp:397]     Test net output #1: loss = 1.31952 (* 1 = 1.31952 loss)
I0523 15:45:43.360550 2712679360 solver.cpp:218] Iteration 500 (3.19703 iter/s, 31.279s/100 iters), loss = 1.22391
I0523 15:45:43.360582 2712679360 solver.cpp:237]     Train net output #0: loss = 1.22391 (* 1 = 1.22391 loss)
I0523 15:45:43.360589 2712679360 sgd_solver.cpp:105] Iteration 500, lr = 0.001
I0523 15:46:06.734716 2712679360 solver.cpp:218] Iteration 600 (4.27826 iter/s, 23.374s/100 iters), loss = 1.23177
I0523 15:46:06.734771 2712679360 solver.cpp:237]     Train net output #0: loss = 1.23177 (* 1 = 1.23177 loss)
I0523 15:46:06.734779 2712679360 sgd_solver.cpp:105] Iteration 600, lr = 0.001
.......数据形式基本相同 故省略...
I0523 16:00:46.286926 2712679360 solver.cpp:218] Iteration 3900 (4.08731 iter/s, 24.466s/100 iters), loss = 0.557826
I0523 16:00:46.286960 2712679360 solver.cpp:237]     Train net output #0: loss = 0.557826 (* 1 = 0.557826 loss)
I0523 16:00:46.286967 2712679360 sgd_solver.cpp:105] Iteration 3900, lr = 0.001
I0523 16:01:09.469552 978944 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:01:10.472170 2712679360 solver.cpp:447] Snapshotting to binary proto file examples/cifar10/cifar10_quick_iter_4000.caffemodel
I0523 16:01:10.475755 2712679360 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/cifar10/cifar10_quick_iter_4000.solverstate
I0523 16:01:10.590515 2712679360 solver.cpp:310] Iteration 4000, loss = 0.641508
I0523 16:01:10.590548 2712679360 solver.cpp:330] Iteration 4000, Testing net (#0)
I0523 16:01:21.619536 1515520 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:01:22.054498 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.7119
I0523 16:01:22.054538 2712679360 solver.cpp:397]     Test net output #1: loss = 0.848064 (* 1 = 0.848064 loss)
I0523 16:01:22.054548 2712679360 solver.cpp:315] Optimization Done.
I0523 16:01:22.054555 2712679360 caffe.cpp:259] Optimization Done.
I0523 16:01:22.119184 2712679360 caffe.cpp:211] Use CPU.
I0523 16:01:22.120214 2712679360 solver.cpp:44] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.0001
display: 100
max_iter: 5000
lr_policy: "fixed"
momentum: 0.9
weight_decay: 0.004
snapshot: 5000
snapshot_prefix: "examples/cifar10/cifar10_quick"
solver_mode: CPU
net: "examples/cifar10/cifar10_quick_train_test.prototxt"
train_state {
  level: 0
  stage: ""
}
snapshot_format: HDF5
I0523 16:01:22.120556 2712679360 solver.cpp:87] Creating training net from net file: examples/cifar10/cifar10_quick_train_test.prototxt
I0523 16:01:22.120817 2712679360 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer cifar
I0523 16:01:22.120833 2712679360 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0523 16:01:22.120841 2712679360 net.cpp:51] Initializing net from parameters:
name: "CIFAR10_quick"
state {
  phase: TRAIN
  level: 0
  stage: ""
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_train_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0523 16:01:22.121104 2712679360 layer_factory.hpp:77] Creating layer cifar
I0523 16:01:22.121320 2712679360 db_lmdb.cpp:35] Opened lmdb examples/cifar10/cifar10_train_lmdb
I0523 16:01:22.121383 2712679360 net.cpp:84] Creating Layer cifar
I0523 16:01:22.121393 2712679360 net.cpp:380] cifar -> data
I0523 16:01:22.121413 2712679360 net.cpp:380] cifar -> label
I0523 16:01:22.121431 2712679360 data_transformer.cpp:25] Loading mean file from: examples/cifar10/mean.binaryproto
I0523 16:01:22.121585 2712679360 data_layer.cpp:45] output data size: 100,3,32,32
I0523 16:01:22.128842 2712679360 net.cpp:122] Setting up cifar
I0523 16:01:22.128867 2712679360 net.cpp:129] Top shape: 100 3 32 32 (307200)
I0523 16:01:22.128875 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:01:22.128880 2712679360 net.cpp:137] Memory required for data: 1229200
I0523 16:01:22.128890 2712679360 layer_factory.hpp:77] Creating layer conv1
I0523 16:01:22.128902 2712679360 net.cpp:84] Creating Layer conv1
I0523 16:01:22.128907 2712679360 net.cpp:406] conv1 <- data
I0523 16:01:22.128914 2712679360 net.cpp:380] conv1 -> conv1
I0523 16:01:22.129009 2712679360 net.cpp:122] Setting up conv1
I0523 16:01:22.129017 2712679360 net.cpp:129] Top shape: 100 32 32 32 (3276800)
I0523 16:01:22.129022 2712679360 net.cpp:137] Memory required for data: 14336400
I0523 16:01:22.129030 2712679360 layer_factory.hpp:77] Creating layer pool1
I0523 16:01:22.129039 2712679360 net.cpp:84] Creating Layer pool1
I0523 16:01:22.129042 2712679360 net.cpp:406] pool1 <- conv1
I0523 16:01:22.129047 2712679360 net.cpp:380] pool1 -> pool1
I0523 16:01:22.129057 2712679360 net.cpp:122] Setting up pool1
I0523 16:01:22.129062 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.129067 2712679360 net.cpp:137] Memory required for data: 17613200
I0523 16:01:22.129071 2712679360 layer_factory.hpp:77] Creating layer relu1
I0523 16:01:22.129078 2712679360 net.cpp:84] Creating Layer relu1
I0523 16:01:22.129083 2712679360 net.cpp:406] relu1 <- pool1
I0523 16:01:22.129087 2712679360 net.cpp:367] relu1 -> pool1 (in-place)
I0523 16:01:22.129093 2712679360 net.cpp:122] Setting up relu1
I0523 16:01:22.129097 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.129102 2712679360 net.cpp:137] Memory required for data: 20890000
I0523 16:01:22.129106 2712679360 layer_factory.hpp:77] Creating layer conv2
I0523 16:01:22.129117 2712679360 net.cpp:84] Creating Layer conv2
I0523 16:01:22.129120 2712679360 net.cpp:406] conv2 <- pool1
I0523 16:01:22.129125 2712679360 net.cpp:380] conv2 -> conv2
I0523 16:01:22.129482 2712679360 net.cpp:122] Setting up conv2
I0523 16:01:22.129487 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.129493 2712679360 net.cpp:137] Memory required for data: 24166800
I0523 16:01:22.129500 2712679360 layer_factory.hpp:77] Creating layer relu2
I0523 16:01:22.129505 2712679360 net.cpp:84] Creating Layer relu2
I0523 16:01:22.129509 2712679360 net.cpp:406] relu2 <- conv2
I0523 16:01:22.129514 2712679360 net.cpp:367] relu2 -> conv2 (in-place)
I0523 16:01:22.129520 2712679360 net.cpp:122] Setting up relu2
I0523 16:01:22.129524 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.129528 2712679360 net.cpp:137] Memory required for data: 27443600
I0523 16:01:22.129534 2712679360 layer_factory.hpp:77] Creating layer pool2
I0523 16:01:22.129537 2712679360 net.cpp:84] Creating Layer pool2
I0523 16:01:22.129541 2712679360 net.cpp:406] pool2 <- conv2
I0523 16:01:22.129547 2712679360 net.cpp:380] pool2 -> pool2
I0523 16:01:22.129554 2712679360 net.cpp:122] Setting up pool2
I0523 16:01:22.129557 2712679360 net.cpp:129] Top shape: 100 32 8 8 (204800)
I0523 16:01:22.129562 2712679360 net.cpp:137] Memory required for data: 28262800
I0523 16:01:22.129566 2712679360 layer_factory.hpp:77] Creating layer conv3
I0523 16:01:22.129573 2712679360 net.cpp:84] Creating Layer conv3
I0523 16:01:22.129577 2712679360 net.cpp:406] conv3 <- pool2
I0523 16:01:22.129585 2712679360 net.cpp:380] conv3 -> conv3
I0523 16:01:22.130280 2712679360 net.cpp:122] Setting up conv3
I0523 16:01:22.130286 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:01:22.130292 2712679360 net.cpp:137] Memory required for data: 29901200
I0523 16:01:22.130298 2712679360 layer_factory.hpp:77] Creating layer relu3
I0523 16:01:22.130304 2712679360 net.cpp:84] Creating Layer relu3
I0523 16:01:22.130308 2712679360 net.cpp:406] relu3 <- conv3
I0523 16:01:22.130313 2712679360 net.cpp:367] relu3 -> conv3 (in-place)
I0523 16:01:22.130318 2712679360 net.cpp:122] Setting up relu3
I0523 16:01:22.130353 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:01:22.130360 2712679360 net.cpp:137] Memory required for data: 31539600
I0523 16:01:22.130364 2712679360 layer_factory.hpp:77] Creating layer pool3
I0523 16:01:22.130370 2712679360 net.cpp:84] Creating Layer pool3
I0523 16:01:22.130374 2712679360 net.cpp:406] pool3 <- conv3
I0523 16:01:22.130379 2712679360 net.cpp:380] pool3 -> pool3
I0523 16:01:22.130385 2712679360 net.cpp:122] Setting up pool3
I0523 16:01:22.130389 2712679360 net.cpp:129] Top shape: 100 64 4 4 (102400)
I0523 16:01:22.130396 2712679360 net.cpp:137] Memory required for data: 31949200
I0523 16:01:22.130400 2712679360 layer_factory.hpp:77] Creating layer ip1
I0523 16:01:22.130409 2712679360 net.cpp:84] Creating Layer ip1
I0523 16:01:22.130414 2712679360 net.cpp:406] ip1 <- pool3
I0523 16:01:22.130419 2712679360 net.cpp:380] ip1 -> ip1
I0523 16:01:22.131337 2712679360 net.cpp:122] Setting up ip1
I0523 16:01:22.131347 2712679360 net.cpp:129] Top shape: 100 64 (6400)
I0523 16:01:22.131352 2712679360 net.cpp:137] Memory required for data: 31974800
I0523 16:01:22.131358 2712679360 layer_factory.hpp:77] Creating layer ip2
I0523 16:01:22.131364 2712679360 net.cpp:84] Creating Layer ip2
I0523 16:01:22.131369 2712679360 net.cpp:406] ip2 <- ip1
I0523 16:01:22.131374 2712679360 net.cpp:380] ip2 -> ip2
I0523 16:01:22.131392 2712679360 net.cpp:122] Setting up ip2
I0523 16:01:22.131397 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:01:22.131400 2712679360 net.cpp:137] Memory required for data: 31978800
I0523 16:01:22.131407 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:01:22.131413 2712679360 net.cpp:84] Creating Layer loss
I0523 16:01:22.131417 2712679360 net.cpp:406] loss <- ip2
I0523 16:01:22.131422 2712679360 net.cpp:406] loss <- label
I0523 16:01:22.131427 2712679360 net.cpp:380] loss -> loss
I0523 16:01:22.131435 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:01:22.131448 2712679360 net.cpp:122] Setting up loss
I0523 16:01:22.131453 2712679360 net.cpp:129] Top shape: (1)
I0523 16:01:22.131458 2712679360 net.cpp:132]     with loss weight 1
I0523 16:01:22.131471 2712679360 net.cpp:137] Memory required for data: 31978804
I0523 16:01:22.131476 2712679360 net.cpp:198] loss needs backward computation.
I0523 16:01:22.131495 2712679360 net.cpp:198] ip2 needs backward computation.
I0523 16:01:22.131505 2712679360 net.cpp:198] ip1 needs backward computation.
I0523 16:01:22.131510 2712679360 net.cpp:198] pool3 needs backward computation.
I0523 16:01:22.131515 2712679360 net.cpp:198] relu3 needs backward computation.
I0523 16:01:22.131518 2712679360 net.cpp:198] conv3 needs backward computation.
I0523 16:01:22.131522 2712679360 net.cpp:198] pool2 needs backward computation.
I0523 16:01:22.131527 2712679360 net.cpp:198] relu2 needs backward computation.
I0523 16:01:22.131531 2712679360 net.cpp:198] conv2 needs backward computation.
I0523 16:01:22.131536 2712679360 net.cpp:198] relu1 needs backward computation.
I0523 16:01:22.131541 2712679360 net.cpp:198] pool1 needs backward computation.
I0523 16:01:22.131544 2712679360 net.cpp:198] conv1 needs backward computation.
I0523 16:01:22.131548 2712679360 net.cpp:200] cifar does not need backward computation.
I0523 16:01:22.131552 2712679360 net.cpp:242] This network produces output loss
I0523 16:01:22.131561 2712679360 net.cpp:255] Network initialization done.
I0523 16:01:22.131786 2712679360 solver.cpp:172] Creating test net (#0) specified by net file: examples/cifar10/cifar10_quick_train_test.prototxt
I0523 16:01:22.131814 2712679360 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer cifar
I0523 16:01:22.131826 2712679360 net.cpp:51] Initializing net from parameters:
name: "CIFAR10_quick"
state {
  phase: TEST
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0523 16:01:22.132225 2712679360 layer_factory.hpp:77] Creating layer cifar
I0523 16:01:22.132313 2712679360 db_lmdb.cpp:35] Opened lmdb examples/cifar10/cifar10_test_lmdb
I0523 16:01:22.132342 2712679360 net.cpp:84] Creating Layer cifar
I0523 16:01:22.132356 2712679360 net.cpp:380] cifar -> data
I0523 16:01:22.132364 2712679360 net.cpp:380] cifar -> label
I0523 16:01:22.132372 2712679360 data_transformer.cpp:25] Loading mean file from: examples/cifar10/mean.binaryproto
I0523 16:01:22.132438 2712679360 data_layer.cpp:45] output data size: 100,3,32,32
I0523 16:01:22.134943 2712679360 net.cpp:122] Setting up cifar
I0523 16:01:22.134956 2712679360 net.cpp:129] Top shape: 100 3 32 32 (307200)
I0523 16:01:22.134963 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:01:22.134968 2712679360 net.cpp:137] Memory required for data: 1229200
I0523 16:01:22.134974 2712679360 layer_factory.hpp:77] Creating layer label_cifar_1_split
I0523 16:01:22.134984 2712679360 net.cpp:84] Creating Layer label_cifar_1_split
I0523 16:01:22.135015 2712679360 net.cpp:406] label_cifar_1_split <- label
I0523 16:01:22.135064 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_0
I0523 16:01:22.135078 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_1
I0523 16:01:22.135116 2712679360 net.cpp:122] Setting up label_cifar_1_split
I0523 16:01:22.135167 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:01:22.135203 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:01:22.135241 2712679360 net.cpp:137] Memory required for data: 1230000
I0523 16:01:22.135313 2712679360 layer_factory.hpp:77] Creating layer conv1
I0523 16:01:22.135330 2712679360 net.cpp:84] Creating Layer conv1
I0523 16:01:22.135335 2712679360 net.cpp:406] conv1 <- data
I0523 16:01:22.135342 2712679360 net.cpp:380] conv1 -> conv1
I0523 16:01:22.135398 2712679360 net.cpp:122] Setting up conv1
I0523 16:01:22.135404 2712679360 net.cpp:129] Top shape: 100 32 32 32 (3276800)
I0523 16:01:22.135411 2712679360 net.cpp:137] Memory required for data: 14337200
I0523 16:01:22.135418 2712679360 layer_factory.hpp:77] Creating layer pool1
I0523 16:01:22.135463 2712679360 net.cpp:84] Creating Layer pool1
I0523 16:01:22.135473 2712679360 net.cpp:406] pool1 <- conv1
I0523 16:01:22.135514 2712679360 net.cpp:380] pool1 -> pool1
I0523 16:01:22.135565 2712679360 net.cpp:122] Setting up pool1
I0523 16:01:22.135574 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.135581 2712679360 net.cpp:137] Memory required for data: 17614000
I0523 16:01:22.135586 2712679360 layer_factory.hpp:77] Creating layer relu1
I0523 16:01:22.135593 2712679360 net.cpp:84] Creating Layer relu1
I0523 16:01:22.135598 2712679360 net.cpp:406] relu1 <- pool1
I0523 16:01:22.135603 2712679360 net.cpp:367] relu1 -> pool1 (in-place)
I0523 16:01:22.135609 2712679360 net.cpp:122] Setting up relu1
I0523 16:01:22.135613 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.135666 2712679360 net.cpp:137] Memory required for data: 20890800
I0523 16:01:22.135673 2712679360 layer_factory.hpp:77] Creating layer conv2
I0523 16:01:22.135681 2712679360 net.cpp:84] Creating Layer conv2
I0523 16:01:22.135686 2712679360 net.cpp:406] conv2 <- pool1
I0523 16:01:22.135700 2712679360 net.cpp:380] conv2 -> conv2
I0523 16:01:22.136068 2712679360 net.cpp:122] Setting up conv2
I0523 16:01:22.136076 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.136081 2712679360 net.cpp:137] Memory required for data: 24167600
I0523 16:01:22.136088 2712679360 layer_factory.hpp:77] Creating layer relu2
I0523 16:01:22.136095 2712679360 net.cpp:84] Creating Layer relu2
I0523 16:01:22.136098 2712679360 net.cpp:406] relu2 <- conv2
I0523 16:01:22.136103 2712679360 net.cpp:367] relu2 -> conv2 (in-place)
I0523 16:01:22.136108 2712679360 net.cpp:122] Setting up relu2
I0523 16:01:22.136112 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:01:22.136117 2712679360 net.cpp:137] Memory required for data: 27444400
I0523 16:01:22.136121 2712679360 layer_factory.hpp:77] Creating layer pool2
I0523 16:01:22.136127 2712679360 net.cpp:84] Creating Layer pool2
I0523 16:01:22.136132 2712679360 net.cpp:406] pool2 <- conv2
I0523 16:01:22.136135 2712679360 net.cpp:380] pool2 -> pool2
I0523 16:01:22.136142 2712679360 net.cpp:122] Setting up pool2
I0523 16:01:22.136147 2712679360 net.cpp:129] Top shape: 100 32 8 8 (204800)
I0523 16:01:22.136152 2712679360 net.cpp:137] Memory required for data: 28263600
I0523 16:01:22.136157 2712679360 layer_factory.hpp:77] Creating layer conv3
I0523 16:01:22.136163 2712679360 net.cpp:84] Creating Layer conv3
I0523 16:01:22.136168 2712679360 net.cpp:406] conv3 <- pool2
I0523 16:01:22.136173 2712679360 net.cpp:380] conv3 -> conv3
I0523 16:01:22.136878 2712679360 net.cpp:122] Setting up conv3
I0523 16:01:22.136888 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:01:22.136893 2712679360 net.cpp:137] Memory required for data: 29902000
I0523 16:01:22.136899 2712679360 layer_factory.hpp:77] Creating layer relu3
I0523 16:01:22.136904 2712679360 net.cpp:84] Creating Layer relu3
I0523 16:01:22.136909 2712679360 net.cpp:406] relu3 <- conv3
I0523 16:01:22.136914 2712679360 net.cpp:367] relu3 -> conv3 (in-place)
I0523 16:01:22.136919 2712679360 net.cpp:122] Setting up relu3
I0523 16:01:22.136930 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:01:22.136961 2712679360 net.cpp:137] Memory required for data: 31540400
I0523 16:01:22.136968 2712679360 layer_factory.hpp:77] Creating layer pool3
I0523 16:01:22.136976 2712679360 net.cpp:84] Creating Layer pool3
I0523 16:01:22.137001 2712679360 net.cpp:406] pool3 <- conv3
I0523 16:01:22.137008 2712679360 net.cpp:380] pool3 -> pool3
I0523 16:01:22.137017 2712679360 net.cpp:122] Setting up pool3
I0523 16:01:22.137022 2712679360 net.cpp:129] Top shape: 100 64 4 4 (102400)
I0523 16:01:22.137027 2712679360 net.cpp:137] Memory required for data: 31950000
I0523 16:01:22.137032 2712679360 layer_factory.hpp:77] Creating layer ip1
I0523 16:01:22.137039 2712679360 net.cpp:84] Creating Layer ip1
I0523 16:01:22.137044 2712679360 net.cpp:406] ip1 <- pool3
I0523 16:01:22.137050 2712679360 net.cpp:380] ip1 -> ip1
I0523 16:01:22.137981 2712679360 net.cpp:122] Setting up ip1
I0523 16:01:22.137995 2712679360 net.cpp:129] Top shape: 100 64 (6400)
I0523 16:01:22.138002 2712679360 net.cpp:137] Memory required for data: 31975600
I0523 16:01:22.138008 2712679360 layer_factory.hpp:77] Creating layer ip2
I0523 16:01:22.138016 2712679360 net.cpp:84] Creating Layer ip2
I0523 16:01:22.138021 2712679360 net.cpp:406] ip2 <- ip1
I0523 16:01:22.138027 2712679360 net.cpp:380] ip2 -> ip2
I0523 16:01:22.138046 2712679360 net.cpp:122] Setting up ip2
I0523 16:01:22.138051 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:01:22.138056 2712679360 net.cpp:137] Memory required for data: 31979600
I0523 16:01:22.138062 2712679360 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0523 16:01:22.138085 2712679360 net.cpp:84] Creating Layer ip2_ip2_0_split
I0523 16:01:22.138103 2712679360 net.cpp:406] ip2_ip2_0_split <- ip2
I0523 16:01:22.138115 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0523 16:01:22.138129 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0523 16:01:22.138142 2712679360 net.cpp:122] Setting up ip2_ip2_0_split
I0523 16:01:22.138150 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:01:22.138160 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:01:22.138170 2712679360 net.cpp:137] Memory required for data: 31987600
I0523 16:01:22.138177 2712679360 layer_factory.hpp:77] Creating layer accuracy
I0523 16:01:22.138187 2712679360 net.cpp:84] Creating Layer accuracy
I0523 16:01:22.138219 2712679360 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0523 16:01:22.138231 2712679360 net.cpp:406] accuracy <- label_cifar_1_split_0
I0523 16:01:22.138242 2712679360 net.cpp:380] accuracy -> accuracy
I0523 16:01:22.138257 2712679360 net.cpp:122] Setting up accuracy
I0523 16:01:22.138264 2712679360 net.cpp:129] Top shape: (1)
I0523 16:01:22.138274 2712679360 net.cpp:137] Memory required for data: 31987604
I0523 16:01:22.138279 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:01:22.138286 2712679360 net.cpp:84] Creating Layer loss
I0523 16:01:22.138290 2712679360 net.cpp:406] loss <- ip2_ip2_0_split_1
I0523 16:01:22.138327 2712679360 net.cpp:406] loss <- label_cifar_1_split_1
I0523 16:01:22.138334 2712679360 net.cpp:380] loss -> loss
I0523 16:01:22.138342 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:01:22.138352 2712679360 net.cpp:122] Setting up loss
I0523 16:01:22.138357 2712679360 net.cpp:129] Top shape: (1)
I0523 16:01:22.138362 2712679360 net.cpp:132]     with loss weight 1
I0523 16:01:22.138368 2712679360 net.cpp:137] Memory required for data: 31987608
I0523 16:01:22.138372 2712679360 net.cpp:198] loss needs backward computation.
I0523 16:01:22.138377 2712679360 net.cpp:200] accuracy does not need backward computation.
I0523 16:01:22.138382 2712679360 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0523 16:01:22.138386 2712679360 net.cpp:198] ip2 needs backward computation.
I0523 16:01:22.138391 2712679360 net.cpp:198] ip1 needs backward computation.
I0523 16:01:22.138396 2712679360 net.cpp:198] pool3 needs backward computation.
I0523 16:01:22.138401 2712679360 net.cpp:198] relu3 needs backward computation.
I0523 16:01:22.138404 2712679360 net.cpp:198] conv3 needs backward computation.
I0523 16:01:22.138408 2712679360 net.cpp:198] pool2 needs backward computation.
I0523 16:01:22.138412 2712679360 net.cpp:198] relu2 needs backward computation.
I0523 16:01:22.138417 2712679360 net.cpp:198] conv2 needs backward computation.
I0523 16:01:22.138444 2712679360 net.cpp:198] relu1 needs backward computation.
I0523 16:01:22.138449 2712679360 net.cpp:198] pool1 needs backward computation.
I0523 16:01:22.138454 2712679360 net.cpp:198] conv1 needs backward computation.
I0523 16:01:22.138463 2712679360 net.cpp:200] label_cifar_1_split does not need backward computation.
I0523 16:01:22.138468 2712679360 net.cpp:200] cifar does not need backward computation.
I0523 16:01:22.138470 2712679360 net.cpp:242] This network produces output accuracy
I0523 16:01:22.138476 2712679360 net.cpp:242] This network produces output loss
I0523 16:01:22.138485 2712679360 net.cpp:255] Network initialization done.
I0523 16:01:22.138537 2712679360 solver.cpp:56] Solver scaffolding done.
I0523 16:01:22.138566 2712679360 caffe.cpp:242] Resuming from examples/cifar10/cifar10_quick_iter_4000.solverstate
I0523 16:01:22.139786 2712679360 sgd_solver.cpp:318] SGDSolver: restoring history
I0523 16:01:22.140019 2712679360 caffe.cpp:248] Starting Optimization
I0523 16:01:22.140027 2712679360 solver.cpp:272] Solving CIFAR10_quick
I0523 16:01:22.140031 2712679360 solver.cpp:273] Learning Rate Policy: fixed
I0523 16:01:22.140113 2712679360 solver.cpp:330] Iteration 4000, Testing net (#0)
I0523 16:01:32.383680 215015424 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:01:32.807214 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.7119
I0523 16:01:32.807250 2712679360 solver.cpp:397]     Test net output #1: loss = 0.848064 (* 1 = 0.848064 loss)
I0523 16:01:33.065510 2712679360 solver.cpp:218] Iteration 4000 (366.133 iter/s, 10.925s/100 iters), loss = 0.641508
I0523 16:01:33.065546 2712679360 solver.cpp:237]     Train net output #0: loss = 0.641508 (* 1 = 0.641508 loss)
I0523 16:01:33.065553 2712679360 sgd_solver.cpp:105] Iteration 4000, lr = 0.0001
I0523 16:01:56.950950 2712679360 solver.cpp:218] Iteration 4100 (4.18673 iter/s, 23.885s/100 iters), loss = 0.603556
I0523 16:01:56.951002 2712679360 solver.cpp:237]     Train net output #0: loss = 0.603556 (* 1 = 0.603556 loss)
I0523 16:01:56.951010 2712679360 sgd_solver.cpp:105] Iteration 4100, lr = 0.0001
I0523 16:02:21.127391 2712679360 solver.cpp:218] Iteration 4200 (4.13633 iter/s, 24.176s/100 iters), loss = 0.491505
I0523 16:02:21.127429 2712679360 solver.cpp:237]     Train net output #0: loss = 0.491505 (* 1 = 0.491505 loss)
I0523 16:02:21.127437 2712679360 sgd_solver.cpp:105] Iteration 4200, lr = 0.0001
I0523 16:02:46.283135 2712679360 solver.cpp:218] Iteration 4300 (3.97535 iter/s, 25.155s/100 iters), loss = 0.495313
I0523 16:02:46.283190 2712679360 solver.cpp:237]     Train net output #0: loss = 0.495313 (* 1 = 0.495313 loss)
I0523 16:02:46.283198 2712679360 sgd_solver.cpp:105] Iteration 4300, lr = 0.0001
I0523 16:03:10.841265 2712679360 solver.cpp:218] Iteration 4400 (4.07199 iter/s, 24.558s/100 iters), loss = 0.438567
I0523 16:03:10.841303 2712679360 solver.cpp:237]     Train net output #0: loss = 0.438567 (* 1 = 0.438567 loss)
I0523 16:03:10.841310 2712679360 sgd_solver.cpp:105] Iteration 4400, lr = 0.0001
I0523 16:03:33.942627 214478848 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:03:34.958622 2712679360 solver.cpp:330] Iteration 4500, Testing net (#0)
I0523 16:03:45.910739 215015424 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:03:46.349741 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.752
I0523 16:03:46.349779 2712679360 solver.cpp:397]     Test net output #1: loss = 0.748076 (* 1 = 0.748076 loss)
I0523 16:03:46.589071 2712679360 solver.cpp:218] Iteration 4500 (2.79744 iter/s, 35.747s/100 iters), loss = 0.503921
I0523 16:03:46.589107 2712679360 solver.cpp:237]     Train net output #0: loss = 0.503921 (* 1 = 0.503921 loss)
I0523 16:03:46.589113 2712679360 sgd_solver.cpp:105] Iteration 4500, lr = 0.0001
I0523 16:04:10.851019 2712679360 solver.cpp:218] Iteration 4600 (4.12184 iter/s, 24.261s/100 iters), loss = 0.562534
I0523 16:04:10.851088 2712679360 solver.cpp:237]     Train net output #0: loss = 0.562534 (* 1 = 0.562534 loss)
I0523 16:04:10.851095 2712679360 sgd_solver.cpp:105] Iteration 4600, lr = 0.0001
I0523 16:04:35.547813 2712679360 solver.cpp:218] Iteration 4700 (4.04924 iter/s, 24.696s/100 iters), loss = 0.464102
I0523 16:04:35.547852 2712679360 solver.cpp:237]     Train net output #0: loss = 0.464102 (* 1 = 0.464102 loss)
I0523 16:04:35.547860 2712679360 sgd_solver.cpp:105] Iteration 4700, lr = 0.0001
I0523 16:05:00.517423 2712679360 solver.cpp:218] Iteration 4800 (4.00497 iter/s, 24.969s/100 iters), loss = 0.474584
I0523 16:05:00.517478 2712679360 solver.cpp:237]     Train net output #0: loss = 0.474584 (* 1 = 0.474584 loss)
I0523 16:05:00.517487 2712679360 sgd_solver.cpp:105] Iteration 4800, lr = 0.0001
I0523 16:05:24.429520 2712679360 solver.cpp:218] Iteration 4900 (4.182 iter/s, 23.912s/100 iters), loss = 0.417258
I0523 16:05:24.429554 2712679360 solver.cpp:237]     Train net output #0: loss = 0.417258 (* 1 = 0.417258 loss)
I0523 16:05:24.429563 2712679360 sgd_solver.cpp:105] Iteration 4900, lr = 0.0001
I0523 16:05:47.148733 214478848 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:05:48.086921 2712679360 solver.cpp:457] Snapshotting to HDF5 file examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5
I0523 16:05:48.101351 2712679360 sgd_solver.cpp:283] Snapshotting solver state to HDF5 file examples/cifar10/cifar10_quick_iter_5000.solverstate.h5
I0523 16:05:48.215885 2712679360 solver.cpp:310] Iteration 5000, loss = 0.487594
I0523 16:05:48.215921 2712679360 solver.cpp:330] Iteration 5000, Testing net (#0)
I0523 16:05:58.710295 215015424 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:05:59.149840 2712679360 solver.cpp:397]     Test net output #0: accuracy = 0.754
I0523 16:05:59.149875 2712679360 solver.cpp:397]     Test net output #1: loss = 0.742307 (* 1 = 0.742307 loss)
I0523 16:05:59.149883 2712679360 solver.cpp:315] Optimization Done.
I0523 16:05:59.149888 2712679360 caffe.cpp:259] Optimization Done.

训练完毕.并且在最后已经创建好了测试网络.

下面我们用训练好的cifar10模型来对数据进行预测:

运行如下命令:

➜  caffe git:(master) ✗ ./build/tools/caffe.bin test \
-model examples/cifar10/cifar10_quick_train_test.prototxt \
-weights examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5 \
-iterations 100

对测试数据集进行预测:

I0523 16:25:41.234220 2712679360 caffe.cpp:284] Use CPU.
I0523 16:25:41.238044 2712679360 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer cifar
I0523 16:25:41.238080 2712679360 net.cpp:51] Initializing net from parameters:
name: "CIFAR10_quick"
state {
  phase: TEST
  level: 0
  stage: ""
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0523 16:25:41.238523 2712679360 layer_factory.hpp:77] Creating layer cifar
I0523 16:25:41.238731 2712679360 db_lmdb.cpp:35] Opened lmdb examples/cifar10/cifar10_test_lmdb
I0523 16:25:41.238788 2712679360 net.cpp:84] Creating Layer cifar
I0523 16:25:41.238796 2712679360 net.cpp:380] cifar -> data
I0523 16:25:41.238816 2712679360 net.cpp:380] cifar -> label
I0523 16:25:41.238834 2712679360 data_transformer.cpp:25] Loading mean file from: examples/cifar10/mean.binaryproto
I0523 16:25:41.238957 2712679360 data_layer.cpp:45] output data size: 100,3,32,32
I0523 16:25:41.246219 2712679360 net.cpp:122] Setting up cifar
I0523 16:25:41.246245 2712679360 net.cpp:129] Top shape: 100 3 32 32 (307200)
I0523 16:25:41.246253 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:25:41.246258 2712679360 net.cpp:137] Memory required for data: 1229200
I0523 16:25:41.246266 2712679360 layer_factory.hpp:77] Creating layer label_cifar_1_split
I0523 16:25:41.246278 2712679360 net.cpp:84] Creating Layer label_cifar_1_split
I0523 16:25:41.246282 2712679360 net.cpp:406] label_cifar_1_split <- label
I0523 16:25:41.246343 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_0
I0523 16:25:41.246367 2712679360 net.cpp:380] label_cifar_1_split -> label_cifar_1_split_1
I0523 16:25:41.246381 2712679360 net.cpp:122] Setting up label_cifar_1_split
I0523 16:25:41.246390 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:25:41.246400 2712679360 net.cpp:129] Top shape: 100 (100)
I0523 16:25:41.246409 2712679360 net.cpp:137] Memory required for data: 1230000
I0523 16:25:41.246417 2712679360 layer_factory.hpp:77] Creating layer conv1
I0523 16:25:41.246438 2712679360 net.cpp:84] Creating Layer conv1
I0523 16:25:41.246448 2712679360 net.cpp:406] conv1 <- data
I0523 16:25:41.246457 2712679360 net.cpp:380] conv1 -> conv1
I0523 16:25:41.246606 2712679360 net.cpp:122] Setting up conv1
I0523 16:25:41.246637 2712679360 net.cpp:129] Top shape: 100 32 32 32 (3276800)
I0523 16:25:41.246680 2712679360 net.cpp:137] Memory required for data: 14337200
I0523 16:25:41.246693 2712679360 layer_factory.hpp:77] Creating layer pool1
I0523 16:25:41.246708 2712679360 net.cpp:84] Creating Layer pool1
I0523 16:25:41.246721 2712679360 net.cpp:406] pool1 <- conv1
I0523 16:25:41.246731 2712679360 net.cpp:380] pool1 -> pool1
I0523 16:25:41.246752 2712679360 net.cpp:122] Setting up pool1
I0523 16:25:41.246781 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:25:41.246788 2712679360 net.cpp:137] Memory required for data: 17614000
I0523 16:25:41.246793 2712679360 layer_factory.hpp:77] Creating layer relu1
I0523 16:25:41.246804 2712679360 net.cpp:84] Creating Layer relu1
I0523 16:25:41.246809 2712679360 net.cpp:406] relu1 <- pool1
I0523 16:25:41.246814 2712679360 net.cpp:367] relu1 -> pool1 (in-place)
I0523 16:25:41.246821 2712679360 net.cpp:122] Setting up relu1
I0523 16:25:41.246825 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:25:41.246830 2712679360 net.cpp:137] Memory required for data: 20890800
I0523 16:25:41.246834 2712679360 layer_factory.hpp:77] Creating layer conv2
I0523 16:25:41.246841 2712679360 net.cpp:84] Creating Layer conv2
I0523 16:25:41.246846 2712679360 net.cpp:406] conv2 <- pool1
I0523 16:25:41.246851 2712679360 net.cpp:380] conv2 -> conv2
I0523 16:25:41.247228 2712679360 net.cpp:122] Setting up conv2
I0523 16:25:41.247236 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:25:41.247242 2712679360 net.cpp:137] Memory required for data: 24167600
I0523 16:25:41.247249 2712679360 layer_factory.hpp:77] Creating layer relu2
I0523 16:25:41.247259 2712679360 net.cpp:84] Creating Layer relu2
I0523 16:25:41.247264 2712679360 net.cpp:406] relu2 <- conv2
I0523 16:25:41.247269 2712679360 net.cpp:367] relu2 -> conv2 (in-place)
I0523 16:25:41.247274 2712679360 net.cpp:122] Setting up relu2
I0523 16:25:41.247278 2712679360 net.cpp:129] Top shape: 100 32 16 16 (819200)
I0523 16:25:41.247283 2712679360 net.cpp:137] Memory required for data: 27444400
I0523 16:25:41.247287 2712679360 layer_factory.hpp:77] Creating layer pool2
I0523 16:25:41.247293 2712679360 net.cpp:84] Creating Layer pool2
I0523 16:25:41.247298 2712679360 net.cpp:406] pool2 <- conv2
I0523 16:25:41.247301 2712679360 net.cpp:380] pool2 -> pool2
I0523 16:25:41.247308 2712679360 net.cpp:122] Setting up pool2
I0523 16:25:41.247313 2712679360 net.cpp:129] Top shape: 100 32 8 8 (204800)
I0523 16:25:41.247318 2712679360 net.cpp:137] Memory required for data: 28263600
I0523 16:25:41.247321 2712679360 layer_factory.hpp:77] Creating layer conv3
I0523 16:25:41.247329 2712679360 net.cpp:84] Creating Layer conv3
I0523 16:25:41.247334 2712679360 net.cpp:406] conv3 <- pool2
I0523 16:25:41.247339 2712679360 net.cpp:380] conv3 -> conv3
I0523 16:25:41.248001 2712679360 net.cpp:122] Setting up conv3
I0523 16:25:41.248008 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:25:41.248013 2712679360 net.cpp:137] Memory required for data: 29902000
I0523 16:25:41.248020 2712679360 layer_factory.hpp:77] Creating layer relu3
I0523 16:25:41.248025 2712679360 net.cpp:84] Creating Layer relu3
I0523 16:25:41.248051 2712679360 net.cpp:406] relu3 <- conv3
I0523 16:25:41.248057 2712679360 net.cpp:367] relu3 -> conv3 (in-place)
I0523 16:25:41.248067 2712679360 net.cpp:122] Setting up relu3
I0523 16:25:41.248072 2712679360 net.cpp:129] Top shape: 100 64 8 8 (409600)
I0523 16:25:41.248077 2712679360 net.cpp:137] Memory required for data: 31540400
I0523 16:25:41.248081 2712679360 layer_factory.hpp:77] Creating layer pool3
I0523 16:25:41.248085 2712679360 net.cpp:84] Creating Layer pool3
I0523 16:25:41.248090 2712679360 net.cpp:406] pool3 <- conv3
I0523 16:25:41.248095 2712679360 net.cpp:380] pool3 -> pool3
I0523 16:25:41.248102 2712679360 net.cpp:122] Setting up pool3
I0523 16:25:41.248109 2712679360 net.cpp:129] Top shape: 100 64 4 4 (102400)
I0523 16:25:41.248114 2712679360 net.cpp:137] Memory required for data: 31950000
I0523 16:25:41.248117 2712679360 layer_factory.hpp:77] Creating layer ip1
I0523 16:25:41.248124 2712679360 net.cpp:84] Creating Layer ip1
I0523 16:25:41.248152 2712679360 net.cpp:406] ip1 <- pool3
I0523 16:25:41.248162 2712679360 net.cpp:380] ip1 -> ip1
I0523 16:25:41.248950 2712679360 net.cpp:122] Setting up ip1
I0523 16:25:41.248993 2712679360 net.cpp:129] Top shape: 100 64 (6400)
I0523 16:25:41.249008 2712679360 net.cpp:137] Memory required for data: 31975600
I0523 16:25:41.249014 2712679360 layer_factory.hpp:77] Creating layer ip2
I0523 16:25:41.249020 2712679360 net.cpp:84] Creating Layer ip2
I0523 16:25:41.249024 2712679360 net.cpp:406] ip2 <- ip1
I0523 16:25:41.249038 2712679360 net.cpp:380] ip2 -> ip2
I0523 16:25:41.249080 2712679360 net.cpp:122] Setting up ip2
I0523 16:25:41.249097 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:25:41.249102 2712679360 net.cpp:137] Memory required for data: 31979600
I0523 16:25:41.249115 2712679360 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0523 16:25:41.249120 2712679360 net.cpp:84] Creating Layer ip2_ip2_0_split
I0523 16:25:41.249125 2712679360 net.cpp:406] ip2_ip2_0_split <- ip2
I0523 16:25:41.249130 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0523 16:25:41.249143 2712679360 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0523 16:25:41.249150 2712679360 net.cpp:122] Setting up ip2_ip2_0_split
I0523 16:25:41.249155 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:25:41.249164 2712679360 net.cpp:129] Top shape: 100 10 (1000)
I0523 16:25:41.249171 2712679360 net.cpp:137] Memory required for data: 31987600
I0523 16:25:41.249174 2712679360 layer_factory.hpp:77] Creating layer accuracy
I0523 16:25:41.249183 2712679360 net.cpp:84] Creating Layer accuracy
I0523 16:25:41.249187 2712679360 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0523 16:25:41.249191 2712679360 net.cpp:406] accuracy <- label_cifar_1_split_0
I0523 16:25:41.249195 2712679360 net.cpp:380] accuracy -> accuracy
I0523 16:25:41.249202 2712679360 net.cpp:122] Setting up accuracy
I0523 16:25:41.249205 2712679360 net.cpp:129] Top shape: (1)
I0523 16:25:41.249209 2712679360 net.cpp:137] Memory required for data: 31987604
I0523 16:25:41.249214 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:25:41.249219 2712679360 net.cpp:84] Creating Layer loss
I0523 16:25:41.249223 2712679360 net.cpp:406] loss <- ip2_ip2_0_split_1
I0523 16:25:41.249236 2712679360 net.cpp:406] loss <- label_cifar_1_split_1
I0523 16:25:41.249241 2712679360 net.cpp:380] loss -> loss
I0523 16:25:41.249249 2712679360 layer_factory.hpp:77] Creating layer loss
I0523 16:25:41.249266 2712679360 net.cpp:122] Setting up loss
I0523 16:25:41.249274 2712679360 net.cpp:129] Top shape: (1)
I0523 16:25:41.249279 2712679360 net.cpp:132]     with loss weight 1
I0523 16:25:41.249300 2712679360 net.cpp:137] Memory required for data: 31987608
I0523 16:25:41.249305 2712679360 net.cpp:198] loss needs backward computation.
I0523 16:25:41.249310 2712679360 net.cpp:200] accuracy does not need backward computation.
I0523 16:25:41.249320 2712679360 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0523 16:25:41.249325 2712679360 net.cpp:198] ip2 needs backward computation.
I0523 16:25:41.249330 2712679360 net.cpp:198] ip1 needs backward computation.
I0523 16:25:41.249366 2712679360 net.cpp:198] pool3 needs backward computation.
I0523 16:25:41.249388 2712679360 net.cpp:198] relu3 needs backward computation.
I0523 16:25:41.249392 2712679360 net.cpp:198] conv3 needs backward computation.
I0523 16:25:41.249408 2712679360 net.cpp:198] pool2 needs backward computation.
I0523 16:25:41.249413 2712679360 net.cpp:198] relu2 needs backward computation.
I0523 16:25:41.249416 2712679360 net.cpp:198] conv2 needs backward computation.
I0523 16:25:41.249420 2712679360 net.cpp:198] relu1 needs backward computation.
I0523 16:25:41.249424 2712679360 net.cpp:198] pool1 needs backward computation.
I0523 16:25:41.249428 2712679360 net.cpp:198] conv1 needs backward computation.
I0523 16:25:41.249431 2712679360 net.cpp:200] label_cifar_1_split does not need backward computation.
I0523 16:25:41.249436 2712679360 net.cpp:200] cifar does not need backward computation.
I0523 16:25:41.249439 2712679360 net.cpp:242] This network produces output accuracy
I0523 16:25:41.249444 2712679360 net.cpp:242] This network produces output loss
I0523 16:25:41.249451 2712679360 net.cpp:255] Network initialization done.
I0523 16:25:41.251152 2712679360 hdf5.cpp:32] Datatype class: H5T_FLOAT
I0523 16:25:41.252013 2712679360 caffe.cpp:290] Running for 100 iterations.
I0523 16:25:41.367466 2712679360 caffe.cpp:313] Batch 0, accuracy = 0.81
I0523 16:25:41.367501 2712679360 caffe.cpp:313] Batch 0, loss = 0.650321
I0523 16:25:41.465518 2712679360 caffe.cpp:313] Batch 1, accuracy = 0.75
I0523 16:25:41.465550 2712679360 caffe.cpp:313] Batch 1, loss = 0.767328
I0523 16:25:41.560680 2712679360 caffe.cpp:313] Batch 2, accuracy = 0.71
I0523 16:25:41.560712 2712679360 caffe.cpp:313] Batch 2, loss = 0.810281
I0523 16:25:41.656878 2712679360 caffe.cpp:313] Batch 3, accuracy = 0.7
I0523 16:25:41.656913 2712679360 caffe.cpp:313] Batch 3, loss = 0.807916
I0523 16:25:41.757275 2712679360 caffe.cpp:313] Batch 4, accuracy = 0.71
I0523 16:25:41.757313 2712679360 caffe.cpp:313] Batch 4, loss = 0.797028
I0523 16:25:41.855583 2712679360 caffe.cpp:313] Batch 5, accuracy = 0.84
I0523 16:25:41.855613 2712679360 caffe.cpp:313] Batch 5, loss = 0.422262
I0523 16:25:41.953912 2712679360 caffe.cpp:313] Batch 6, accuracy = 0.73
I0523 16:25:41.953946 2712679360 caffe.cpp:313] Batch 6, loss = 0.696204
I0523 16:25:42.052671 2712679360 caffe.cpp:313] Batch 7, accuracy = 0.72
I0523 16:25:42.052705 2712679360 caffe.cpp:313] Batch 7, loss = 0.896313
I0523 16:25:42.155107 2712679360 caffe.cpp:313] Batch 8, accuracy = 0.73
I0523 16:25:42.155153 2712679360 caffe.cpp:313] Batch 8, loss = 0.862504
I0523 16:25:42.258592 2712679360 caffe.cpp:313] Batch 9, accuracy = 0.78
I0523 16:25:42.258627 2712679360 caffe.cpp:313] Batch 9, loss = 0.642714
I0523 16:25:42.362510 2712679360 caffe.cpp:313] Batch 10, accuracy = 0.75
I0523 16:25:42.362543 2712679360 caffe.cpp:313] Batch 10, loss = 0.827924
I0523 16:25:42.463922 2712679360 caffe.cpp:313] Batch 11, accuracy = 0.76
I0523 16:25:42.463953 2712679360 caffe.cpp:313] Batch 11, loss = 0.674977
I0523 16:25:42.567791 2712679360 caffe.cpp:313] Batch 12, accuracy = 0.7
I0523 16:25:42.567822 2712679360 caffe.cpp:313] Batch 12, loss = 0.717463
I0523 16:25:42.664435 2712679360 caffe.cpp:313] Batch 13, accuracy = 0.75
I0523 16:25:42.664469 2712679360 caffe.cpp:313] Batch 13, loss = 0.640668
I0523 16:25:42.759980 2712679360 caffe.cpp:313] Batch 14, accuracy = 0.78
I0523 16:25:42.760013 2712679360 caffe.cpp:313] Batch 14, loss = 0.62553
I0523 16:25:42.856386 2712679360 caffe.cpp:313] Batch 15, accuracy = 0.76
I0523 16:25:42.856417 2712679360 caffe.cpp:313] Batch 15, loss = 0.721462
I0523 16:25:42.954746 2712679360 caffe.cpp:313] Batch 16, accuracy = 0.73
I0523 16:25:42.954777 2712679360 caffe.cpp:313] Batch 16, loss = 0.858499
I0523 16:25:43.053562 2712679360 caffe.cpp:313] Batch 17, accuracy = 0.75
I0523 16:25:43.053593 2712679360 caffe.cpp:313] Batch 17, loss = 0.746772
I0523 16:25:43.155479 2712679360 caffe.cpp:313] Batch 18, accuracy = 0.74
I0523 16:25:43.155508 2712679360 caffe.cpp:313] Batch 18, loss = 0.893995
I0523 16:25:43.254688 2712679360 caffe.cpp:313] Batch 19, accuracy = 0.68
I0523 16:25:43.254716 2712679360 caffe.cpp:313] Batch 19, loss = 0.943102
I0523 16:25:43.364045 2712679360 caffe.cpp:313] Batch 20, accuracy = 0.7
I0523 16:25:43.364076 2712679360 caffe.cpp:313] Batch 20, loss = 0.786499
I0523 16:25:43.465351 2712679360 caffe.cpp:313] Batch 21, accuracy = 0.76
I0523 16:25:43.465384 2712679360 caffe.cpp:313] Batch 21, loss = 0.742349
I0523 16:25:43.560330 2712679360 caffe.cpp:313] Batch 22, accuracy = 0.8
I0523 16:25:43.560362 2712679360 caffe.cpp:313] Batch 22, loss = 0.707087
I0523 16:25:43.662050 2712679360 caffe.cpp:313] Batch 23, accuracy = 0.69
I0523 16:25:43.662077 2712679360 caffe.cpp:313] Batch 23, loss = 0.854361
I0523 16:25:43.760444 2712679360 caffe.cpp:313] Batch 24, accuracy = 0.74
I0523 16:25:43.760473 2712679360 caffe.cpp:313] Batch 24, loss = 0.844035
I0523 16:25:43.858397 2712679360 caffe.cpp:313] Batch 25, accuracy = 0.68
I0523 16:25:43.858425 2712679360 caffe.cpp:313] Batch 25, loss = 1.02302
I0523 16:25:43.959595 2712679360 caffe.cpp:313] Batch 26, accuracy = 0.82
I0523 16:25:43.959627 2712679360 caffe.cpp:313] Batch 26, loss = 0.493385
I0523 16:25:44.057914 2712679360 caffe.cpp:313] Batch 27, accuracy = 0.76
I0523 16:25:44.057942 2712679360 caffe.cpp:313] Batch 27, loss = 0.78877
I0523 16:25:44.157359 2712679360 caffe.cpp:313] Batch 28, accuracy = 0.78
I0523 16:25:44.157388 2712679360 caffe.cpp:313] Batch 28, loss = 0.709657
I0523 16:25:44.285976 2712679360 caffe.cpp:313] Batch 29, accuracy = 0.78
I0523 16:25:44.286007 2712679360 caffe.cpp:313] Batch 29, loss = 0.674438
I0523 16:25:44.390980 2712679360 caffe.cpp:313] Batch 30, accuracy = 0.79
I0523 16:25:44.391010 2712679360 caffe.cpp:313] Batch 30, loss = 0.65947
I0523 16:25:44.491211 2712679360 caffe.cpp:313] Batch 31, accuracy = 0.77
I0523 16:25:44.491241 2712679360 caffe.cpp:313] Batch 31, loss = 0.716022
I0523 16:25:44.593423 2712679360 caffe.cpp:313] Batch 32, accuracy = 0.73
I0523 16:25:44.593457 2712679360 caffe.cpp:313] Batch 32, loss = 0.805526
I0523 16:25:44.692994 2712679360 caffe.cpp:313] Batch 33, accuracy = 0.68
I0523 16:25:44.693023 2712679360 caffe.cpp:313] Batch 33, loss = 0.903316
I0523 16:25:44.795087 2712679360 caffe.cpp:313] Batch 34, accuracy = 0.72
I0523 16:25:44.795116 2712679360 caffe.cpp:313] Batch 34, loss = 0.834438
I0523 16:25:44.897828 2712679360 caffe.cpp:313] Batch 35, accuracy = 0.73
I0523 16:25:44.897874 2712679360 caffe.cpp:313] Batch 35, loss = 0.908751
I0523 16:25:44.996119 2712679360 caffe.cpp:313] Batch 36, accuracy = 0.74
I0523 16:25:44.996150 2712679360 caffe.cpp:313] Batch 36, loss = 0.981981
I0523 16:25:45.093991 2712679360 caffe.cpp:313] Batch 37, accuracy = 0.76
I0523 16:25:45.094023 2712679360 caffe.cpp:313] Batch 37, loss = 0.725703
I0523 16:25:45.195551 2712679360 caffe.cpp:313] Batch 38, accuracy = 0.78
I0523 16:25:45.195585 2712679360 caffe.cpp:313] Batch 38, loss = 0.686703
I0523 16:25:45.292881 2712679360 caffe.cpp:313] Batch 39, accuracy = 0.8
I0523 16:25:45.292912 2712679360 caffe.cpp:313] Batch 39, loss = 0.650689
I0523 16:25:45.397084 2712679360 caffe.cpp:313] Batch 40, accuracy = 0.79
I0523 16:25:45.397115 2712679360 caffe.cpp:313] Batch 40, loss = 0.755663
I0523 16:25:45.495128 2712679360 caffe.cpp:313] Batch 41, accuracy = 0.82
I0523 16:25:45.495160 2712679360 caffe.cpp:313] Batch 41, loss = 0.855221
I0523 16:25:45.597597 2712679360 caffe.cpp:313] Batch 42, accuracy = 0.81
I0523 16:25:45.597626 2712679360 caffe.cpp:313] Batch 42, loss = 0.552907
I0523 16:25:45.695441 2712679360 caffe.cpp:313] Batch 43, accuracy = 0.8
I0523 16:25:45.695472 2712679360 caffe.cpp:313] Batch 43, loss = 0.688889
I0523 16:25:45.796842 2712679360 caffe.cpp:313] Batch 44, accuracy = 0.8
I0523 16:25:45.796875 2712679360 caffe.cpp:313] Batch 44, loss = 0.713613
I0523 16:25:45.899427 2712679360 caffe.cpp:313] Batch 45, accuracy = 0.76
I0523 16:25:45.899462 2712679360 caffe.cpp:313] Batch 45, loss = 0.819739
I0523 16:25:46.003129 2712679360 caffe.cpp:313] Batch 46, accuracy = 0.77
I0523 16:25:46.003190 2712679360 caffe.cpp:313] Batch 46, loss = 0.79499
I0523 16:25:46.101080 2712679360 caffe.cpp:313] Batch 47, accuracy = 0.73
I0523 16:25:46.101112 2712679360 caffe.cpp:313] Batch 47, loss = 0.784097
I0523 16:25:46.199532 2712679360 caffe.cpp:313] Batch 48, accuracy = 0.82
I0523 16:25:46.199563 2712679360 caffe.cpp:313] Batch 48, loss = 0.509592
I0523 16:25:46.296840 2712679360 caffe.cpp:313] Batch 49, accuracy = 0.76
I0523 16:25:46.296872 2712679360 caffe.cpp:313] Batch 49, loss = 0.775396
I0523 16:25:46.399880 2712679360 caffe.cpp:313] Batch 50, accuracy = 0.77
I0523 16:25:46.399914 2712679360 caffe.cpp:313] Batch 50, loss = 0.61452
I0523 16:25:46.500458 2712679360 caffe.cpp:313] Batch 51, accuracy = 0.79
I0523 16:25:46.500488 2712679360 caffe.cpp:313] Batch 51, loss = 0.631971
I0523 16:25:46.599107 2712679360 caffe.cpp:313] Batch 52, accuracy = 0.78
I0523 16:25:46.599139 2712679360 caffe.cpp:313] Batch 52, loss = 0.613152
I0523 16:25:46.699442 2712679360 caffe.cpp:313] Batch 53, accuracy = 0.74
I0523 16:25:46.699475 2712679360 caffe.cpp:313] Batch 53, loss = 0.813763
I0523 16:25:46.802717 2712679360 caffe.cpp:313] Batch 54, accuracy = 0.69
I0523 16:25:46.802749 2712679360 caffe.cpp:313] Batch 54, loss = 0.79753
I0523 16:25:46.903400 2712679360 caffe.cpp:313] Batch 55, accuracy = 0.81
I0523 16:25:46.903430 2712679360 caffe.cpp:313] Batch 55, loss = 0.683275
I0523 16:25:47.007345 2712679360 caffe.cpp:313] Batch 56, accuracy = 0.78
I0523 16:25:47.007377 2712679360 caffe.cpp:313] Batch 56, loss = 0.785579
I0523 16:25:47.107044 2712679360 caffe.cpp:313] Batch 57, accuracy = 0.84
I0523 16:25:47.107076 2712679360 caffe.cpp:313] Batch 57, loss = 0.455638
I0523 16:25:47.204998 2712679360 caffe.cpp:313] Batch 58, accuracy = 0.7
I0523 16:25:47.205029 2712679360 caffe.cpp:313] Batch 58, loss = 0.685973
I0523 16:25:47.307816 2712679360 caffe.cpp:313] Batch 59, accuracy = 0.74
I0523 16:25:47.307848 2712679360 caffe.cpp:313] Batch 59, loss = 0.815847
I0523 16:25:47.409512 2712679360 caffe.cpp:313] Batch 60, accuracy = 0.79
I0523 16:25:47.409544 2712679360 caffe.cpp:313] Batch 60, loss = 0.694609
I0523 16:25:47.509786 2712679360 caffe.cpp:313] Batch 61, accuracy = 0.72
I0523 16:25:47.509819 2712679360 caffe.cpp:313] Batch 61, loss = 0.721049
I0523 16:25:47.608265 2712679360 caffe.cpp:313] Batch 62, accuracy = 0.76
I0523 16:25:47.608304 2712679360 caffe.cpp:313] Batch 62, loss = 0.649006
I0523 16:25:47.711271 2712679360 caffe.cpp:313] Batch 63, accuracy = 0.77
I0523 16:25:47.711302 2712679360 caffe.cpp:313] Batch 63, loss = 0.620039
I0523 16:25:47.812440 2712679360 caffe.cpp:313] Batch 64, accuracy = 0.71
I0523 16:25:47.812471 2712679360 caffe.cpp:313] Batch 64, loss = 0.706689
I0523 16:25:47.911661 2712679360 caffe.cpp:313] Batch 65, accuracy = 0.77
I0523 16:25:47.911694 2712679360 caffe.cpp:313] Batch 65, loss = 0.824431
I0523 16:25:48.011318 2712679360 caffe.cpp:313] Batch 66, accuracy = 0.73
I0523 16:25:48.011351 2712679360 caffe.cpp:313] Batch 66, loss = 0.739382
I0523 16:25:48.117573 2712679360 caffe.cpp:313] Batch 67, accuracy = 0.7
I0523 16:25:48.117606 2712679360 caffe.cpp:313] Batch 67, loss = 0.800725
I0523 16:25:48.214515 2712679360 caffe.cpp:313] Batch 68, accuracy = 0.68
I0523 16:25:48.214545 2712679360 caffe.cpp:313] Batch 68, loss = 0.807705
I0523 16:25:48.314254 2712679360 caffe.cpp:313] Batch 69, accuracy = 0.7
I0523 16:25:48.314283 2712679360 caffe.cpp:313] Batch 69, loss = 0.952385
I0523 16:25:48.412657 2712679360 caffe.cpp:313] Batch 70, accuracy = 0.74
I0523 16:25:48.412686 2712679360 caffe.cpp:313] Batch 70, loss = 0.781932
I0523 16:25:48.512931 2712679360 caffe.cpp:313] Batch 71, accuracy = 0.73
I0523 16:25:48.512964 2712679360 caffe.cpp:313] Batch 71, loss = 0.895561
I0523 16:25:48.608669 2712679360 caffe.cpp:313] Batch 72, accuracy = 0.8
I0523 16:25:48.608700 2712679360 caffe.cpp:313] Batch 72, loss = 0.615967
I0523 16:25:48.705847 2712679360 caffe.cpp:313] Batch 73, accuracy = 0.78
I0523 16:25:48.705878 2712679360 caffe.cpp:313] Batch 73, loss = 0.588951
I0523 16:25:48.803540 2712679360 caffe.cpp:313] Batch 74, accuracy = 0.72
I0523 16:25:48.803591 2712679360 caffe.cpp:313] Batch 74, loss = 0.784208
I0523 16:25:48.906528 2712679360 caffe.cpp:313] Batch 75, accuracy = 0.77
I0523 16:25:48.906565 2712679360 caffe.cpp:313] Batch 75, loss = 0.529825
I0523 16:25:49.007186 2712679360 caffe.cpp:313] Batch 76, accuracy = 0.77
I0523 16:25:49.007216 2712679360 caffe.cpp:313] Batch 76, loss = 0.794115
I0523 16:25:49.107000 2712679360 caffe.cpp:313] Batch 77, accuracy = 0.76
I0523 16:25:49.107033 2712679360 caffe.cpp:313] Batch 77, loss = 0.726804
I0523 16:25:49.205263 2712679360 caffe.cpp:313] Batch 78, accuracy = 0.77
I0523 16:25:49.205294 2712679360 caffe.cpp:313] Batch 78, loss = 0.919712
I0523 16:25:49.304277 2712679360 caffe.cpp:313] Batch 79, accuracy = 0.69
I0523 16:25:49.304309 2712679360 caffe.cpp:313] Batch 79, loss = 0.87618
I0523 16:25:49.404642 2712679360 caffe.cpp:313] Batch 80, accuracy = 0.77
I0523 16:25:49.404672 2712679360 caffe.cpp:313] Batch 80, loss = 0.704637
I0523 16:25:49.501708 2712679360 caffe.cpp:313] Batch 81, accuracy = 0.75
I0523 16:25:49.501739 2712679360 caffe.cpp:313] Batch 81, loss = 0.71787
I0523 16:25:49.599267 2712679360 caffe.cpp:313] Batch 82, accuracy = 0.76
I0523 16:25:49.599304 2712679360 caffe.cpp:313] Batch 82, loss = 0.613339
I0523 16:25:49.698971 2712679360 caffe.cpp:313] Batch 83, accuracy = 0.78
I0523 16:25:49.699002 2712679360 caffe.cpp:313] Batch 83, loss = 0.689216
I0523 16:25:49.803320 2712679360 caffe.cpp:313] Batch 84, accuracy = 0.72
I0523 16:25:49.803352 2712679360 caffe.cpp:313] Batch 84, loss = 0.817351
I0523 16:25:49.904433 2712679360 caffe.cpp:313] Batch 85, accuracy = 0.78
I0523 16:25:49.904467 2712679360 caffe.cpp:313] Batch 85, loss = 0.62069
I0523 16:25:50.005846 2712679360 caffe.cpp:313] Batch 86, accuracy = 0.75
I0523 16:25:50.005878 2712679360 caffe.cpp:313] Batch 86, loss = 0.680651
I0523 16:25:50.103121 2712679360 caffe.cpp:313] Batch 87, accuracy = 0.78
I0523 16:25:50.103153 2712679360 caffe.cpp:313] Batch 87, loss = 0.788875
I0523 16:25:50.200103 2712679360 caffe.cpp:313] Batch 88, accuracy = 0.8
I0523 16:25:50.200134 2712679360 caffe.cpp:313] Batch 88, loss = 0.620548
I0523 16:25:50.299957 2712679360 caffe.cpp:313] Batch 89, accuracy = 0.74
I0523 16:25:50.299989 2712679360 caffe.cpp:313] Batch 89, loss = 0.779962
I0523 16:25:50.399699 2712679360 caffe.cpp:313] Batch 90, accuracy = 0.75
I0523 16:25:50.399731 2712679360 caffe.cpp:313] Batch 90, loss = 0.70084
I0523 16:25:50.502117 2712679360 caffe.cpp:313] Batch 91, accuracy = 0.79
I0523 16:25:50.502148 2712679360 caffe.cpp:313] Batch 91, loss = 0.576651
I0523 16:25:50.599150 2712679360 caffe.cpp:313] Batch 92, accuracy = 0.71
I0523 16:25:50.599181 2712679360 caffe.cpp:313] Batch 92, loss = 0.9778
I0523 16:25:50.699782 2712679360 caffe.cpp:313] Batch 93, accuracy = 0.78
I0523 16:25:50.699813 2712679360 caffe.cpp:313] Batch 93, loss = 0.795732
I0523 16:25:50.802847 2712679360 caffe.cpp:313] Batch 94, accuracy = 0.77
I0523 16:25:50.802877 2712679360 caffe.cpp:313] Batch 94, loss = 0.803904
I0523 16:25:50.900668 2712679360 caffe.cpp:313] Batch 95, accuracy = 0.77
I0523 16:25:50.900702 2712679360 caffe.cpp:313] Batch 95, loss = 0.664654
I0523 16:25:50.902439 102174720 data_layer.cpp:73] Restarting data prefetching from start.
I0523 16:25:50.999625 2712679360 caffe.cpp:313] Batch 96, accuracy = 0.74
I0523 16:25:50.999656 2712679360 caffe.cpp:313] Batch 96, loss = 0.700099
I0523 16:25:51.100697 2712679360 caffe.cpp:313] Batch 97, accuracy = 0.66
I0523 16:25:51.100728 2712679360 caffe.cpp:313] Batch 97, loss = 0.937044
I0523 16:25:51.201591 2712679360 caffe.cpp:313] Batch 98, accuracy = 0.79
I0523 16:25:51.201622 2712679360 caffe.cpp:313] Batch 98, loss = 0.677679
I0523 16:25:51.299702 2712679360 caffe.cpp:313] Batch 99, accuracy = 0.76
I0523 16:25:51.299736 2712679360 caffe.cpp:313] Batch 99, loss = 0.687144
I0523 16:25:51.299741 2712679360 caffe.cpp:318] Loss: 0.742307
I0523 16:25:51.299762 2712679360 caffe.cpp:330] accuracy = 0.754
I0523 16:25:51.299773 2712679360 caffe.cpp:330] loss = 0.742307 (* 1 = 0.742307 loss)

得到最终的测试集准确率可以到达accuracy = 0.754

到这里我们对于练习 cifar10模型
就结束了.

2017/5/23 posted in  caffe框架学习 基础知识

caffe框架运行手写体数字识别例子(MNIST)

关于caffe环境的搭建暂时不做讨论,之后有时间整理一下(mac系统上cpu-only的caffe环境搭建)。

LeNet-5模型描述

caffe框架中给的LeNet-5模型与原版有所不同,其中将Sigmoid激活函数换成了ReLu,它的描述文件为examples/mnist/lenet_train_test.prototxt
,它的内容主要为:

name: "LeNet"           //网络(Net)的名称为LeNet
layer {                 //定义一个层(Layer)
  name: "mnist"         //层的名称为mnist
  type: "Data"          //层的类型为数据层
  top: "data"           //层的输出blob有两个:data和label
  top: "label"
  include {
    phase: TRAIN        //表明该层参数只在训练阶段有效
  }
  transform_param {
    scale: 0.00390625   //数据变换使用的数据缩放因子
  }
  data_param {          //数据层参数
    source: "examples/mnist/mnist_train_lmdb"       //LMDB的路径
    batch_size: 64      //批量数目,一次性读取64张图
    backend: LMDB       //数据格式为LMDB
  }
}
layer {                 //一个新的数据层,名字也叫mnist,输出blob也是data和Label,但是这里定义的参数只在分类阶段有效
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST         //表明只在测试分类阶段有效
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"         //定义一个新的卷积层conv1,输入blob为data,输出blob为conv1
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1          //权值学习速率倍乘因子,1蓓表示与全局参数一致
  }
  param {
    lr_mult: 2          //bias学习速率倍乘因子,是全局参数的两倍
  }
  convolution_param {   //卷积计算参数
    num_output: 20      //输出feature map数目为20
    kernel_size: 5      //卷积核尺寸,5*5
    stride: 1           //卷积输出跳跃间隔,1表示连续输出,无跳跃
    weight_filler {     //权值使用Xavier填充器
      type: "xavier"
    }
    bias_filler {       //bias使用常熟填充器,默认为0
      type: "constant"
    }
  }
}
layer {                 //定义新的下采样层pool1,输入blob为conv1,输出blob为pool1
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {       //下采样参数
    pool: MAX           //使用最大值下采样方法
    kernel_size: 2      //下采样窗口尺寸为2*2
    stride: 2           //下采样输出跳跃间隔2*2
  }
}   
layer {                 //新的卷积层,和conv1类似
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {             //新的下采样层,和pool1类似
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {                 //新的全连接层,输入blob为pool2,输出blob为ip1
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {     //全连接层参数
    num_output: 500         //该层输出元素个数为500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {             //新的非线性层,用ReLU方法
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {             //分类准确率层,输入blob为ip2和Label,输出blob为accuracy,该层用于计算分类准确率
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {             //损失层,损失函数SoftmaxLoss,输入blob为ip2和label,输出blob为loss
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

LeNet模型原图为:

训练超参数:

上面已近给出了LeNet模型中的网络结构图和一些参数定义,下面我们正式来训练,这是一个分类准确率可以达到99%以上的模型。

首先进入caffe所在目录:
执行:examples/mnist/train_lenet.sh

train_lenet.sh的代码为:

#!/usr/bin/env sh
set -e          #暂时不知道具体作用

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt $@

这里调用了之前编译好的build/tools/caffe.bin二进制文件,参数为:--solver=examples/mnist/lenet_solver.prototxt $@指定了训练超参数文件,内容如下:

# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.(训练时每迭代500次,进行一次预测)
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.(网络的基础学习速率,冲量和衰减量)
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy(学习速率的衰减策略)
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations(每100次迭代,在屏幕上打印一次运行log)
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results(每5000次迭代打印一次快照)
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU(求解模式为CPU模式,因为mac没有N卡)
solver_mode: CPU

训练日志

执行上面examples/mnist/train_lenet.sh文件后会产生如下的日志输出:

//使用cpu模式运行
I0513 11:18:42.330993 3659862976 caffe.cpp:211] Use CPU.
I0513 11:18:42.331964 3659862976 solver.cpp:44] Initializing solver from parameters:
//打印训练超参数文件lenet_solver.prototxt中经过解析的内容
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
solver_mode: CPU
net: "examples/mnist/lenet_train_test.prototxt"
train_state {
  level: 0
  stage: ""
}
I0513 11:18:42.332221 3659862976 solver.cpp:87] Creating training net from net file: examples/mnist/lenet_train_test.prototxt
//解析CNN网络描述文件中的网络参数,创建训练网络
I0513 11:18:42.332438 3659862976 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
I0513 11:18:42.332453 3659862976 net.cpp:294] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0513 11:18:42.332459 3659862976 net.cpp:51] Initializing net from parameters:
//打印训练网路参数描述
name: "LeNet"
state {
  phase: TRAIN
  level: 0
  stage: ""
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
//........中间省略
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0513 11:18:42.332698 3659862976 layer_factory.hpp:77] Creating layer mnist
I0513 11:18:42.332906 3659862976 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb
I0513 11:18:42.332963 3659862976 net.cpp:84] Creating Layer mnist
//产生两个输出,data为图片数据,label为标签数据
I0513 11:18:42.332970 3659862976 net.cpp:380] mnist -> data
I0513 11:18:42.332989 3659862976 net.cpp:380] mnist -> label
//之后打开了训练集LMDB,data为四维数组,又称blob,尺寸为64,1,28,28
I0513 11:18:42.333026 3659862976 data_layer.cpp:45] output data size: 64,1,28,28
I0513 11:18:42.337728 3659862976 net.cpp:122] Setting up mnist
I0513 11:18:42.337738 3659862976 net.cpp:129] Top shape: 64 1 28 28 (50176)
I0513 11:18:42.337759 3659862976 net.cpp:129] Top shape: 64 (64)
//统计占用内存,会逐层累加
I0513 11:18:42.337762 3659862976 net.cpp:137] Memory required for data: 200960
//盖一楼,conv1
I0513 11:18:42.337769 3659862976 layer_factory.hpp:77] Creating layer conv1
I0513 11:18:42.337776 3659862976 net.cpp:84] Creating Layer conv1
//conv1需要一个输入data(来自上一层mnist),产生一个输出conv1(送入下一层)
I0513 11:18:42.337780 3659862976 net.cpp:406] conv1 <- data
I0513 11:18:42.337785 3659862976 net.cpp:380] conv1 -> conv1
I0513 11:18:42.337836 3659862976 net.cpp:122] Setting up conv1
//conv1的输出尺寸为(64,20,24,24)
I0513 11:18:42.337842 3659862976 net.cpp:129] Top shape: 64 20 24 24 (737280)
//统计内存逐层累加
I0513 11:18:42.337847 3659862976 net.cpp:137] Memory required for data: 3150080
I0513 11:18:42.337853 3659862976 layer_factory.hpp:77] Creating layer pool1
//中间层创建类似
I0513 11:18:42.337877 3659862976 net.cpp:84] Creating Layer pool1
I0513 11:18:42.337882 3659862976 net.cpp:406] pool1 <- conv1
I0513 11:18:42.337887 3659862976 net.cpp:380] pool1 -> pool1
I0513 11:18:42.337895 3659862976 net.cpp:122] Setting up pool1
I0513 11:18:42.337899 3659862976 net.cpp:129] Top shape: 64 20 12 12 (184320)
I0513 11:18:42.337904 3659862976 net.cpp:137] Memory required for data: 3887360
I0513 11:18:42.337908 3659862976 layer_factory.hpp:77] Creating layer conv2
I0513 11:18:42.337913 3659862976 net.cpp:84] Creating Layer conv2
I0513 11:18:42.337916 3659862976 net.cpp:406] conv2 <- pool1
I0513 11:18:42.337921 3659862976 net.cpp:380] conv2 -> conv2
I0513 11:18:42.338141 3659862976 net.cpp:122] Setting up conv2
I0513 11:18:42.338146 3659862976 net.cpp:129] Top shape: 64 50 8 8 (204800)
I0513 11:18:42.338162 3659862976 net.cpp:137] Memory required for data: 4706560
I0513 11:18:42.338167 3659862976 layer_factory.hpp:77] Creating layer pool2
I0513 11:18:42.338174 3659862976 net.cpp:84] Creating Layer pool2
I0513 11:18:42.338178 3659862976 net.cpp:406] pool2 <- conv2
I0513 11:18:42.338182 3659862976 net.cpp:380] pool2 -> pool2
I0513 11:18:42.338210 3659862976 net.cpp:122] Setting up pool2
I0513 11:18:42.338215 3659862976 net.cpp:129] Top shape: 64 50 4 4 (51200)
I0513 11:18:42.338220 3659862976 net.cpp:137] Memory required for data: 4911360
I0513 11:18:42.338224 3659862976 layer_factory.hpp:77] Creating layer ip1
I0513 11:18:42.338232 3659862976 net.cpp:84] Creating Layer ip1
I0513 11:18:42.338235 3659862976 net.cpp:406] ip1 <- pool2
I0513 11:18:42.338240 3659862976 net.cpp:380] ip1 -> ip1
I0513 11:18:42.341404 3659862976 net.cpp:122] Setting up ip1
I0513 11:18:42.341413 3659862976 net.cpp:129] Top shape: 64 500 (32000)
I0513 11:18:42.341418 3659862976 net.cpp:137] Memory required for data: 5039360
I0513 11:18:42.341424 3659862976 layer_factory.hpp:77] Creating layer relu1
I0513 11:18:42.341433 3659862976 net.cpp:84] Creating Layer relu1
I0513 11:18:42.341435 3659862976 net.cpp:406] relu1 <- ip1
I0513 11:18:42.341440 3659862976 net.cpp:367] relu1 -> ip1 (in-place)
I0513 11:18:42.341444 3659862976 net.cpp:122] Setting up relu1
I0513 11:18:42.341449 3659862976 net.cpp:129] Top shape: 64 500 (32000)
I0513 11:18:42.341451 3659862976 net.cpp:137] Memory required for data: 5167360
I0513 11:18:42.341455 3659862976 layer_factory.hpp:77] Creating layer ip2
I0513 11:18:42.341470 3659862976 net.cpp:84] Creating Layer ip2
I0513 11:18:42.341473 3659862976 net.cpp:406] ip2 <- ip1
I0513 11:18:42.341478 3659862976 net.cpp:380] ip2 -> ip2
I0513 11:18:42.341531 3659862976 net.cpp:122] Setting up ip2
I0513 11:18:42.341536 3659862976 net.cpp:129] Top shape: 64 10 (640)
I0513 11:18:42.341539 3659862976 net.cpp:137] Memory required for data: 5169920
//盖最后一层loss
I0513 11:18:42.341544 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:18:42.341550 3659862976 net.cpp:84] Creating Layer loss
//该层需要两个输入ip2和label,产生一个输出loss
I0513 11:18:42.341554 3659862976 net.cpp:406] loss <- ip2
I0513 11:18:42.341557 3659862976 net.cpp:406] loss <- label
I0513 11:18:42.341563 3659862976 net.cpp:380] loss -> loss
I0513 11:18:42.341572 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:18:42.341583 3659862976 net.cpp:122] Setting up loss
//输出loss尺寸为1,loss weight参数为1
I0513 11:18:42.341586 3659862976 net.cpp:129] Top shape: (1)
I0513 11:18:42.341590 3659862976 net.cpp:132]     with loss weight 1
I0513 11:18:42.341598 3659862976 net.cpp:137] Memory required for data: 5169924
//从后往前统计哪些层需要做反向传播计算(BP)
I0513 11:18:42.341601 3659862976 net.cpp:198] loss needs backward computation.
I0513 11:18:42.341606 3659862976 net.cpp:198] ip2 needs backward computation.
I0513 11:18:42.341609 3659862976 net.cpp:198] relu1 needs backward computation.
I0513 11:18:42.341614 3659862976 net.cpp:198] ip1 needs backward computation.
I0513 11:18:42.341616 3659862976 net.cpp:198] pool2 needs backward computation.
I0513 11:18:42.341620 3659862976 net.cpp:198] conv2 needs backward computation.
I0513 11:18:42.341624 3659862976 net.cpp:198] pool1 needs backward computation.
I0513 11:18:42.341627 3659862976 net.cpp:198] conv1 needs backward computation.
I0513 11:18:42.341631 3659862976 net.cpp:200] mnist does not need backward computation.
I0513 11:18:42.341655 3659862976 net.cpp:242] This network produces output loss
//盖楼完毕
I0513 11:18:42.341662 3659862976 net.cpp:255] Network initialization done.
//还需要创建测试网络,在盖一次楼
I0513 11:18:42.341949 3659862976 solver.cpp:172] Creating test net (#0) specified by net file: examples/mnist/lenet_train_test.prototxt
I0513 11:18:42.341986 3659862976 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
I0513 11:18:42.341996 3659862976 net.cpp:51] Initializing net 
from parameters:
//类似于第一座楼的情况,只是地基mnist改了一下lmdb源和输出尺寸,顶楼加了一个accuracy阁楼
name: "LeNet"
state {
  phase: TEST
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}

//....中间重复,不表
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
//具体盖楼过程与训练网络类似
I0513 11:18:42.342216 3659862976 layer_factory.hpp:77] Creating layer mnist
I0513 11:18:42.342300 3659862976 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I0513 11:18:42.342319 3659862976 net.cpp:84] Creating Layer mnist
I0513 11:18:42.342329 3659862976 net.cpp:380] mnist -> data
I0513 11:18:42.342335 3659862976 net.cpp:380] mnist -> label
I0513 11:18:42.342345 3659862976 data_layer.cpp:45] output data size: 100,1,28,28
I0513 11:18:42.343029 3659862976 net.cpp:122] Setting up mnist
I0513 11:18:42.343037 3659862976 net.cpp:129] Top shape: 100 1 28 28 (78400)
I0513 11:18:42.343057 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:18:42.343061 3659862976 net.cpp:137] Memory required for data: 314000
I0513 11:18:42.343065 3659862976 layer_factory.hpp:77] Creating layer label_mnist_1_split
I0513 11:18:42.343073 3659862976 net.cpp:84] Creating Layer label_mnist_1_split
I0513 11:18:42.343077 3659862976 net.cpp:406] label_mnist_1_split <- label
I0513 11:18:42.343082 3659862976 net.cpp:380] label_mnist_1_split -> label_mnist_1_split_0
I0513 11:18:42.343087 3659862976 net.cpp:380] label_mnist_1_split -> label_mnist_1_split_1
I0513 11:18:42.343093 3659862976 net.cpp:122] Setting up label_mnist_1_split
I0513 11:18:42.343097 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:18:42.343101 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:18:42.343106 3659862976 net.cpp:137] Memory required for data: 314800
I0513 11:18:42.343109 3659862976 layer_factory.hpp:77] Creating layer conv1
I0513 11:18:42.343137 3659862976 net.cpp:84] Creating Layer conv1
I0513 11:18:42.343144 3659862976 net.cpp:406] conv1 <- data
I0513 11:18:42.343152 3659862976 net.cpp:380] conv1 -> conv1
I0513 11:18:42.343175 3659862976 net.cpp:122] Setting up conv1
I0513 11:18:42.343181 3659862976 net.cpp:129] Top shape: 100 20 24 24 (1152000)
I0513 11:18:42.343186 3659862976 net.cpp:137] Memory required for data: 4922800
I0513 11:18:42.343196 3659862976 layer_factory.hpp:77] Creating layer pool1
I0513 11:18:42.343206 3659862976 net.cpp:84] Creating Layer pool1
I0513 11:18:42.343214 3659862976 net.cpp:406] pool1 <- conv1
I0513 11:18:42.343219 3659862976 net.cpp:380] pool1 -> pool1
I0513 11:18:42.343228 3659862976 net.cpp:122] Setting up pool1
I0513 11:18:42.343232 3659862976 net.cpp:129] Top shape: 100 20 12 12 (288000)
I0513 11:18:42.343236 3659862976 net.cpp:137] Memory required for data: 6074800
I0513 11:18:42.343240 3659862976 layer_factory.hpp:77] Creating layer conv2
I0513 11:18:42.343245 3659862976 net.cpp:84] Creating Layer conv2
I0513 11:18:42.343250 3659862976 net.cpp:406] conv2 <- pool1
I0513 11:18:42.343253 3659862976 net.cpp:380] conv2 -> conv2
I0513 11:18:42.343482 3659862976 net.cpp:122] Setting up conv2
I0513 11:18:42.343488 3659862976 net.cpp:129] Top shape: 100 50 8 8 (320000)
I0513 11:18:42.343503 3659862976 net.cpp:137] Memory required for data: 7354800
I0513 11:18:42.343509 3659862976 layer_factory.hpp:77] Creating layer pool2
I0513 11:18:42.343513 3659862976 net.cpp:84] Creating Layer pool2
I0513 11:18:42.343518 3659862976 net.cpp:406] pool2 <- conv2
I0513 11:18:42.343521 3659862976 net.cpp:380] pool2 -> pool2
I0513 11:18:42.343526 3659862976 net.cpp:122] Setting up pool2
I0513 11:18:42.343530 3659862976 net.cpp:129] Top shape: 100 50 4 4 (80000)
I0513 11:18:42.343534 3659862976 net.cpp:137] Memory required for data: 7674800
I0513 11:18:42.343538 3659862976 layer_factory.hpp:77] Creating layer ip1
I0513 11:18:42.343564 3659862976 net.cpp:84] Creating Layer ip1
I0513 11:18:42.343569 3659862976 net.cpp:406] ip1 <- pool2
I0513 11:18:42.343575 3659862976 net.cpp:380] ip1 -> ip1
I0513 11:18:42.346873 3659862976 net.cpp:122] Setting up ip1
I0513 11:18:42.346884 3659862976 net.cpp:129] Top shape: 100 500 (50000)
I0513 11:18:42.346889 3659862976 net.cpp:137] Memory required for data: 7874800
I0513 11:18:42.346895 3659862976 layer_factory.hpp:77] Creating layer relu1
I0513 11:18:42.346901 3659862976 net.cpp:84] Creating Layer relu1
I0513 11:18:42.346905 3659862976 net.cpp:406] relu1 <- ip1
I0513 11:18:42.346909 3659862976 net.cpp:367] relu1 -> ip1 (in-place)
I0513 11:18:42.346915 3659862976 net.cpp:122] Setting up relu1
I0513 11:18:42.346917 3659862976 net.cpp:129] Top shape: 100 500 (50000)
I0513 11:18:42.346921 3659862976 net.cpp:137] Memory required for data: 8074800
I0513 11:18:42.346925 3659862976 layer_factory.hpp:77] Creating layer ip2
I0513 11:18:42.346931 3659862976 net.cpp:84] Creating Layer ip2
I0513 11:18:42.346935 3659862976 net.cpp:406] ip2 <- ip1
I0513 11:18:42.346938 3659862976 net.cpp:380] ip2 -> ip2
I0513 11:18:42.346987 3659862976 net.cpp:122] Setting up ip2
I0513 11:18:42.346992 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:18:42.346997 3659862976 net.cpp:137] Memory required for data: 8078800
//注意这里,ip2_ip2_0_split在网络描述中没有显示给出,是caffe解析后自动加上的
I0513 11:18:42.347002 3659862976 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0513 11:18:42.347007 3659862976 net.cpp:84] Creating Layer ip2_ip2_0_split
//ip2_ip2_0_split接受一个输入ip2,产生两个输出ip2_ip2_0_split_0和ip2_ip2_0_split_1,是复制关系
I0513 11:18:42.347010 3659862976 net.cpp:406] ip2_ip2_0_split <- ip2
I0513 11:18:42.347014 3659862976 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0513 11:18:42.347019 3659862976 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0513 11:18:42.347024 3659862976 net.cpp:122] Setting up ip2_ip2_0_split
I0513 11:18:42.347028 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:18:42.347033 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:18:42.347036 3659862976 net.cpp:137] Memory required for data: 8086800
//ip2_ip2_0_split_0给了accuracy层
I0513 11:18:42.347039 3659862976 layer_factory.hpp:77] Creating layer accuracy
I0513 11:18:42.347069 3659862976 net.cpp:84] Creating Layer accuracy
I0513 11:18:42.347074 3659862976 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0513 11:18:42.347077 3659862976 net.cpp:406] accuracy <- label_mnist_1_split_0
I0513 11:18:42.347082 3659862976 net.cpp:380] accuracy -> accuracy
I0513 11:18:42.347088 3659862976 net.cpp:122] Setting up accuracy
//accuracy层输出尺寸为1,即分类准确率
I0513 11:18:42.347091 3659862976 net.cpp:129] Top shape: (1)
I0513 11:18:42.347095 3659862976 net.cpp:137] Memory required for data: 8086804
//ip2_ip2_0_split_1给了loss层
I0513 11:18:42.347100 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:18:42.347103 3659862976 net.cpp:84] Creating Layer loss
I0513 11:18:42.347107 3659862976 net.cpp:406] loss <- ip2_ip2_0_split_1
I0513 11:18:42.347111 3659862976 net.cpp:406] loss <- label_mnist_1_split_1
I0513 11:18:42.347115 3659862976 net.cpp:380] loss -> loss
I0513 11:18:42.347121 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:18:42.347131 3659862976 net.cpp:122] Setting up loss
I0513 11:18:42.347133 3659862976 net.cpp:129] Top shape: (1)
I0513 11:18:42.347137 3659862976 net.cpp:132]     with loss weight 1
I0513 11:18:42.347143 3659862976 net.cpp:137] Memory required for data: 8086808
I0513 11:18:42.347147 3659862976 net.cpp:198] loss needs backward computation.
I0513 11:18:42.347151 3659862976 net.cpp:200] accuracy does not need backward computation.
I0513 11:18:42.347156 3659862976 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0513 11:18:42.347159 3659862976 net.cpp:198] ip2 needs backward computation.
I0513 11:18:42.347162 3659862976 net.cpp:198] relu1 needs backward computation.
I0513 11:18:42.347167 3659862976 net.cpp:198] ip1 needs backward computation.
I0513 11:18:42.347169 3659862976 net.cpp:198] pool2 needs backward computation.
I0513 11:18:42.347173 3659862976 net.cpp:198] conv2 needs backward computation.
I0513 11:18:42.347177 3659862976 net.cpp:198] pool1 needs backward computation.
I0513 11:18:42.347180 3659862976 net.cpp:198] conv1 needs backward computation.
I0513 11:18:42.347184 3659862976 net.cpp:200] label_mnist_1_split does not need backward computation.
I0513 11:18:42.347189 3659862976 net.cpp:200] mnist does not need backward computation.
I0513 11:18:42.347193 3659862976 net.cpp:242] This network produces output accuracy
I0513 11:18:42.347196 3659862976 net.cpp:242] This network produces output loss
//第二座楼盖好了
I0513 11:18:42.347203 3659862976 net.cpp:255] Network initialization done.
//装修方案确定了
I0513 11:18:42.347247 3659862976 solver.cpp:56] Solver scaffolding done.
//开始装修
I0513 11:18:42.347271 3659862976 caffe.cpp:248] Starting Optimization
I0513 11:18:42.347275 3659862976 solver.cpp:272] Solving LeNet
I0513 11:18:42.347278 3659862976 solver.cpp:273] Learning Rate Policy: inv
//先测试一次,得到出事分类准确率和损失
I0513 11:18:42.348048 3659862976 solver.cpp:330] Iteration 0, Testing net (#0)
I0513 11:18:44.611253 57593856 data_layer.cpp:73] Restarting data prefetching from start.
I0513 11:18:44.703907 3659862976 solver.cpp:397]     Test net output #0: accuracy = 0.077
I0513 11:18:44.703938 3659862976 solver.cpp:397]     Test net output #1: loss = 2.41516 (* 1 = 2.41516 loss)
//现在分类效果肯定很差,准确率只有0.077,损失值约为2.3
I0513 11:18:44.741230 3659862976 solver.cpp:218] Iteration 0 (0 iter/s, 2.393s/100 iters), loss = 2.42047
//0次迭代后,依旧很差,训练网络没有accuracy输出,只有loss输出
I0513 11:18:44.741261 3659862976 solver.cpp:237]     Train net output #0: loss = 2.42047 (* 1 = 2.42047 loss)
I0513 11:18:44.741287 3659862976 sgd_solver.cpp:105] Iteration 0, lr = 0.01
//迭代100次之后,效果就出来了,loss已经降到0.21(之前是2.42)
I0513 11:18:47.874459 3659862976 solver.cpp:218] Iteration 100 (31.9183 iter/s, 3.133s/100 iters), loss = 0.215375
I0513 11:18:47.874493 3659862976 solver.cpp:237]     Train net output #0: loss = 0.215375 (* 1 = 0.215375 loss)
I0513 11:18:47.874500 3659862976 sgd_solver.cpp:105] Iteration 100, lr = 0.00992565
I0513 11:18:50.998973 3659862976 solver.cpp:218] Iteration 200 (32.0102 iter/s, 3.124s/100 iters), loss = 0.144389
I0513 11:18:50.999003 3659862976 solver.cpp:237]     Train net output #0: loss = 0.144389 (* 1 = 0.144389 loss)
I0513 11:18:50.999011 3659862976 sgd_solver.cpp:105] Iteration 200, lr = 0.00985258
I0513 11:18:54.100409 3659862976 solver.cpp:218] Iteration 300 (32.2477 iter/s, 3.101s/100 iters), loss = 0.192488
I0513 11:18:54.100476 3659862976 solver.cpp:237]     Train net output #0: loss = 0.192488 (* 1 = 0.192488 loss)
I0513 11:18:54.100483 3659862976 sgd_solver.cpp:105] Iteration 300, lr = 0.00978075
I0513 11:18:57.210686 3659862976 solver.cpp:218] Iteration 400 (32.1543 iter/s, 3.11s/100 iters), loss = 0.0663644
I0513 11:18:57.210728 3659862976 solver.cpp:237]     Train net output #0: loss = 0.0663644 (* 1 = 0.0663644 loss)
I0513 11:18:57.210737 3659862976 sgd_solver.cpp:105] Iteration 400, lr = 0.00971013
//迭代500次之后,进行一次测试。
I0513 11:19:00.279249 3659862976 solver.cpp:330] Iteration 500, Testing net (#0)
I0513 11:19:02.608597 57593856 data_layer.cpp:73] Restarting data prefetching from start.
//发现准确度accuracy已经显著提升到0.9744了,loss为0.08
I0513 11:19:02.703658 3659862976 solver.cpp:397]     Test net output #0: accuracy = 0.9744
I0513 11:19:02.703694 3659862976 solver.cpp:397]     Test net output #1: loss = 0.0836155 (* 1 = 0.0836155 loss)
I0513 11:19:02.735476 3659862976 solver.cpp:218] Iteration 500 (18.1028 iter/s, 5.524s/100 iters), loss = 0.0916289
I0513 11:19:02.735512 3659862976 solver.cpp:237]     Train net output #0: loss = 0.0916288 (* 1 = 0.0916288 loss)
I0513 11:19:02.735520 3659862976 sgd_solver.cpp:105] Iteration 500, lr = 0.00964069
I0513 11:19:05.931562 3659862976 solver.cpp:218] Iteration 600 (31.2891 iter/s, 3.196s/100 iters), loss = 0.0844364
I0513 11:19:05.931597 3659862976 solver.cpp:237]     Train net output #0: loss = 0.0844363 (* 1 = 0.0844363 loss)
I0513 11:19:05.931604 3659862976 sgd_solver.cpp:105] Iteration 600, lr = 0.0095724
I0513 11:19:09.116649 3659862976 solver.cpp:218] Iteration 700 (31.3972 iter/s, 3.185s/100 iters), loss = 0.134004
I0513 11:19:09.116684 3659862976 solver.cpp:237]     Train net output #0: loss = 0.134004 (* 1 = 0.134004 loss)
I0513 11:19:09.116691 3659862976 sgd_solver.cpp:105] Iteration 700, lr = 0.00950522
//中间是训练过程。。。。。。
I0513 11:22:17.536756 3659862976 solver.cpp:218] Iteration 4800 (19.3311 iter/s, 5.173s/100 iters), loss = 0.0179583
I0513 11:22:17.536806 3659862976 solver.cpp:237]     Train net output #0: loss = 0.0179581 (* 1 = 0.0179581 loss)
I0513 11:22:17.536818 3659862976 sgd_solver.cpp:105] Iteration 4800, lr = 0.00745253
I0513 11:22:22.731861 3659862976 solver.cpp:218] Iteration 4900 (19.2493 iter/s, 5.195s/100 iters), loss = 0.00556874
I0513 11:22:22.731927 3659862976 solver.cpp:237]     Train net output #0: loss = 0.00556857 (* 1 = 0.00556857 loss)
I0513 11:22:22.731940 3659862976 sgd_solver.cpp:105] Iteration 4900, lr = 0.00741498
//每迭代到5000次之后,打印一次快照,保存lenet_iter_5000.caffemodel和lenet_iter_5000.solverstate
I0513 11:22:28.143353 3659862976 solver.cpp:447] Snapshotting to binary proto file examples/mnist/lenet_iter_5000.caffemodel
I0513 11:22:28.167670 3659862976 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_5000.solverstate
I0513 11:22:28.171842 3659862976 solver.cpp:330] Iteration 5000, Testing net (#0)
I0513 11:22:32.514833 57593856 data_layer.cpp:73] Restarting data prefetching from start.
I0513 11:22:32.699314 3659862976 solver.cpp:397]     Test net output #0: accuracy = 0.9888
I0513 11:22:32.699359 3659862976 solver.cpp:397]     Test net output #1: loss = 0.0334435 (* 1 = 0.0334435 loss)
I0513 11:22:32.754936 3659862976 solver.cpp:218] Iteration 5000 (9.97705 iter/s, 10.023s/100 iters), loss = 0.0241056
I0513 11:22:32.754987 3659862976 solver.cpp:237]     Train net output #0: loss = 0.0241055 (* 1 = 0.0241055 loss)
I0513 11:22:32.754999 3659862976 sgd_solver.cpp:105] Iteration 5000, lr = 0.00737788
//中间继续训练。。。。。
I0513 11:26:53.808578 3659862976 solver.cpp:218] Iteration 9900 (21.097 iter/s, 4.74s/100 iters), loss = 0.00466773
I0513 11:26:53.808624 3659862976 solver.cpp:237]     Train net output #0: loss = 0.00466757 (* 1 = 0.00466757 loss)
I0513 11:26:53.808635 3659862976 sgd_solver.cpp:105] Iteration 9900, lr = 0.00596843
//最后一次打印快照
I0513 11:26:58.671659 3659862976 solver.cpp:447] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel
I0513 11:26:58.688323 3659862976 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate
I0513 11:26:58.715297 3659862976 solver.cpp:310] Iteration 10000, loss = 0.00293942
I0513 11:26:58.715337 3659862976 solver.cpp:330] Iteration 10000, Testing net (#0)
I0513 11:27:02.099313 57593856 data_layer.cpp:73] Restarting data prefetching from start.
//最终分类准确率为99%
I0513 11:27:02.230465 3659862976 solver.cpp:397]     Test net output #0: accuracy = 0.991
//最终loss值为0.03
I0513 11:27:02.230509 3659862976 solver.cpp:397]     Test net output #1: loss = 0.0304018 (* 1 = 0.0304018 loss)
I0513 11:27:02.230518 3659862976 solver.cpp:315] Optimization Done.
I0513 11:27:02.230525 3659862976 caffe.cpp:259] Optimization Done.
//装修结束

用训练好的模型对数据进行预测

从上面的输出结果可以看到最终训练的模型权值存在lenet_iter_10000.caffemodal中,之后可以对测试数据集进行预测。运行如下命令就可以了:

➜  caffe git:(master) ✗ ./build/tools/caffe.bin test \
-model examples/mnist/lenet_train_test.prototxt \
-weights examples/mnist/lenet_iter_10000.caffemodel \
    -iterations 100

上述命令解释:
./build/tools/caffe.bin test,表示只做预测(前向传播急速那),不进行参数更新(BP反向传播计算)

-model examples/mnist/lenet_train_test.prototxt ,指定模型描述文本文件

-weights examples/mnist/lenet_iter_10000.caffemodel ,指定模型预先训练好的权值文件
-iterations 100 , 指定测试迭代次数。参与测试的样例数目为(iterations*batch_size),batch_size在model prototxt中设定,为100时刚好覆盖全部10000个测试样本。

我们运行上述命令得到:

I0513 11:37:08.827889 3659862976 caffe.cpp:284] Use CPU.
I0513 11:37:08.830747 3659862976 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist
I0513 11:37:08.830780 3659862976 net.cpp:51] Initializing net from parameters:
name: "LeNet"
state {
  phase: TEST
  level: 0
  stage: ""
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0513 11:37:08.831130 3659862976 layer_factory.hpp:77] Creating layer mnist
I0513 11:37:08.831360 3659862976 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_test_lmdb
I0513 11:37:08.831418 3659862976 net.cpp:84] Creating Layer mnist
I0513 11:37:08.831425 3659862976 net.cpp:380] mnist -> data
I0513 11:37:08.831444 3659862976 net.cpp:380] mnist -> label
I0513 11:37:08.831480 3659862976 data_layer.cpp:45] output data size: 100,1,28,28
I0513 11:37:08.836457 3659862976 net.cpp:122] Setting up mnist
I0513 11:37:08.836468 3659862976 net.cpp:129] Top shape: 100 1 28 28 (78400)
I0513 11:37:08.836488 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:37:08.836491 3659862976 net.cpp:137] Memory required for data: 314000
I0513 11:37:08.836498 3659862976 layer_factory.hpp:77] Creating layer label_mnist_1_split
I0513 11:37:08.836505 3659862976 net.cpp:84] Creating Layer label_mnist_1_split
I0513 11:37:08.836509 3659862976 net.cpp:406] label_mnist_1_split <- label
I0513 11:37:08.836513 3659862976 net.cpp:380] label_mnist_1_split -> label_mnist_1_split_0
I0513 11:37:08.836519 3659862976 net.cpp:380] label_mnist_1_split -> label_mnist_1_split_1
I0513 11:37:08.836525 3659862976 net.cpp:122] Setting up label_mnist_1_split
I0513 11:37:08.836529 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:37:08.836534 3659862976 net.cpp:129] Top shape: 100 (100)
I0513 11:37:08.836539 3659862976 net.cpp:137] Memory required for data: 314800
I0513 11:37:08.836542 3659862976 layer_factory.hpp:77] Creating layer conv1
I0513 11:37:08.836550 3659862976 net.cpp:84] Creating Layer conv1
I0513 11:37:08.836555 3659862976 net.cpp:406] conv1 <- data
I0513 11:37:08.836558 3659862976 net.cpp:380] conv1 -> conv1
I0513 11:37:08.836611 3659862976 net.cpp:122] Setting up conv1
I0513 11:37:08.836616 3659862976 net.cpp:129] Top shape: 100 20 24 24 (1152000)
I0513 11:37:08.836639 3659862976 net.cpp:137] Memory required for data: 4922800
I0513 11:37:08.836648 3659862976 layer_factory.hpp:77] Creating layer pool1
I0513 11:37:08.836653 3659862976 net.cpp:84] Creating Layer pool1
I0513 11:37:08.836658 3659862976 net.cpp:406] pool1 <- conv1
I0513 11:37:08.836661 3659862976 net.cpp:380] pool1 -> pool1
I0513 11:37:08.836671 3659862976 net.cpp:122] Setting up pool1
I0513 11:37:08.836675 3659862976 net.cpp:129] Top shape: 100 20 12 12 (288000)
I0513 11:37:08.836680 3659862976 net.cpp:137] Memory required for data: 6074800
I0513 11:37:08.836683 3659862976 layer_factory.hpp:77] Creating layer conv2
I0513 11:37:08.836691 3659862976 net.cpp:84] Creating Layer conv2
I0513 11:37:08.836695 3659862976 net.cpp:406] conv2 <- pool1
I0513 11:37:08.836700 3659862976 net.cpp:380] conv2 -> conv2
I0513 11:37:08.836917 3659862976 net.cpp:122] Setting up conv2
I0513 11:37:08.836923 3659862976 net.cpp:129] Top shape: 100 50 8 8 (320000)
I0513 11:37:08.836971 3659862976 net.cpp:137] Memory required for data: 7354800
I0513 11:37:08.837033 3659862976 layer_factory.hpp:77] Creating layer pool2
I0513 11:37:08.837041 3659862976 net.cpp:84] Creating Layer pool2
I0513 11:37:08.837045 3659862976 net.cpp:406] pool2 <- conv2
I0513 11:37:08.837049 3659862976 net.cpp:380] pool2 -> pool2
I0513 11:37:08.837059 3659862976 net.cpp:122] Setting up pool2
I0513 11:37:08.837062 3659862976 net.cpp:129] Top shape: 100 50 4 4 (80000)
I0513 11:37:08.837067 3659862976 net.cpp:137] Memory required for data: 7674800
I0513 11:37:08.837070 3659862976 layer_factory.hpp:77] Creating layer ip1
I0513 11:37:08.837076 3659862976 net.cpp:84] Creating Layer ip1
I0513 11:37:08.837080 3659862976 net.cpp:406] ip1 <- pool2
I0513 11:37:08.837085 3659862976 net.cpp:380] ip1 -> ip1
I0513 11:37:08.840445 3659862976 net.cpp:122] Setting up ip1
I0513 11:37:08.840461 3659862976 net.cpp:129] Top shape: 100 500 (50000)
I0513 11:37:08.840467 3659862976 net.cpp:137] Memory required for data: 7874800
I0513 11:37:08.840476 3659862976 layer_factory.hpp:77] Creating layer relu1
I0513 11:37:08.840487 3659862976 net.cpp:84] Creating Layer relu1
I0513 11:37:08.840492 3659862976 net.cpp:406] relu1 <- ip1
I0513 11:37:08.840497 3659862976 net.cpp:367] relu1 -> ip1 (in-place)
I0513 11:37:08.840504 3659862976 net.cpp:122] Setting up relu1
I0513 11:37:08.840507 3659862976 net.cpp:129] Top shape: 100 500 (50000)
I0513 11:37:08.840512 3659862976 net.cpp:137] Memory required for data: 8074800
I0513 11:37:08.840517 3659862976 layer_factory.hpp:77] Creating layer ip2
I0513 11:37:08.840523 3659862976 net.cpp:84] Creating Layer ip2
I0513 11:37:08.840528 3659862976 net.cpp:406] ip2 <- ip1
I0513 11:37:08.840533 3659862976 net.cpp:380] ip2 -> ip2
I0513 11:37:08.840591 3659862976 net.cpp:122] Setting up ip2
I0513 11:37:08.840597 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:37:08.840601 3659862976 net.cpp:137] Memory required for data: 8078800
I0513 11:37:08.840606 3659862976 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0513 11:37:08.840612 3659862976 net.cpp:84] Creating Layer ip2_ip2_0_split
I0513 11:37:08.840616 3659862976 net.cpp:406] ip2_ip2_0_split <- ip2
I0513 11:37:08.840623 3659862976 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0513 11:37:08.840631 3659862976 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0513 11:37:08.840637 3659862976 net.cpp:122] Setting up ip2_ip2_0_split
I0513 11:37:08.840641 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:37:08.840646 3659862976 net.cpp:129] Top shape: 100 10 (1000)
I0513 11:37:08.840649 3659862976 net.cpp:137] Memory required for data: 8086800
I0513 11:37:08.840653 3659862976 layer_factory.hpp:77] Creating layer accuracy
I0513 11:37:08.840659 3659862976 net.cpp:84] Creating Layer accuracy
I0513 11:37:08.840663 3659862976 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0513 11:37:08.840668 3659862976 net.cpp:406] accuracy <- label_mnist_1_split_0
I0513 11:37:08.840672 3659862976 net.cpp:380] accuracy -> accuracy
I0513 11:37:08.840678 3659862976 net.cpp:122] Setting up accuracy
I0513 11:37:08.840708 3659862976 net.cpp:129] Top shape: (1)
I0513 11:37:08.840714 3659862976 net.cpp:137] Memory required for data: 8086804
I0513 11:37:08.840718 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:37:08.840724 3659862976 net.cpp:84] Creating Layer loss
I0513 11:37:08.840728 3659862976 net.cpp:406] loss <- ip2_ip2_0_split_1
I0513 11:37:08.840733 3659862976 net.cpp:406] loss <- label_mnist_1_split_1
I0513 11:37:08.840737 3659862976 net.cpp:380] loss -> loss
I0513 11:37:08.840746 3659862976 layer_factory.hpp:77] Creating layer loss
I0513 11:37:08.840759 3659862976 net.cpp:122] Setting up loss
I0513 11:37:08.840762 3659862976 net.cpp:129] Top shape: (1)
I0513 11:37:08.840767 3659862976 net.cpp:132]     with loss weight 1
I0513 11:37:08.840776 3659862976 net.cpp:137] Memory required for data: 8086808
I0513 11:37:08.840780 3659862976 net.cpp:198] loss needs backward computation.
I0513 11:37:08.840785 3659862976 net.cpp:200] accuracy does not need backward computation.
I0513 11:37:08.840790 3659862976 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0513 11:37:08.840793 3659862976 net.cpp:198] ip2 needs backward computation.
I0513 11:37:08.840798 3659862976 net.cpp:198] relu1 needs backward computation.
I0513 11:37:08.840802 3659862976 net.cpp:198] ip1 needs backward computation.
I0513 11:37:08.840806 3659862976 net.cpp:198] pool2 needs backward computation.
I0513 11:37:08.840811 3659862976 net.cpp:198] conv2 needs backward computation.
I0513 11:37:08.840814 3659862976 net.cpp:198] pool1 needs backward computation.
I0513 11:37:08.840818 3659862976 net.cpp:198] conv1 needs backward computation.
I0513 11:37:08.840822 3659862976 net.cpp:200] label_mnist_1_split does not need backward computation.
I0513 11:37:08.840827 3659862976 net.cpp:200] mnist does not need backward computation.
I0513 11:37:08.840831 3659862976 net.cpp:242] This network produces output accuracy
I0513 11:37:08.840836 3659862976 net.cpp:242] This network produces output loss
I0513 11:37:08.840843 3659862976 net.cpp:255] Network initialization done.
I0513 11:37:08.843325 3659862976 caffe.cpp:290] Running for 100 iterations.
I0513 11:37:08.871536 3659862976 caffe.cpp:313] Batch 0, accuracy = 1
I0513 11:37:08.871567 3659862976 caffe.cpp:313] Batch 0, loss = 0.0085843
I0513 11:37:08.894382 3659862976 caffe.cpp:313] Batch 1, accuracy = 1
I0513 11:37:08.894414 3659862976 caffe.cpp:313] Batch 1, loss = 0.00573037
I0513 11:37:08.918002 3659862976 caffe.cpp:313] Batch 2, accuracy = 0.99
I0513 11:37:08.918031 3659862976 caffe.cpp:313] Batch 2, loss = 0.0333053
I0513 11:37:08.943091 3659862976 caffe.cpp:313] Batch 3, accuracy = 0.99
I0513 11:37:08.943127 3659862976 caffe.cpp:313] Batch 3, loss = 0.0271862
I0513 11:37:08.967147 3659862976 caffe.cpp:313] Batch 4, accuracy = 0.99
I0513 11:37:08.967177 3659862976 caffe.cpp:313] Batch 4, loss = 0.0571239
I0513 11:37:08.989929 3659862976 caffe.cpp:313] Batch 5, accuracy = 0.99
I0513 11:37:08.989961 3659862976 caffe.cpp:313] Batch 5, loss = 0.0569953
I0513 11:37:09.015426 3659862976 caffe.cpp:313] Batch 6, accuracy = 0.98
I0513 11:37:09.015463 3659862976 caffe.cpp:313] Batch 6, loss = 0.0698283
I0513 11:37:09.039398 3659862976 caffe.cpp:313] Batch 7, accuracy = 0.99
I0513 11:37:09.039432 3659862976 caffe.cpp:313] Batch 7, loss = 0.0349087
I0513 11:37:09.063937 3659862976 caffe.cpp:313] Batch 8, accuracy = 1
I0513 11:37:09.063967 3659862976 caffe.cpp:313] Batch 8, loss = 0.0115442
I0513 11:37:09.086630 3659862976 caffe.cpp:313] Batch 9, accuracy = 0.99
I0513 11:37:09.086663 3659862976 caffe.cpp:313] Batch 9, loss = 0.0361095
I0513 11:37:09.111706 3659862976 caffe.cpp:313] Batch 10, accuracy = 0.98
I0513 11:37:09.111735 3659862976 caffe.cpp:313] Batch 10, loss = 0.0702643
I0513 11:37:09.135445 3659862976 caffe.cpp:313] Batch 11, accuracy = 0.97
I0513 11:37:09.135478 3659862976 caffe.cpp:313] Batch 11, loss = 0.0508112
I0513 11:37:09.159065 3659862976 caffe.cpp:313] Batch 12, accuracy = 0.95
I0513 11:37:09.159097 3659862976 caffe.cpp:313] Batch 12, loss = 0.148118
I0513 11:37:09.181542 3659862976 caffe.cpp:313] Batch 13, accuracy = 0.98
I0513 11:37:09.181607 3659862976 caffe.cpp:313] Batch 13, loss = 0.036772
I0513 11:37:09.205440 3659862976 caffe.cpp:313] Batch 14, accuracy = 1
I0513 11:37:09.205476 3659862976 caffe.cpp:313] Batch 14, loss = 0.00694412
I0513 11:37:09.228198 3659862976 caffe.cpp:313] Batch 15, accuracy = 0.99
I0513 11:37:09.228229 3659862976 caffe.cpp:313] Batch 15, loss = 0.0389514
I0513 11:37:09.251550 3659862976 caffe.cpp:313] Batch 16, accuracy = 0.98
I0513 11:37:09.251581 3659862976 caffe.cpp:313] Batch 16, loss = 0.0298825
I0513 11:37:09.275153 3659862976 caffe.cpp:313] Batch 17, accuracy = 1
I0513 11:37:09.275182 3659862976 caffe.cpp:313] Batch 17, loss = 0.0170967
I0513 11:37:09.298004 3659862976 caffe.cpp:313] Batch 18, accuracy = 0.99
I0513 11:37:09.298035 3659862976 caffe.cpp:313] Batch 18, loss = 0.0189575
I0513 11:37:09.321348 3659862976 caffe.cpp:313] Batch 19, accuracy = 0.99
I0513 11:37:09.321379 3659862976 caffe.cpp:313] Batch 19, loss = 0.0455956
I0513 11:37:09.344025 3659862976 caffe.cpp:313] Batch 20, accuracy = 0.98
I0513 11:37:09.344058 3659862976 caffe.cpp:313] Batch 20, loss = 0.108723
I0513 11:37:09.368069 3659862976 caffe.cpp:313] Batch 21, accuracy = 0.98
I0513 11:37:09.368101 3659862976 caffe.cpp:313] Batch 21, loss = 0.0780955
I0513 11:37:09.390791 3659862976 caffe.cpp:313] Batch 22, accuracy = 0.99
I0513 11:37:09.390823 3659862976 caffe.cpp:313] Batch 22, loss = 0.0368689
I0513 11:37:09.414577 3659862976 caffe.cpp:313] Batch 23, accuracy = 0.97
I0513 11:37:09.414621 3659862976 caffe.cpp:313] Batch 23, loss = 0.0296016
I0513 11:37:09.437597 3659862976 caffe.cpp:313] Batch 24, accuracy = 0.97
I0513 11:37:09.437628 3659862976 caffe.cpp:313] Batch 24, loss = 0.0589915
I0513 11:37:09.460636 3659862976 caffe.cpp:313] Batch 25, accuracy = 0.99
I0513 11:37:09.460669 3659862976 caffe.cpp:313] Batch 25, loss = 0.0754509
I0513 11:37:09.483229 3659862976 caffe.cpp:313] Batch 26, accuracy = 0.99
I0513 11:37:09.483261 3659862976 caffe.cpp:313] Batch 26, loss = 0.118656
I0513 11:37:09.508059 3659862976 caffe.cpp:313] Batch 27, accuracy = 0.98
I0513 11:37:09.508092 3659862976 caffe.cpp:313] Batch 27, loss = 0.0222734
I0513 11:37:09.530911 3659862976 caffe.cpp:313] Batch 28, accuracy = 0.99
I0513 11:37:09.530943 3659862976 caffe.cpp:313] Batch 28, loss = 0.0315118
I0513 11:37:09.555687 3659862976 caffe.cpp:313] Batch 29, accuracy = 0.97
I0513 11:37:09.555721 3659862976 caffe.cpp:313] Batch 29, loss = 0.129427
I0513 11:37:09.579476 3659862976 caffe.cpp:313] Batch 30, accuracy = 1
I0513 11:37:09.579507 3659862976 caffe.cpp:313] Batch 30, loss = 0.0196561
I0513 11:37:09.602957 3659862976 caffe.cpp:313] Batch 31, accuracy = 1
I0513 11:37:09.602993 3659862976 caffe.cpp:313] Batch 31, loss = 0.00242798
I0513 11:37:09.626893 3659862976 caffe.cpp:313] Batch 32, accuracy = 0.99
I0513 11:37:09.626924 3659862976 caffe.cpp:313] Batch 32, loss = 0.0169622
I0513 11:37:09.650236 3659862976 caffe.cpp:313] Batch 33, accuracy = 1
I0513 11:37:09.650270 3659862976 caffe.cpp:313] Batch 33, loss = 0.00425847
I0513 11:37:09.673212 3659862976 caffe.cpp:313] Batch 34, accuracy = 0.99
I0513 11:37:09.673243 3659862976 caffe.cpp:313] Batch 34, loss = 0.0726783
I0513 11:37:09.696039 3659862976 caffe.cpp:313] Batch 35, accuracy = 0.95
I0513 11:37:09.696071 3659862976 caffe.cpp:313] Batch 35, loss = 0.173234
I0513 11:37:09.719209 3659862976 caffe.cpp:313] Batch 36, accuracy = 1
I0513 11:37:09.719241 3659862976 caffe.cpp:313] Batch 36, loss = 0.0126433
I0513 11:37:09.741852 3659862976 caffe.cpp:313] Batch 37, accuracy = 0.99
I0513 11:37:09.741884 3659862976 caffe.cpp:313] Batch 37, loss = 0.0380185
I0513 11:37:09.766039 3659862976 caffe.cpp:313] Batch 38, accuracy = 1
I0513 11:37:09.766072 3659862976 caffe.cpp:313] Batch 38, loss = 0.0161337
I0513 11:37:09.788811 3659862976 caffe.cpp:313] Batch 39, accuracy = 0.98
I0513 11:37:09.788844 3659862976 caffe.cpp:313] Batch 39, loss = 0.0317039
I0513 11:37:09.812556 3659862976 caffe.cpp:313] Batch 40, accuracy = 1
I0513 11:37:09.812587 3659862976 caffe.cpp:313] Batch 40, loss = 0.0283054
I0513 11:37:09.835418 3659862976 caffe.cpp:313] Batch 41, accuracy = 0.98
I0513 11:37:09.835450 3659862976 caffe.cpp:313] Batch 41, loss = 0.0595546
I0513 11:37:09.858765 3659862976 caffe.cpp:313] Batch 42, accuracy = 0.98
I0513 11:37:09.858793 3659862976 caffe.cpp:313] Batch 42, loss = 0.033258
I0513 11:37:09.881479 3659862976 caffe.cpp:313] Batch 43, accuracy = 1
I0513 11:37:09.881510 3659862976 caffe.cpp:313] Batch 43, loss = 0.00560485
I0513 11:37:09.906558 3659862976 caffe.cpp:313] Batch 44, accuracy = 1
I0513 11:37:09.906590 3659862976 caffe.cpp:313] Batch 44, loss = 0.0164246
I0513 11:37:09.932261 3659862976 caffe.cpp:313] Batch 45, accuracy = 0.99
I0513 11:37:09.932294 3659862976 caffe.cpp:313] Batch 45, loss = 0.047733
I0513 11:37:09.957159 3659862976 caffe.cpp:313] Batch 46, accuracy = 1
I0513 11:37:09.957190 3659862976 caffe.cpp:313] Batch 46, loss = 0.00406718
I0513 11:37:09.979852 3659862976 caffe.cpp:313] Batch 47, accuracy = 0.99
I0513 11:37:09.979883 3659862976 caffe.cpp:313] Batch 47, loss = 0.0176224
I0513 11:37:10.003631 3659862976 caffe.cpp:313] Batch 48, accuracy = 0.95
I0513 11:37:10.003666 3659862976 caffe.cpp:313] Batch 48, loss = 0.0918992
I0513 11:37:10.027333 3659862976 caffe.cpp:313] Batch 49, accuracy = 1
I0513 11:37:10.027365 3659862976 caffe.cpp:313] Batch 49, loss = 0.00535747
I0513 11:37:10.050904 3659862976 caffe.cpp:313] Batch 50, accuracy = 1
I0513 11:37:10.050935 3659862976 caffe.cpp:313] Batch 50, loss = 0.000293352
I0513 11:37:10.076280 3659862976 caffe.cpp:313] Batch 51, accuracy = 1
I0513 11:37:10.076314 3659862976 caffe.cpp:313] Batch 51, loss = 0.00675426
I0513 11:37:10.099964 3659862976 caffe.cpp:313] Batch 52, accuracy = 1
I0513 11:37:10.099993 3659862976 caffe.cpp:313] Batch 52, loss = 0.0113504
I0513 11:37:10.123363 3659862976 caffe.cpp:313] Batch 53, accuracy = 1
I0513 11:37:10.123394 3659862976 caffe.cpp:313] Batch 53, loss = 0.00080642
I0513 11:37:10.146338 3659862976 caffe.cpp:313] Batch 54, accuracy = 1
I0513 11:37:10.146368 3659862976 caffe.cpp:313] Batch 54, loss = 0.0119724
I0513 11:37:10.170075 3659862976 caffe.cpp:313] Batch 55, accuracy = 1
I0513 11:37:10.170106 3659862976 caffe.cpp:313] Batch 55, loss = 9.95353e-05
I0513 11:37:10.192754 3659862976 caffe.cpp:313] Batch 56, accuracy = 1
I0513 11:37:10.192785 3659862976 caffe.cpp:313] Batch 56, loss = 0.00792123
I0513 11:37:10.215930 3659862976 caffe.cpp:313] Batch 57, accuracy = 1
I0513 11:37:10.215963 3659862976 caffe.cpp:313] Batch 57, loss = 0.0106224
I0513 11:37:10.238731 3659862976 caffe.cpp:313] Batch 58, accuracy = 1
I0513 11:37:10.238765 3659862976 caffe.cpp:313] Batch 58, loss = 0.00865888
I0513 11:37:10.261700 3659862976 caffe.cpp:313] Batch 59, accuracy = 0.98
I0513 11:37:10.261731 3659862976 caffe.cpp:313] Batch 59, loss = 0.0758659
I0513 11:37:10.284554 3659862976 caffe.cpp:313] Batch 60, accuracy = 1
I0513 11:37:10.284585 3659862976 caffe.cpp:313] Batch 60, loss = 0.00406362
I0513 11:37:10.310072 3659862976 caffe.cpp:313] Batch 61, accuracy = 1
I0513 11:37:10.310102 3659862976 caffe.cpp:313] Batch 61, loss = 0.00472714
I0513 11:37:10.332813 3659862976 caffe.cpp:313] Batch 62, accuracy = 1
I0513 11:37:10.332845 3659862976 caffe.cpp:313] Batch 62, loss = 0.00013836
I0513 11:37:10.356101 3659862976 caffe.cpp:313] Batch 63, accuracy = 1
I0513 11:37:10.356132 3659862976 caffe.cpp:313] Batch 63, loss = 0.000318341
I0513 11:37:10.378556 3659862976 caffe.cpp:313] Batch 64, accuracy = 1
I0513 11:37:10.378587 3659862976 caffe.cpp:313] Batch 64, loss = 0.000235923
I0513 11:37:10.402688 3659862976 caffe.cpp:313] Batch 65, accuracy = 0.94
I0513 11:37:10.402724 3659862976 caffe.cpp:313] Batch 65, loss = 0.174556
I0513 11:37:10.426704 3659862976 caffe.cpp:313] Batch 66, accuracy = 0.98
I0513 11:37:10.426736 3659862976 caffe.cpp:313] Batch 66, loss = 0.0710799
I0513 11:37:10.450608 3659862976 caffe.cpp:313] Batch 67, accuracy = 0.99
I0513 11:37:10.450641 3659862976 caffe.cpp:313] Batch 67, loss = 0.0471492
I0513 11:37:10.474786 3659862976 caffe.cpp:313] Batch 68, accuracy = 1
I0513 11:37:10.474853 3659862976 caffe.cpp:313] Batch 68, loss = 0.00714237
I0513 11:37:10.497565 3659862976 caffe.cpp:313] Batch 69, accuracy = 1
I0513 11:37:10.497596 3659862976 caffe.cpp:313] Batch 69, loss = 0.00141993
I0513 11:37:10.520592 3659862976 caffe.cpp:313] Batch 70, accuracy = 1
I0513 11:37:10.520623 3659862976 caffe.cpp:313] Batch 70, loss = 0.00206052
I0513 11:37:10.543385 3659862976 caffe.cpp:313] Batch 71, accuracy = 1
I0513 11:37:10.543418 3659862976 caffe.cpp:313] Batch 71, loss = 0.000801532
I0513 11:37:10.567934 3659862976 caffe.cpp:313] Batch 72, accuracy = 0.99
I0513 11:37:10.567965 3659862976 caffe.cpp:313] Batch 72, loss = 0.0175235
I0513 11:37:10.591750 3659862976 caffe.cpp:313] Batch 73, accuracy = 1
I0513 11:37:10.591784 3659862976 caffe.cpp:313] Batch 73, loss = 0.000181734
I0513 11:37:10.617092 3659862976 caffe.cpp:313] Batch 74, accuracy = 1
I0513 11:37:10.617122 3659862976 caffe.cpp:313] Batch 74, loss = 0.00376508
I0513 11:37:10.639822 3659862976 caffe.cpp:313] Batch 75, accuracy = 1
I0513 11:37:10.639853 3659862976 caffe.cpp:313] Batch 75, loss = 0.00211647
I0513 11:37:10.664058 3659862976 caffe.cpp:313] Batch 76, accuracy = 1
I0513 11:37:10.664090 3659862976 caffe.cpp:313] Batch 76, loss = 0.000218412
I0513 11:37:10.686815 3659862976 caffe.cpp:313] Batch 77, accuracy = 1
I0513 11:37:10.686847 3659862976 caffe.cpp:313] Batch 77, loss = 0.000203503
I0513 11:37:10.710923 3659862976 caffe.cpp:313] Batch 78, accuracy = 1
I0513 11:37:10.710953 3659862976 caffe.cpp:313] Batch 78, loss = 0.0013391
I0513 11:37:10.733860 3659862976 caffe.cpp:313] Batch 79, accuracy = 1
I0513 11:37:10.733891 3659862976 caffe.cpp:313] Batch 79, loss = 0.00335708
I0513 11:37:10.758643 3659862976 caffe.cpp:313] Batch 80, accuracy = 0.99
I0513 11:37:10.758677 3659862976 caffe.cpp:313] Batch 80, loss = 0.0256179
I0513 11:37:10.781409 3659862976 caffe.cpp:313] Batch 81, accuracy = 1
I0513 11:37:10.781440 3659862976 caffe.cpp:313] Batch 81, loss = 0.0023732
I0513 11:37:10.805886 3659862976 caffe.cpp:313] Batch 82, accuracy = 0.99
I0513 11:37:10.805920 3659862976 caffe.cpp:313] Batch 82, loss = 0.0162458
I0513 11:37:10.828743 3659862976 caffe.cpp:313] Batch 83, accuracy = 1
I0513 11:37:10.828775 3659862976 caffe.cpp:313] Batch 83, loss = 0.00678432
I0513 11:37:10.852507 3659862976 caffe.cpp:313] Batch 84, accuracy = 0.99
I0513 11:37:10.852538 3659862976 caffe.cpp:313] Batch 84, loss = 0.0189542
I0513 11:37:10.875788 3659862976 caffe.cpp:313] Batch 85, accuracy = 0.99
I0513 11:37:10.875819 3659862976 caffe.cpp:313] Batch 85, loss = 0.0198986
I0513 11:37:10.899011 3659862976 caffe.cpp:313] Batch 86, accuracy = 1
I0513 11:37:10.899040 3659862976 caffe.cpp:313] Batch 86, loss = 0.000146087
I0513 11:37:10.921692 3659862976 caffe.cpp:313] Batch 87, accuracy = 1
I0513 11:37:10.921723 3659862976 caffe.cpp:313] Batch 87, loss = 0.000129989
I0513 11:37:10.944453 3659862976 caffe.cpp:313] Batch 88, accuracy = 1
I0513 11:37:10.944484 3659862976 caffe.cpp:313] Batch 88, loss = 4.1275e-05
I0513 11:37:10.968449 3659862976 caffe.cpp:313] Batch 89, accuracy = 1
I0513 11:37:10.968482 3659862976 caffe.cpp:313] Batch 89, loss = 4.4345e-05
I0513 11:37:10.994932 3659862976 caffe.cpp:313] Batch 90, accuracy = 0.97
I0513 11:37:10.994962 3659862976 caffe.cpp:313] Batch 90, loss = 0.0680957
I0513 11:37:11.018280 3659862976 caffe.cpp:313] Batch 91, accuracy = 1
I0513 11:37:11.018312 3659862976 caffe.cpp:313] Batch 91, loss = 2.29651e-05
I0513 11:37:11.044423 3659862976 caffe.cpp:313] Batch 92, accuracy = 1
I0513 11:37:11.044457 3659862976 caffe.cpp:313] Batch 92, loss = 0.000162702
I0513 11:37:11.068132 3659862976 caffe.cpp:313] Batch 93, accuracy = 1
I0513 11:37:11.068163 3659862976 caffe.cpp:313] Batch 93, loss = 0.000582345
I0513 11:37:11.090775 3659862976 caffe.cpp:313] Batch 94, accuracy = 1
I0513 11:37:11.090806 3659862976 caffe.cpp:313] Batch 94, loss = 0.000352066
I0513 11:37:11.115216 3659862976 caffe.cpp:313] Batch 95, accuracy = 1
I0513 11:37:11.115247 3659862976 caffe.cpp:313] Batch 95, loss = 0.00453322
I0513 11:37:11.115762 84811776 data_layer.cpp:73] Restarting data prefetching from start.
I0513 11:37:11.137984 3659862976 caffe.cpp:313] Batch 96, accuracy = 0.97
I0513 11:37:11.138017 3659862976 caffe.cpp:313] Batch 96, loss = 0.0792528
I0513 11:37:11.162164 3659862976 caffe.cpp:313] Batch 97, accuracy = 0.98
I0513 11:37:11.162194 3659862976 caffe.cpp:313] Batch 97, loss = 0.106678
I0513 11:37:11.184717 3659862976 caffe.cpp:313] Batch 98, accuracy = 1
I0513 11:37:11.184751 3659862976 caffe.cpp:313] Batch 98, loss = 0.0035934
I0513 11:37:11.208353 3659862976 caffe.cpp:313] Batch 99, accuracy = 0.99
I0513 11:37:11.208385 3659862976 caffe.cpp:313] Batch 99, loss = 0.0180797
I0513 11:37:11.208390 3659862976 caffe.cpp:318] Loss: 0.0304018
I0513 11:37:11.208411 3659862976 caffe.cpp:330] accuracy = 0.991
I0513 11:37:11.208425 3659862976 caffe.cpp:330] loss = 0.0304018 (* 1 = 0.0304018 loss)

最后accuracy为0.991,loss为0.03

总结

通过上述内容,我们可以初步了解一个完整的深度学习系统最核心的两个方面:数据和模型.数据是带标签的图片集,分训练集和测试集;模型是描述CNN结构的有向无环图(DAG),表示对原始数据的处理方式.

Caffe并不直接处理原始数据,由预处理程序将原始数据存储为LMDB格式,来保持较高的IO效率.模型通常用ProtoBuffer文本格式表述,训练结果保存为ProtoBuffer二进制文件或HDF5格式文件.深度学习的过程就是利用训练数据对模型进行训练,将数据中蕴藏的大量信息通过机器学习算法不断收集到模型中,利用训练好的模型对现实世界中相似数据进行特定处理(如分类,识别,检测,定位).

2017/5/13 posted in  caffe框架学习 基础知识

CNN卷积神经网络

简介

卷积神经网络是近年发展起来,并引起广泛重视的一种高效识别方法。20世纪60年代,Hubel和Wiesel在研究猫脑皮层中用于局部敏感和方向选择的神经元时发现其独特的网络结构可以有效地降低反馈神经网络的复杂性,继而提出了卷积神经网络(Convolutional Neural Networks-简称CNN)。现在,CNN已经成为众多科学领域的研究热点之一,特别是在模式分类领域,由于该网络避免了对图像的复杂前期预处理,可以直接输入原始图像,因而得到了更为广泛的应用

一般地,CNN的基本结构包括两层:

  1. 其一为特征提取层:每个神经元的输入与前一层的局部接受域相连,并提取该局部的特征。一旦该局部特征被提取后,它与其它特征间的位置关系也随之确定下来;

  2. 其二是特征映射层:网络的每个计算层由多个特征映射组成,每个特征映射是一个平面,平面上所有神经元的权值相等。特征映射结构采用影响函数核小的sigmoid函数作为卷积网络的激活函数,使得特征映射具有位移不变性。此外,由于一个映射面上的神经元共享权值,因而减少了网络自由参数的个数。卷积神经网络中的每一个卷积层都紧跟着一个用来求局部平均与二次提取的计算层,这种特有的两次特征提取结构减小了特征分辨率。

CNN主要用来识别位移、缩放及其他形式扭曲不变性的二维图形。由于CNN的特征检测层通过训练数据进行学习,所以在使用CNN时,避免了显示的特征抽取,而隐式地从训练数据中进行学习;

再者由于同一特征映射面上的神经元权值相同,所以网络可以并行学习,这也是卷积网络相对于神经元彼此相连网络的一大优势。卷积神经网络以其局部权值共享的特殊结构在语音识别和图像处理方面有着独特的优越性,其布局更接近于实际的生物神经网络,权值共享降低了网络的复杂性,特别是多维输入向量的图像可以直接输入网络这一特点避免了特征提取和分类过程中数据重建的复杂度

卷积神经网络

在图像处理中,往往把图像表示为像素的向量,比如一个1000×1000的图像,可以表示为一个1000000的向量。在上一节中提到的神经网络中,如果隐含层数目与输入层一样,即也是1000000时,那么输入层到隐含层的参数数据为:\(1000000×1000000=10^{12}\),这样就太多了,基本没法训练。所以图像处理要想练成神经网络大法,必先减少参数加快速度。就跟辟邪剑谱似的,普通人练得很挫,一旦自宫后内力变强剑法变快,就变的很牛了。

局部感知

卷积神经网络有两种神器可以降低参数数目,第一种神器叫做局部感知野。一般认为人对外界的认知是从局部到全局的,而图像的空间联系也是局部的像素联系较为紧密,而距离较远的像素相关性则较弱。因而,每个神经元其实没有必要对全局图像进行感知,只需要对局部进行感知,然后在更高层将局部的信息综合起来就得到了全局的信息。网络部分连通的思想,也是受启发于生物学里面的视觉系统结构。视觉皮层的神经元就是局部接受信息的(即这些神经元只响应某些特定区域的刺激)。如下图所示:左图为全连接,右图为局部连接。

在上右图中,假如每个神经元只和10×10个像素值相连,那么权值数据为1000000×100个参数,减少为原来的千分之一。而那10×10个像素值对应的10×10个参数,其实就相当于卷积操作。

参数共享

但其实这样的话参数仍然过多,那么就启动第二级神器,即权值共享。在上面的局部连接中,每个神经元都对应100个参数,一共1000000个神经元,如果这1000000个神经元的100个参数都是相等的,那么参数数目就变为100了。

怎么理解权值共享呢?我们可以这100个参数(也就是卷积操作)看成是提取特征的方式,该方式与位置无关。这其中隐含的原理则是:图像的一部分的统计特性与其他部分是一样的。这也意味着我们在这一部分学习的特征也能用在另一部分上,所以对于这个图像上的所有位置,我们都能使用同样的学习特征

更直观一些,当从一个大尺寸图像中随机选取一小块,比如说 8×8 作为样本,并且从这个小块样本中学习到了一些特征,这时我们可以把从这个 8×8 样本中学习到的特征作为探测器,应用到这个图像的任意地方中去。特别是,我们可以用从 8×8 样本中所学习到的特征跟原本的大尺寸图像作卷积,从而对这个大尺寸图像上的任一位置获得一个不同特征的激活值。

如下图所示,展示了一个33的卷积核在55的图像上做卷积的过程。每个卷积都是一种特征提取方式,就像一个筛子,将图像中符合条件(激活值越大越符合条件)的部分筛选出来

多卷积核

上面所述只有100个参数时,表明只有1个100*100的卷积核,显然,特征提取是不充分的,我们可以添加多个卷积核,比如32个卷积核,可以学习32种特征。在有多个卷积核时,如下图所示:

上图右,不同颜色表明不同的卷积核。每个卷积核都会将图像生成为另一幅图像。比如两个卷积核就可以将生成两幅图像,这两幅图像可以看做是一张图像的不同的通道。如下图所示,下图有个小错误,即将w1改为w0,w2改为w1即可。下文中仍以w1和w2称呼它们.

2017/5/6 posted in  基础知识

MNIST机器学习入门

MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片:

它也包含每一张图片对应的标签,告诉我们这个是数字几。比如,上面这四张图片的标签分别是5,0,4,1。

这里我们将试着训练一个机器学习模型用于预测图片里面的数字。这次我们的主要目的是研究如何使用TensorFlow。所以,我们这里会从一个很简单的数学模型开始,它叫做Softmax Regression。

对应这个教程的实现代码很短,而且真正有意思的内容只包含在三行代码里面。但是,去理解包含在这些代码里面的设计思想是非常重要的:TensorFlow工作流程和机器学习的基本概念。因此,此次会很详细地介绍这些代码的实现原理。

MNIST数据集

MNIST数据集的官网是Yann LeCun's website。在这里,我们提供了一份python源代码(见附录)用于自动下载和安装这个数据集。你可以下载这份代码,然后用下面的代码导入到你的项目里面,也可以直接复制粘贴到你的代码文件里面。

import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

下载下来的数据集被分成两部分:60000行的训练数据集(mnist.train)和10000行的测试数据集(mnist.test)。这样的切分很重要,在机器学习模型设计时必须有一个单独的测试数据集不用于训练而是用来评估这个模型的性能,从而更加容易把设计的模型推广到其他数据集上(泛化)。

正如前面提到的一样,每一个MNIST数据单元有两部分组成:一张包含手写数字的图片和一个对应的标签。我们把这些图片设为“xs”,把这些标签设为“ys”。训练数据集和测试数据集都包含xs和ys,比如训练数据集的图片是mnist.train.images ,训练数据集的标签是 mnist.train.labels

每一张图片包含28像素X28像素。我们可以用一个数字数组来表示这张图片:


我们把这个数组展开成一个向量,长度是 28x28 = 784。如何展开这个数组(数字间的顺序)不重要,只要保持各个图片采用相同的方式展开。从这个角度来看,MNIST数据集的图片就是在784维向量空间里面的点, 并且拥有比较复杂的结构 (提醒: 此类数据的可视化是计算密集型的)。

展平图片的数字数组会丢失图片的二维结构信息。这显然是不理想的,最优秀的计算机视觉方法会挖掘并利用这些结构信息,我们会在后续教程中介绍。但是在这个教程中我们忽略这些结构,所介绍的简单数学模型,softmax回归(softmax regression),不会利用这些结构信息。

因此,在MNIST训练数据集中,mnist.train.images 是一个形状为 [60000, 784] 的张量,第一个维度数字用来索引图片,第二个维度数字用来索引每张图片中的像素点。在此张量里的每一个元素,都表示某张图片里的某个像素的强度值,值介于0和1之间。

相对应的MNIST数据集的标签是介于0到9的数字,用来描述给定图片里表示的数字。为了用于这个教程,我们使标签数据是"one-hot vectors"。 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。所以在此教程中,数字n将表示成一个只有在第n维度(从0开始)数字为1的10维向量。比如,标签0将表示成([1,0,0,0,0,0,0,0,0,0,0])。因此,mnist.train.labels 是一个 [60000, 10] 的数字矩阵。

现在,我们准备好可以开始构建我们的模型啦!

Softmax回归介绍

我们知道MNIST的每一张图片都表示一个数字,从0到9。我们希望得到给定图片代表每个数字的概率。比如说,我们的模型可能推测一张包含9的图片代表数字9的概率是80%但是判断它是8的概率是5%(因为8和9都有上半部分的小圆),然后给予它代表其他数字的概率更小的值。

这是一个使用softmax回归(softmax regression)模型的经典案例。softmax模型可以用来给不同的对象分配概率。即使在之后,我们训练更加精细的模型时,最后一步也需要用softmax来分配概率。

softmax回归(softmax regression)分两步:第一步

为了得到一张给定图片属于某个特定数字类的证据(evidence),我们对图片像素值进行加权求和。如果这个像素具有很强的证据说明这张图片不属于该类,那么相应的权值为负数,相反如果这个像素拥有有利的证据支持这张图片属于这个类,那么权值是正数。

下面的图片显示了一个模型学习到的图片上每个像素对于特定数字类的权值。红色代表负数权值,蓝色代表正数权值。

我们也需要加入一个额外的偏置量(bias),因为输入往往会带有一些无关的干扰量。因此对于给定的输入图片 x 它代表的是数字 i 的证据可以表示为

其中 代表权重, 代表数字 i 类的偏置量,j 代表给定图片 x 的像素索引用于像素求和。然后用softmax函数可以把这些证据转换成概率 y:

这里的softmax可以看成是一个激励(activation)函数或者链接(link)函数,把我们定义的线性函数的输出转换成我们想要的格式,也就是关于10个数字类的概率分布。因此,给定一张图片,它对于每一个数字的吻合度可以被softmax函数转换成为一个概率值。softmax函数可以定义为:

展开等式右边的子式,可以得到:

但是更多的时候把softmax模型函数定义为前一种形式:把输入值当成幂指数求值,再正则化这些结果值。这个幂运算表示,更大的证据对应更大的假设模型(hypothesis)里面的乘数权重值。反之,拥有更少的证据意味着在假设模型里面拥有更小的乘数系数。假设模型里的权值不可以是0值或者负值。Softmax然后会正则化这些权重值,使它们的总和等于1,以此构造一个有效的概率分布。(更多的关于Softmax函数的信息,可以参考Michael Nieslen的书里面的这个部分,其中有关于softmax的可交互式的可视化解释。)

对于softmax回归模型可以用下面的图解释,对于输入的xs加权求和,再分别加上一个偏置量,最后再输入到softmax函数中:

如果把它写成一个等式,我们可以得到:

我们也可以用向量表示这个计算过程:用矩阵乘法和向量相加。这有助于提高计算效率。(也是一种更有效的思考方式)

更进一步,可以写成更加紧凑的方式:

实现回归模型

为了用python实现高效的数值计算,我们通常会使用函数库,比如NumPy,会把类似矩阵乘法这样的复杂运算使用其他外部语言实现。不幸的是,从外部计算切换回Python的每一个操作,仍然是一个很大的开销。如果你用GPU来进行外部计算,这样的开销会更大。用分布式的计算方式,也会花费更多的资源用来传输数据。

TensorFlow也把复杂的计算放在python之外完成,但是为了避免前面说的那些开销,它做了进一步完善。Tensorflow不单独地运行单一的复杂计算,而是让我们可以先用图描述一系列可交互的计算操作,然后全部一起在Python之外运行。(这样类似的运行方式,可以在不少的机器学习库中看到。)

使用TensorFlow之前,首先导入它:

import tensorflow as tf

我们通过操作符号变量来描述这些可交互的操作单元,可以用下面的方式创建一个:

x = tf.placeholder("float", [None, 784])

x不是一个特定的值,而是一个占位符placeholder,我们在TensorFlow运行计算时输入这个值。我们希望能够输入任意数量的MNIST图像,每一张图展平成784维的向量。我们用2维的浮点数张量来表示这些图,这个张量的形状是[None,784 ]。(这里的None表示此张量的第一个维度可以是任何长度的。)

我们的模型也需要权重值和偏置量,当然我们可以把它们当做是另外的输入(使用占位符),但TensorFlow有一个更好的方法来表示它们:Variable 。 一个Variable代表一个可修改的张量,存在在TensorFlow的用于描述交互性操作的图中。它们可以用于计算输入值,也可以在计算中被修改。对于各种机器学习应用,一般都会有模型参数,可以用Variable表示。

y = tf.nn.softmax(tf.matmul(x,W) + b)

首先,我们用tf.matmul(​​X,W)表示x乘以W,对应之前等式里面的,这里x是一个2维张量拥有多个输入。然后再加上b,把和输入到tf.nn.softmax函数里面。

至此,我们先用了几行简短的代码来设置变量,然后只用了一行代码来定义我们的模型。TensorFlow不仅仅可以使softmax回归模型计算变得特别简单,它也用这种非常灵活的方式来描述其他各种数值计算,从机器学习模型对物理学模拟仿真模型。一旦被定义好之后,我们的模型就可以在不同的设备上运行:计算机的CPU,GPU,甚至是手机!

训练模型

为了训练我们的模型,我们首先需要定义一个指标来评估这个模型是好的。其实,在机器学习,我们通常定义指标来表示一个模型是坏的,这个指标称为成本(cost)或损失(loss),然后尽量最小化这个指标。但是,这两种方式是相同的。

一个非常常见的,非常漂亮的成本函数是“交叉熵”(cross-entropy)。交叉熵产生于信息论里面的信息压缩编码技术,但是它后来演变成为从博弈论到机器学习等其他领域里的重要技术手段。它的定义如下:

y 是我们预测的概率分布, y' 是实际的分布(我们输入的one-hot vector)。比较粗糙的理解是,交叉熵是用来衡量我们的预测用于描述真相的低效性。更详细的关于交叉熵的解释超出本次试验的范畴,但是很有必要好好理解它

为了计算交叉熵,我们首先需要添加一个新的占位符用于输入正确值:

y_ = tf.placeholder("float", [None,10])

然后我们可以用计算交叉熵:

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

首先,用 tf.log 计算 y 的每个元素的对数。接下来,我们把 y_的每一个元素和 tf.log(y_) 的对应元素相乘。最后,用 tf.reduce_sum 计算张量的所有元素的总和。(注意,这里的交叉熵不仅仅用来衡量单一的一对预测和真实值,而是所有100幅图片的交叉熵的总和。对于100个数据点的预测表现比单一数据点的表现能更好地描述我们的模型的性能。

现在我们知道我们需要我们的模型做什么啦,用TensorFlow来训练它是非常容易的。因为TensorFlow拥有一张描述你各个计算单元的图,它可以自动地使用反向传播算法(backpropagation algorithm)来有效地确定你的变量是如何影响你想要最小化的那个成本值的。然后,TensorFlow会用你选择的优化算法来不断地修改变量以降低成本。

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

在这里,我们要求TensorFlow用梯度下降算法(gradient descent algorithm)以0.01的学习速率最小化交叉熵。梯度下降算法(gradient descent algorithm)是一个简单的学习过程,TensorFlow只需将每个变量一点点地往使成本不断降低的方向移动。当然TensorFlow也提供了其他许多优化算法:只要简单地调整一行代码就可以使用其他的算法。

TensorFlow在这里实际上所做的是,它会在后台给描述你的计算的那张图里面增加一系列新的计算操作单元用于实现反向传播算法和梯度下降算法。然后,它返回给你的只是一个单一的操作,当运行这个操作时,它用梯度下降算法训练你的模型,微调你的变量,不断减少成本。

现在,我们已经设置好了我们的模型。在运行计算之前,我们需要添加一个操作来初始化我们创建的变量:

init = tf.initialize_all_variables()

现在我们可以在一个Session里面启动我们的模型,并且初始化变量:

sess = tf.Session()
sess.run(init)

然后开始训练模型,这里我们让模型循环训练1000次!

for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

该循环的每个步骤中,我们都会随机抓取训练数据中的100个批处理数据点,然后我们用这些数据点作为参数替换之前的占位符来运行train_step

使用一小部分的随机数据来进行训练被称为随机训练(stochastic training)- 在这里更确切的说是随机梯度下降训练。在理想情况下,我们希望用我们所有的数据来进行每一步的训练,因为这能给我们更好的训练结果,但显然这需要很大的计算开销。所以,每一次训练我们可以使用不同的数据子集,这样做既可以减少计算开销,又可以最大化地学习到数据集的总体特性。

评估我们的模型

那么我们的模型性能如何呢?

首先让我们找出那些预测正确的标签。tf.argmax是一个非常有用的函数,它能给出某个tensor对象在某一维上的其数据最大值所在的索引值。由于标签向量是由0,1组成,因此最大值1所在的索引位置就是类别标签,比如tf.argmax(y,1)返回的是模型对于任一输入x预测到的标签值,而 tf.argmax(y_,1) 代表正确的标签,我们可以用 tf.equal 来检测我们的预测是否真实标签匹配(索引位置一样表示匹配)。

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

这行代码会给我们一组布尔值。为了确定正确预测项的比例,我们可以把布尔值转换成浮点数,然后取平均值。例如,[True, False, True, True] 会变成 [1,0,1,1] ,取平均值后得到 0.75.

accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

最后,我们计算所学习到的模型在测试数据集上面的正确率。

print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})

这个最终结果值应该大约是91%。

这个结果好吗?嗯,并不太好。事实上,这个结果是很差的。这是因为我们仅仅使用了一个非常简单的模型。不过,做一些小小的改进,我们就可以得到97%的正确率。最好的模型甚至可以获得超过99.7%的准确率!(想了解更多信息,可以看看这个关于各种模型的性能对比列表。)
附件:input_data

2017/2/23 posted in  基础知识