Caffe模型

我们之前学习过,一个完整的深度学习系统最核心的两个方面是数据模型。今大我们 主要关注模型。一个深度学习模型通常由三部分参数组成:

  • 可学习参数(Leamable Parameter),又称可训练参数、神经网络权系数、权重,其数值由模型初始化参数、误差反向传播过程控制,一般不可人工干预.
  • 结构参数(Archetecture Parameter),包括卷积层/全连接层/下采样层数目、卷积核数目、 卷积核大小等描述网络结构的参数,一旦设定好,在网络训练阶段不能更改;值得注意的是,训练阶段网络结构参数和预测阶段结构参数很可能不同。
  • 训练超参数(Hyper-Parameter),用来控制网络训练收敛的参数,训练阶段可以自动或手动调节以获得更好的效果,预测阶段不需要该参数.

在Caffe中,一个模型的三部分参数分别由不同模块定义和实现:

  • 可学习参数在内存中使用Blob对象保持,必要时以二进制ProtoBuffer文件(*.caffemodel)形态序列化并存储于磁盘上,便于进一步微调(finetune,又称精调)、共享(例如参数服务器Parameter Server, PS)、性能评估(benchmark)。
  • 结构参数使用ProtoBuffer文本格式(*.prototxt)描述,网络初始化时通过该描述文件构建Net对象、Layer对象形成有向无环图结构,在Layer与Layer之间、Net输入源和输出阱均为持有数据和中间结果的Blob对象。
  • 训练超参数同样使用ProtoBuffer文本格式(*.prototxt)描述,训练阶段利用该描述文件构建求解器(Solver)对象,该对象按照一定规则在训练网络时自动调节这些超参数值。

我们在MNIST例子中对LeNet-5模型稍微修改一下.变成逻辑回归(Logistic Regression, LR)分类器。

复制一份examples/mnist/lenet_train_test.prototxt,重命名为 lenet_lr.prototxt,修改内容如下:


name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "ip"
  type: "InnerProduct"
  bottom: "data"
  top: "ip"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 20
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip"
  bottom: "label"
  top: "accuracy"
  include {
    phase:TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip"
  bottom: "label"
  top: "loss"
}

复制一份examples/mnist/lenet_solver.prototxt,重命名为lenet_lr_solver.prototxt,修改内容
如下:


# The train/test net protocol buffer definition
net: "examples/mnist/lenet_lr.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: CPU

然后运行训练命令,在命令行输入:

./build/tools/caffe train --solver=examples/mnist/lenet_lr_solver.prototxt

但是发现报错了:

通过上述错误描述,发现是lmdb数据文件没有的问题....

运行./examples/mnist/create_mnist.sh脚本,将之前下载过的数据转化成lmdb形式.中间的报错和解决如截图所示:

我们成功获得到了lmdb文件.

再次执行训练命令:

master) ✗ ./build/tools/caffe train --solver=examples/mnist/lenet_lr_solver.prototxt

然后就发现已经开始训练了.

最后得到结果如图所示:

经过训练,可以获得在测试集上分类准确率为0.9908的模型。相比LeNet-5而言准确率降低了,这也符合直觉,因为将模型简化后参数变少,层数变少,网络表达能力变差。我们今天不关注准确率,只关注模型的表达方式。

内存中的表示

从运行的log文件可以追踪模型是如何从prototxt描述变为内存中表示方式的,

看到这行:


 Creating training net from net file:
examples/mnist/lenet_lr.prototxt

// ...不要在意这些细节
Initializing net from parameters:

追踪solver.cpp的第87行,看到如下代码:


//前面省略..
//在solver.hpp 中声明了SolverParameterparam_
//它是ProtoBuffer工具生成的结构体,用来解析lenet_lr_solver.prototxt
if (param_.has_net()) {
    LOG_IF(INFO, Caffe::root_solver()) //打印log
        //这里param_.net()会返回examples/mnist/lenet_lr.prototxt 
        << "Creating training net from net file: " << param_.net();
    ReadNetParamsFromTextFileOrDie(param_.net(), &net_param);
  }
  

磁盘上表示

Caffe使用ProtoBuffer二进制文件有最小文件尺寸,并由ProtoBuffer工具自动生成高效的序列化/反序列化接U口(多语言支持,包括C++、Java、Python),以及可读性好、兼容二进制文件的文本格式.

我们仍然从运行log查找线索:

Snapshotting to binary proto file
examples/mn is t/lenet__iter_l 0000. caffemodel

Snapshotting solver state to binary proto file examples/mnist/Xenet_iter_10000.solverstate

其中,.caffemodel文件是在特定训练间隙保存的二进制文件,包含当前网络各层的权值状态;而.solverstate是与.caffemodel一起产生的二进制文件,包含从上次停止点恢复训练模型所需的信息。我们具体看下列代码:

追踪solver.cpp的第445行,上下文信息如下所示:


template <typename Dtype>
string Solver<Dtype>::SnapshotToBinaryProto() {
//得到模型文件名
  string model_filename = SnapshotFilename(".caffemodel");
  LOG(INFO) << "Snapshotting to binary proto file " << model_filename;
  NetParameter net_param;
  //将net_转换为Netparameter
  net_->ToProto(&net_param, param_.snapshot_diff());
  ///写入 ProtoBuffer 二进制文件,这里是 lenet_iter_10000.caffemodel
    WriteProtoToBinaryFile(net_param, model_filename);
  return model_filename;
}

追踪sgd_solver.cpp的259行:


template <typename Dtype>
void SGDSolver<Dtype>::SnapshotSolverStateToBinaryProto(
    const string& model_filename) {
  SolverState state;    //创建一个序列化对象
  state.set_iter(this->iter_);  //记录当前的迭代次数
  state.set_learned_net(model_filename); //记录网络描述文件
  state.set_current_step(this->current_step_);  //记录当前步进值
  state.clear_history();    //清空容器,准备接纳新内容
  for (int i = 0; i < history_.size(); ++i) {
    // Add history 记录权值的历史信息
    BlobProto* history_blob = state.add_history();
    history_[i]->ToProto(history_blob);
  }
  string snapshot_filename = Solver<Dtype>::SnapshotFilename(".solverstate");
  LOG(INFO)
    << "Snapshotting solver state to binary proto file " << snapshot_filename;
    //将SolverState对象写入二进制文件(*.solverstate)
  WriteProtoToBinaryFile(state, snapshot_filename.c_str());
}

从磁盘上将模型、求解器状态文件载入内存的过程与上面代码刚好相反,我们可自行跟踪阅读。

Caffe Modal Zoo

对于前面我们运行的简单模型,可以从头训练(from scrash)。然而,对于规模更大、结构更复杂的模型,从头训练需耍解决两个问题:首先是硬件计算能力。模型训练十分消耗计算资源,使用普通计算机需要相当长的时间,不经济:而且世界上每个研究机构都从头训练,重复性工作太多,不环保。其次是调参能力。同样的模型设计,可能每个人训练结果都不一致,中间调参是项技术活,控制不当会引起训练发散或训练不充分,无法达到理想的分类效果

为了解决上述问题,Caffe Model Zoo则提供了一个分享模型的平台,世界各地的研究人员都可以把自己的训练成果共享给社区中更多的人使用,节省人力、物力。

今天我们也站在前人的肩膀上,运行一个基于已训练模型的图片分类例程。我们首先需要下载几个文件。

下载meta数据到当前目录:

➜  caffe git:(master) ✗ cd data/ilsvrc12

➜  ilsvrc12 git:(master) ✗ ./get_ilsvrc_aux.sh

Downloading...
--2017-06-29 10:54:55--  http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz
Resolving dl.caffe.berkeleyvision.org... 169.229.222.251
Connecting to dl.caffe.berkeleyvision.org|169.229.222.251|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://202.114.49.110/cache/9/02/berkeleyvision.org/6b5ff42be9dd0690a814318a14401a7f/caffe_ilsvrc12.tar.gz [following]
--2017-06-29 10:54:56--  http://202.114.49.110/cache/9/02/berkeleyvision.org/6b5ff42be9dd0690a814318a14401a7f/caffe_ilsvrc12.tar.gz
Connecting to 202.114.49.110:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 17858008 (17M) [application/octet-stream]
Saving to: ‘caffe_ilsvrc12.tar.gz’

caffe_ilsvrc12.tar. 100%[===================>]  17.03M  9.67MB/s    in 1.8s

2017-06-29 10:54:58 (9.67 MB/s) - ‘caffe_ilsvrc12.tar.gz’ saved [17858008/17858008]

Unzipping...
Done.

下载caffenet模型:

➜  ilsvrc12 git:(master) ✗ cd ../../models/bvlc_reference_caffenet

➜  bvlc_reference_caffenet git:(master) ✗ wget http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel
--2017-06-29 11:14:10--  http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel
Resolving dl.caffe.berkeleyvision.org... 169.229.222.251
Connecting to dl.caffe.berkeleyvision.org|169.229.222.251|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 243862418 (233M) [application/octet-stream]
Saving to: ‘bvlc_reference_caffenet.caffemodel’

bvlc_reference_caffen 100%[=========================>] 232.56M   129KB/s    in 21m 50s

2017-06-29 11:36:01 (182 KB/s) - ‘bvlc_reference_caffenet.caffemodel’ saved [243862418/243862418]

回到根目录执行:


➜  caffe git:(master) ✗ ./build/examples/cpp_classification/classification.bin \
models/bvlc_reference_caffenet/deploy.prototxt \
models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel \
data/ilsvrc12/imagenet_mean.binaryproto \
data/ilsvrc12/synset_words.txt \
examples/images/cat.jpg


发现报错:

执行:


➜  caffe git:(master) ✗ install_name_tool -add_rpath '/Users/liangzhonghao/anaconda2/lib'  /usr/local/Cellar/caffe/./build/examples/cpp_classification/classification.bin

再次运行上面命令,得出结果:

命令行解释如下:

➜  caffe git:(master) ✗ ./build/examples/cpp_classification/classification.bin \        //二进制程序名
models/bvlc_reference_caffenet/deploy.prototxt \    //模型描述文件
models/bvlc_reference_caffenet/     bvlc_reference_caffenet.caffemodel \        //*.caffemodel模型权值文件
data/ilsvrc12/imagenet_mean.binaryproto \       //图像均值文件
data/ilsvrc12/synset_words.txt \    //图像类别标签信息
examples/images/mouse.png   //输入待分类图像

打开输入图像examples/images/cat.jpg:

cat

命令行输出的预测结果为:

可见给出了5个预测结果,按照概率分布从高到低的顺序排列。这种预测结果称为Top-5预测结果,对当前样本而言,分类准确率为5项之和。除Top-5预测结果之外,还有Top-3、 Top-1等预测结果,对当前样木的分类正确率分别为0.6749、0.3134。

分类准确率不仅与验证数据集有关,与模型的关系也非常密切。我们在Caffe Model Zoo上找到几个模型在ILSVRC 2012验证数据集上的分类效果,如图所示。

可见单模型分类性能最好的是BVLC GoogLeNet

通过掌握上面的内容,并学习其他更多深度学习模型的设计和训练方法.

2017/6/28 posted in  Caffe模型

数据转换器

Caffe的数据变换器(DataTransformer)主要提供了对原始输入图像的预处理方法,包括随机切块、随机镜像、幅度缩放、去均值、灰度/色度变换等。相信熟悉图像处理、OpenCV的读者对上述操作并不陌生。

数据结构描述

message TransformationParameter {
  // For data pre-processing, we can do simple scaling and subtracting the
  // data mean, if provided. Note that the mean subtraction is always carried
  // out before scaling.
  //像素幅度缩放参数,默认为1,即不缩放
  optional float scale = 1 [default = 1];
  // Specify if we want to randomly mirror data.
  //图像随机镜像开关,默认为false,即不进行镜像操作
  optional bool mirror = 2 [default = false];
  // Specify if we would like to randomly crop an image.
  //图像随机切块的大小,默认为0,即不进行切块操作
  optional uint32 crop_size = 3 [default = 0];
  // mean_file and mean_value cannot be specified at the same time(存储图像均值的文件)
  optional string mean_file = 4;
  // if specified can be repeated once (would subtract it from all the channels)
  // or can be repeated the same number of times as channels
  // (would subtract them from the corresponding channel)
  //均值数值,无须读取文件。若数目与图像通道数相等,则每个图像通道分别减去对应的均值;如果只给出一个值.则毎个图像通道都减去同一个均值
  repeated float mean_value = 5;
  // Force the decoded image to have 3 color channels.
  //强制为三通道彩色图像输入
  optional bool force_color = 6 [default = false];
  // Force the decoded image to have 1 color channels.
  //强制为单通道灰度图像输入
  optional bool force_gray = 7 [default = false];
}

数据变换器的实现

数据变换器声明头文件位于include/cafTe/data_transformer.hpp中,如果需要单独使用该模块,应包含这个头文件。文件内容如下:

#ifndef CAFFE_DATA_TRANSFORMER_HPP
#define CAFFE_DATA_TRANSFORMER_HPP

#include <vector>

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/proto/caffe.pb.h"

namespace caffe {

/**
 * @brief Applies common transformations to the input data, such as
 * scaling, mirroring, substracting the image mean...
 */
 //DataTransformer类声明
template <typename Dtype>
class DataTransformer {
 public:
 //显式构造函数
  explicit DataTransformer(const TransformationParameter& param, Phase phase);
  //析构函数
  virtual ~DataTransformer() {}

  /**
   * @brief Initialize the Random number generations if needed by the
   *    transformation.
   */
   //初始化随机数种子函数
  void InitRand();

  /**
   * @brief Applies the transformation defined in the data layer's
   * transform_param block to the data.
   *
   * @param datum
   *    Datum containing the data to be transformed.
   * @param transformed_blob
   *    This is destination blob. It can be part of top blob's data if
   *    set_cpu_data() is used. See data_layer.cpp for an example.
   */
   //下面几种函数重载,以适应多种输入数据源
  void Transform(const Datum& datum, Blob<Dtype>* transformed_blob);

  /**
   * @brief Applies the transformation defined in the data layer's
   * transform_param block to a vector of Datum.
   *
   * @param datum_vector
   *    A vector of Datum containing the data to be transformed.
   * @param transformed_blob
   *    This is destination blob. It can be part of top blob's data if
   *    set_cpu_data() is used. See memory_layer.cpp for an example.
   */
  void Transform(const vector<Datum> & datum_vector,
                Blob<Dtype>* transformed_blob);

#ifdef USE_OPENCV
  /**
   * @brief Applies the transformation defined in the data layer's
   * transform_param block to a vector of Mat.
   *
   * @param mat_vector
   *    A vector of Mat containing the data to be transformed.
   * @param transformed_blob
   *    This is destination blob. It can be part of top blob's data if
   *    set_cpu_data() is used. See memory_layer.cpp for an example.
   */
  void Transform(const vector<cv::Mat> & mat_vector,
                Blob<Dtype>* transformed_blob);

  /**
   * @brief Applies the transformation defined in the data layer's
   * transform_param block to a cv::Mat
   *
   * @param cv_img
   *    cv::Mat containing the data to be transformed.
   * @param transformed_blob
   *    This is destination blob. It can be part of top blob's data if
   *    set_cpu_data() is used. See image_data_layer.cpp for an example.
   */
  void Transform(const cv::Mat& cv_img, Blob<Dtype>* transformed_blob);
#endif  // USE_OPENCV

  /**
   * @brief Applies the same transformation defined in the data layer's
   * transform_param block to all the num images in a input_blob.
   *
   * @param input_blob
   *    A Blob containing the data to be transformed. It applies the same
   *    transformation to all the num images in the blob.
   * @param transformed_blob
   *    This is destination blob, it will contain as many images as the
   *    input blob. It can be part of top blob's data.
   */
  void Transform(Blob<Dtype>* input_blob, Blob<Dtype>* transformed_blob);


 //获取执行Transform后的输出Blob形状
  /**
   * @brief Infers the shape of transformed_blob will have when
   *    the transformation is applied to the data.
   *
   * @param datum
   *    Datum containing the data to be transformed.
   */
  vector<int> InferBlobShape(const Datum& datum);
  /**
   * @brief Infers the shape of transformed_blob will have when
   *    the transformation is applied to the data.
   *    It uses the first element to infer the shape of the blob.
   *
   * @param datum_vector
   *    A vector of Datum containing the data to be transformed.
   */
  vector<int> InferBlobShape(const vector<Datum> & datum_vector);
  /**
   * @brief Infers the shape of transformed_blob will have when
   *    the transformation is applied to the data.
   *    It uses the first element to infer the shape of the blob.
   *
   * @param mat_vector
   *    A vector of Mat containing the data to be transformed.
   */
#ifdef USE_OPENCV
  vector<int> InferBlobShape(const vector<cv::Mat> & mat_vector);
  /**
   * @brief Infers the shape of transformed_blob will have when
   *    the transformation is applied to the data.
   *
   * @param cv_img
   *    cv::Mat containing the data to be transformed.
   */
  vector<int> InferBlobShape(const cv::Mat& cv_img);
#endif  // USE_OPENCV

 protected:
   /**
   * @brief Generates a random integer from Uniform({0, 1, ..., n-1}).
   *
   * @param n
   *    The upperbound (exclusive) value of the random number.
   * @return
   *    A uniformly random integer value from ({0, 1, ..., n-1}).
   */
   //产生取值{0, 1, n-1}的随机整数,服从均匀分布
  virtual int Rand(int n);

  void Transform(const Datum& datum, Dtype* transformed_data);
  // Tranformation parameters(变换参数,该数据结构由ProtoBuffer工具自动生成)
  TransformationParameter param_;

//随机数生成器,声明在include/caffe/common.hpp中
  shared_ptr<Caffe::RNG> rng_;
//当前运行阶段,可能为TRAIN或TEST。阶段不同,执行变换会有差异
  Phase phase_;
//均值图像,用于从均值文件中读取
  Blob<Dtype> data_mean_;
//均值数值,用于从param_中提取
  vector<Dtype> mean_values_;
};

}  // namespace caffe

#endif  // CAFFE_DATA_TRANSFORMER_HPP_

数据变化器的实现文件位于src/caffe/data_transformer.cpp,我们来深入阅读一下。

#ifdef USE_OPENCV
#include <opencv2/core/core.hpp>
#endif  // USE_OPENCV

#include <string>
#include <vector>

#include "caffe/data_transformer.hpp"
#include "caffe/util/io.hpp"
#include "caffe/util/math_functions.hpp"
#include "caffe/util/rng.hpp"

namespace caffe {
//构造函数
template<typename Dtype>
DataTransformer<Dtype>::DataTransformer(const TransformationParameter& param,
    Phase phase)
    : param_(param), phase_(phase) {        //初始化param_和phase_
  // check if we want to use mean_file(查看是否使用均值文件)
  if (param_.has_mean_file()) {
  //如果定了均值文件,又指定了均值数值,则报错,只能2选1
    CHECK_EQ(param_.mean_value_size(), 0) <<
      "Cannot specify mean_file and mean_value at the same time";
    const string& mean_file = param.mean_file();    //获取均值文件名
    if (Caffe::root_solver()) {
      LOG(INFO) << "Loading mean file from: " << mean_file;
    }
    //从均值文件中读取数据到blob_proto对象中
    BlobProto blob_proto;
    ReadProtoFromBinaryFileOrDie(mean_file.c_str(), &blob_proto);
    //从blob_proto将均值反序列化到data_mean_内存中 
    data_mean_.FromProto(blob_proto);
  }
  // check if we want to use mean_value(均值数值)
  if (param_.mean_value_size() > 0) {
    CHECK(param_.has_mean_file() == false) <<
      "Cannot specify mean_file and mean_value at the same time";
    for (int c = 0; c < param_.mean_value_size(); ++c) {
      mean_values_.push_back(param_.mean_value(c));//从param_中读取均值数值,不在读取均值文件
    }
  }
}
//变换函数,从众多重载函数中,我们选择一个重点讲解,其他的计算流程都类似
//下面函数使用了Datum作为输入,这个结构体我们可以从caffe.proto中一窥究竟
/*
    // Datum用来从LMDB/LEVELDB中读取数据,或将数据写入LMDB/LEVELDB,和BlobProto有相似的功能,只是BlobProto用于模型权值序列化/反序列化,而Datum专为数据或特征阁(feature map)提供序列化/反序列化服务.
message Datum {
 //数据维度信息,channels * height ★ width
  optional int32 channels = 1;
  optional int32 height = 2;
  optional int32 width = 3;
  // the actual image data, in bytes(图像数据,以字节类型存储)
  optional bytes data = 4;
  //标签数据,统一用int32类型存储
  optional int32 label = 5;
  // Optionally, the datum could also hold float data.(可选,图像数据也可以用float类型存储 )
  repeated float float_data = 6;
  // If true data contains an encoded image that need to be decoded(是否为编码数据,默认不是)
  optional bool encoded = 7 [default = false];
}
*/

//下面函数输入为Datum,输出为数据指针 
template<typename Dtype>
void DataTransformer<Dtype>::Transform(const Datum& datum,
                                       Dtype* transformed_data) {
  //获得datum数据字串、维度信息
  const string& data = datum.data();
  const int datum_channels = datum.channels();
  const int datum_height = datum.height();
  const int datum_width = datum.width();

//从取处理参数,如切块大小、幅度缩放、随机镜像、图像均值等
  const int crop_size = param_.crop_size();
  const Dtype scale = param_.scale();
  const bool do_mirror = param_.mirror() && Rand(2);
  const bool has_mean_file = param_.has_mean_file();
  const bool has_uint8 = data.size() > 0;
  const bool has_mean_values = mean_values_.size() > 0;

  CHECK_GT(datum_channels, 0);  //保证输入数据通道数大于0
  CHECK_GE(datum_height, crop_size);    //保证输入数据宽和高大于切块大小
  CHECK_GE(datum_width, crop_size);
//获得图像均值
  Dtype* mean = NULL;
  if (has_mean_file) {  //若指定了图像均值文件
  //保证图像均值的维度与输入图像数据的维度完全相同 
    CHECK_EQ(datum_channels, data_mean_.channels());
    CHECK_EQ(datum_height, data_mean_.height());
    CHECK_EQ(datum_width, data_mean_.width());
    mean = data_mean_.mutable_cpu_data(); //夺取图像均值数据控制权
  }
  if (has_mean_values) {    //若没有指定图像均值文件,而是直接给出数值
  //保证均值数值维度为1,或与输人图像数据的channels数目相同
    CHECK(mean_values_.size() == 1 || mean_values_.size() == datum_channels) <<
     "Specify either 1 mean_value or as many as channels: " << datum_channels;
    if (datum_channels > 1 && mean_values_.size() == 1) {
      // Replicate the mean_value for simplicity(若均值数值维度为1,而输入数据channels数目大于1,则重复该值channels次 )
      for (int c = 1; c < datum_channels; ++c) {
        mean_values_.push_back(mean_values_[0]);
      }
    }
  }
  //输入图像宽和高
  int height = datum_height;
  int width = datum_width;
  //开始图像切块
  int h_off = 0;
  int w_off = 0;
  if (crop_size) { //crop_size不为0,则进行切块;若为0表示不切块
    height = crop_size;
    width = crop_size;
    // We only do random crop when we do training.
    if (phase_ == TRAIN) {
      h_off = Rand(datum_height - crop_size + 1); //切块的 height偏移量
      w_off = Rand(datum_width - crop_size + 1);  //切块的 width 偏移量
    } else {
      h_off = (datum_height - crop_size) / 2;
      w_off = (datum_width - crop_size) / 2;
    }
  }

  Dtype datum_element;      //存放输入图像的像素值
  int top_index, data_index;    //分别存放输出index,输入index
  for (int c = 0; c < datum_channels; ++c) {
    for (int h = 0; h < height; ++h) {
      for (int w = 0; w < width; ++w) {
        data_index = (c * datum_height + h_off + h) * datum_width + w_off + w;
        if (do_mirror) {    //若需要镜像操作,则对输出index设置width反向
          top_index = (c * height + h) * width + (width - 1 - w);
        } else {
          top_index = (c * height + h) * width + w;
        }
        if (has_uint8) {    //若datum中使用uint8存储图像数据,需要转换为float
          datum_element =
            static_cast<Dtype>(static_cast<uint8_t>(data[data_index]));
        } else {
          datum_element = datum.float_data(data_index);
        }
        if (has_mean_file) {    //若指定了均值文件
          transformed_data[top_index] =
            (datum_element - mean[data_index]) * scale; // 执行去均值、幅度缩放
        } else {
          if (has_mean_values) {    //若指定了均值数值
            transformed_data[top_index] =
              (datum_element - mean_values_[c]) * scale;    //去均值,幅度缩放
          } else {
            transformed_data[top_index] = datum_element * scale;    //不去均值,制作幅度缩放
          }
        }
      }
    }
  }
}

//与上面函数类似.只是输出变为Blob
template<typename Dtype>
void DataTransformer<Dtype>::Transform(const Datum& datum,
                                       Blob<Dtype>* transformed_blob) {
  // If datum is encoded, decode and transform the cv::image.(如果datum是经过编码的图像,则先解码 )
  if (datum.encoded()) {
#ifdef USE_OPENCV
    CHECK(!(param_.force_color() && param_.force_gray()))
        << "cannot set both force_color and force_gray";
    cv::Mat cv_img;
    if (param_.force_color() || param_.force_gray()) {
    // If force_color then decode in color otherwise decode in gray.
      cv_img = DecodeDatumToCVMat(datum, param_.force_color());
    } else {
      cv_img = DecodeDatumToCVMatNative(datum);
    }
    // Transform the cv::image into blob.(将cv::image变为Blob)
    return Transform(cv_img, transformed_blob);
#else
    LOG(FATAL) << "Encoded datum requires OpenCV; compile with USE_OPENCV.";
#endif  // USE_OPENCV
  } else {
    if (param_.force_color() || param_.force_gray()) {
      LOG(ERROR) << "force_color and force_gray only for encoded datum";
    }
  }

  const int crop_size = param_.crop_size();
  const int datum_channels = datum.channels();
  const int datum_height = datum.height();
  const int datum_width = datum.width();

  // Check dimensions.
  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();
  const int num = transformed_blob->num();

  CHECK_EQ(channels, datum_channels);
  CHECK_LE(height, datum_height);
  CHECK_LE(width, datum_width);
  CHECK_GE(num, 1);

  if (crop_size) {
    CHECK_EQ(crop_size, height);
    CHECK_EQ(crop_size, width);
  } else {
    CHECK_EQ(datum_height, height);
    CHECK_EQ(datum_width, width);
  }

  Dtype* transformed_data = transformed_blob->mutable_cpu_data();
  Transform(datum, transformed_data);   //参数变换完毕,调用现有函数
}

//对一组datum进行变换
template<typename Dtype>
void DataTransformer<Dtype>::Transform(const vector<Datum> & datum_vector,
                                       Blob<Dtype>* transformed_blob) {
  const int datum_num = datum_vector.size();
  const int num = transformed_blob->num();
  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();

  CHECK_GT(datum_num, 0) << "There is no datum to add";
  CHECK_LE(datum_num, num) <<
    "The size of datum_vector must be no greater than transformed_blob->num()";
  Blob<Dtype> uni_blob(1, channels, height, width); //临时Blob
  //依次对每个datum进行变换.放入对应的Blob中
  for (int item_id = 0; item_id < datum_num; ++item_id) {
    int offset = transformed_blob->offset(item_id);
    uni_blob.set_cpu_data(transformed_blob->mutable_cpu_data() + offset);
    Transform(datum_vector[item_id], &uni_blob);
  }
}
//对一组输入cv::Mat对象进行变换.放入Blob中
#ifdef USE_OPENCV
template<typename Dtype>
void DataTransformer<Dtype>::Transform(const vector<cv::Mat> & mat_vector,
                                       Blob<Dtype>* transformed_blob) {
  const int mat_num = mat_vector.size();
  const int num = transformed_blob->num();
  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();

  CHECK_GT(mat_num, 0) << "There is no MAT to add";
  CHECK_EQ(mat_num, num) <<
    "The size of mat_vector must be equals to transformed_blob->num()";
  Blob<Dtype> uni_blob(1, channels, height, width);
  for (int item_id = 0; item_id < mat_num; ++item_id) {
    int offset = transformed_blob->offset(item_id);
    uni_blob.set_cpu_data(transformed_blob->mutable_cpu_data() + offset);
    Transform(mat_vector[item_id], &uni_blob);
  }
}

//对一个cv:Mat对象进行变换
template<typename Dtype>
void DataTransformer<Dtype>::Transform(const cv::Mat& cv_img,
                                       Blob<Dtype>* transformed_blob) {
  const int crop_size = param_.crop_size();
  const int img_channels = cv_img.channels();
  const int img_height = cv_img.rows;
  const int img_width = cv_img.cols;

  // Check dimensions.
  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();
  const int num = transformed_blob->num();

  CHECK_EQ(channels, img_channels);
  CHECK_LE(height, img_height);
  CHECK_LE(width, img_width);
  CHECK_GE(num, 1);

  CHECK(cv_img.depth() == CV_8U) << "Image data type must be unsigned byte";

  const Dtype scale = param_.scale();
  const bool do_mirror = param_.mirror() && Rand(2);
  const bool has_mean_file = param_.has_mean_file();
  const bool has_mean_values = mean_values_.size() > 0;

  CHECK_GT(img_channels, 0);
  CHECK_GE(img_height, crop_size);
  CHECK_GE(img_width, crop_size);

  Dtype* mean = NULL;
  if (has_mean_file) {
    CHECK_EQ(img_channels, data_mean_.channels());
    CHECK_EQ(img_height, data_mean_.height());
    CHECK_EQ(img_width, data_mean_.width());
    mean = data_mean_.mutable_cpu_data();
  }
  if (has_mean_values) {
    CHECK(mean_values_.size() == 1 || mean_values_.size() == img_channels) <<
     "Specify either 1 mean_value or as many as channels: " << img_channels;
    if (img_channels > 1 && mean_values_.size() == 1) {
      // Replicate the mean_value for simplicity(复制均值数值,便于操作)
      for (int c = 1; c < img_channels; ++c) {
        mean_values_.push_back(mean_values_[0]);
      }
    }
  }

  int h_off = 0;
  int w_off = 0;
  cv::Mat cv_cropped_img = cv_img;
  if (crop_size) {
    CHECK_EQ(crop_size, height);
    CHECK_EQ(crop_size, width);
    // We only do random crop when we do training.
    if (phase_ == TRAIN) {
      h_off = Rand(img_height - crop_size + 1);
      w_off = Rand(img_width - crop_size + 1);
    } else {
      h_off = (img_height - crop_size) / 2;
      w_off = (img_width - crop_size) / 2;
    }
    cv::Rect roi(w_off, h_off, crop_size, crop_size);
    cv_cropped_img = cv_img(roi);
  } else {
    CHECK_EQ(img_height, height);
    CHECK_EQ(img_width, width);
  }

  CHECK(cv_cropped_img.data);

  Dtype* transformed_data = transformed_blob->mutable_cpu_data();
  int top_index;
  for (int h = 0; h < height; ++h) {
    const uchar* ptr = cv_cropped_img.ptr<uchar>(h);
    int img_index = 0;
    for (int w = 0; w < width; ++w) {
      for (int c = 0; c < img_channels; ++c) {
        if (do_mirror) {
          top_index = (c * height + h) * width + (width - 1 - w);
        } else {
          top_index = (c * height + h) * width + w;
        }
        // int top_index = (c * height + h) * width + w;
        Dtype pixel = static_cast<Dtype>(ptr[img_index++]);
        if (has_mean_file) {
          int mean_index = (c * img_height + h_off + h) * img_width + w_off + w;
          transformed_data[top_index] =
            (pixel - mean[mean_index]) * scale;
        } else {
          if (has_mean_values) {
            transformed_data[top_index] =
              (pixel - mean_values_[c]) * scale;
          } else {
            transformed_data[top_index] = pixel * scale;
          }
        }
      }
    }
  }
}
#endif  // USE_OPENCV

//输入是Blob,输出也是Blob
template<typename Dtype>
void DataTransformer<Dtype>::Transform(Blob<Dtype>* input_blob,
                                       Blob<Dtype>* transformed_blob) {
  const int crop_size = param_.crop_size();
  const int input_num = input_blob->num();
  const int input_channels = input_blob->channels();
  const int input_height = input_blob->height();
  const int input_width = input_blob->width();

  if (transformed_blob->count() == 0) {
    // Initialize transformed_blob with the right shape.(初始化变换后的Blob的形状)
    if (crop_size) {
      transformed_blob->Reshape(input_num, input_channels,
                                crop_size, crop_size);
    } else {
      transformed_blob->Reshape(input_num, input_channels,
                                input_height, input_width);
    }
  }

  const int num = transformed_blob->num();
  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();
  const int size = transformed_blob->count();

  CHECK_LE(input_num, num);
  CHECK_EQ(input_channels, channels);
  CHECK_GE(input_height, height);
  CHECK_GE(input_width, width);


  const Dtype scale = param_.scale();
  const bool do_mirror = param_.mirror() && Rand(2);
  const bool has_mean_file = param_.has_mean_file();
  const bool has_mean_values = mean_values_.size() > 0;

  int h_off = 0;
  int w_off = 0;
  if (crop_size) {
    CHECK_EQ(crop_size, height);
    CHECK_EQ(crop_size, width);
    // We only do random crop when we do training.
    if (phase_ == TRAIN) {
      h_off = Rand(input_height - crop_size + 1);
      w_off = Rand(input_width - crop_size + 1);
    } else {
      h_off = (input_height - crop_size) / 2;
      w_off = (input_width - crop_size) / 2;
    }
  } else {
    CHECK_EQ(input_height, height);
    CHECK_EQ(input_width, width);
  }

  Dtype* input_data = input_blob->mutable_cpu_data();
  if (has_mean_file) {
    CHECK_EQ(input_channels, data_mean_.channels());
    CHECK_EQ(input_height, data_mean_.height());
    CHECK_EQ(input_width, data_mean_.width());
    for (int n = 0; n < input_num; ++n) {
      int offset = input_blob->offset(n);
      caffe_sub(data_mean_.count(), input_data + offset,
            data_mean_.cpu_data(), input_data + offset);
    }
  }

  if (has_mean_values) {
    CHECK(mean_values_.size() == 1 || mean_values_.size() == input_channels) <<
     "Specify either 1 mean_value or as many as channels: " << input_channels;
    if (mean_values_.size() == 1) {
      caffe_add_scalar(input_blob->count(), -(mean_values_[0]), input_data);
    } else {
      for (int n = 0; n < input_num; ++n) {
        for (int c = 0; c < input_channels; ++c) {
          int offset = input_blob->offset(n, c);
          caffe_add_scalar(input_height * input_width, -(mean_values_[c]),
            input_data + offset);
        }
      }
    }
  }

  Dtype* transformed_data = transformed_blob->mutable_cpu_data();

  for (int n = 0; n < input_num; ++n) {
    int top_index_n = n * channels;
    int data_index_n = n * channels;
    for (int c = 0; c < channels; ++c) {
      int top_index_c = (top_index_n + c) * height;
      int data_index_c = (data_index_n + c) * input_height + h_off;
      for (int h = 0; h < height; ++h) {
        int top_index_h = (top_index_c + h) * width;
        int data_index_h = (data_index_c + h) * input_width + w_off;
        if (do_mirror) {
          int top_index_w = top_index_h + width - 1;
          for (int w = 0; w < width; ++w) {
            transformed_data[top_index_w-w] = input_data[data_index_h + w];
          }
        } else {
          for (int w = 0; w < width; ++w) {
            transformed_data[top_index_h + w] = input_data[data_index_h + w];
          }
        }
      }
    }
  }
  if (scale != Dtype(1)) {
    DLOG(INFO) << "Scale: " << scale;
    caffe_scal(size, scale, transformed_data);
  }
}

//获得数据变换输出Blob尺寸
template<typename Dtype>
vector<int> DataTransformer<Dtype>::InferBlobShape(const Datum& datum) {
  if (datum.encoded()) {
#ifdef USE_OPENCV
    CHECK(!(param_.force_color() && param_.force_gray()))
        << "cannot set both force_color and force_gray";
    cv::Mat cv_img;
    if (param_.force_color() || param_.force_gray()) {
    // If force_color then decode in color otherwise decode in gray.
      cv_img = DecodeDatumToCVMat(datum, param_.force_color());
    } else {
      cv_img = DecodeDatumToCVMatNative(datum);
    }
    // InferBlobShape using the cv::image.
    return InferBlobShape(cv_img);
#else
    LOG(FATAL) << "Encoded datum requires OpenCV; compile with USE_OPENCV.";
#endif  // USE_OPENCV
  }
  const int crop_size = param_.crop_size();
  const int datum_channels = datum.channels();
  const int datum_height = datum.height();
  const int datum_width = datum.width();
  // Check dimensions.
  CHECK_GT(datum_channels, 0);
  CHECK_GE(datum_height, crop_size);
  CHECK_GE(datum_width, crop_size);
  // Build BlobShape.
  vector<int> shape(4);
  shape[0] = 1;
  shape[1] = datum_channels;
  shape[2] = (crop_size)? crop_size: datum_height;
  shape[3] = (crop_size)? crop_size: datum_width;
  return shape;
}


template<typename Dtype>
vector<int> DataTransformer<Dtype>::InferBlobShape(
    const vector<Datum> & datum_vector) {
  const int num = datum_vector.size();
  CHECK_GT(num, 0) << "There is no datum to in the vector";
  // Use first datum in the vector to InferBlobShape.
  vector<int> shape = InferBlobShape(datum_vector[0]);
  // Adjust num to the size of the vector.
  shape[0] = num;
  return shape;
}

#ifdef USE_OPENCV
template<typename Dtype>
vector<int> DataTransformer<Dtype>::InferBlobShape(const cv::Mat& cv_img) {
  const int crop_size = param_.crop_size();
  const int img_channels = cv_img.channels();
  const int img_height = cv_img.rows;
  const int img_width = cv_img.cols;
  // Check dimensions.
  CHECK_GT(img_channels, 0);
  CHECK_GE(img_height, crop_size);
  CHECK_GE(img_width, crop_size);
  // Build BlobShape.
  vector<int> shape(4);
  shape[0] = 1;
  shape[1] = img_channels;
  shape[2] = (crop_size)? crop_size: img_height;
  shape[3] = (crop_size)? crop_size: img_width;
  return shape;
}

template<typename Dtype>
vector<int> DataTransformer<Dtype>::InferBlobShape(
    const vector<cv::Mat> & mat_vector) {
  const int num = mat_vector.size();
  CHECK_GT(num, 0) << "There is no cv_img to in the vector";
  // Use first cv_img in the vector to InferBlobShape.
  vector<int> shape = InferBlobShape(mat_vector[0]);
  // Adjust num to the size of the vector.
  shape[0] = num;
  return shape;
}
#endif  // USE_OPENCV

//初始化随机数种子
template <typename Dtype>
void DataTransformer<Dtype>::InitRand() {
//如果在初始化参数中要求对输入进行随机镜像操作,或者在训练阶段需要随机切块,那么需要初始化随机数种子
  const bool needs_rand = param_.mirror() ||
      (phase_ == TRAIN && param_.crop_size());
  if (needs_rand) {
    const unsigned int rng_seed = caffe_rng_rand();
    rng_.reset(new Caffe::RNG(rng_seed));
  } else {
    rng_.reset();
  }
}

//生成0~n-1之间的随机数
template <typename Dtype>
int DataTransformer<Dtype>::Rand(int n) {
  CHECK(rng_);
  CHECK_GT(n, 0);
  caffe::rng_t* rng =
      static_cast<caffe::rng_t*>(rng_->generator());
  return ((*rng)() % n);
}

INSTANTIATE_CLASS(DataTransformer);

}  // namespace caffe

2017/6/26 posted in  Caffe I/O模块

Caffe I/O模块

这里我们讨论学习Caffe的I/O模块,即与数据打交道的模块。

我们在运行Caffe例程前,首先需要将原始数据转换为LMDB格式,训练网络时则需要由数据读取层(DataLayer)不断地从LMDB读取数据,送入后续卷积、下采样 等计算层。作为基础,Caffe I/O模块的效率直接影响到处理效果。

数据读取层

Caffe数据读取层(DataLayer)是Layer的派生类。除了读取LMDB、LEVELDB之外,也可以从原始图像直接读取(ImageDataLayer)。

数据结构描述

我们在caffe.proto中可以找到,关于数据结构的描述

message DataParameter {
//输入数据使用的DB类型
  enum DB {
    LEVELDB = 0;    //使用 LEVELDB
    LMDB = 1;       //使用 LMDB
  }
  // Specify the data source.(源数据的路径)
  optional string source = 1;
  // Specify the batch size.( 一个批量数据包含的图片数)
  optional uint32 batch_size = 4;
  // The rand_skip variable is for the data layer to skip a few data points
  // to avoid all asynchronous sgd clients to start at the same point. The skip
  // point would be set as rand_skip * rand(0,1). Note that rand_skip should not
  // be larger than the number of keys in the database.
  // DEPRECATED. Each solver accesses a different subset of the database.
  //随机跳过若干图片,跳跃数为rand_skip * rand(0, 1)
  optional uint32 rand_skip = 7 [default = 0];
  //默认输入数据使用DB类型,默认为LEVELDB
  optional DB backend = 8 [default = LEVELDB];
  // DEPRECATED. See TransformationParameter. For data pre-processing, we can do
  // simple scaling and subtracting the data mean, if provided. Note that the
  // mean subtraction is always carried out before scaling.
  //scale、mean_file、crop_size、mirror 均为旧版参数,现已转移到 TransformationParameter
  optional float scale = 2 [default = 1];
  optional string mean_file = 3;
  // DEPRECATED. See TransformationParameter. Specify if we would like to randomly
  // crop an image.
  optional uint32 crop_size = 5 [default = 0];
  // DEPRECATED. See TransformationParameter. Specify if we want to randomly mirror
  // data.
  optional bool mirror = 6 [default = false];
  //强制编码图像为三通道彩色图像
  optional bool force_encoded_color = 9 [default = false];
  // Prefetch queue (Increase if data feeding bandwidth varies, within the
  // limit of device memory for GPU training)
  //预取队列(预先放到主机内存中的批量数.默认为4个Batch)
  optional uint32 prefetch = 10 [default = 4];
}

数据读取层的实现

数据读取层声明位于include/caffe/layers/base_data_layers.hpp中,如果需要单独使用该层,则应包含这个头文件.

namespace caffe {

/**
 * @brief Provides base for data layers that feed blobs to the Net.
 *
 * TODO(dox): thorough documentation for Forward and proto params.
 */
template <typename Dtype>
class BaseDataLayer : public Layer<Dtype> {
 public:
 //显式构造函数
  explicit BaseDataLayer(const LayerParameter& param);
  // LayerSetUp: implements common data layer setup functionality, and calls
  // DataLayerSetUp to do special data layer setup for individual layer types.
  // This method may not be overridden except by the BasePrefetchingDataLayer.
  //层配置,实现通用层配置功能,之后调用DataLayerSetUp进行数据读取层的特别配置 
  virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);
  virtual void DataLayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {}
  // Data layers have no bottoms, so reshaping is trivial.(数据读取没有Bottom Blob,变形操作很简单 )
  virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {}
//反向传播函数不需要做任何操作
  virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {}
  virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {}

 protected:
 //数据预处理变换器参数
  TransformationParameter transform_param_;
  //数据预处理变换器
  shared_ptr<DataTransformer<Dtype> > data_transformer_;
  //是否输出标签数据
  bool output_labels_;
};
//批量数据,用于存放数据读取层输出
template <typename Dtype>
class Batch {
 public:
 //包含两个Blob: data_用于存放图片数据,label_用于存放标签
  Blob<Dtype> data_, label_;
};

//带预取功能的数据读取派生于BaseDataLayer和InternalThread
template <typename Dtype>
class BasePrefetchingDataLayer :
    public BaseDataLayer<Dtype>, public InternalThread {
 public:
 //显式构造函数
  explicit BasePrefetchingDataLayer(const LayerParameter& param);
  // LayerSetUp: implements common data layer setup functionality, and calls
  // DataLayerSetUp to do special data layer setup for individual layer types.
  // This method may not be overridden.
  //层设置函数
  void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);
//前向传播
  virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);
  virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);

 protected:
  virtual void InternalThreadEntry();   //内部线程入口
  virtual void load_batch(Batch<Dtype>* batch) = 0; //载入批量数据,纯虚函数

  vector<shared_ptr<Batch<Dtype> > > prefetch_; //预取Buffer
  BlockingQueue<Batch<Dtype>*> prefetch_free_;  //空闲Batch队列
  BlockingQueue<Batch<Dtype>*> prefetch_full_;  //已加载Batch队列
  Batch<Dtype>* prefetch_current_;  //当前Batch(猜的)

  Blob<Dtype> transformed_data_;    //变换后的数据
};


数据读取层的实现位于src/caffe/layers/base_data_layer.cpp中,内容如下:

#include <boost/thread.hpp>
#include <vector>

#include "caffe/blob.hpp"
#include "caffe/data_transformer.hpp"
#include "caffe/internal_thread.hpp"
#include "caffe/layer.hpp"
#include "caffe/layers/base_data_layer.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/blocking_queue.hpp"

namespace caffe {
//构造函数,初始化Layer参数、数据变换器参数
template <typename Dtype>
BaseDataLayer<Dtype>::BaseDataLayer(const LayerParameter& param)
    : Layer<Dtype>(param),
      transform_param_(param.transform_param()) {
}
//BaseDataLayer层设置 
template <typename Dtype>
void BaseDataLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {
  if (top.size() == 1) {    //判断输出Blob个数,若为1只输出data,若为2则输出data和label
    output_labels_ = false;
  } else {
    output_labels_ = true;
  }
  //初始化数据变化器对象
  data_transformer_.reset(
      new DataTransformer<Dtype>(transform_param_, this->phase_));
  data_transformer_->InitRand();    //生成随机数种子
  // The subclasses should setup the size of bottom and top
  //子类负责设置Top Blob形状
  DataLayerSetUp(bottom, top);
}
//BasePrefetchingDataLayer 构造函数
template <typename Dtype>
BasePrefetchingDataLayer<Dtype>::BasePrefetchingDataLayer(
    const LayerParameter& param)
    : BaseDataLayer<Dtype>(param),
      prefetch_(param.data_param().prefetch()),
      prefetch_free_(), prefetch_full_(), prefetch_current_() {
  for (int i = 0; i < prefetch_.size(); ++i) {
    prefetch_[i].reset(new Batch<Dtype>());
    prefetch_free_.push(prefetch_[i].get());    //将Batch对象都放入空闲队列中
  }
}
//BasePrefetchingDataLayer层配置函数
template <typename Dtype>
void BasePrefetchingDataLayer<Dtype>::LayerSetUp(
    const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
  BaseDataLayer<Dtype>::LayerSetUp(bottom, top);

  // Before starting the prefetch thread, we make cpu_data and gpu_data
  // calls so that the prefetch thread does not accidentally make simultaneous
  // cudaMalloc calls when the main thread is running. In some GPUs this
  // seems to cause failures if we do not so.
  //在幵启数据预取线程前,通过调用Blob相应函数先进行cudaMalloc,避免在多线程情况下同时进行cudaMalloc.会导致CUDA API调用失败
  for (int i = 0; i < prefetch_.size(); ++i) {
    prefetch_[i]->data_.mutable_cpu_data();
    if (this->output_labels_) {
      prefetch_[i]->label_.mutable_cpu_data();
    }
  }
  //如果编译选项没有CPU_ONLY,则需要编译GPU代码
#ifndef CPU_ONLY    
  if (Caffe::mode() == Caffe::GPU) {
    for (int i = 0; i < prefetch_.size(); ++i) {
      prefetch_[i]->data_.mutable_gpu_data();
      if (this->output_labels_) {
        prefetch_[i]->label_.mutable_gpu_data();    //功能同上
      }
    }
  }
#endif
  DLOG(INFO) << "Initializing prefetch";
  this->data_transformer_->InitRand();
  StartInternalThread();    //开启内部预取线程
  DLOG(INFO) << "Prefetch initialized.";
}
//内部线程入口
template <typename Dtype>
void BasePrefetchingDataLayer<Dtype>::InternalThreadEntry() {
//创建CUDA Stream,非阻塞类型
#ifndef CPU_ONLY
  cudaStream_t stream;
  if (Caffe::mode() == Caffe::GPU) {
    CUDA_CHECK(cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking));
  }
#endif

  try {
    while (!must_stop()) {  //循环载入批量数据
      Batch<Dtype>* batch = prefetch_free_.pop();   //拿到一个空闲Batch
      load_batch(batch);    //载入批量数据
#ifndef CPU_ONLY
      if (Caffe::mode() == Caffe::GPU) {
        batch->data_.data().get()->async_gpu_push(stream);
        if (this->output_labels_) {
          batch->label_.data().get()->async_gpu_push(stream);
        }
        CUDA_CHECK(cudaStreamSynchronize(stream));//同步到GPU
      }
#endif
      prefetch_full_.push(batch);   //加入到带负载的Batch队列中
    }
  } catch (boost::thread_interrupted&) {
    // Interrupted exception is expected on shutdown(捕获到异常,退出while循环)
  }
#ifndef CPU_ONLY
  if (Caffe::mode() == Caffe::GPU) {
    CUDA_CHECK(cudaStreamDestroy(stream));  //销毁CUDA Stream
  }
#endif
}
//前向传波函数
template <typename Dtype>
void BasePrefetchingDataLayer<Dtype>::Forward_cpu(
    const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
    //从带负载的Batch队列中取出一个Batch对象
  if (prefetch_current_) {
    prefetch_free_.push(prefetch_current_);
  }
  prefetch_current_ = prefetch_full_.pop("Waiting for data");
  // Reshape to loaded data.(Top Blob根据Batch形状进行变形)
  top[0]->ReshapeLike(prefetch_current_->data_);
  //将数据放到Top Blob中
  top[0]->set_cpu_data(prefetch_current_->data_.mutable_cpu_data());
  if (this->output_labels_) {
    // Reshape to loaded labels.(同上)
    top[1]->ReshapeLike(prefetch_current_->label_);
    top[1]->set_cpu_data(prefetch_current_->label_.mutable_cpu_data());
  }
}

#ifdef CPU_ONLY
STUB_GPU_FORWARD(BasePrefetchingDataLayer, Forward);
#endif

INSTANTIATE_CLASS(BaseDataLayer);
INSTANTIATE_CLASS(BasePrefetchingDataLayer);

}  // namespace caffe

2017/6/24 posted in  Caffe I/O模块

MAC下openBlas的安装

这里我们安装openblas主要有两种方法:

  1. 通过git代码到本地并安装
  2. 通过brew来安装

git到本地编译安装.

  1. git代码到本地并安装
git clone https://github.com/xianyi/OpenBLAS.git
cd OpenBLAS
make -j4
make install

  1. 修改Caffe的Makefile.config
BLAS := open
BLAS_INCLUDE :=  /opt/OpenBLAS/include
BLAS_LIB := /opt/OpenBLAS/lib
  1. caffe重新make
make clean
make pycaffe
make all -j4
make test && runtest

使用brew进行安装

  1. brew uninstall openblas; brew install --fresh -vd openblas.运行上面命令安装openblas
  2. 更改Makefile.config

# Homebrew puts openblas in a directory that is not on the standard search path
BLAS_INCLUDE := $(shell brew --prefix openblas)/include
BLAS_LIB := $(shell brew --prefix openblas)/lib

  1. 重新编译

如果有些人遇到了这种错误

In file included from src/caffe/util/blocking_queue.cpp:5:
In file included from ./include/caffe/layers/base_data_layer.hpp:9:
In file included from ./include/caffe/layer.hpp:12:
In file included from ./include/caffe/util/math_functions.hpp:11:
./include/caffe/util/mkl_alternate.hpp:14:10: fatal error: 'cblas.h' file not found
#include <cblas.h>
         ^
1 error generated.
make: *** [.build_release/src/caffe/util/blocking_queue.o] Error 1

可以试试这个命令:

cmake -DCMAKE_CXX_FLAGS=-I/usr/local/opt/openblas/include ..
2017/6/21 posted in  Caffe I/O模块

Caffe中的Net

Net在Caffe中代表一个完整的CNN模型,它包含若干Layer实例。前面我们已经在第5天内容中看到用ProtoBuffer文本文件(prototxt)描述的经典网络结构如LeNet、AlexNet,这些结构反映在Caffe代码实现上就是一个Net对象。Net其实是相对Blob、 Layer更为复杂的设计,需要沉住气

Net的基本用法

Net是一张图纸,对应的描述文件为*.prototxt,我们选择Caffe自带的CaffeNet模型描述文件,位于models/bvlc_reference_caffenet/deploy.prototxt。将该文件拷贝到当前工作目录下。

编写测试代码为:

#include <vector>
#include <iostream>
#include <caffe/net.hpp>
using namespace caffe;
using namespace std;

int main(void)
{
    std::string proto("deploy.prototxt");
    Net<float> nn(proto,caffe::TEST);
    vector<string> bn = nn.blob_names();    // 获取 Net 中所有 Blob 对象名
    vector<string> ln = nn.layer_names();   // 获取 Net 中所有 Layer 对象名
    for (int i = 0; i < bn.size(); i++)
    {
        cout<<"Blob #"<<i<<" : "<<bn[i]<<endl;
    }
    for (int i = 0; i < ln.size(); i++)
    {
        cout<<"layer #"<<i<<" : "<<ln[i]<<endl;
    }
    return 0;
}

编译(注意这里我们需要安装openblas,具体过程不再这里讲述):

g++ -o netapp net_demo.cpp -I /usr/local/Cellar/caffe/include -D CPU_ONLY -I /usr/local/Cellar/caffe/.build_release/src/ -L /usr/local/Cellar/caffe/build/lib -I /usr/local/Cellar/openblas/0.2.19_1/include  -lcaffe -lglog -lboost_system -lprotobuf

注意到这里有一段-I /usr/local/Cellar/openblas/0.2.19_1/include这是为了连接到本地的blas

运行 ./netapp :

发现又报错了:

运行install_name_tool -add_rpath '/usr/local/Cellar/caffe/build/lib/' /usr/local/Cellar/caffe/LZHcaffe/./netapp命令,连接库文件.

运行成功后,输出为:

...省略上面部分,之前已经见过了
I0622 11:03:24.688719 3012948928 net.cpp:255] Network initialization done.
Blob #0 : data
Blob #1 : conv1
Blob #2 : pool1
Blob #3 : norm1
Blob #4 : conv2
Blob #5 : pool2
Blob #6 : norm2
Blob #7 : conv3
Blob #8 : conv4
Blob #9 : conv5
Blob #10 : pool5
Blob #11 : fc6
Blob #12 : fc7
Blob #13 : fc8
Blob #14 : prob
layer #0 : data
layer #1 : conv1
layer #2 : relu1
layer #3 : pool1
layer #4 : norm1
layer #5 : conv2
layer #6 : relu2
layer #7 : pool2
layer #8 : norm2
layer #9 : conv3
layer #10 : relu3
layer #11 : conv4
layer #12 : relu4
layer #13 : conv5
layer #14 : relu5
layer #15 : pool5
layer #16 : fc6
layer #17 : relu6
layer #18 : drop6
layer #19 : fc7
layer #20 : relu7
layer #21 : drop7
layer #22 : fc8
layer #23 : prob

通过上面的简单例子,我们发现Net中既包括Layer对象,有包括Blob对象.其中Blob对象用于存放每个Layer输入/输出中间结果.Layer则根据Net描述对指定的输入Blob进行某些计算处理(卷积、下采样、全连接、非线性变换、计算代价函数等),输出结果放到指定的输出Blob中。输入Blob和输出Blob可能为同一个。所有的Layer和Blob对象都用名字区分,同名的Blob表示同一个Blob对象,同名的Layer表示同一个Layer对象。而Blob和Layer同名则不代表它们有任何直接关系

我们可以通过has_blob()、has_layer()函数来査询当前Net对象是否包含指定名字的Blob或Layer对象,如果返回值为真,则可以进-步调用blob_by_name()、layer_by_name()函数直接获取相应的Blob或Layer指针,进行些操作(如提取某层计算输出特征或某个Blob中的权值)。

数据结构描述

我们这里先了解下caffe.proto:


message NetParameter {
  optional string name = 1; // consider giving the network a name
  // DEPRECATED. See InputParameter. The input blobs to the network.
  repeated string input = 3;
  // DEPRECATED. See InputParameter. The shape of the input blobs.(网络的输入Blob名称,可以多个Blob)
  repeated BlobShape input_shape = 8;

  // 4D input dimensions -- deprecated.  Use "input_shape" instead.
  // If specified, for each input blob there should be four
  // values specifying the num, channels, height and width of the input blob.
  // Thus, there should be a total of (4 * #input) numbers.
  //(旧版的维度信息)
  repeated int32 input_dim = 4;

  // Whether the network will force every layer to carry out backward operation.
  // If set False, then whether to carry out backward is determined
  // automatically according to the net structure and learning rates.
  optional bool force_backward = 5 [default = false];
  // The current "state" of the network, including the phase, level, and stage.
  // Some layers may be included/excluded depending on this state and the states
  // specified in the layers' include and exclude fields.
  optional NetState state = 6;

  // Print debugging information about results while running Net::Forward,
  // Net::Backward, and Net::Update.
  optional bool debug_info = 7 [default = false];

  // The layers that make up the net.  Each of their configurations, including
  // connectivity and behavior, is specified as a LayerParameter.
  repeated LayerParameter layer = 100;  // ID 100 so layers are printed last.

  // DEPRECATED: use 'layer' instead.
  repeated V1LayerParameter layers = 2;
}

看似很短的proto描述,实际上对应的真实网络prototxt可以很长很长,关键在于可重复多次出现的LayerParameterlayer这个字段。其他字段的功能基本都是辅助网络运行的,在代码中会看到更多的细节。

Net的形成

我们将Blob比作Caffe砖石,Layer比作Caffe的墙面,那么Net更像是工匠手中的图纸,描述了每个墙面应当出现的位置,这样设计的房屋才足够牢固、抗震。为了达到这个目的,Net实现时必然有一套用于记录Layer、Blob的数据结构。在下表中公布一下这些数据结构的名字,错过与它们打交道的机会。

类对象 含义
layers_ 记录Net prototxt中出现的每个Layer
layer_names_ 记录Net prototxt中出现的每个Layer的名称
layer_names_index_ 记录Net prototxt中每个Layer名称与顺序索引的对应关系
layer_need_backward_ 记录Layer是否需要反向传播过程
blobs_ 记录Net中所有Blob
blob_names_ 记录每个Blob名称
blob_names_index_ 记录每个Blob名称与顺序索引的对应关系
blob_need_backward_ 记录每个Blob是否需要反向传播过程
bottom_vecs_ blobs_的影子,记录每个Layer的输入Blob
bottom_id_vecs_ 与bottom_vecs_关联,用于在blobs_中定位每个Layer的每个输入Blob
bottom_need_backward_ 与bottom_vecs_关联,标志每个Blob是否需要反向传播过程
top_vecs_ blobs_的影子,记录每个Layer的输出Blob
top_id_vecs_ 与top_vecs_关联,用于在blobs_中定位每个Layer的每个输出Blob
blob_loss_weights_ Net中每个Blob对损失函数的投票因子,一般损失层为1,其他层为0
net_input_blob_indices_ Net输入Blob在blobs_中的索引
net_output_blob_indices_ Net输出Blob在blobs_中的索引
net_input_blobs_ Net 输入 Blob
net_output_blobs_ Net 输出 Blob
params_ Net权值Blob,用于存储网络权值
param_display_names_ Net中权值Blob的名称
learnable_params_ Net中可训练的权值Blob
params_lr_ learnable_params_中每个元素是否有学习速率倍乘因子
has_params_lr_ 标志learnable_params_中每个元素是否有学习速率倍乘因子
params_weight_decay_ learnable_params_中每个元素的权值衰减倍乘因子
has_params_decay_ 标志learnable_params_中每个元素是否有权值衰减倍乘因子

看到上面有两类Blob:以param开头的权值Blob以blob开头的Layer输入/输出Blob它们虽然都是Blob类型,但在网络中的地位截然不同。权值Blob会随着学习过程而更新,归属于“模型”:Layer输入/输出Blob则只会随网络输入变化,归属于“数据”。深度学习的目的就是不断从“数据”中获取知识,存储到“模型”中,应用于后来的“数据”。

Net声明位于include/caffe/net.hpp中,内容如下:


template <typename Dtype>
class Net {
 public:
 //显示构造函数
  explicit Net(const NetParameter& param);
  explicit Net(const string& param_file, Phase phase,
>       const int level = 0, const vector<string>* stages = NULL);
//析构函数
  virtual ~Net() {}

  /// @brief Initialize a network with a NetParameter.
  void Init(const NetParameter& param);

  //运行前向传播,输入Blob已经领先填充
  const vector<Blob<Dtype>*>& Forward(Dtype* loss = NULL);
  /// @brief DEPRECATED; use Forward() instead.
  const vector<Blob<Dtype>*>& ForwardPrefilled(Dtype* loss = NULL) {
    LOG_EVERY_N(WARNING, 1000) << "DEPRECATED: ForwardPrefilled() "
        << "will be removed in a future version. Use Forward().";
    return Forward(loss);
  }
  
   /**
   * The From and To variants of Forward and Backward operate on the
   * (topological) ordering by which the net is specified. For general DAG
   * networks, note that (1) computing from one layer to another might entail
   * extra computation on unrelated branches, and (2) computation starting in
   * the middle may be incorrect if all of the layers of a fan-in are not
   * included.
   */
   //前向传播的几种形式
  Dtype ForwardFromTo(int start, int end);
  Dtype ForwardFrom(int start);
  Dtype ForwardTo(int end);
  /// @brief DEPRECATED; set input blobs then use Forward() instead.
  const vector<Blob<Dtype>*>& Forward(const vector<Blob<Dtype>* > & bottom,
      Dtype* loss = NULL);
    //清零所有权值的diff域,应在反向传播之前运行
    void ClearParamDiffs();
    //几种不同形式的Net反向传播,无须指定输入/输出Blob,因为在前向传播过程中已经建立连接
  void Backward();
  void BackwardFromTo(int start, int end);
  void BackwardFrom(int start);
  void BackwardTo(int end);
  //对Net中所有Layer自底向上进行变形,无须运行一次前向传播就可以计算各层所需的Blob尺寸
  void Reshape();
  //前向传播+反向传播,输入为Bottom Blob,输出为loss
  Dtype ForwardBackward() {
    Dtype loss;
    Forward(&loss);
    Backward();
    return loss;
  }
  //根据已经(由Solver)准备好的diff值更新网络权值
   void Update();
  /**
   * @brief Shares weight data of owner blobs with shared blobs.
   *
   * Note: this is called by Net::Init, and thus should normally not be
   * called manually.
   */
  void ShareWeights();

  /**
   * @brief For an already initialized net, implicitly copies (i.e., using no
   *        additional memory) the pre-trained layers from another Net.
   */
   //从1个已训练好的Net获取共享权值
  void ShareTrainedLayersWith(const Net* other);
  // For an already initialized net, CopyTrainedLayersFrom() copies the already
  // trained layers from another net parameter instance.
  /**
   * @brief For an already initialized net, copies the pre-trained layers from
   *        another Net.
   */
  void CopyTrainedLayersFrom(const NetParameter& param);
  void CopyTrainedLayersFrom(const string trained_filename);
  void CopyTrainedLayersFromBinaryProto(const string trained_filename);
  void CopyTrainedLayersFromHDF5(const string trained_filename);
  /// @brief Writes the net to a proto.
  // 序列化一个 Net 到 ProtoBuffer
  void ToProto(NetParameter* param, bool write_diff = false) const;
  /// @brief Writes the net to an HDF5 file.
  //序列化一个Net到HDF5
  void ToHDF5(const string& filename, bool write_diff = false) const;
  
  /// @brief returns the network name.
  inline const string& name() const { return name_; }
  /// @brief returns the layer names
  inline const vector<string>& layer_names() const { return layer_names_; }
  /// @brief returns the blob names
  inline const vector<string>& blob_names() const { return blob_names_; }
  /// @brief returns the blobs
  inline const vector<shared_ptr<Blob<Dtype> > >& blobs() const {
    return blobs_;
  }
  /// @brief returns the layers
  inline const vector<shared_ptr<Layer<Dtype> > >& layers() const {
    return layers_;
  }
  /// @brief returns the phase: TRAIN or TEST
  inline Phase phase() const { return phase_; }
  /**
   * @brief returns the bottom vecs for each layer -- usually you won't
   *        need this unless you do per-layer checks such as gradients.
   */
   //返回每个Layer的Bottom Blob
  inline const vector<vector<Blob<Dtype>*> >& bottom_vecs() const {
    return bottom_vecs_;
  }
  /**
   * @brief returns the top vecs for each layer -- usually you won't
   *        need this unless you do per-layer checks such as gradients.
   */
   //返回每个Layer的Top Blob
  inline const vector<vector<Blob<Dtype>*> >& top_vecs() const {
    return top_vecs_;
  }
  /// @brief returns the ids of the top blobs of layer i
  inline const vector<int> & top_ids(int i) const {
    CHECK_GE(i, 0) << "Invalid layer id";
    CHECK_LT(i, top_id_vecs_.size()) << "Invalid layer id";
    return top_id_vecs_[i];
  }
  /// @brief returns the ids of the bottom blobs of layer i
  inline const vector<int> & bottom_ids(int i) const {
    CHECK_GE(i, 0) << "Invalid layer id";
    CHECK_LT(i, bottom_id_vecs_.size()) << "Invalid layer id";
    return bottom_id_vecs_[i];
  }
  
  inline const vector<vector<bool> >& bottom_need_backward() const {
    return bottom_need_backward_;
  }
  inline const vector<Dtype>& blob_loss_weights() const {
    return blob_loss_weights_;
  }
  //返回每个Layer是否需要反向传播计算
  inline const vector<bool>& layer_need_backward() const {
    return layer_need_backward_;
  }
  /// @brief returns the parameters
  inline const vector<shared_ptr<Blob<Dtype> > >& params() const {
    return params_;
  }
  //返回所有可训练权值
  inline const vector<Blob<Dtype>*>& learnable_params() const {
    return learnable_params_;
  }
  /// @brief returns the learnable parameter learning rate multipliers(倍乘因子)
  inline const vector<float>& params_lr() const { return params_lr_; }
  inline const vector<bool>& has_params_lr() const { return has_params_lr_; }
  /// @brief returns the learnable parameter decay multipliers(返回可训练权值的衰减因子)
  inline const vector<float>& params_weight_decay() const {
    return params_weight_decay_;
  }
  inline const vector<bool>& has_params_decay() const {
    return has_params_decay_;
  }
  //返回Layer名称与向量下标映射对
  const map<string, int>& param_names_index() const {
    return param_names_index_;
  }
  //返回权值所有者
  inline const vector<int>& param_owners() const { return param_owners_; }
  inline const vector<string>& param_display_names() const {
    return param_display_names_;
  }
  /// @brief Input and output blob numbers
  inline int num_inputs() const { return net_input_blobs_.size(); }
  inline int num_outputs() const { return net_output_blobs_.size(); }
  //返回输入Blob
  inline const vector<Blob<Dtype>*>& input_blobs() const {
    return net_input_blobs_;
  }
  //返回输出Blob
  inline const vector<Blob<Dtype>*>& output_blobs() const {
    return net_output_blobs_;
  }
  //返回输入Blob下标
  inline const vector<int>& input_blob_indices() const {
    return net_input_blob_indices_;
  }
  //返回输出Blob下标
  inline const vector<int>& output_blob_indices() const {
    return net_output_blob_indices_;
  }
  //查找当前网络是否包含某一名称Blob
  bool has_blob(const string& blob_name) const;
  //如果包含,那么就把它找出来
  const shared_ptr<Blob<Dtype> > blob_by_name(const string& blob_name) const;
  //查找当前网络是都包含某一名称Layer
  bool has_layer(const string& layer_name) const;
  //如果包含,那么就把它找出来
  const shared_ptr<Layer<Dtype> > layer_by_name(const string& layer_name) const;
 //设置debug_info_
  void set_debug_info(const bool value) { debug_info_ = value; }

// Helpers for Init.(下面这些函数是Init的帮手)
  /**
   * @brief Remove layers that the user specified should be excluded given the current
   *        phase, level, and stage.
   */
   //过滤掉用户指定的在某个阶段、级别、状态下不应包含的Layer
  static void FilterNet(const NetParameter& param,
      NetParameter* param_filtered);
  /// @brief return whether NetState state meets NetStateRule rule
  //判断网络状态是否满足网络规则
  static bool StateMeetsRule(const NetState& state, const NetStateRule& rule,
      const string& layer_name);

  // Invoked at specific points during an iteration
  class Callback {
   protected:
    virtual void run(int layer) = 0;

    template <typename T>
    friend class Net;
  };
  const vector<Callback*>& before_forward() const { return before_forward_; }
  void add_before_forward(Callback* value) {
    before_forward_.push_back(value);
  }
  const vector<Callback*>& after_forward() const { return after_forward_; }
  void add_after_forward(Callback* value) {
    after_forward_.push_back(value);
  }
  const vector<Callback*>& before_backward() const { return before_backward_; }
  void add_before_backward(Callback* value) {
    before_backward_.push_back(value);
  }
  const vector<Callback*>& after_backward() const { return after_backward_; }
  void add_after_backward(Callback* value) {
    after_backward_.push_back(value);
  }
  
  protected:
  // Helpers for Init.
  /// @brief Append a new top blob to the net.
  //为网络追加一个Top Blob
  void AppendTop(const NetParameter& param, const int layer_id,
                 const int top_id, set<string>* available_blobs,
                 map<string, int>* blob_name_to_idx);
  /// @brief Append a new bottom blob to the net.
  //为网络追加一个Bottom Blob
  int AppendBottom(const NetParameter& param, const int layer_id,
                   const int bottom_id, set<string>* available_blobs,
                   map<string, int>* blob_name_to_idx);
  /// @brief Append a new parameter blob to the net.
  //为网络追加一个权值Blob
  void AppendParam(const NetParameter& param, const int layer_id,
                   const int param_id);

  /// @brief Helper for displaying debug info in Forward.
  void ForwardDebugInfo(const int layer_id);
  /// @brief Helper for displaying debug info in Backward.
  void BackwardDebugInfo(const int layer_id);
  /// @brief Helper for displaying debug info in Update.
  //显示权值更新调试信息
  void UpdateDebugInfo(const int param_id);
  
  /// @brief The network name
  string name_;
  /// @brief The phase: TRAIN or TEST
  Phase phase_;
  /// @brief Individual layers in the net(网络中的独立层)
  vector<shared_ptr<Layer<Dtype> > > layers_;
  vector<string> layer_names_; //层名称
  map<string, int> layer_names_index_;  //层名称与索引映射表
  vector<bool> layer_need_backward_;    //标记某个层是否需要BP
  /// @brief the blobs storing intermediate results between the layer.
  vector<shared_ptr<Blob<Dtype> > > blobs_; //层与层中间传递数据的管道
  vector<string> blob_names_;   //Blob名称
  map<string, int> blob_names_index_;   //Blob名称与索引映射表
  vector<bool> blob_need_backward_; //标记某个Blob是否需要BP
  /// bottom_vecs stores the vectors containing the input for each layer.
  /// They don't actually host the blobs (blobs_ does), so we simply store
  /// pointers.
  //bottom_vecs_存放每个层的输入Blob,实际上它并不是这些Blob的所有者(所有者为blobs_),只是存放了指针.
  vector<vector<Blob<Dtype>*> > bottom_vecs_;
  vector<vector<int> > bottom_id_vecs_;
  vector<vector<bool> > bottom_need_backward_;
  /// top_vecs stores the vectors containing the output for each layer
  //top_vecs_存放每个层的输入Blob,实际上它并不是这些Blob的所有者(所有者为blobs_),只是存放了指针.
  vector<vector<Blob<Dtype>*> > top_vecs_;
  vector<vector<int> > top_id_vecs_;
  /// Vector of weight in the loss (or objective) function of each net blob,
  /// indexed by blob_id.
  //每个Blob队全局损失函数的贡献权重
  vector<Dtype> blob_loss_weights_;
  vector<vector<int> > param_id_vecs_;
  vector<int> param_owners_;
  vector<string> param_display_names_;
  vector<pair<int, int> > param_layer_indices_;
  map<string, int> param_names_index_;
  /// blob indices for the input and the output of the net
  //网络输入/输出Blob的索引
  vector<int> net_input_blob_indices_;
  vector<int> net_output_blob_indices_;
  vector<Blob<Dtype>*> net_input_blobs_;
  vector<Blob<Dtype>*> net_output_blobs_;
  /// The parameters in the network.(网络权值)
  vector<shared_ptr<Blob<Dtype> > > params_;
  //可训练的网络权值
  vector<Blob<Dtype>*> learnable_params_;
  /**
   * The mapping from params_ -> learnable_params_: we have
   * learnable_param_ids_.size() == params_.size(),
   * and learnable_params_[learnable_param_ids_[i]] == params_[i].get()
   * if and only if params_[i] is an "owner"; otherwise, params_[i] is a sharer
   * and learnable_params_[learnable_param_ids_[i]] gives its owner.
   */
   //从params到learnable_params_的映射
   //当且仅当params_[i]为所有者时,learnable_param_ids_.size() == params__.size()以及 learnable_params_[learnable_parara_ids_[i]] == params_[i].get()成立
   //否则,params_[i]只是1个共享者,learnable_params_[learnable_param_ids_[i]]给出了它的所有者
  vector<int> learnable_param_ids_;
  /// the learning rate multipliers for learnable_params_
  vector<float> params_lr_;
  vector<bool> has_params_lr_;
  /// the weight decay multipliers for learnable_params_(权值衰减因子)
  vector<float> params_weight_decay_;
  vector<bool> has_params_decay_;
  /// The bytes of memory used by this net(记录网络占用的内存大小)
  size_t memory_used_;
  /// Whether to compute and display debug info for the net.(是否显示调试信息)
  bool debug_info_;
  // Callbacks
  vector<Callback*> before_forward_;
  vector<Callback*> after_forward_;
  vector<Callback*> before_backward_;
  vector<Callback*> after_backward_;
//禁用拷贝构造函数,赋值运算函数
DISABLE_COPY_AND_ASSIGN(Net);
};


}  // namespace caffe

#endif  // CAFFE_NET_HPP_

这里关于Net的头文件学习就到这里,后续学习相关的实现代码(cpp文件)

机制和策略

首先caffe中的Net/Layer/Blob是一种分层设计模式

在我们生活中普遍存在但又最容易被忽视的两个概念是:机制和策略.

一般来说,对于某客观事物.机制回答了“它能干啥”这个问题,策略则回答了“它怎么用”这个问题。

回到Caffe源码上,我们发现Blob提供了数据容器的机制;而Layer则通过不同的策略使用该数据容器,实现多元化的计算处理过程,同时又提供了深度学习各种基本算法(卷积、下采样、损失函数计算等)的机制;Net则利用Layer这些机制,组合为完整的深度学习模型,提供了更加丰富的学习策略。后面我们还会看到,Net也是一种机制。

在阅读源码时,时刻记得目标是希望看到高层策略,还是底层机制?

2017/6/21 posted in  Caffe 数据结构

Caffe中Layer的学习

Layer是Caffe的基本计算单元,至少有一个输入Blob (Bottom Blob)和一个输出Blob (Top Blob),部分Layer带有权值(Weight)和偏置项(Bias),有两个运算方向:前向传播(Forward)和反向传播(Backward),其中前向传播计算会对输入Blob进行某种处理(存权值和偏置项的Layer会利用这些对输入进行处理),得到输出Blob;而反向传播计算则对输出Blob的diff进行某种处理,得到输入Blob的diff(有权值和偏置项的Layer可能也会计算权值Blob、偏置项Blob的diff)。

layer中的数据结构描述

我们可以搜索caffe中关于message LayerParameter的类,来了解.
如果你一开始找不到这个类在那个文件描述,可以用下面这个命令去搜索:

➜  caffe git:(master) ✗ grep -n -H -R "message LayerParameter" *


得到它的路径.

我们发现是在src/caffe/proto/caffe.proto这个路径中.因为caffe使用google_protobuf数据类型来声明layer.关于google_protobuf的相关内容,之后可以研究一下.

这里我们看一下源码:

//注意:如果你增加了1个新的LayerParameter域,一定记得更新一个可用ID
// LayerParameter 下一个layer-specific ID: 147 (last added: recurrent_param)
message LayerParameter {
  optional string name = 1; // the layer name
  optional string type = 2; // the layer type
  repeated string bottom = 3; // 输入Blob(bottom Blob)的名称
  repeated string top = 4; // 输出Blob(Top Blob)的名称

  // 当前计算阶段(TRAIN 或 TEST)
    optional Phase phase = 10;

  // 为每个Top Blob分配对损失函数的权重,毎个Layer都有默认值,要么为0,表示不参与目标函数计算:要么为1,表示参与损失函数计算
  repeated float loss_weight = 5;

  // 指定训练参数(例如相对全局学习常数的缩放因子,以及用于权值共享 的名称或其他设置)
  repeated ParamSpec param = 6;

  // 承载了该层数值参数的Blob
  repeated BlobProto blobs = 7;
  //是否对Bottom Blob进行反向传播过程。该字段的维度应与 Bottom Blob个数一致
  // Specifies whether to backpropagate to each bottom. If unspecified,
  // Caffe will automatically infer whether each input needs backpropagation
  // to compute parameter gradients. If set to true for some inputs,
  // backpropagation to those inputs is forced; if set false for some inputs,
  // backpropagation to those inputs is skipped.
  //
  // The size must be either 0 or equal to the number of bottoms.
  repeated bool propagate_down = 11;

 //控制某个层在某个时刻是否包含在网络中,基于当前NetState。你可以为include或exclude(不要同时)指定非零值。如果没有任何规则,那么该层一直包含在网络中:如果当前NetState满足了任何1个指定规则,耶么该层会被包含或排斥
  // Rules controlling whether and when a layer is included in the network,
  // based on the current NetState.  You may specify a non-zero number of rules
  // to include OR exclude, but not both.  If no include or exclude rules are
  // specified, the layer is always included.  If the current NetState meets
  // ANY (i.e., one or more) of the specified rules, the layer is
  // included/excluded.
  repeated NetStateRule include = 8;
  repeated NetStateRule exclude = 9;

  // Parameters for data pre-processing.数据预处理参数
  optional TransformationParameter transform_param = 100;

  // Parameters shared by loss layers.所有损失层共享的参数
  optional LossParameter loss_param = 101;
  
  
  //特定类型层的参数。注意一些层实现时可能有多于一种的计算引擎,这些层包括一个引擎类型和引擎参数来选择实现.默认引擎是在编译阶段由引擎开关设置的
  // Layer type-specific parameters.
  //
  // Note: certain layers may have more than one computational engine
  // for their implementation. These layers include an Engine type and
  // engine parameter for selecting the implementation.
  // The default for the engine is set by the ENGINE switch at compile-time.
  optional AccuracyParameter accuracy_param = 102;
  optional ArgMaxParameter argmax_param = 103;
  optional BatchNormParameter batch_norm_param = 139;
  optional BiasParameter bias_param = 141;
  optional ConcatParameter concat_param = 104;
  optional ContrastiveLossParameter contrastive_loss_param = 105;
  optional ConvolutionParameter convolution_param = 106;
  optional CropParameter crop_param = 144;
  optional DataParameter data_param = 107;
  optional DropoutParameter dropout_param = 108;
  optional DummyDataParameter dummy_data_param = 109;
  optional EltwiseParameter eltwise_param = 110;
  optional ELUParameter elu_param = 140;
  optional EmbedParameter embed_param = 137;
  optional ExpParameter exp_param = 111;
  optional FlattenParameter flatten_param = 135;
  optional HDF5DataParameter hdf5_data_param = 112;
  optional HDF5OutputParameter hdf5_output_param = 113;
  optional HingeLossParameter hinge_loss_param = 114;
  optional ImageDataParameter image_data_param = 115;
  optional InfogainLossParameter infogain_loss_param = 116;
  optional InnerProductParameter inner_product_param = 117;
  optional InputParameter input_param = 143;
  optional LogParameter log_param = 134;
  optional LRNParameter lrn_param = 118;
  optional MemoryDataParameter memory_data_param = 119;
  optional MVNParameter mvn_param = 120;
  optional ParameterParameter parameter_param = 145;
  optional PoolingParameter pooling_param = 121;
  optional PowerParameter power_param = 122;
  optional PReLUParameter prelu_param = 131;
  optional PythonParameter python_param = 130;
  optional RecurrentParameter recurrent_param = 146;
  optional ReductionParameter reduction_param = 136;
  optional ReLUParameter relu_param = 123;
  optional ReshapeParameter reshape_param = 133;
  optional ScaleParameter scale_param = 142;
  optional SigmoidParameter sigmoid_param = 124;
  optional SoftmaxParameter softmax_param = 125;
  optional SPPParameter spp_param = 132;
  optional SliceParameter slice_param = 126;
  optional TanHParameter tanh_param = 127;
  optional ThresholdParameter threshold_param = 128;
  optional TileParameter tile_param = 138;
  optional WindowDataParameter window_data_param = 129;
}

Layer是怎么炼成的

Layer头文件位于include/caffe/layer.hpp中,我们来解析一下:

#ifndef CAFFE_LAYER_H_
#define CAFFE_LAYER_H_

#include <algorithm>
#include <string>
#include <vector>

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer_factory.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/math_functions.hpp"

/**
 Forward declare boost::thread instead of including boost/thread.hpp
 to avoid a boost/NVCC issues (#1009, #1010) on OSX.
 */
namespace boost { class mutex; }

namespace caffe {

/**
 * @brief An interface for the units of computation which can be composed into a
 *        Net.
 *
 * Layer%s must implement a Forward function, in which they take their input
 * (bottom) Blob%s (if any) and compute their output Blob%s (if any).
 * They may also implement a Backward function, in which they compute the error
 * gradients with respect to their input Blob%s, given the error gradients with
 * their output Blob%s.
 */
template <typename Dtype>
class Layer {
 public:
  /**
   * You should not implement your own constructor. Any set up code should go
   * to SetUp(), where the dimensions of the bottom blobs are provided to the
   * layer.
   */
   //显式构造函数,从LayerParameter对象中加载配置
  explicit Layer(const LayerParameter& param)
    : layer_param_(param) {
      // Set phase(训练/预测) and copy blobs (if there are any).
      phase_ = param.phase();
      if (layer_param_.blobs_size() > 0) {
        //按 layer_param_设置本身Blob对象个数,并依次将每个Blob对象尺寸调整为与layer_param_中的Blob尺寸一致
        blobs_.resize(layer_param_.blobs_size());
        for (int i = 0; i < layer_param_.blobs_size(); ++i) {
          blobs_[i].reset(new Blob<Dtype>());
          blobs_[i]->FromProto(layer_param_.blobs(i));
        }
      }
    }
    //析构函数
  virtual ~Layer() {}

  /**
   * @brief Implements common layer setup functionality.
   *
   * @param bottom the preshaped input blobs
   * @param top
   *     the allocated but unshaped output blobs, to be shaped by Reshape
   *
   * Checks that the number of bottom and top blobs is correct.
   * Calls LayerSetUp to do special layer setup for individual layer types,
   * followed by Reshape to set up sizes of top blobs and internal buffers.
   * Sets up the loss weight multiplier blobs for any non-zero loss weights.
   * This method may not be overridden.
   */
   
   //配置函数,实现常用层配置接口,不可被覆盖
  void SetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {
    CheckBlobCounts(bottom, top);   //检查Blob
    LayerSetUp(bottom, top);        //  与层类型相关的配置过程
    Reshape(bottom, top);       //对Top Blob变形
    SetLossWeights(top);        //设置损失权值因子Blob
  }

  /**
   * @brief Does layer-specific setup: your layer should implement this function
   *        as well as Reshape.
   *
   * @param bottom
   *     the preshaped input blobs, whose data fields store the input data for
   *     this layer
   * @param top
   *     the allocated but unshaped output blobs
   *
   * This method should do one-time layer specific setup. This includes reading
   * and processing relevent parameters from the <code>layer_param_</code>.
   * Setting up the shapes of top blobs and internal buffers should be done in
   * <code>Reshape</code>, which will be called before the forward pass to
   * adjust the top blob sizes.
   */
   
   //层配置(虚)函数,做特定类型层相关的配置,由该类型层自己实现
  virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top) {}
   
   //变形(纯虚)函数,修改Top Blob以及内部Blob缓冲区的形状
  virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) = 0;

 
   //前向传播函数,给定Bottom Blob,计算Top Blob和loss,返回值为当前层loss
   //该函数会调用相应设裕包装闲数,如Forward_cpu或Forward_gpu来实现真正的计算过程。如果该层有任意非零loss_weights参数,那么包装函数会计算并返回loss
   //派生类应该实现Forward_cpu和Forward_gpu (可选〉
  inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);

  //反向传播函数,给定Top Blob误差梯度,汁算Bottom Blob误差梯度
  //参数说明:
  // top-Top Blob,其diff域包含来自上一层的误差梯度
  // propagate_down -- 多路幵关,与Bottom Blob矢量维度相问,每个值表示是否将误差梯度传递到对应的 Bottom Blob
  // bottom—Bottom Blob,其diff域需要由该函数计算得到
  // 该函数会调用相应设备包装函数,如Backward_cpu或Backward_gpu来实现真正的计算过程,由派生类负责实现
  inline void Backward(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,
      const vector<Blob<Dtype>*>& bottom);

  //返回Layer内部可训练的权值、偏置项Blob向量
  vector<shared_ptr<Blob<Dtype> > >& blobs() {
    return blobs_;
  }

  //返回Layer初始化参数(由ProtoBuffer提供)
  const LayerParameter& layer_param() const { return layer_param_; }

  //将Layer初始化参数写入ProtoBuffer缓冲区
  virtual void ToProto(LayerParameter* param, bool write_diff = false);

  //返回与某个Top Blob相关的标量loss值
  inline Dtype loss(const int top_index) const {
    return (loss_.size() > top_index) ? loss_[top_index] : Dtype(0);
  }

  //设置与某个Top Blob相关的标量loss值
  inline void set_loss(const int top_index, const Dtype value) {
    if (loss_.size() <= top_index) {
      loss_.resize(top_index + 1, Dtype(0));
    }
    loss_[top_index] = value;
  }

  //返回层类型字符串,便于识別,由派生类负责实现
  virtual inline const char* type() const { return ""; }

 //返回该Layer需要的输入Blob数目.-1表示不关心。由派生类负责实现
  virtual inline int ExactNumBottomBlobs() const { return -1; }

  virtual inline int MinBottomBlobs() const { return -1; }
  
  virtual inline int MaxBottomBlobs() const { return -1; }
  //返回该Layer需要的输出Blob数目.-1表示不关心。由派生类负责实现
  virtual inline int ExactNumTopBlobs() const { return -1; }
 
  virtual inline int MinTopBlobs() const { return -1; }
  
  virtual inline int MaxTopBlobs() const { return -1; }
  
  //返回该Layer是否有相同的输入/输出Blob,由派生类负责实现
  virtual inline bool EqualNumBottomTopBlobs() const { return false; }

  //返回是否允许匿名Top Blob,即由该Layer自动创建。若为真,在Net::Init()函数中会创建足够多的匿名Top Blob来满足该 Layer ExactNumTopBlobs()、MinTopBlobs()需求
  virtual inline bool AutoTopBlobs() const { return false; }

  //返回某些Bottom Blob足否允许强制反向传播,如果AllowForceBackward(i) === false,将会忽略 force_backward 设定
  virtual inline bool AllowForceBackward(const int bottom_index) const {
    return true;
  }

  //指定该Layer是否计算相对权值或偏置项的梯度,具体相对谁由param_id指定
  inline bool param_propagate_down(const int param_id) {
    return (param_propagate_down_.size() > param_id) ?
        param_propagate_down_[param_id] : false;
  }
  
  //设置该Layer是否计算相对权值或偏置项的梯度,具体相对谁由param_id指定
  inline void set_param_propagate_down(const int param_id, const bool value) {
    if (param_propagate_down_.size() <= param_id) {
      param_propagate_down_.resize(param_id + 1, true);
    }
    param_propagate_down_[param_id] = value;
  }


 protected:
  /** The protobuf that stores the layer parameters */
  LayerParameter layer_param_;
  /** 当前所处阶段: TRAIN or TEST */
  Phase phase_;
  /** The vector that stores the learnable parameters as a set of blobs. */
  //Layer 内部权值或偏置项,以 Blob 方式组织
  vector<shared_ptr<Blob<Dtype> > > blobs_;
  /** Vector indicating whether to compute the diff of each param blob. */
  //标志位,是否计算对应参数的误差梯度
  vector<bool> param_propagate_down_;

  //标志位,在目标函数中,是否每个Top Blob都有非零权重
  vector<Dtype> loss_;

//下面4个函数,我们会在各个Layer派生类中经常看到

  /** @brief Using the CPU device, compute the layer output. */
  virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) = 0;
  /**
   * @brief Using the GPU device, compute the layer output.
   *        Fall back to Forward_cpu() if unavailable.
   */
  virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {
    // LOG(WARNING) << "Using CPU code as backup.";
    return Forward_cpu(bottom, top);
  }

  /**
   * @brief Using the CPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   */
  virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,
      const vector<Blob<Dtype>*>& bottom) = 0;
  /**
   * @brief Using the GPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   *        Fall back to Backward_cpu() if unavailable.
   */
  virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,
      const vector<Blob<Dtype>*>& bottom) {
    // LOG(WARNING) << "Using CPU code as backup.";
    Backward_cpu(top, propagate_down, bottom);
  }

  /**
   * Called by the parent Layer's SetUp to check that the number of bottom
   * and top Blobs provided as input match the expected numbers specified by
   * the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
   */
   //校验输入/输出Blob数目是否满足Layer要求
  virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
                               const vector<Blob<Dtype>*>& top) {
    if (ExactNumBottomBlobs() >= 0) {
      CHECK_EQ(ExactNumBottomBlobs(), bottom.size())
          << type() << " Layer takes " << ExactNumBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (MinBottomBlobs() >= 0) {
      CHECK_LE(MinBottomBlobs(), bottom.size())
          << type() << " Layer takes at least " << MinBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (MaxBottomBlobs() >= 0) {
      CHECK_GE(MaxBottomBlobs(), bottom.size())
          << type() << " Layer takes at most " << MaxBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (ExactNumTopBlobs() >= 0) {
      CHECK_EQ(ExactNumTopBlobs(), top.size())
          << type() << " Layer produces " << ExactNumTopBlobs()
          << " top blob(s) as output.";
    }
    if (MinTopBlobs() >= 0) {
      CHECK_LE(MinTopBlobs(), top.size())
          << type() << " Layer produces at least " << MinTopBlobs()
          << " top blob(s) as output.";
    }
    if (MaxTopBlobs() >= 0) {
      CHECK_GE(MaxTopBlobs(), top.size())
          << type() << " Layer produces at most " << MaxTopBlobs()
          << " top blob(s) as output.";
    }
    if (EqualNumBottomTopBlobs()) {
      CHECK_EQ(bottom.size(), top.size())
          << type() << " Layer produces one top blob as output for each "
          << "bottom blob input.";
    }
  }

  /**
   * Called by SetUp to initialize the weights associated with any top blobs in
   * the loss function. Store non-zero loss weights in the diff blob.
   */
   //该函数在Layer的Setup函数中被调用,主要目的是初始化与Top Blob相关的loss权重,放到Top Blob的diff域,实际由Forward()计算loss函数
   //loss_weight == 0,表示当前层不参与loss函数汁算,大部分Layer属于这一类
   //loss_weight ==1,表示当前层参与loss函数汁算,损失层(LossLayer) 属于这一类
  inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {
  //从ProtoBuffer对象中获得Layer参数,这里需要用loss_weight参数
    const int num_loss_weights = layer_param_.loss_weight_size();
    //如果 ProtoBuffer中存在至少一个loss_weight参数,loss_weight参数个数应当与Top Blob数目相同,或者不要loss_weight参数
    if (num_loss_weights) {
      CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
          "unspecified or specified once per top blob.";
    //遍历每个Top Blob
      for (int top_id = 0; top_id < top.size(); ++top_id) {
      // 从 ProtoBuffer 对象拿到 loss_weight 实际值(0 或者1)
        const Dtype loss_weight = layer_param_.loss_weight(top_id);
        //若为0,跳过
        if (loss_weight == Dtype(0)) { continue; }\
        //若不为0,则对网络进行相关设置
        this->set_loss(top_id, loss_weight);    //本地记录loss_weight值
        const int count = top[top_id]->count();
        Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();
        //将loss_weight值入 Top Blob 的diff域,传递到其他需耍使用的地一方,实现远程同步
        caffe_set(count, loss_weight, loss_multiplier);
      }
    }
  }

 private:
 //禁用拷贝构造函数和賦值运算函数 
  DISABLE_COPY_AND_ASSIGN(Layer);
};  // class Layer

// Forward and backward wrappers. You should implement the cpu and
// gpu specific implementations instead, and should not change these
// functions.
//使用时只需在派生类中改写 Forward_cpu、Forward_gpu、Backward_cpu、Backward_gpu
template <typename Dtype>
inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  Dtype loss = 0;
  Reshape(bottom, top);
  switch (Caffe::mode()) {      //判断计算设备
  case Caffe::CPU:      //在CPU上执行Forward计算
    Forward_cpu(bottom, top);   //调用CPU版本的 Forward函数
    //还没完,要计算loss (如果有的话)
    for (int top_id = 0; top_id < top.size(); ++top_id) {
      if (!this->loss(top_id)) { continue; }
      const int count = top[top_id]->count();
      // 若为 LossLayer,则已经通过Forward函数计算出全局损失函数,放在Top Blob data域
      const Dtype* data = top[top_id]->cpu_data();
      // 若loss_weight不为0,则己经在SetLossWeights函数中将loss权重放在Top Blob diff域
      const Dtype* loss_weights = top[top_id]->cpu_diff();
      // 计算加权后的loss之和,得到标量loss值
      loss += caffe_cpu_dot(count, data, loss_weights);
    }
    break;
  case Caffe::GPU:
    Forward_gpu(bottom, top);
#ifndef CPU_ONLY
    for (int top_id = 0; top_id < top.size(); ++top_id) {
      if (!this->loss(top_id)) { continue; }
      const int count = top[top_id]->count();
      const Dtype* data = top[top_id]->gpu_data();
      const Dtype* loss_weights = top[top_id]->gpu_diff();
      Dtype blob_loss = 0;
      caffe_gpu_dot(count, data, loss_weights, &blob_loss);
      loss += blob_loss;
    }
#endif
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
  return loss;
}
//反向传播函数,直接调用对应设备函数
template <typename Dtype>
inline void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down,
    const vector<Blob<Dtype>*>& bottom) {
  switch (Caffe::mode()) {
  case Caffe::CPU:
    Backward_cpu(top, propagate_down, bottom);
    break;
  case Caffe::GPU:
    Backward_gpu(top, propagate_down, bottom);
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
}

//将层配置参数序列化为ProtoBuffer
template <typename Dtype>
void Layer<Dtype>::ToProto(LayerParameter* param, bool write_diff) {
  param->Clear();
  param->CopyFrom(layer_param_);
  param->clear_blobs();
  for (int i = 0; i < blobs_.size(); ++i) { //权值和偏置项也会保存
    blobs_[i]->ToProto(param->add_blobs(), write_diff);
  }
}

}  // namespace caffe

#endif  // CAFFE_LAYER_H_

Layer源文件位于src/caffe/layer.cpp中:

#include "caffe/layer.hpp"

namespace caffe {

INSTANTIATE_CLASS(Layer);

}  // namespace caffe

可见Layer大部分函数并没有实现,只有虚函数,真正的实现都在派生类中。具体代码可以进一步阅读 src/caffe/丨ayers/*.cpp

在使用 Layer 之前,需要先包含头文件#include <caffe/layer.hpp>,再通过using namespace caffe;使用命名空间caffe。如果代码中试图创建Layer对象,编译时会报错:

error: cannot declare variable 'a^ to be of abstract type 'caffe::Layer<float>

这是因为Layer类是一个虚基类,不能直接创建对象。关于虚基类,这里不再过多说明.

2017/6/15 posted in  Caffe 数据结构