使用 Model Optimizer 转换 Caffe 模型

Intel

0/5 (0投票)

2019年2月28日

CPOL

4436

如何使用 Model Optimizer（包括框架无关和 Caffe 特定命令行选项）转换训练好的 Caffe 模型

引言

模型优化器是一个跨平台命令行工具，它有助于训练和部署环境之间的过渡，执行静态模型分析，并调整深度学习模型以在终端目标设备上进行优化执行。

模型优化器流程假设您拥有一个使用支持的框架训练的网络模型。以下方案说明了部署已训练的深度学习模型的典型工作流程。

对使用 Caffe* 训练的模型进行优化和部署的步骤摘要：

配置模型优化器以支持 Caffe（您的模型是使用 Caffe 训练的）。
将 Caffe* 模型转换为，基于已训练的网络拓扑、权重和偏差值，生成优化的模型中间表示 (IR)。
使用目标环境中的推理引擎，通过提供的推理引擎验证应用程序或示例应用程序，测试中间表示格式的模型。
将推理引擎集成到您的应用程序中，以在目标环境中部署模型。

模型优化器工作流

模型优化器流程假设您拥有一个使用 Caffe* 框架训练的网络模型。工作流程如下：

通过从 <INSTALL_DIR>/deployment_tools/model_optimizer/install_prerequisites 目录运行 Linux* OS 的配置 bash 脚本或 Windows* OS 的批处理文件，为 Caffe 框架配置模型优化器。
- 适用于 Linux* OS
```
install_prerequisites_caffe.sh
```
- 适用于 Windows* OS
```
install_prerequisites_caffe.bat
```
有关配置模型优化器的详细信息，请参阅配置模型优化器。
提供包含特定拓扑（在 .prototxt 文件中描述）以及调整后的权重和偏差（在 .caffemodel 文件中描述）的已训练模型作为输入。
将 Caffe* 模型转换为优化的中间表示。

模型优化器输出中间表示 (IR)，该表示可由推理引擎读取、加载和推理。推理引擎 API 在多个支持的 Intel® 平台上提供统一的 API。中间表示是一对文件，用于描述整个模型：

.xml：描述网络拓扑
.bin：包含权重和偏差的二进制数据

支持的拓扑

分类模型
- AlexNet
- VGG-16, VGG-19
- SqueezeNet v1.0, SqueezeNet v1.1
- ResNet-50, ResNet-101, ResNet-152
- Inception v1, Inception v2, Inception v3, Inception v4
- CaffeNet
- MobileNet
- Squeeze-and-Excitation Networks: SE-BN-Inception, SE-Resnet-101, SE-ResNet-152, SE-ResNet-50, SE-ResNeXt-101, SE-ResNeXt-50
- ShuffleNet v2
对象检测模型
- SSD300-VGG16, SSD500-VGG16
- Faster-RCNN
人脸检测模型
- VGG Face
语义分割模型
- FCN8

注意：要使用模型优化器转换大多数 Caffe* 模型，必须指定均值和缩放值。具体值需要为每个模型单独确定。例如，对于在 ImageNet 上训练的 Caffe* 模型，蓝色、绿色和红色通道的均值通常分别为 123.68、116.779 和 103.939。缩放值通常为 127.5。有关如何指定均值和缩放值的信息，请参阅使用与框架无关的卷积参数。

转换 Caffe* 模型

要转换 Caffe 模型：

转到 <INSTALL_DIR>/deployment_tools/model_optimizer 目录。
使用 mo.py 脚本，只需提供输入模型 .caffemodel 文件的路径即可转换模型。
```
python3 mo.py --input_model <INPUT_MODEL>.caffemodel
```

有两种参数组可用于转换您的模型：

与框架无关的参数：用于转换任何受支持框架中训练的模型的参数。
Caffe 特定参数：仅用于转换 Caffe 模型的参数。

使用与框架无关的转换参数

为了调整转换过程，您可以使用通用的（与框架无关的）参数。

	optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         The name of the input operation of the given model.
                        Usually this is a name of the input placeholder of the
                        model.
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --move_to_preprocess  Move mean values to IR preprocess section
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, e.g.: "node_name->True"
  --generate_deprecated_IR_V2
                        Force to generate legacy/deprecated IR V2 to work with
                        previous versions of the Inference Engine. The
                        resulting IR may or may not be correctly loaded by
                        Inference Engine API (including the most recent and
                        old versions of Inference Engine) and provided as a
                        partially-validated backup option for specific
                        deployment scenarios. Use it at your own discretion.
                        By default, without this option, the Model Optimizer
                        generates IR V3.

注意：模型优化器默认不会像 2017 R3 Beta 版本那样将输入通道从 RGB 翻转到 BGR。必须手动指定命令行参数 --reverse_input_channels 来执行翻转。有关详细信息，请参阅何时翻转输入通道章节。

以下各节提供有关使用特定参数的详细信息和命令行命令示例。

何时指定均值和缩放值

通常，神经网络模型使用归一化后的输入数据进行训练。这意味着输入数据的数值被转换为特定的范围，例如 [0, 1] 或 [-1, 1]。有时，作为预处理的一部分，会从输入数据中减去均值（均值图像）。输入数据预处理有两种实现方式：

输入预处理操作是拓扑的一部分。在这种情况下，使用框架推理拓扑的应用程序不进行输入预处理。
输入预处理操作不是拓扑的一部分，并且预处理是在向模型提供输入数据的应用程序中执行的。

在第一种情况下，模型优化器会生成包含必需预处理层的 IR，并且可以使用推理引擎样本来推理模型。

在第二种情况下，应将关于均值/缩放值的信息提供给模型优化器，以便将其嵌入到生成的 IR 中。模型优化器提供了一些命令行参数来指定它们：--scale、--scale_values、--mean_values、--mean_file。

如果同时指定了均值和缩放值，则先减去均值，然后应用缩放。输入值将按缩放值（或值）进行除法。

对于特定模型，没有通用的确定均值/缩放值的方法。以下步骤有助于确定它们：

阅读模型文档。通常，如果需要预处理，文档会描述均值/缩放值。
打开执行模型的示例脚本/应用程序，并跟踪输入数据是如何读取并传递到框架的。
在可视化工具中打开模型，并检查执行减法或乘法的图层（如 Sub、Mul、ScaleShift、Eltwise 等），这些图层会作用于输入数据。如果存在此类图层，则预处理很可能属于模型的一部分。

何时指定输入形状

在某些情况下，模型的输入数据形状不是固定的，例如对于全卷积神经网络。在这种情况下，例如 TensorFlow* 模型在 Placeholder 操作的 shape 属性中包含 -1 值。推理引擎不支持具有未定义大小的输入层，因此如果模型中未定义输入形状，模型优化器将无法转换模型。

解决方案是使用 --input_shape 命令行参数为模型的所有输入提供输入形状，或者如果模型只有一个输入且仅批次大小未定义，则使用 -b 命令行参数提供批次大小。在后一种情况下，TensorFlow* 模型的 Placeholder 形状如下所示：[-1, 224, 224, 3]。

何时翻转输入通道

推理引擎示例以 BGR 通道顺序加载输入图像。但模型可能以 RGB 通道顺序加载图像进行训练。在这种情况下，使用推理引擎示例的推理结果将不正确。解决方案是提供 --reverse_input_channels 命令行参数。然后，模型优化器将执行第一个卷积或其他依赖通道的操作权重修改，使这些操作的输出看起来像是以 RGB 通道顺序传递的图像。

使用与框架无关的参数的命令行界面 (CLI) 示例

使用调试日志级别启动模型优化器以处理 Caffe bvlc_alexnet 模型。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --log_level DEBUG
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，并将输出的 IR 命名为 result.*，保存在指定的 output_dir 目录中。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --model_name result --output_dir /../../models/
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，其中一个输入具有缩放值。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --scale_values [59,59,59]
```

启动模型优化器以处理 Caffe bvlc_alexnet 模型，其中有多个输入具有缩放值。

	python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --scale_values [59,59,59],[5,5,5]

启动模型优化器以处理 Caffe bvlc_alexnet 模型，其中为特定节点指定了多个输入的缩放值和均值。

	python3 mo.py --input_model bvlc_alexnet.caffemodel --input data,rois --mean_values data[59,59,59] --scale_values rois[5,5,5]

启动模型优化器以处理 Caffe bvlc_alexnet 模型，并指定了输入层，覆盖了输入形状，缩放值为 5，批次大小为 8，并指定了输出操作的名称。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --input data --input_shape [1,3,224,224] --output pool5 -s 5 -b 8
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，并禁用了线性运算到卷积和分组卷积的融合。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --disable_fusing --disable_gfusing
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，该模型在 RGB 和 BGR 之间翻转了输入通道顺序，指定了要用于每个通道输入图像的均值，并指定了输入张量值的数据类型。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，其中包含指定目录中的扩展列表，并指定了均值图像的 binaryproto 文件。有关扩展的更多信息，请参阅使用新原语扩展模型优化器页面。
```
	python3 mo.py --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto
```
启动模型优化器以处理 Caffe bvlc_alexnet 模型，并传入一个占位符张量值。它会将占位符替换为一个包含传入值的常量层。
此处张量用方括号表示，每个值之间用空格分隔。如果模型中设置了数据类型，此张量将被重塑为占位符形状并转换为占位符数据类型。否则，它将被转换为传递给 --data_type 参数的数据类型（默认值为 FP32）。
```
python3 mo.py --input_model FaceNet.pb --freeze_placeholder_with_value "<placeholder_layer_name>->[0.1 1.2 2.3]"
```

使用 Caffe*-特定的转换参数

以下列表提供了 Caffe*-特定的参数。

Caffe-specific parameters:
  --input_proto INPUT_PROTO, -d INPUT_PROTO
                        Deploy-ready prototxt file that contains a topology
                        structure and layer attributes
  -k K                  Path to CustomLayersMapping.xml to register custom
                        layers
  --mean_file MEAN_FILE, -mf MEAN_FILE
                        Mean image to be used for the input. Should be a
                        binaryproto file
  --mean_file_offsets MEAN_FILE_OFFSETS, -mo MEAN_FILE_OFFSETS
                        Mean image offsets to be used for the input
                        binaryproto file. When the mean image is bigger than
                        the expected input, it is cropped. By default, centers
                        of the input image and the mean image are the same and
                        the mean image is cropped by dimensions of the input
                        image. The format to pass this option is the
                        following: "-mo (x,y)". In this case, the mean file is
                        cropped by dimensions of the input image with offset
                        (x,y) from the upper left corner of the mean image
  --disable_omitting_optional
                        Disable omitting optional attributes to be used for
                        custom layers. Use this option if you want to transfer
                        all attributes of a custom layer to IR. Default
                        behavior is to transfer the attributes with default
                        values and the attributes defined by the user to IR.
  --enable_flattening_nested_params
                        Enable flattening optional params to be used for
                        custom layers. Use this option if you want to transfer
                        attributes of a custom layer to IR with flattened
                        nested parameters. Default behavior is to transfer the
                        attributes without flattening nested parameters.

使用 Caffe*-特定参数的命令行界面 (CLI) 示例

启动模型优化器以处理 bvlc_alexnet.caffemodel，并指定了 prototxt 文件。当 Caffe 模型名称和 .prototxt 文件不同或位于不同目录时，需要这样做。否则，仅提供输入 model.caffemodel 文件的路径就足够了。
```
python3 mo.py --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt
```
启动模型优化器以处理 bvlc_alexnet.caffemodel，并指定了 CustomLayersMapping 文件。这是在模型包含自定义层时快速启用模型转换的旧方法。这需要计算机上安装了系统 Caffe*。要了解更多信息，请参阅 Caffe* 自定义层的旧模式。
在 .prototxt 文件中未指定的、没有默认值的可选参数将被从中间表示中移除，并且嵌套参数会被展平。
```
python3 mo.py --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params
```
此示例显示了一个具有输入层 data 和 rois 的多输入模型。
```
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param {
    shape { dim: 1 dim: 3 dim: 224 dim: 224 }
  }
}
layer {
  name: "rois"
  type: "Input"
  top: "rois"
  input_param {
    shape { dim: 1 dim: 5 dim: 1 dim: 1 }
  }
}
```
启动模型优化器以处理一个多输入模型，该模型有两个输入，并按传递给模型优化器的顺序为每个输入提供新形状。具体来说，对于 data，将形状设置为 1,3,227,227。对于 rois，将形状设置为 1,6,1,1。
```
python3 mo.py --input_model /path-to/your-model.caffemodel --input data,rois --input_shape (1,3,227,227),[1,6,1,1]
```

自定义层定义

在内部，当您运行模型优化器时，它会加载模型，遍历拓扑，并尝试在已知层列表中查找每种层类型。自定义层是未包含在已知层列表中的层。如果您的拓扑包含任何不在此已知层列表中的层，模型优化器会将其归类为自定义层。有关自定义层的更多信息，请参阅包含自定义层的 Caffe 模型。

支持的 Caffe* 层

层编号	Caffe 中的层名称	中间表示中的层名称
1	输入	输入
2	GlobalInput	输入
3	InnerProduct	FullyConnected
4	Dropout	已忽略。不会出现在 IR 中
5	卷积	卷积
6	反卷积	反卷积
7	池化	池化
8	BatchNorm	BatchNormalization
9	LRN	Norm
10	幂	幂
11	ReLU	ReLU
12	Scale	ScaleShift
13	Concat	Concat
14	Eltwise	Eltwise
15	Flatten	Flatten
16	Reshape	Reshape
17	Slice	Slice
18	Softmax	SoftMax
19	Permute	Permute
20	ROIPooling	ROIPooling
21	Tile	Tile
22	ShuffleChannel	Reshape + Split + Permute + Concat
23	Axpy	ScaleShift + Eltwise
24	BN	ScaleShift

有关以下信息，请参阅模型优化器开发指南：

模型优化器处理自定义层的内部过程
如何转换包含自定义层的模型
自定义层实现细节

常见问题解答 (FAQ)

如果模型优化器由于拼写错误、选项使用不当或其他问题而无法完成运行，它会提供解释性消息。消息会描述问题的潜在原因，并提供指向模型优化器 FAQ 的链接。FAQ 提供了解决大多数问题的说明。FAQ 还包含指向模型优化器开发指南中相关部分的链接，以帮助您理解问题所在。

摘要

在本文件中，您学习了：

模型优化器如何处理 Caffe* 模型的基本信息。
支持哪些 Caffe* 模型。
如何使用模型优化器通过与框架无关和 Caffe 特定的命令行选项来转换已训练的 Caffe* 模型。

法律信息

您不得将本文档用于或协助用于任何侵犯 Intel 产品（此处所述）的权利的行为或其他法律分析。您同意授予 Intel 对此处披露的主题事项之后起草的任何专利申请的非独占、免版税许可。

本文档不授予任何知识产权的许可（明示或暗示，禁止反言或以其他方式）。

此处提供的所有信息如有更改，恕不另行通知。请联系您的Intel代表以获取最新的Intel产品规格和路线图。

描述的产品可能包含设计缺陷或称为errata的错误，这些错误可能导致产品偏离已发布的规格。当前已确定特征的errata可根据要求提供。

Intel技术的特性和优势取决于系统配置，并可能需要启用硬件、软件或服务激活。了解更多信息，请访问http://www.intel.com/或联系OEM或零售商。

没有计算机系统可以绝对安全。

Intel、Arria、Core、Movidia、Pentium、Xeon 和 Intel 标志是 Intel Corporation 在美国和/或其他国家/地区的商标。

OpenCL 和 OpenCL 标志是 Apple Inc. 的商标，已获得 Khronos 的许可使用。

*其他名称和品牌可能被声明为他人的财产。