在 CNTK 中使用 C# 将输入归一化作为单独的层

Bahrudin Hrnjica

5.00/5 (2投票s)

2018 年 7 月 13 日

CPOL

3分钟阅读

11125

如何将数据标准化实现为常规神经网络层，从而简化训练过程和数据准备

在之前的文章中，我们已经了解了如何计算描述性统计的一些基本参数，以及如何通过计算均值和标准差来标准化数据。在这篇博文中，我们将把数据标准化实现为常规神经网络层，从而简化训练过程和数据准备。

什么是数据标准化？

简单来说，数据标准化是一组任务，它将数据集中的任何特征的值转换为预定义的数字范围。通常，这个范围是 [-1,1]、[0,1] 或其他一些特定范围。数据标准化在机器学习中起着非常重要的作用，因为它可以显着改善训练过程并简化网络参数的设置。

数据标准化主要有两种类型

MinMax 标准化 – 它将所有值转换为 [0,1] 范围，
高斯标准化或 Z 分数标准化，它以这样的方式转换值，即平均值为零，标准差为 1。

除了这些类型之外，还有许多其他方法可以使用。通常，当数据集的大小已知时，使用这两种方法，否则我们应该使用其他一些方法，例如对数缩放，用某个常数除以每个值等等。但是为什么需要标准化数据呢？这是机器学习中的一个基本问题，最简单的答案是为所有特征提供相等的影响以改变输出标签。有关数据标准化和缩放的更多信息，请访问此链接。

在这篇博文中，我们将实现一个 CNTK 神经网络，其中包含输入层和第一个隐藏层之间的“标准化层”。网络的示意图如下图所示

可以看出，标准化层位于输入层和第一个隐藏层之间。此外，标准化层包含与输入层相同的神经元，并产生与输入层相同维度的输出。

为了实现标准化层，必须满足以下要求

在训练数据集中计算平均值 $\mu$ 和标准差 $\sigma$ ，并找到每个特征的最大值和最小值。
这必须在神经网络模型创建之前完成，因为我们需要标准化层中的这些值。
在网络模型创建中，应在定义输入层后定义标准化层。

训练数据集的均值和标准差的计算

在网络创建之前，我们应该准备好均值和标准差参数，这些参数将在标准化层中用作常量。希望 CNTK 在 Minibatch 源类中为此目的提供 static 方法 “MinibatchSource.ComputeInputPerDimMeansAndInvStdDevs”。该方法采用在小批量中定义的整个训练数据集并计算参数。

//calculate mean and std for the minibatchsource
// prepare the training data
var d = new DictionaryNDArrayView, NDArrayView>>();
using (var mbs = MinibatchSource.TextFormatMinibatchSource(
trainingDataPath , streamConfig, MinibatchSource.FullDataSweep,false))
{
d.Add(mbs.StreamInfo("feature"), new Tuple(null, null));
//compute mean and standard deviation of the population for inputs variables
MinibatchSource.ComputeInputPerDimMeansAndInvStdDevs(mbs, d, device);
}

现在我们有了每个特征的平均值和标准差值，我们可以创建一个带有标准化层的网络。在此示例中，我们定义了一个简单的前馈神经网络，具有 1 个输入层、1 个标准化层、1 个隐藏层和 1 个输出层。

private static Function createFFModelWithNormalizationLayer
(Variable feature, int hiddenDim,int outputDim, Tuple<NDArrayView, NDArrayView> avgStdConstants, 
DeviceDescriptor device)
{
    //First the parameters initialization must be performed
    var glorotInit = CNTKLib.GlorotUniformInitializer(
    CNTKLib.DefaultParamInitScale,
    CNTKLib.SentinelValueForInferParamInitRank,
    CNTKLib.SentinelValueForInferParamInitRank, 1);

    //*******Input layer is indicated as feature
    var inputLayer = feature;

    //*******Normalization layer
    var mean = new Constant(avgStdConstants.Item1, "mean");
    var std = new Constant(avgStdConstants.Item2, "std");
    var normalizedLayer = CNTKLib.PerDimMeanVarianceNormalize(inputLayer, mean, std);

    //*****hidden layer creation
    //shape of one hidden layer should be inputDim x neuronCount
    var shape = new int[] { hiddenDim, 4 };
    var weightParam = new Parameter(shape, DataType.Float, glorotInit, device, "wh");
    var biasParam = new Parameter(new NDShape(1, hiddenDim), 0, device, "bh");
    var hidLay = CNTKLib.Times(weightParam, normalizedLayer) + biasParam;
    var hidLayerAct = CNTKLib.ReLU(hidLay);

    //******Output layer creation
    //the last action is creation of the output layer
    var shapeOut = new int[] { 3, hiddenDim };
    var wParamOut = new Parameter(shapeOut, DataType.Float, glorotInit, device, "wo");
    var bParamOut = new Parameter(new NDShape(1, 3), 0, device, "bo");
    var outLay = CNTKLib.Times(wParamOut, hidLayerAct) + bParamOut;
    
    return outLay;
}

完整的源代码示例

下面列出了有关此示例的完整源代码。该示例展示了如何标准化 Iris 著名数据集的输入特征。请注意，当使用这种数据标准化方式时，我们不需要处理验证或测试数据集的标准化，因为数据标准化是网络模型的一部分。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using CNTK;
namespace NormalizationLayerDemo
{
    class Program
    {
        static string trainingDataPath = "./data/iris_training.txt";
        static string validationDataPath = "./data/iris_validation.txt";
        static void Main(string[] args)
        {
            DeviceDescriptor device = DeviceDescriptor.UseDefaultDevice();

            //stream configuration to distinct features and labels in the file
            var streamConfig = new StreamConfiguration[]
               {
                   new StreamConfiguration("feature", 4),
                   new StreamConfiguration("flower", 3)
               };

            // build a NN model
            //define input and output variable and connecting to the stream configuration
            var feature = Variable.InputVariable(new NDShape(1, 4), DataType.Float, "feature");
            var label = Variable.InputVariable(new NDShape(1, 3), DataType.Float, "flower");

            //calculate mean and std for the minibatchsource
            // prepare the training data
            var d = new Dictionary<StreamInformation, Tuple>();
            using (var mbs = MinibatchSource.TextFormatMinibatchSource(
               trainingDataPath , streamConfig, MinibatchSource.FullDataSweep,false))
            {
                d.Add(mbs.StreamInfo("feature"), new Tuple(null, null));
                //compute mean and standard deviation of the population for inputs variables
                MinibatchSource.ComputeInputPerDimMeansAndInvStdDevs(mbs, d, device);

            }

            //Build simple Feed Froward Neural Network with normalization layer
            var ffnn_model = createFFModelWithNormalizationLayer
                                    (feature,5,3,d.ElementAt(0).Value, device);

            //Loss and error functions definition
            var trainingLoss = CNTKLib.CrossEntropyWithSoftmax
            (new Variable(ffnn_model), label, "lossFunction");
            var classError = CNTKLib.ClassificationError
            (new Variable(ffnn_model), label, "classificationError");

            // set learning rate for the network
            var learningRatePerSample = new TrainingParameterScheduleDouble(0.01, 1);

            //define learners for the NN model
            var ll = Learner.SGDLearner(ffnn_model.Parameters(), learningRatePerSample);

            //define trainer based on model, loss and error functions , and SGD learner
            var trainer = Trainer.CreateTrainer
                          (ffnn_model, trainingLoss, classError, new Learner[] { ll });

            //Preparation for the iterative learning process

            // create minibatch for training
            var mbsTraining = MinibatchSource.TextFormatMinibatchSource
            (trainingDataPath, streamConfig, MinibatchSource.InfinitelyRepeat, true);

            int epoch = 1;
            while (epoch < 20)
            {
                var minibatchData = mbsTraining.GetNextMinibatch(65, device);
                //pass to the trainer the current batch separated by the features and label.
                var arguments = new Dictionary
                {
                    { feature, minibatchData[mbsTraining.StreamInfo("feature")] },
                    { label, minibatchData[mbsTraining.StreamInfo("flower")] }
                };

                trainer.TrainMinibatch(arguments, device);

                //for each epoch report the training process
                if (minibatchData.Values.Any(a => a.sweepEnd))
                {
                    reportTrainingProgress(feature, label, streamConfig, trainer, epoch, device);
                    epoch++;
                }
            }
            Console.Read();
        }

        private static void reportTrainingProgress(Variable feature, Variable label, 
        StreamConfiguration[] streamConfig,  Trainer trainer, int epoch, DeviceDescriptor device)
        {
            // create minibatch for training
            var mbsTrain = MinibatchSource.TextFormatMinibatchSource
            (trainingDataPath, streamConfig, MinibatchSource.FullDataSweep, false);
            var trainD = mbsTrain.GetNextMinibatch(int.MaxValue, device);
            //
            var a1 = new UnorderedMapVariableMinibatchData();
            a1.Add(feature, trainD[mbsTrain.StreamInfo("feature")]);
            a1.Add(label, trainD[mbsTrain.StreamInfo("flower")]);
            var trainEvaluation = trainer.TestMinibatch(a1);

            // create minibatch for validation
            var mbsVal = MinibatchSource.TextFormatMinibatchSource
            (validationDataPath, streamConfig, MinibatchSource.FullDataSweep, false);
            var valD = mbsVal.GetNextMinibatch(int.MaxValue, device);

            //
            var a2 = new UnorderedMapVariableMinibatchData();
            a2.Add(feature, valD[mbsVal.StreamInfo("feature")]);
            a2.Add(label, valD[mbsVal.StreamInfo("flower")]);
            var valEvaluation = trainer.TestMinibatch(a2);

            Console.WriteLine($"Epoch={epoch}, 
            Train Error={trainEvaluation}, Validation Error={valEvaluation}");
        }

        private static Function createFFModelWithNormalizationLayer
        (Variable feature, int hiddenDim,int outputDim, Tuple avgStdConstants, DeviceDescriptor device)
        {
            //First the parameters initialization must be performed
            var glorotInit = CNTKLib.GlorotUniformInitializer(
                    CNTKLib.DefaultParamInitScale,
                    CNTKLib.SentinelValueForInferParamInitRank,
                    CNTKLib.SentinelValueForInferParamInitRank, 1);

            //*******Input layer is indicated as feature
            var inputLayer = feature;

            //*******Normalization layer
            var mean = new Constant(avgStdConstants.Item1, "mean");
            var std = new Constant(avgStdConstants.Item2, "std");
            var normalizedLayer = CNTKLib.PerDimMeanVarianceNormalize(inputLayer, mean, std);

            //*****hidden layer creation
            //shape of one hidden layer should be inputDim x neuronCount
            var shape = new int[] { hiddenDim, 4 };
            var weightParam = new Parameter(shape, DataType.Float, glorotInit, device, "wh");
            var biasParam = new Parameter(new NDShape(1, hiddenDim), 0, device, "bh");
            var hidLay = CNTKLib.Times(weightParam, normalizedLayer) + biasParam;
            var hidLayerAct = CNTKLib.ReLU(hidLay);

            //******Output layer creation
            //the last action is creation of the output layer
            var shapeOut = new int[] { 3, hiddenDim };
            var wParamOut = new Parameter(shapeOut, DataType.Float, glorotInit, device, "wo");
            var bParamOut = new Parameter(new NDShape(1, 3), 0, device, "bo");
            var outLay = CNTKLib.Times(wParamOut, hidLayerAct) + bParamOut;
            return outLay;
        }
    }
}

输出窗口应如下所示

2018-07-13_18-20-58

示例中使用的数据集文件可以从此处下载。