遗传算法在人工神经网络分类问题中的应用






4.76/5 (11投票s)
本文演示了遗传算法在人工神经网络分类问题中的应用。
引言
大自然为解决优化问题提供了内在的方法,称为遗传算法。生物体通过进化、适应不断变化的环境、交配并繁衍出比前代更优良的个体。个体的适应度表明其生存能力或对特定目的的适应程度。遗传算法 (GA) 是一种基于自然选择(驱动生物进化的过程)来解决优化问题的方法。遗传算法会反复修改个体解决方案的种群。在每一步,遗传算法会从当前种群中随机选择个体作为父代,并利用它们来产生下一代的子代。经过连续几代,种群会“进化”到最优解。你可以将遗传算法应用于解决许多不适合标准优化算法的优化问题,包括目标函数是离散的、不可微的、随机的或高度非线性的问题。在本文中,我将演示如何将遗传算法应用于训练人工神经网络以进行分类。它的优点在于它会搜索整个空间以找到最佳解决方案,但代价是速度不如数值方法。然而,后者往往会收敛到局部最小值。遗传算法和数值方法都有其优点和缺点,可以针对特定问题进行互换使用。
背景
你需要熟悉遗传算法。可以参考我学习该主题时使用的一些网站: 遗传算法、遗传编程、通俗易懂的遗传算法。要智能地使用我的代码,你需要理解交叉、变异和选择过程。关于神经网络的控制台接口和文件结构描述,可以在我之前的文章中找到:反向传播人工神经网络 (C++)。
使用代码
控制台应用程序的帮助信息允许你使用这些参数
argv[1] t-train, r-run
argv[2] network conf file
argv[3] cls1 files [0.9]
argv[4] cls2 files [0.1]
argv[5] epochs num
argv[6] [population size 50]
argv[7] [crossover chance 2]
argv[8] [mutation chance 1000]
argv[9] [validation class]
argv[10] [test class]
argv[12] [validation TH 0.5]
argv[12] [validation type [mse]]
argv[13] [normalization]: [0-no], 1-minmax, 2-zscore,
3-softmax, 4-energy
argv[14] [error tolerance cls] +- 0.05 default
argv[1] r-run
argv[2] network conf file
argv[3] cls files
argv[4] [validation TH 0.5]
argv[5] [normalization]: [0-no], 1-minmax, 2-zscore, 3-softmax, 4-energy
ann1dn.exe t net.nn cls1 cls2 3000 [50]
[crossover rate] [mutation rate]
[tst.txt][val.txt][TH [0.5]][vtype [mse]]
[normalization [0]] [err [0.05]]
ann1dn.exe r net.nn testcls [TH [0.5]] [normalization [0]]
与《反向传播人工神经网络 (C++)》项目相比,唯一的区别是增加了遗传算法特有的参数:6、7 和 8。其他方面都一样。它们分别表示初始种群大小、交叉概率和变异概率。默认值为 50、2 和 1000。交叉和变异率的含义是,对于每对父代的每个染色体,存在 1/2 的交叉概率,而对于配子过程中的特定基因位,变异概率是 1/1000。因此,你可以立即开始训练了。
>ann1dg.exe t iris.nn setosa.txt virgi.txt 200
在附加的 `run.bat` 文件中,你可以找到以下命令行来执行
>ann1dg.exe t iris.nn setosa.txt virgi.txt 200 50 2 100 void void 0.5 0 2
它使用 50 个个体的种群大小,交叉概率为 2,变异概率为 100。接下来的参数表示使用均方误差作为交叉验证指标,通过随机选择验证集和测试集从训练集中进行选择,以及对输入数据进行 ZScore 归一化。如果数据的范围与首选范围(对于神经网络分类,大约 -1.0 到 1.0 比较合适)差异很大,则需要进行归一化。
下面展示了一个典型的遗传算法过程,用于将 IRIS 数据分类为 setosa 和 versi 品种,以区分 virgi 品种
loading data...
cls1: 100 cls2: 50 files loaded. size: 4 samples
validaton size: 25 12
validaton size: 26 13
normalizing zscore...
training...
epoch: 1 0.454 0.450 pop age:0 mean[0.00] fit:0.00
bOrg: 0.715 0.546 fit 9.68 age 0
epoch: 2 0.642 0.631 pop age:1 mean[1.00] fit:238.01
bOrg: 0.715 0.546 fit 9.68 age 1
epoch: 3 0.630 0.617 pop age:2 mean[1.64] fit:334.49
bOrg: 0.665 0.341 fit 11.63 age 0
epoch: 4 0.638 0.617 pop age:3 mean[2.18] fit:288.08
bOrg: 0.665 0.341 fit 11.63 age 1
epoch: 5 0.607 0.577 pop age:4 mean[2.48] fit:395.51
bOrg: 0.665 0.341 fit 11.63 age 2
epoch: 6 0.636 0.602 pop age:5 mean[2.90] fit:456.78
bOrg: 0.685 0.341 fit 11.80 age 0
epoch: 7 0.638 0.583 pop age:6 mean[3.34] fit:390.31
bOrg: 0.677 0.452 fit 12.09 age 0
epoch: 8 0.651 0.562 pop age:7 mean[3.52] fit:425.81
bOrg: 0.764 0.513 fit 12.71 age 0
epoch: 9 0.649 0.535 pop age:8 mean[3.45] fit:415.64
bOrg: 0.826 0.455 fit 17.96 age 0
epoch: 10 0.608 0.445 pop age:9 mean[3.02] fit:408.99
bOrg: 0.826 0.455 fit 17.96 age 1 vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
epoch: 11 0.650 0.445 pop age:10 mean[2.98] fit:463.34
bOrg: 0.826 0.455 fit 17.96 age 2 vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
epoch: 12 0.615 0.405 pop age:11 mean[3.24] fit:467.19
bOrg: 0.826 0.455 fit 17.96 age 3 vld: 9.71 (epoch 10) se:68.00 sp:100.00 ac:78.38
epoch: 13 0.666 0.454 pop age:12 mean[2.98] fit:532.92
bOrg: 0.826 0.455 fit 17.96 age 4 vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
epoch: 14 0.662 0.401 pop age:13 mean[3.19] fit:627.14
bOrg: 0.826 0.455 fit 17.96 age 5 vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
epoch: 15 0.667 0.390 pop age:14 mean[3.35] fit:517.49
bOrg: 0.711 0.216 fit 22.49 age 0 vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
epoch: 16 0.662 0.372 pop age:15 mean[3.08] fit:508.93
bOrg: 0.711 0.216 fit 22.49 age 1 vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
epoch: 17 0.638 0.339 pop age:16 mean[2.65] fit:732.92
bOrg: 0.711 0.216 fit 22.49 age 2 vld: 15.42 (epoch 13) se:88.00 sp:100.00 ac:91.89
epoch: 18 0.676 0.366 pop age:17 mean[3.27] fit:793.93
bOrg: 0.711 0.216 fit 22.49 age 3 vld: 33.02 (epoch 18) se:100.00 sp:100.00 ac:100.00
epoch: 19 0.648 0.333 pop age:18 mean[3.35] fit:741.40
bOrg: 0.711 0.216 fit 22.49 age 4 vld: 33.02 (epoch 18) se:100.00 sp:100.00 ac:100.00
...
...
epoch: 187 0.747 0.213 pop age:186 mean[1.85] fit:2343.14
bOrg: 0.778 0.243 fit 42.80 age 0 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 188 0.747 0.213 pop age:187 mean[2.23] fit:1630.09
bOrg: 0.778 0.243 fit 42.80 age 1 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 189 0.747 0.213 pop age:188 mean[2.17] fit:2050.81
bOrg: 0.778 0.243 fit 42.80 age 2 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 190 0.748 0.214 pop age:189 mean[2.36] fit:1899.64
bOrg: 0.778 0.243 fit 42.81 age 0 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 191 0.751 0.217 pop age:190 mean[2.41] fit:1692.95
bOrg: 0.778 0.243 fit 42.81 age 1 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 192 0.752 0.223 pop age:191 mean[2.00] fit:1869.26
bOrg: 0.778 0.243 fit 42.81 age 0 vld: 65.97 (epoch 183) se:100.00 sp:100.00 ac:100.00
epoch: 193 0.755 0.221 pop age:192 mean[2.00] fit:1965.02
bOrg: 0.778 0.243 fit 42.82 age 0 vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
epoch: 194 0.762 0.227 pop age:193 mean[2.42] fit:2118.48
bOrg: 0.778 0.243 fit 42.82 age 1 vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
epoch: 195 0.766 0.232 pop age:194 mean[2.63] fit:2538.75
bOrg: 0.778 0.243 fit 42.82 age 2 vld: 81.72 (epoch 193) se:100.00 sp:100.00 ac:100.00
epoch: 196 0.778 0.244 pop age:195 mean[2.66] fit:2140.18
bOrg: 0.778 0.243 fit 42.82 age 3 vld: 81.77 (epoch 196) se:100.00 sp:100.00 ac:100.00
epoch: 197 0.778 0.243 pop age:196 mean[2.86] fit:2140.25
bOrg: 0.778 0.243 fit 42.82 age 4 vld: 81.77 (epoch 196) se:100.00 sp:100.00 ac:100.00
epoch: 198 0.778 0.243 pop age:197 mean[2.76] fit:2311.59
bOrg: 0.778 0.243 fit 42.82 age 0 vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
epoch: 199 0.778 0.243 pop age:198 mean[2.52] fit:2226.04
bOrg: 0.778 0.243 fit 42.82 age 1 vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
epoch: 200 0.777 0.242 pop age:199 mean[2.49] fit:1583.92
bOrg: 0.778 0.243 fit 42.82 age 2 vld: 81.81 (epoch 198) se:100.00 sp:100.00 ac:100.00
training time: 00:00:00:781
classification results: maxacur.nn
train set: 49 25
sensitivity: 97.96
specificity: 96.00
+predictive: 97.96
-predictive: 96.00
accuracy: 97.30
validation set: 25 12
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
test set: 26 13
sensitivity: 88.46
specificity: 100.00
+predictive: 100.00
-predictive: 81.25
accuracy: 92.31
单次迭代的输出,以最后 200 次迭代为例:在训练集上,种群对正类的平均输出为 0.777,对负类的平均输出为 0.242,种群年龄为 199,个体的平均年龄为 2.49,平均适应度为 1583.92(MSE 的倒数),bOrg 是最适应的个体,在训练集上对正类的平均输出为 0.788,对负类的平均输出为 0.243,年龄为 2,并且其在验证集上的分类结果。
为了进行比较,我运行了几次遗传算法训练,每次都随机划分 IRIS 训练数据,并选择了最佳分类率配置。同样的操作我也为我文章中的反向传播学习算法神经网络进行了:反向传播人工神经网络 (C++)。下面,我展示了网络的权重和分类率。我们可以看到,遗传算法取得了与反向传播算法非常接近的性能结果。
遗传算法结果
4
4 3 2 1
0
1
-44.000000 0.115305
-22.000000 0.257352
-10.000000 0.057181
-1.000000 0.136029
-1.018195
1.106349
-0.252558
-3.816144
0.060086
2.665148
0.106655
-0.719681
1.192948
-2.096924
-0.999024
0.572130
-1.355700
0.017045
1.615307
-2.008180
-8154.750000
-0.141530
7.478271
-1.202083
0.817744
1.999939
-2.613441
-0.727891
-1.999817
14.607610
train set: 49 25
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
validation set: 25 12
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
test set: 26 13
sensitivity: 96.15
specificity: 100.00
+predictive: 100.00
-predictive: 92.86
accuracy: 97.44
反向传播算法结果
4
4 3 2 1
0
1
-43.000000 0.109332
-23.000000 0.251434
-11.000000 0.055665
-1.000000 0.133365
7.644542
1.920880
0.046209
-4.151631
-2.031714
-8.133764
1.372707
-1.903027
2.587456
2.505832
1.232619
-0.139417
0.577583
1.139807
1.212673
-1.549961
-4.962042
4.220874
-1.322594
1.052067
2.433295
-2.301689
0.902922
0.755153
-4.881694
2.221259
train set: 49 25
sensitivity: 97.96
specificity: 100.00
+predictive: 100.00
-predictive: 96.15
accuracy: 98.65
validation set: 25 12
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
test set: 26 13
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
遗传算法
项目中的遗传算法特有类是
Gene
染色体
Organism
Population
我将按此顺序描述它们。
通常,对于涉及浮点数的分类过程,单个基因用该数字表示,变异算子只需向该数字添加一个随机值。但在自然过程中,基因由 1 或 0 的单个位序列组成。变异只是翻转基因中的一个随机位。你无法对浮点数使用位翻转,因为这可能会导致 NaN 或非常非常大的数字,接近浮点数的极限,而对于神经网络分类,权重的典型范围大约在 -1.0 到 1.0 之间。为了接近基因的整数表示,但具有浮点值,我使用一个 32 位整数作为单个基因,通过将 HI 符号字除以 LO 符号字来获得其浮点值。唯一必要的条件是避免零字,因为这种情况可能导致除以零或神经网络中出现零系数,这可能在选择和变异过程中导致具有大多数零基因的个体,而没有实际信息。
下面展示了 `Gene` 类,其构造函数使用随机初始化基因或仅从另一个基因复制。
Gene::Gene()
{
//if MAX_RAND=0x7fff mult by 2*rand()
unsigned short lo = 0xFFFF & (2 * rand());
while (!lo)
lo = 0xFFFF & (2 * rand());
unsigned short hi = 0xFFFF & (2 * rand());
while (!hi)
hi = 0xFFFF & (2 * rand());
m_gene = lo | hi << 16;
}
Gene::Gene(unsigned int gene): m_gene(gene)
{
}
class Gene
{
friend class Population;
friend class Chromosome;
public:
Gene(); //random gene
Gene(unsigned int gene); //copy gene
inline float gene() const;
private:
Gene(const Gene& gene);
const Gene& operator=(const Gene& gene);
unsigned int m_gene; //32bit number lo/hi non-zero values only
};
inline float Gene::gene() const
{
short lo = 0xFFFF & m_gene;
short hi = 0xFFFF & (m_gene >> 16);
return (float)hi / (float)lo;
}
我使用 `friend`,因为在该库中不需要 `protected` 接口。
下一个难题是如何智能地组合由基因组成的染色体的生物体结构,即一个由浮点数权重组成的人工神经网络,以便两个父代神经网络配置的交配能够为给定的分类任务产生一个改进的后代。随机交换不同层的权重根本不起作用。我通过将网络中的每个单独神经元表示为一个染色体,其中基因代表神经元的输入连接(即浮点数权重)来解决这个问题。这样,在交叉过程中会使用相应的染色体(即神经元)。也就是说,选定的父代(神经网络)中的第一个染色体(神经元)与其第一个对应父代(神经网络)的第一个染色体交换权重,第二个染色体与该父代的相应第二个染色体交换,依此类推。这样,交叉是智能的,因为不会发生最后一层神经元与第一层神经元的权重交换。
如你所见,整个神经网络作为一个由 `Organism` 类表示的生物体,而单个神经元则表示为一个 `Chromosome` 对象,其基因数量代表了该神经元的输入权重连接数。这样,神经网络由 3 层组成,每层分别有 10、5 和 1 个神经元,相应地由 5 + 1 = 6 个染色体组成的生物体表示,总共有 5 * (10 + 1) + 1 * (5 + 1) = 55 + 6 = 61 个基因。增加的 1 是偏置连接。第一层的染色体由 11 个基因组成,最后一层的染色体有 6 个基因。这样,一个父代的第一层染色体与第二个父代的第一层类似染色体交换其基因。
下面展示了 `Chromosome` 类
Chromosome::Chromosome(int genes_number)
{
for (int g = 0; g < genes_number; g++)
m_genes.push_back(new Gene());
}
Chromosome::Chromosome(vector<Gene *> *genes)
{
for (int g = 0; g < (int)genes->size(); g++)
m_genes.push_back(new Gene(genes->at(g)->m_gene));
}
Chromosome::~Chromosome()
{
for (int g = 0; g < get_genes_number(); g++)
delete m_genes[g];
}
class Chromosome
{
friend class Population;
friend class Organism;
public:
Chromosome(int genes_number); //random genes constructor
Chromosome(vector<Gene *> *pgenes); //copy genes constructor
~Chromosome();
inline int get_genes_number() const;
inline const Gene *get_gene(int g) const;
private:
Chromosome(const Chromosome& chrom);
const Chromosome& operator=(const Chromosome& chrom);
vector<Gene *> m_genes;
};
inline int Chromosome::get_genes_number() const
{
return (int) m_genes.size();
}
inline const Gene *Chromosome::get_gene(int g) const
{
if (g > get_genes_number() - 1 || g < 0)
return 0;
return m_genes[g];
}
它包含基因的向量,你可以创建随机基因或从另一个染色体复制它们。
`Organism` 对象就像一个生物体,包含不同数量基因的染色体。该生物体有年龄、其存活的种群数量以及解决分类任务的适应度,通常是训练集、验证集或测试集上的均方误差 (MSE) 的倒数。你可以使用特定的神经网络配置、层数和每层神经元数量来创建它,或者通过复制现有生物体的基因值来创建一个副本。
Organism::Organism(int layers_number, const int *neurons_per_layer):
m_chromosomes_number(0), m_genes_number(0), m_age(0), m_fitness(-1.0f)
{
for (int i = 0; i < USERDATA_SIZE; i++)
user_data[i] = 0.0f;
//e.g.: 10 5 1 5chr of 10+1genes, 1chr of 5+1genes
for (int l = 0; l < layers_number - 1; l++) { //loop thru layers
for (int c = 0; c < neurons_per_layer[l+1]; c++) {
//init with number of genes+1 [including 1 bias neuron!]
Chromosome *pchr = new Chromosome(neurons_per_layer[l] + 1);
m_chromosomes.push_back(pchr);
for (int g = 0; g < neurons_per_layer[l] + 1; g++) {
m_genes_number++;
CHROM_GENE_INDEX pnt;
pnt.c = m_chromosomes_number;
pnt.g = g;
m_gene_index.push_back(pnt);
}
m_chromosomes_number++;
}
}
}
Organism::Organism(const Organism& org) : m_chromosomes_number(org.m_chromosomes_number),
m_genes_number(org.m_genes_number),
m_age(0), m_fitness(-1.0f)
{
for (int i = 0; i < USERDATA_SIZE; i++)
user_data[i] = org.user_data[i];
//copy chr/gene indices
for (int i = 0; i < m_genes_number; i++)
m_gene_index.push_back(org.m_gene_index[i]);
//copy genes from template
for (int c = 0; c < m_chromosomes_number; c++)
m_chromosomes.push_back(new Chromosome(&org.m_chromosomes[c]->m_genes));
}
typedef struct _chrom_gene_index {
int c; //chromosome index
int g; //gene index
} CHROM_GENE_INDEX, *PCHROM_GENE_INDEX;
#define USERDATA_SIZE 2
class Organism
{
friend class Population;
public:
//create organism with certain config
Organism(int layers_number, const int *neurons_per_layer);
Organism(const Organism& org); //copy constructor
~Organism();
inline int get_chromosomes_number() const;
inline const Chromosome *get_chromosome(int c) const;
inline CHROM_GENE_INDEX get_gene_index(int i) const;
inline float fitness() const;
inline void fitness(float fitness);
inline int age() const;
inline void age(int age);
//outputs on training set
float user_data[USERDATA_SIZE];
private:
const Organism& operator=(const Organism& org);
//chromosome array
vector<Chromosome *> m_chromosomes;
//index of individual gene m_gene_index.size() = m_genes_number
vector<CHROM_GENE_INDEX> m_gene_index;
int m_chromosomes_number; //total number of chromosomes
int m_genes_number; //total number of genes
float m_fitness; //fitness of organizm
int m_age; //organism age
};
`m_gene_index` 是 `CHROM_GENE_INDEX` 结构的数组。它允许在变异过程中随机选择单个基因进行变异。它只包含特定染色体中单个基因的坐标。
`Population` 类是生物体的容器。
#define PSZVAR 30+1 //variation of population size 30% from m_meansize
////////////////Population////////////////////////////////////////////////////
class Population
{
enum CROSOVER_TYPE {ONEPOINT, TWOPOINT, UNIFORM, ARITHMETIC};
public:
Population(int mean_size, int layers_number, const int *neurons_per_layer);
~Population();
void populate(); //remove unfit, mate parents, produce siblings
inline int crossover_rate() const;
inline void crossover_rate(int c_rate);
inline int mutation_rate() const;
inline void mutation_rate(int m_rate);
inline int population_size() const;
inline Organism *get_organism(int i);
inline const Organism *get_organism(int i) const;
inline int get_age() const;
inline float get_fitness() const;
inline float get_fittest_meanage() const;
private:
Population(const Population& pop);
const Population& operator=(const Population& pop);
vector<Organism *> m_organisms;
//crossover rate 1 chance from m_crossover_rate for a chromosome
int m_crossover_rate;
//mutation rate 1 chance from m_mutation_rate max=0x7FFF
int m_mutation_rate;
int m_mean_size; //hard wired initial fittest parents size
int m_age; //population age
float m_fittest_meanage; //fittest organisms mean age
float m_fitness; //current population total fitness
int m_parents_limit; //current pop size of parents organisms
int m_offsprings_limit; //current pop size of offsprings organisms
int m_fittest_limit; //size for fittest ones from
//[m_parents_limit + m_offsprings_limit]
//get random fittest
const Organism *run_roulette_wheel(float population_fitness) const;
//mate_parents 2 organisms
Organism *mate_parents(const Organism *X, const Organism *Y) const;
};
Population::Population(int mean_size, int layers_number,
const int *neurons_per_layer): m_mean_size(mean_size),
m_age(0), m_fittest_meanage(0.0f), m_fitness(0.0f),
m_crossover_rate(2), m_mutation_rate(1000)
{
srand((unsigned int)time(0));
//max size parents/offsprings
//meansize +/- N% parents
float a = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
m_parents_limit = m_mean_size + int(a * (float)m_mean_size);
// offsprings
float b = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
m_offsprings_limit = m_mean_size + int(b * (float)m_mean_size);
// fittest size
float f = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
m_fittest_limit = m_mean_size + int(f * (float)m_mean_size);
//init population including parents/offsprings organisms
for (int i = 0; i < (m_parents_limit + m_offsprings_limit); i++) {
Organism *porg = new Organism(layers_number, neurons_per_layer);
m_organisms.push_back(porg);
}
}
你需要这样初始化它
//load ANN network/////////////////////////////
ann = new ANNetwork(argv[2]);
if (ann->status() < 0) {
wprintf(L"failed to load network: %s", argv[2]);
exit(1);
}
//...
//init GA code/////////////////////////////////
int *cnf = new int[ann->get_layers_number()];
for (int i = 0; i < ann->get_layers_number(); i++)
cnf[i] = ann->get_layer(i)->get_neurons_number();
//parents_number individuals of ANN config
gpop = new Population(parents_number, ann->get_layers_number(), cnf);
gpop->crossover_rate(crossover_rate);
gpop->mutation_rate(mutation_rate);
delete[] cnf;
在构建过程中,它会创建父代和子代的初始种群,具有随机基因,并定义可以传递到下一代的适应度最高的个体数量上限,其余的则会灭绝。对于 50 的种群大小,它将创建 50±15 个父代和子代。并设置适应度最高的种群大小上限为 50±15。例如,60 个父代和 48 个子代,总共 108 个个体。如果适应度最高的上限设置为 55,那么在评估完训练数据上每个个体的适应度后,将从种群中移除不适应的 108 - 55 = 53 个个体。而适应度最高的 55 个个体将存活到下一代。然后重复这个过程,产生新的子代,评估训练数据的适应度,并移除不适应的个体。
void Population::populate()
//remove unfit, mate population
{
m_age++;
//selection process
random_shuffle(m_organisms.begin(), m_organisms.end());
while (population_size() > m_fittest_limit) {
//while num of orgs exceeds fittest limit
int ind = 0;
Organism *unfit = m_organisms[ind];
for (int i = 1; i < population_size(); i++) {
if (m_organisms[i]->m_fitness <= unfit->m_fitness) {
ind = i;
unfit = m_organisms[i];
}
}
delete unfit;
m_organisms.erase(m_organisms.begin() + ind);
}
m_parents_limit = m_fittest_limit; //offs becomes parents
m_fittest_meanage = 0.0f;
m_fitness = 0.0f;
//increase age of fittest individuals
for (int i = 0; i < population_size(); i++) {
m_organisms[i]->m_age++;
m_fitness += m_organisms[i]->m_fitness;
m_fittest_meanage += m_organisms[i]->m_age;
}
m_fittest_meanage /= (float)population_size();
//mate fittest individuals
vector<Organism *> siblings;
float b = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
//offsprings size
m_offsprings_limit = m_mean_size + int(b * (float)m_mean_size);
float f = float(short(0xFFFF & (2 * rand())) % PSZVAR) / 100.0f;
//fittest size limit
m_fittest_limit = m_mean_size + int(f * (float)m_mean_size);
for (int s = 0; s < m_offsprings_limit; s++) {
//roulette wheel selection
const Organism *X = run_roulette_wheel(m_fitness);
const Organism *Y = run_roulette_wheel(m_fitness);
while (X == Y)
Y = m_organisms[rand()%population_size()];
//up to 5 childrens
int cnum = rand() % 5;
cnum++;
for (int i = 0; i < cnum; i++) {
siblings.push_back(mate_parents(X, Y));
s++;
if (!(s < m_offsprings_limit))
break;
}
}
//introduce siblings to population
for (int s = 0; s < (int)siblings.size(); s++)
m_organisms.push_back(siblings[s]);
}
对于选择,我使用轮盘赌法。
inline const Organism *Population::run_roulette_wheel(float population_fitness) const
//get random fittest
{
float psum = 0.0f;
float ind = float(rand() % 100); //0...99
for (int i = 0; i < population_size(); i++) {
//prcnt of the roulette wheel
psum += 100.0f * (m_organisms[i]->m_fitness / population_fitness);
if (psum > ind)
return m_organisms[i];
}
return m_organisms[population_size()-1];
}
而交配和随后的变异由该函数定义
//mate_parents 2 organisms: 1,2 point, uniform, arithmetic crossover
inline Organism *Population::mate_parents(const Organism *X,
const Organism *Y) const
{
Organism *child = new Organism(*X);
int type = rand() % 2;//4;
//choose crossover type
switch (type) {
case ONEPOINT:
for (int c = 0; c < child->m_chromosomes_number; c++) {
if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;
int gnum = child->m_chromosomes[c]->get_genes_number();
if (gnum == 1) continue;
//swap at random point
int p = (rand() % (gnum - 1)) + 1;
for (int g = 0; g < p; g++)
child->m_chromosomes[c]->m_genes[g]->m_gene =
Y->m_chromosomes[c]->m_genes[g]->m_gene;
//or swap(child->...->m_genes[g],
// Y->...->m_genes[g]) Gene class adresses
}
break;
case TWOPOINT:
for (int c = 0; c < child->m_chromosomes_number; c++) {
if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;
int gnum = child->m_chromosomes[c]->get_genes_number();
if (gnum <= 2) continue;
//swap at 2 random points only for 3 gene chromosomes
int p1 = (rand() % (gnum - 1)) + 1;
int p2 = (rand() % (gnum - 1)) + 1;
while (p1 == p2)
p2 = (rand() % (gnum - 1)) + 1;
if (p1 > p2) swap(p1, p2);
for (int g = 0; g < p1; g++)
child->m_chromosomes[c]->m_genes[g]->m_gene =
Y->m_chromosomes[c]->m_genes[g]->m_gene;
for (int g = p2; g < gnum; g++)
child->m_chromosomes[c]->m_genes[g]->m_gene =
Y->m_chromosomes[c]->m_genes[g]->m_gene;
}
break;
//bad crossovers
case UNIFORM:
for (int c = 0; c < child->m_chromosomes_number; c++) {
if (rand() % m_crossover_rate) continue;
for (int g = 0; g < child->m_chromosomes[c]->get_genes_number(); g++) {
unsigned int res = 0;
unsigned int gX = child->m_chromosomes[c]->m_genes[g]->m_gene;
unsigned int gY = Y->m_chromosomes[c]->m_genes[g]->m_gene;
for (int i = 0; i < 32; i++) { //32 bit per gene
if (rand() % 2)
res |= (1 << i) & gY;
else
res |= (1 << i) & gX;
}
if ((0xFFFF&res) && (res >> 16))
child->m_chromosomes[c]->m_genes[g]->m_gene = res;
else
child->m_chromosomes[c]->m_genes[g]->m_gene = gY;
}
}
break;
case ARITHMETIC:
for (int c = 0; c < child->m_chromosomes_number; c++) {
if (m_crossover_rate > 1 && rand() % m_crossover_rate) continue;
for (int g = 0; g < child->m_chromosomes[c]->get_genes_number(); g++) {
unsigned int gX = child->m_chromosomes[c]->m_genes[g]->m_gene;
unsigned int gY = Y->m_chromosomes[c]->m_genes[g]->m_gene;
if ((0xFFFF&(gX&gY)) && ((gX&gY) >> 16))
child->m_chromosomes[c]->m_genes[g]->m_gene = gX & gY;
else
child->m_chromosomes[c]->m_genes[g]->m_gene = gY;
}
}
break;
}
//mutate offspring
if (rand() % m_mutation_rate == 0) {
CHROM_GENE_INDEX *pnt;
unsigned int gene, mask;
//get rnd Num of chromosomes (genes) to mutate
int N = rand() % (child->m_chromosomes_number);
for (int n = 0; n < N + 1; n++) {
//get rnd CHROM_GENE_INDEX in chr/gene indices array
pnt = &child->m_gene_index[rand()%child->m_genes_number];
gene = child->m_chromosomes[pnt->c]->m_genes[pnt->g]->m_gene;
mask = 1 << (rand() % 32);
if ((0xFFFF&(gene ^ mask)) && ((gene ^ mask) >> 16))
//no ZERO hi/lo parts in gene
child->m_chromosomes[pnt->c]->m_genes[pnt->g]->m_gene = gene ^ mask;
//(dw &= ~mask; clever op)
}
}
return child;
}
虽然,我只使用 1 点和 2 点交叉,因为其他方法在分类任务上往往效果不好,但你可以自己实验:只需取消注释 4 并替换为 2。
int type = rand() % 4;