使用生成对抗网络 (GAN) 生成新的时尚设计

Abdulkader Helwan

5.00/5 (2投票s)

2021年3月19日

CPOL

4分钟阅读

13733

在本文中，我们将介绍如何构建一个用于时尚设计生成的生成对抗网络（GAN）。

下载源代码 - 120.7 MB

引言

诸如 DeepFashion 这样的数据集的可用性为时尚行业开辟了新的可能性。在这一系列文章中，我们将展示一个由人工智能驱动的深度学习系统，它可以通过帮助我们更好地理解客户需求来彻底改变时尚设计行业。

在这个项目中，我们将使用

Jupyter Notebook 作为 IDE
库
DeepFashion 数据集的自定义子集——相对较小，以减少计算和内存开销

我们假设您熟悉深度学习的概念，以及 Jupyter Notebooks 和 TensorFlow。如果您是 Jupyter Notebooks 的新手，请先阅读本教程。欢迎下载项目代码。

在上一篇文章中，我们评估并改进了深度网络的性能。在本文中，我们将着手构建、训练和测试一个生成对抗网络（GAN）——然后我们将使用该网络来生成新的服装图像和设计。

预测新时尚图像的力量

人工智能不仅可以帮助我们预测服装的类别，还可以创建看起来相似的计算机生成图像。这对于那些致力于创造个性化服装或预测更广泛时尚趋势的零售商和时装设计师来说非常有用。

在 GAN 出现之前，由于图像数据量巨大，生成逼真的时尚图像是一项艰巨的任务。图像通常是高分辨率的，这意味着有很多像素。此外，每个像素代表三个通道值：红色、蓝色和绿色（RGB）。GAN 为研究人员提供了一种生成和验证所有这些数据的可行方法。

构建 GAN

GAN 是一种流行的无监督机器学习模型，其中两个神经网络——生成器和判别器——相互交互。生成器的作用是根据输入的随机噪声生成图像。判别器的任务是通过与数据集中的图像进行比较，来检测这些生成的图像是假的还是真的。这个过程会持续几个 epoch，直到假图像和真图像之间的判别器损失达到最小值。随着损失达到最小值，生成器就能熟练地生成与原始数据集中的图像相似的图像。

构建 GAN 将包括以下阶段

初始化网络参数和加载数据
构建生成器
构建判别器

我们将使用 Pytorch 库来构建我们的 GAN。该库速度快，并且不需要大量的计算能力。

使用 Conda 安装带 CUDA10 的 Pytorch

# CUDA 10.0
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch

初始化 GAN 参数和加载数据

构成 GAN 的两个卷积神经网络（CNN）包括判别器的卷积、批归一化和 ReLU 层，以及生成器的反卷积、批归一化和 ReLU 层。

在开始构建我们的生成器和判别器网络之前，让我们设置一些参数并加载将用于训练和测试网络的时尚图像数据集。

首先，我们导入一些依赖项。

from __future__ import print_function
#%matplotlib inline
import argparse
import os
import random
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.optim as optim
import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.utils as vutils
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML

接下来，我们设置一些随机种子以实现可复现性

# Set random seed for reproducibility
manualSeed = 999
#manualSeed = random.randint(1, 10000) # use if you want new results
print("Random Seed: ", manualSeed)
random.seed(manualSeed)
torch.manual_seed(manualSeed)

然后，我们设置一些重要参数，例如特征图数量、输入图像尺寸、批次大小、epoch 数量和学习率。

# Root directory for dataset
dataroot = r"C:\Users\abdul\Desktop\ContentLab\P2\DeepFashion\Train"

# Number of workers for dataloader
workers = 2

# Batch size during training
batch_size = 128

# Spatial size of training images. All images will be resized to this
#   size using a transformer.
image_size = 64

# Number of channels in the training images. For color images this is 3
nc = 3

# Size of z latent vector (i.e. size of generator input)
nz = 100

# Size of feature maps in generator
ngf = 64

# Size of feature maps in discriminator
ndf = 64

# Number of training epochs
num_epochs = 40

# Learning rate for optimizers
lr = 0.0002

# Beta1 hyperparam for Adam optimizers
beta1 = 0.5

# Number of GPUs available. Use 0 for CPU mode.
ngpu = 1

现在我们可以使用 dataloader 加载数据并显示一些样本数据。

dataset = dset.ImageFolder(root=dataroot,
                           transform=transforms.Compose([
                               transforms.Resize(image_size),
                               transforms.CenterCrop(image_size),
                               transforms.ToTensor(),
                               transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                           ]))
# Create the dataloader
dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                         shuffle=True, num_workers=workers)

# Decide which device we want to run on
device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu")

# Plot some training images
real_batch = next(iter(dataloader))
plt.figure(figsize=(8,8))
plt.axis("off")
plt.title("Training Images")
plt.imshow(np.transpose(vutils.make_grid(real_batch[0].to(device)[:64], padding=2, normalize=True).cpu(),(1,2,0)))

最后，我们将使用下面的函数来初始化生成器和判别器网络的权重。

# custom weights initialization 
def weights_init(m):
    classname = m.__class__.__name__
    if classname.find('Conv') != -1:
        nn.init.normal_(m.weight.data, 0.0, 0.02)
    elif classname.find('BatchNorm') != -1:
        nn.init.normal_(m.weight.data, 1.0, 0.02)
        nn.init.constant_(m.bias.data, 0)

从头开始构建生成器

生成器 CNN 由转置卷积层、批归一化层和 ReLU 激活函数组成。输入是来自标准正态分布的潜在向量 z，输出是 3 x 64 x 64 像素的 RGB 图像。

class Generator(nn.Module):
    def __init__(self, ngpu):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )

    def forward(self, input):
        return self.main(input)

现在，我们创建 netG 生成器并显示其结构。

# Create the generator
netG = Generator(ngpu).to(device)

# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
    netG = nn.DataParallel(netG, list(range(ngpu)))

# Apply the weights_init function to randomly initialize all weights
#  to mean=0, stdev=0.2.
netG.apply(weights_init)

# Print the model
print(netG)

从头开始构建判别器

我们的判别器将被称为 netD，它将由跨步卷积层、LeakyReLU 激活函数和批归一化层组成。它的输入将是 3 x 64 x 64 的输入图像，输出将是输入来自真实数据集的标量概率。

class Discriminator(nn.Module):
    def __init__(self, ngpu):
        super(Discriminator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is (nc) x 64 x 64
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf) x 32 x 32
            nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*2) x 16 x 16
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*4) x 8 x 8
            nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()
        )

    def forward(self, input):
        return self.main(input)

# Create the Discriminator
netD = Discriminator(ngpu).to(device)

# Handle multi-gpu if desired
if (device.type == 'cuda') and (ngpu > 1):
    netD = nn.DataParallel(netD, list(range(ngpu)))

# Apply the weights_init function to randomly initialize all weights
#  to mean=0, stdev=0.2.
netD.apply(weights_init)

# Print the model
print(netD)

初始化 GAN 的损失和优化器

在开始训练我们的 GAN 之前，我们将设置其损失函数和优化器。在 GAN 中，我们通常使用二元交叉熵作为损失函数，因为我们的输出有两个类别：假（0）和真（1）。我们将使用 Adam 优化器，学习率为 0.0002，Beta1 为 0.5。

# Initialize BCELoss function
criterion = nn.BCELoss()

# Create batch of latent vectors that we will use to visualize
#  the progression of the generator
fixed_noise = torch.randn(64, nz, 1, 1, device=device)

# Establish convention for real and fake labels during training
real_label = 1.
fake_label = 0.

# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))

后续步骤

在下一篇文章中，我们将向您展示如何训练 GAN 来生成时尚设计。敬请关注！