为 AI 模型训练准备图像

Sergio Virahonda

0/5 (0投票)

2021年1月26日

CPOL

4分钟阅读

9157

在本文中，我们准备用于 AI 模型训练的口罩图像。

引言

在本系列的上一篇文章中，我们讨论了创建口罩检测器的不同方法。在本文中，我们将为口罩检测器解决方案准备数据集。

收集图像、预处理图像以及增强结果数据集的过程对于任何图像数据集基本上都是相同的。我们将采用漫长的方式来涵盖数据稀缺的现实场景。我从两个不同的来源获得了图像，我将向您展示如何标准化和增强它们，以便将来进行标注。

虽然有一些自动化工具可以使这个过程变得轻松，但我们将以困难的方式来学习更多。

我们将使用一个Roboflow 数据集，其中包含 149 张佩戴口罩的人的照片，所有照片都有黑色填充并且“尺寸相同”，以及另一组来自 Kaggle 上的完全不同的来源，仅包含人脸（没有口罩）。有了这两个分别代表两类的数据集——戴口罩的脸和不戴口罩的脸——让我们来看看实现标准化和增强数据集的步骤。

Roboflow 数据集规范化

我将使用 Kaggle notebooks 来运行本文中的代码，因为它们可以轻松访问计算能力，并且预先配置了我们所需的所有工具，因此我们不必安装 Python、Tensorflow 或任何其他东西。但它们不是强制性的；如果您愿意，您可以通过在本地运行 Jupyter Notebook 来获得相同的结果。

在这种情况下，我手动下载了数据集，将其压缩并上传到 Kaggle Notebook。要启动 Kaggle Notebook，请访问 https://kaggle.com，登录，转到左侧面板中的 Notebooks，然后单击New notebook。运行后，上传 zip 文件并运行以下单元格。

基本库导入

import os # to explore directories
import matplotlib.pyplot as plt #to plot images
#import matplotlib.image as mpimg
import cv2 #to make image transformations
from PIL import Image,ImageOps #for images handling

让我们探索图像的尺寸。我们将读取每个图像，获取其形状，并获取数据集中唯一的尺寸

#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

在这里我得到了一些我意想不到的东西。这是输出

Unique shapes at imageset:  {(415, 415, 3), (415, 416, 3), (416, 415, 3), (416, 416, 3)}

您可能知道，我们不能用不同尺寸的图像来训练任何模型。让我们将它们标准化为单一尺寸 (415x415)

def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGB', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            new_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(new_image)
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-roboflow.jpg")
            if counter == 150:
                break

在 Kaggle 中保存文件并将其作为输出的便捷目录是/kaggle/working。

在下载标准化的数据集之前，运行此单元格以压缩所有图像，这样您就可以更轻松地找到最终的存档

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在您可以在右侧的目录浏览器中查找 output.zip 文件

人脸数据集的标准化

此任务的方法与我们为上面的 Roboflow 数据集选择的方法略有不同。这一次，数据集包含 4,000 多个图像，所有图像的尺寸都完全不同。转到数据集链接并从那里启动一个 Jupyter Notebook。我们将选择前 150 张图像。

基本导入

import os # to explore directories
import matplotlib.pyplot as plt #to plot images
import cv2 #to make image transformations
from PIL import Image #for images handling

如果您想探索数据集

#How many images do we have?
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
print('Images in directory: ',counter)
 
#Let's explore an image
%matplotlib inline
plt.figure()
image = cv2.imread('/kaggle/input/human-faces/Humans/1 (719).jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
plt.imshow(image)
plt.show()
 
 
#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

最后一个单元格返回了各种各样的尺寸，因此标准化是必要的。让我们将所有图像调整为 (415x415)，黑色填充

def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGBA', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            test_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(test_image)
            new_image = new_image.convert("RGB")
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-kaggle.jpg")
            if counter == 150:
                Break

要下载数据集

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在您可以在右侧面板中轻松找到它。

数据集增强

一旦您拥有了两个标准化的数据集，就可以加入数据并增强结果集了。数据增强为我们提供了一种从相对较小的数据集中人为生成更多小训练数据的方法。增强通常是必要的，因为任何模型都需要大量数据才能在训练期间获得良好的结果。

在您的计算机上解压缩这两个文件，将所有图像放在同一个文件夹中，将它们压缩，启动一个新的 Kaggle Notebook（我的在这里），然后上传结果文件。

接下来，让我们看看您必须做什么来增强数据。我们可以使用自动化服务来走一些捷径，但我们决定自己做一切，以便学习更多。

基本导入

import numpy as np
from numpy import expand_dims
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import cv2
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image

让我们直接进入增强。我们将使用 Keras 中的ImageDataGenerator方法，该方法在计算机视觉社区中得到广泛使用

def data_augmentation(filename):
    
    """
    This function will perform data augmentation:
    for each one of the images, will create expanded/reduced, darker/lighter, rotated images. 5 for every modification type.
    In total, we will create 15 extra images for every one in the original dataset.
    """
    
    image_data = []
    #reading the image
    image = cv2.imread(filename,3)
    #image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    #expanding the image dimension to one sample
    samples = expand_dims(image, 0)
    # creating the image data augmentation generators
    datagen1 = ImageDataGenerator(zoom_range=[0.5,1.2])
    datagen2 = ImageDataGenerator(brightness_range=[0.2,1.0])
    datagen3 = ImageDataGenerator(rotation_range=20)
      
    # preparing iterators
    it1 = datagen1.flow(samples, batch_size=1)
    it2 = datagen2.flow(samples, batch_size=1)
    it3 = datagen3.flow(samples, batch_size=1)
    image_data.append(image)
    for i in range(5):
        # generating batch of images
        batch1 = it1.next()
        batch2 = it2.next()
        batch3 = it3.next()
        # convert to unsigned integers
        image1 = batch1[0].astype('uint8')
        image2 = batch2[0].astype('uint8')
        image3 = batch3[0].astype('uint8')
        #appending to the list of images
        image_data.append(image1)
        image_data.append(image2)
        image_data.append(image3)
        
    return image_data

要实现它，让我们迭代/kaggle/input目录中的每个图像，并将所有结果保存在/kaggle/working中，以便将来下载

for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        result = data_augmentation(os.path.join(dirname, filename))
        for i in range(16):
            cv2.imwrite('/kaggle/working/'+str(counter)+'.jpg', result[i])

同样，在下载之前，只需运行接下来的两行，即可在右侧面板中更轻松地找到这些文件

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在您可以下载 output.zip 文件了。

下一步

在下一篇文章中，我们将了解如何正确标记生成的图像，以便训练 YOLO 模型。敬请关注！