使用 Python 和 CodeProject.AI 服务器进行 IP 摄像机对象检测，第 2 部分

Chris Maunder

5.00/5 (3投票s)

2023 年 5 月 23 日

CPOL

6分钟阅读

14012

关于检测对象和邪恶啮齿动物的两部分系列的第二部分。

Scheming Racoon

引言

在我们上一篇文章《使用 CodeProject.AI 服务器检测浣熊，第 1 部分》中，我们展示了如何连接 Wyze 摄像机的视频流并将其发送到 CodeProject.AI 服务器以检测对象。

在本文中，我们将训练自己的浣熊专用模型，并设置一个简单的警报，当这些“垃圾熊猫”出现在任务中时通知我们。

为 CodeProject.AI 服务器对象检测训练模型

CodeProject.AI 服务器开箱即用，提供了多种对象检测模块。为了简单起见，我们将重点关注 YOLOv5 6.2 模块，这意味着训练一个 YOLOv5 PyTorch 模型。

安装

本文基于 Matthew Dennis 的更全面的文章《如何训练自定义 YOLOv5 模型来检测对象》。我们将快速总结设置过程，以便您能快速上手。

从 Visual Studio Code 开始，安装 Jupyter notebook 扩展，并使用 Matthew 文章中提供的 Jupyter notebook。在该 notebook 中，会创建一个 Python 虚拟环境。

!Python -m venv venv

克隆 Ultralytics YOLOv5 存储库

!git clone <a href="https://github.com/ultralytics/yolov5">https://github.com/ultralytics/yolov5</a>

并设置 VS Code 将使用的虚拟环境，方法是在 notebook 右上角的列表中选择“venv”。

在包含 Jupyter notebook 的下载文件中，有一个 `requirements.txt` 文件，其中包含设置 Python 环境时需要安装的依赖项。

安装依赖项

%pip install fiftyone
%pip install -r requirements-cpu.txt 
%pip install ipywidgets

如果您有 NVIDIA GPU，请使用 `requirements-gpu.txt`。

训练数据

要训练自定义模型，我们需要图像来构建模型。在选择训练数据和构建像我们这样的模型时，一个重要注意事项是，我们只希望检测浣熊，并且要确保我们训练的是浣熊，而不是松鼠、猫或非常非常小的熊。为了做到这一点，我们将训练一个包含浣熊、狗、猫、松鼠和臭鼬的模型。

我们将使用出色的 fiftyone 包从广泛的 Open Images 数据集中抓取图像。首先，我们创建一个只包含浣熊的 `critters` 数据集，然后逐步向该集合中添加猫、狗以及其他动物。

    import fiftyone as fo
    import fiftyone.zoo as foz

    splits = ["train", "validation", "test"]
    numSamples = 10000
    seed = 42

    # Get 10,000 images (maybe in total, maybe of each split) from fiftyone. 
    # We'll ask FiftyOne to use images from the open-images-v6 dataset and 
    # store information of this download in the dataset named 
    # "open-imges-critters". 

    # The data that's downloaded will include the images, annotations, and
    # a summary of what's been downloaded. That summary will be stored 
    # /Users/<username>/.FiftyOne in a mongoDB database. The images / 
    # annotations will be in /Users/<username>/FiftyOne.

    if fo.dataset_exists("open-images-critters"):
        fo.delete_dataset("open-images-critters")

    dataset = foz.load_zoo_dataset(
        "open-images-v6",
        splits=splits,
        label_types=["detections"],
        classes="Raccoon",
        max_samples=numSamples,
        seed=seed,
        shuffle=True,
        dataset_name="open-images-critters")

    # Take a quick peek to see what's there
    print(dataset)

    # Do the same for cats, dogs, squirrels, and skunks, but after each
    # download we'll merge the new downloaded dataset with the existing 
    # open-images-critters dataset so we can build up one large, 
    # multi-class set

    if fo.dataset_exists("open-images-cats"):
        fo.delete_dataset("open-images-cats")

    cats_dataset = foz.load_zoo_dataset(
        "open-images-v6",
        splits=splits,
        label_types=["detections"],
        classes="Cat",
        max_samples=numSamples,
        seed=seed,
        shuffle=True,
        dataset_name="open-images-cats")

    # Now merge this new set with the existing open-images-critters set
    dataset.merge_samples(cats_dataset)

    if fo.dataset_exists("open-images-dogs"):
        fo.delete_dataset("open-images-dogs")

    dogs_dataset = foz.load_zoo_dataset(
        "open-images-v6",
        splits=splits,
        label_types=["detections"],
        classes="Dog",
        max_samples=numSamples,
        seed=seed,
        shuffle=True,
        dataset_name="open-images-dogs")

    dataset.merge_samples(dogs_dataset)

    if fo.dataset_exists("open-images-squirrels"):
        fo.delete_dataset("open-images-squirrels")

    squirrels_dataset = foz.load_zoo_dataset(
        "open-images-v6",
        splits=splits,
        label_types=["detections"],
        classes="Squirrel",
        max_samples=numSamples,
        seed=seed,
        shuffle=True,
        dataset_name="open-images-squirrels")

    dataset.merge_samples(squirrels_dataset)

    if fo.dataset_exists("open-images-skunks"):
        fo.delete_dataset("open-images-skunks")

    skunks_dataset = foz.load_zoo_dataset(
        "open-images-v6",
        splits=splits,
        label_types=["detections"],
        classes="Skunk",
        max_samples=numSamples,
        seed=seed,
        shuffle=True,
        dataset_name="open-images-skunks")

    dataset.merge_samples(skunks_dataset)

    # For whenever you want to see what's been loaded.
    print(fo.list_datasets())

    # Uncomment the following line if you wish to explore the 
    # resulting datasets in the FiftyOne UI
    # session = fo.launch_app(dataset, port=5151)

下一步是将此训练数据导出为 YOLOv5 训练器所需的格式

    import fiftyone as fo

    export_dir = "datasets/critters"
    label_field = "detections"  # for example

    # The splits to export
    splits = ["train", "validation","test"]

    # All splits must use the same classes list
    classes = ["Raccoon", "Cat", "Dog", "Squirrel", "Skunk"]

    # The dataset or view to export
    # We assume the dataset uses sample tags to encode the splits to export
    dataset_or_view = fo.load_dataset("open-images-critters")

    # Export the splits
    for split in splits:
        split_view = dataset_or_view.match_tags(split)
        split_view.export(
            export_dir=export_dir,
            dataset_type=fo.types.YOLOv5Dataset,
            label_field=label_field,
            split=split,
            classes=classes,
        )

在此过程中，会创建一个 `datasets\critters\dataset.yaml` 文件。我们需要对其进行微调，将 `validation` 重命名为 `val`。您的文件应如下所示：

names: 
- Raccoon 
- Cat 
- Dog 
- Squirrel 
- Skunk 
nc: 5 
path: c:\Dev\YoloV5_Training\datasets\critters 
train: .\images\train\ 
test: .\images\test\ 
val: .\images\validation\

`nc` 是“类别数量”，即 `5`（浣熊、猫、狗、松鼠、臭鼬），`path` 是图像的路径，而 `train`、`test` 和 `val` 是我们模型训练过程中包含训练、测试和验证数据的文件夹。

关于图像数量的说明

提高模型准确性的两种方法是：

训练更长时间（更多“epochs”，或训练迭代次数）
使用更多数据进行训练（更多图像）

您可能需要根据自己的设置调整图像数量。资源消耗可能相当大，图像越多，训练时间越长。使用 50 个 epochs 和 1,000 张图像，在 NVIDIA 3060 GPU 上训练大约需要 50 分钟。25,000 张图像和 300 个 epochs 大约需要 30 小时。

训练模型

要开始在我们的 Jupyter notebook 中训练模型，我们使用 `!` 语法运行 `yolov5/train.py` Python 模块来启动一个外部进程。

!python yolov5/train.py --batch 24 --weights
yolov5s.pt --data datasets/critters/dataset.yaml --project train/critters
--name epochs50 --epochs 300

我们将 `batch` 参数设置为 `24`，只是为了确保我们不会耗尽内存。我们有 16GB 系统内存，12GB 专用 GPU 内存。对于较小（1,000 张图像）的数据集，批处理大小为 32 是可以接受的，但对于较大的图像集，批处理大小为 32 则过高。您可能需要进行实验以获得适合您机器的最佳批处理大小。

中断和恢复训练

您可以随时停止训练，并使用 `--resume` 标志重新开始。

!python yolov5/train.py --resume train/critters/epochs300/weights/last.pt

使用我们的模型

获取我们训练创建的 `critters.pt` 文件，并将其放入 `C:\Program Files\CodeProject\AI\modules\ObjectDetectionYolo\custom-models`。CodeProject.AI 服务器将立即能够使用此新模型，无需任何更改或重启，使用路由 `vision/custom/critters`，**前提是您使用的是 YOLO 6.2 模块**。每个模块都有自己的自定义模型位置。

我们可以通过打开作为 CodeProject.AI 一部分安装的 CodeProject.AI 服务器浏览器来测试。选择 **Vision** 选项卡，在 **Custom Detect** 按钮旁边选择一张浣熊图片，选择“critters”作为 **Model**，然后进行测试。

Gotcha

更新我们的 Wyze Cam 代码以使用此新模型

我们将修改《使用 CodeProject.AI 服务器检测浣熊，第 1 部分》中的代码，添加两项内容：

我们将使用我们的新模型。
当检测到浣熊时，我们将触发一个警报。

使用模型

使用模型很简单。我们将修改 `do_detection` 方法，通过更改 `do_detection` 中的以下行来使用新模型：

        response = session.post(opts.endpoint("vision/detection"),

        response = session.post(opts.endpoint("vision/custom/critters"),

但是，要设置警报，我们需要知道要查找什么，以及是否找到了。我们将添加一个参数，该参数接受一个要监视的“入侵者”列表，并返回一个由逗号分隔的找到的入侵者列表。

model_name = "critters"             # Model we'll use
intruders  = [ "racoon", "skunk" ]  # Things we care about

def do_detection(image: Image, intruders: List[str]) -> "(Image, str)":

    """
    Performs object detection on an image and returns an image with the objects
    that were detected outlined, as well as a de-duped list of objects detected.
    If nothing detected, image and list of objects are both returned as None
    """

    # Convert to format suitable for a POST
    buf = io.BytesIO()
    image.save(buf, format='JPEG')
    buf.seek(0)

    # Better to have a session object created once at the start and closed at
    # the end, but we keep the code simpler here for demo purposes    
    with requests.Session() as session:
        response = session.post(opts.endpoint("vision/custom/" + model_name),
                                files={"image": ('image.png', buf, 'image/png') },
                                data={"min_confidence": 0.5}).json()

    # Get the predictions (but be careful of a null return)
    predictions = response["predictions"]

    detected_list = []

    if predictions:
        # Draw each bounding box that was returned by the AI engine
        # font = ImageFont.load_default()
        font_size = 25
        padding   = 5
        font = ImageFont.truetype("arial.ttf", font_size)
        draw = ImageDraw.Draw(image)

        for object in predictions:
            label = object["label"]
            conf  = object["confidence"]
            y_max = int(object["y_max"])
            y_min = int(object["y_min"])
            x_max = int(object["x_max"])
            x_min = int(object["x_min"])

            draw.rectangle([(x_min, y_min), (x_max, y_max)], outline="red", width=5)
            draw.rectangle([(x_min, y_min - 2*padding - font_size), 
                            (x_max, y_min)], fill="red", outline="red")
            draw.text((x_min + padding, y_min - padding - font_size),
                       f"{label} {round(conf*100.0,0)}%", font=font)

            # We're looking for specific objects. Build a deduped list
            # containing only the objects we're interested in.
            if label in intruders and not label in detected_list:
                detected_list.append(label)

    # All done. Did we find any objects we were interested in?
    if detected_list:
        return image, ', '.join(detected_list)

    return None, None

接下来，我们将修改 `main` 方法，以便在检测到浣熊时触发警报。

secs_between_checks = 5   # Min secs between sending a frame to CodeProject.AI
last_check_time = datetime(1999, 11, 15, 0, 0, 0)
recipient       = "alerts@acme_security.com"    # Sucker who deals with reports

def main():

    # Open the RTSP stream
    vs = VideoStream(opts.rtsp_url).start() 

    while True:

        # Grab a frame at a time
        frame = vs.read()
        if frame is None:
            continue

        objects_detected = ""

        # Let's not send an alert *every* time we see an object, otherwise we'll
        # get an endless stream of emails, fractions of a second apart
        global last_check_time
        seconds_since_last_check = (datetime.now() - last_check_time).total_seconds()

        if seconds_since_last_check >= secs_between_checks:
            # You may need to convert the colour space.
            # image: Image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
            image: Image = Image.fromarray(frame)
            (image, objects_detected) = do_detection(image, intruders)

            # Replace the webcam feed's frame with our image that include object 
            # bounding boxes
            if image:
                frame = np.asarray(image)

            last_check_time = datetime.now()

        # Resize and display the frame on the screen
        if frame is not None:
            frame = imutils.resize(frame, width = 1200)
            cv2.imshow('WyzeCam', frame)

            if objects_detected:
                # Shrink the image to reduce email size
                frame = imutils.resize(frame, width = 600)
                image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
                report_intruder(image, objects_detected, recipient)

        # Wait for the user to hit 'q' for quit
        key = cv2.waitKey(1) & 0xFF
        if key == ord('q'):
            break

    # Clean up and we're outta here.
    cv2.destroyAllWindows()
    vs.stop()

请注意，我们不是将每一帧都发送到 CodeProject.AI。这会占用相当多的处理器时间，而且没有必要。Wyze 摄像机以每秒 15 帧的速度运行，但为了实际起见，我们可以每隔几秒检查一帧。根据需要调整。

最后一块拼图是 `report_intruder` 方法。我们将向控制台写入检测到的入侵者列表，同时将电子邮件发送给需要知道的人。对于电子邮件，我们使用的是 Gmail 帐户。

要启用此功能，请使用或创建一个 Gmail 帐户，并使用 Windows `setx` 命令将您的帐户的电子邮件和密码存储在环境变量中。**这不安全**，但总比将密码提交到 Git 仓库要好。请为此使用测试电子邮件帐户，而不是您的实际电子邮件帐户。

setx CPAI_EMAIL_DEMO_FROM "me@gmail.com"
setx CPAI_EMAIL_DEMO_PWD  "password123"

我们的 `report_intruder` 方法及其使用的 `send_email` 方法如下：

last_alert_time = datetime(1999, 11, 15, 0, 0, 0)
secs_between_alerts = 300 # Min secs between sending alerts (don't spam!)

def report_intruder(image: Image, objects_detected: str, recipient: str) -> None:

    # time since we last sent an alert
    global last_alert_time
    seconds_since_last_alert = (datetime.now() - last_alert_time).total_seconds()

    # Only send an alert if there's been sufficient time since the last alert
    if seconds_since_last_alert > secs_between_alerts:

        # Simple console output
        timestamp = datetime.now().strftime("%d %b %Y %I:%M:%S %p")
        print(f"{timestamp} Intruder or intruders detected: {objects_detected}")

        # Send an email alert as well
        with BytesIO() as buffered:
            image.save(buffered, format="JPEG")
            img_dataB64_bytes : bytes = base64.b64encode(buffered.getvalue())
            img_dataB64 : str = img_dataB64_bytes.decode("ascii");

        message_html = "<p>An intruder was detected. Please review this image</p>" \
                     + f"<img src='data:image/jpeg;base64,{img_dataB64}'>"
        message_text = "A intruder was detected. We're all doomed!"

        send_email(opts.email_acct, opts.email_pwd, recipient, "Intruder Alert!", 
                   message_text, message_html)

        # Could send an SMS or a tweet. Whatever takes your fancy...

        last_alert_time = datetime.now()

def send_email(sender, pwd, recipient, subject, message_text, message_html):

    msg = MIMEMultipart('alternative')
    msg['From']    = sender
    msg['To']      = recipient
    msg['Subject'] = subject

    text = MIMEText(message_text, 'plain')
    html = MIMEText(message_html, 'html')
    msg.attach(text)
    msg.attach(html)

    try:
        server = smtplib.SMTP(opts.email_server, opts.email_port)
        server.ehlo()
        server.starttls()
        server.ehlo()
        server.login(sender, pwd)
        server.send_message(msg, sender, [recipient])
    except Exception as ex:
        print(f"Error sending email: {ex}")
    finally:
        server.quit()

结论

我们已经走完了获取标准 Wyze 摄像机并更新其固件，以便能够访问 RTSP 流进行处理的过程。然后，我们使用了 Open Images 数据集来创建一个自定义 YOLOv5 模型来检测小动物。通过将此模型添加到 CodeProjet.AI 服务器的 YOLOv5 6.2 对象检测模块的 `custom-models` 文件夹中，我们就拥有了自己的浣熊检测器。再用一点 Python 代码，我们就可以使用这个检测器定期检查我们的 Wyze 摄像机馈送，并在这些戴着面具的小盗贼出现在视野中时向我们发送电子邮件。

代码包含在 CodeProject.AI 服务器的源代码中（在 `Demos/Python/ObjectDetect/racoon_detect.py` 中）。

我们编写 CodeProject.AI 服务器是为了省去设置 AI 系统和项目的麻烦。我们处理运行时、包以及让所有组件就位，以便我们可以直接跳到有趣的部分，例如检测“垃圾熊猫”。

请下载 CodeProject.AI 并试用。添加您自己的模块，将其集成到您的应用程序中，训练一些自定义模型，并使用它来了解一些关于人工智能的知识。