使用 Python 和 CodeProject.AI 服务器进行 IP 摄像机对象检测,第 2 部分





5.00/5 (3投票s)
关于检测对象和邪恶啮齿动物的两部分系列的第二部分。
引言
在我们上一篇文章《使用 CodeProject.AI 服务器检测浣熊,第 1 部分》中,我们展示了如何连接 Wyze 摄像机的视频流并将其发送到 CodeProject.AI 服务器以检测对象。
在本文中,我们将训练自己的浣熊专用模型,并设置一个简单的警报,当这些“垃圾熊猫”出现在任务中时通知我们。
为 CodeProject.AI 服务器对象检测训练模型
CodeProject.AI 服务器开箱即用,提供了多种对象检测模块。为了简单起见,我们将重点关注 YOLOv5 6.2 模块,这意味着训练一个 YOLOv5 PyTorch 模型。
安装
本文基于 Matthew Dennis 的更全面的文章《如何训练自定义 YOLOv5 模型来检测对象》。我们将快速总结设置过程,以便您能快速上手。
从 Visual Studio Code 开始,安装 Jupyter notebook 扩展,并使用 Matthew 文章中提供的 Jupyter notebook。在该 notebook 中,会创建一个 Python 虚拟环境。
!Python -m venv venv
克隆 Ultralytics YOLOv5 存储库
!git clone <a href="https://github.com/ultralytics/yolov5">https://github.com/ultralytics/yolov5</a>
并设置 VS Code 将使用的虚拟环境,方法是在 notebook 右上角的列表中选择“venv”。
在包含 Jupyter notebook 的下载文件中,有一个 `requirements.txt` 文件,其中包含设置 Python 环境时需要安装的依赖项。
安装依赖项
%pip install fiftyone
%pip install -r requirements-cpu.txt
%pip install ipywidgets
如果您有 NVIDIA GPU,请使用 `requirements-gpu.txt`。
训练数据
要训练自定义模型,我们需要图像来构建模型。在选择训练数据和构建像我们这样的模型时,一个重要注意事项是,我们只希望检测浣熊,并且要确保我们训练的是浣熊,而不是松鼠、猫或非常非常小的熊。为了做到这一点,我们将训练一个包含浣熊、狗、猫、松鼠和臭鼬的模型。
我们将使用出色的 fiftyone 包从广泛的 Open Images 数据集中抓取图像。首先,我们创建一个只包含浣熊的 `critters` 数据集,然后逐步向该集合中添加猫、狗以及其他动物。
import fiftyone as fo
import fiftyone.zoo as foz
splits = ["train", "validation", "test"]
numSamples = 10000
seed = 42
# Get 10,000 images (maybe in total, maybe of each split) from fiftyone.
# We'll ask FiftyOne to use images from the open-images-v6 dataset and
# store information of this download in the dataset named
# "open-imges-critters".
# The data that's downloaded will include the images, annotations, and
# a summary of what's been downloaded. That summary will be stored
# /Users/<username>/.FiftyOne in a mongoDB database. The images /
# annotations will be in /Users/<username>/FiftyOne.
if fo.dataset_exists("open-images-critters"):
fo.delete_dataset("open-images-critters")
dataset = foz.load_zoo_dataset(
"open-images-v6",
splits=splits,
label_types=["detections"],
classes="Raccoon",
max_samples=numSamples,
seed=seed,
shuffle=True,
dataset_name="open-images-critters")
# Take a quick peek to see what's there
print(dataset)
# Do the same for cats, dogs, squirrels, and skunks, but after each
# download we'll merge the new downloaded dataset with the existing
# open-images-critters dataset so we can build up one large,
# multi-class set
if fo.dataset_exists("open-images-cats"):
fo.delete_dataset("open-images-cats")
cats_dataset = foz.load_zoo_dataset(
"open-images-v6",
splits=splits,
label_types=["detections"],
classes="Cat",
max_samples=numSamples,
seed=seed,
shuffle=True,
dataset_name="open-images-cats")
# Now merge this new set with the existing open-images-critters set
dataset.merge_samples(cats_dataset)
if fo.dataset_exists("open-images-dogs"):
fo.delete_dataset("open-images-dogs")
dogs_dataset = foz.load_zoo_dataset(
"open-images-v6",
splits=splits,
label_types=["detections"],
classes="Dog",
max_samples=numSamples,
seed=seed,
shuffle=True,
dataset_name="open-images-dogs")
dataset.merge_samples(dogs_dataset)
if fo.dataset_exists("open-images-squirrels"):
fo.delete_dataset("open-images-squirrels")
squirrels_dataset = foz.load_zoo_dataset(
"open-images-v6",
splits=splits,
label_types=["detections"],
classes="Squirrel",
max_samples=numSamples,
seed=seed,
shuffle=True,
dataset_name="open-images-squirrels")
dataset.merge_samples(squirrels_dataset)
if fo.dataset_exists("open-images-skunks"):
fo.delete_dataset("open-images-skunks")
skunks_dataset = foz.load_zoo_dataset(
"open-images-v6",
splits=splits,
label_types=["detections"],
classes="Skunk",
max_samples=numSamples,
seed=seed,
shuffle=True,
dataset_name="open-images-skunks")
dataset.merge_samples(skunks_dataset)
# For whenever you want to see what's been loaded.
print(fo.list_datasets())
# Uncomment the following line if you wish to explore the
# resulting datasets in the FiftyOne UI
# session = fo.launch_app(dataset, port=5151)
下一步是将此训练数据导出为 YOLOv5 训练器所需的格式
import fiftyone as fo
export_dir = "datasets/critters"
label_field = "detections" # for example
# The splits to export
splits = ["train", "validation","test"]
# All splits must use the same classes list
classes = ["Raccoon", "Cat", "Dog", "Squirrel", "Skunk"]
# The dataset or view to export
# We assume the dataset uses sample tags to encode the splits to export
dataset_or_view = fo.load_dataset("open-images-critters")
# Export the splits
for split in splits:
split_view = dataset_or_view.match_tags(split)
split_view.export(
export_dir=export_dir,
dataset_type=fo.types.YOLOv5Dataset,
label_field=label_field,
split=split,
classes=classes,
)
在此过程中,会创建一个 `datasets\critters\dataset.yaml` 文件。我们需要对其进行微调,将 `validation` 重命名为 `val`。您的文件应如下所示:
names:
- Raccoon
- Cat
- Dog
- Squirrel
- Skunk
nc: 5
path: c:\Dev\YoloV5_Training\datasets\critters
train: .\images\train\
test: .\images\test\
val: .\images\validation\
`nc` 是“类别数量”,即 `5`(浣熊、猫、狗、松鼠、臭鼬),`path` 是图像的路径,而 `train`、`test` 和 `val` 是我们模型训练过程中包含训练、测试和验证数据的文件夹。
关于图像数量的说明
提高模型准确性的两种方法是:
- 训练更长时间(更多“epochs”,或训练迭代次数)
- 使用更多数据进行训练(更多图像)
您可能需要根据自己的设置调整图像数量。资源消耗可能相当大,图像越多,训练时间越长。使用 50 个 epochs 和 1,000 张图像,在 NVIDIA 3060 GPU 上训练大约需要 50 分钟。25,000 张图像和 300 个 epochs 大约需要 30 小时。
训练模型
要开始在我们的 Jupyter notebook 中训练模型,我们使用 `!` 语法运行 `yolov5/train.py` Python 模块来启动一个外部进程。
!python yolov5/train.py --batch 24 --weights
yolov5s.pt --data datasets/critters/dataset.yaml --project train/critters
--name epochs50 --epochs 300
我们将 `batch` 参数设置为 `24`,只是为了确保我们不会耗尽内存。我们有 16GB 系统内存,12GB 专用 GPU 内存。对于较小(1,000 张图像)的数据集,批处理大小为 32 是可以接受的,但对于较大的图像集,批处理大小为 32 则过高。您可能需要进行实验以获得适合您机器的最佳批处理大小。
中断和恢复训练
您可以随时停止训练,并使用 `--resume` 标志重新开始。
!python yolov5/train.py --resume train/critters/epochs300/weights/last.pt
使用我们的模型
获取我们训练创建的 `critters.pt` 文件,并将其放入 `C:\Program Files\CodeProject\AI\modules\ObjectDetectionYolo\custom-models`。CodeProject.AI 服务器将立即能够使用此新模型,无需任何更改或重启,使用路由 `vision/custom/critters`,**前提是您使用的是 YOLO 6.2 模块**。每个模块都有自己的自定义模型位置。
我们可以通过打开作为 CodeProject.AI 一部分安装的 CodeProject.AI 服务器浏览器来测试。选择 **Vision** 选项卡,在 **Custom Detect** 按钮旁边选择一张浣熊图片,选择“critters”作为 **Model**,然后进行测试。
更新我们的 Wyze Cam 代码以使用此新模型
我们将修改《使用 CodeProject.AI 服务器检测浣熊,第 1 部分》中的代码,添加两项内容:
- 我们将使用我们的新模型。
- 当检测到浣熊时,我们将触发一个警报。
使用模型
使用模型很简单。我们将修改 `do_detection` 方法,通过更改 `do_detection` 中的以下行来使用新模型:
response = session.post(opts.endpoint("vision/detection"),
to
response = session.post(opts.endpoint("vision/custom/critters"),
但是,要设置警报,我们需要知道要查找什么,以及是否找到了。我们将添加一个参数,该参数接受一个要监视的“入侵者”列表,并返回一个由逗号分隔的找到的入侵者列表。
model_name = "critters" # Model we'll use
intruders = [ "racoon", "skunk" ] # Things we care about
def do_detection(image: Image, intruders: List[str]) -> "(Image, str)":
"""
Performs object detection on an image and returns an image with the objects
that were detected outlined, as well as a de-duped list of objects detected.
If nothing detected, image and list of objects are both returned as None
"""
# Convert to format suitable for a POST
buf = io.BytesIO()
image.save(buf, format='JPEG')
buf.seek(0)
# Better to have a session object created once at the start and closed at
# the end, but we keep the code simpler here for demo purposes
with requests.Session() as session:
response = session.post(opts.endpoint("vision/custom/" + model_name),
files={"image": ('image.png', buf, 'image/png') },
data={"min_confidence": 0.5}).json()
# Get the predictions (but be careful of a null return)
predictions = response["predictions"]
detected_list = []
if predictions:
# Draw each bounding box that was returned by the AI engine
# font = ImageFont.load_default()
font_size = 25
padding = 5
font = ImageFont.truetype("arial.ttf", font_size)
draw = ImageDraw.Draw(image)
for object in predictions:
label = object["label"]
conf = object["confidence"]
y_max = int(object["y_max"])
y_min = int(object["y_min"])
x_max = int(object["x_max"])
x_min = int(object["x_min"])
draw.rectangle([(x_min, y_min), (x_max, y_max)], outline="red", width=5)
draw.rectangle([(x_min, y_min - 2*padding - font_size),
(x_max, y_min)], fill="red", outline="red")
draw.text((x_min + padding, y_min - padding - font_size),
f"{label} {round(conf*100.0,0)}%", font=font)
# We're looking for specific objects. Build a deduped list
# containing only the objects we're interested in.
if label in intruders and not label in detected_list:
detected_list.append(label)
# All done. Did we find any objects we were interested in?
if detected_list:
return image, ', '.join(detected_list)
return None, None
接下来,我们将修改 `main` 方法,以便在检测到浣熊时触发警报。
secs_between_checks = 5 # Min secs between sending a frame to CodeProject.AI
last_check_time = datetime(1999, 11, 15, 0, 0, 0)
recipient = "alerts@acme_security.com" # Sucker who deals with reports
def main():
# Open the RTSP stream
vs = VideoStream(opts.rtsp_url).start()
while True:
# Grab a frame at a time
frame = vs.read()
if frame is None:
continue
objects_detected = ""
# Let's not send an alert *every* time we see an object, otherwise we'll
# get an endless stream of emails, fractions of a second apart
global last_check_time
seconds_since_last_check = (datetime.now() - last_check_time).total_seconds()
if seconds_since_last_check >= secs_between_checks:
# You may need to convert the colour space.
# image: Image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
image: Image = Image.fromarray(frame)
(image, objects_detected) = do_detection(image, intruders)
# Replace the webcam feed's frame with our image that include object
# bounding boxes
if image:
frame = np.asarray(image)
last_check_time = datetime.now()
# Resize and display the frame on the screen
if frame is not None:
frame = imutils.resize(frame, width = 1200)
cv2.imshow('WyzeCam', frame)
if objects_detected:
# Shrink the image to reduce email size
frame = imutils.resize(frame, width = 600)
image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
report_intruder(image, objects_detected, recipient)
# Wait for the user to hit 'q' for quit
key = cv2.waitKey(1) & 0xFF
if key == ord('q'):
break
# Clean up and we're outta here.
cv2.destroyAllWindows()
vs.stop()
请注意,我们不是将每一帧都发送到 CodeProject.AI。这会占用相当多的处理器时间,而且没有必要。Wyze 摄像机以每秒 15 帧的速度运行,但为了实际起见,我们可以每隔几秒检查一帧。根据需要调整。
最后一块拼图是 `report_intruder` 方法。我们将向控制台写入检测到的入侵者列表,同时将电子邮件发送给需要知道的人。对于电子邮件,我们使用的是 Gmail 帐户。
要启用此功能,请使用或创建一个 Gmail 帐户,并使用 Windows `setx` 命令将您的帐户的电子邮件和密码存储在环境变量中。**这不安全**,但总比将密码提交到 Git 仓库要好。请为此使用测试电子邮件帐户,而不是您的实际电子邮件帐户。
setx CPAI_EMAIL_DEMO_FROM "me@gmail.com"
setx CPAI_EMAIL_DEMO_PWD "password123"
我们的 `report_intruder` 方法及其使用的 `send_email` 方法如下:
last_alert_time = datetime(1999, 11, 15, 0, 0, 0)
secs_between_alerts = 300 # Min secs between sending alerts (don't spam!)
def report_intruder(image: Image, objects_detected: str, recipient: str) -> None:
# time since we last sent an alert
global last_alert_time
seconds_since_last_alert = (datetime.now() - last_alert_time).total_seconds()
# Only send an alert if there's been sufficient time since the last alert
if seconds_since_last_alert > secs_between_alerts:
# Simple console output
timestamp = datetime.now().strftime("%d %b %Y %I:%M:%S %p")
print(f"{timestamp} Intruder or intruders detected: {objects_detected}")
# Send an email alert as well
with BytesIO() as buffered:
image.save(buffered, format="JPEG")
img_dataB64_bytes : bytes = base64.b64encode(buffered.getvalue())
img_dataB64 : str = img_dataB64_bytes.decode("ascii");
message_html = "<p>An intruder was detected. Please review this image</p>" \
+ f"<img src='data:image/jpeg;base64,{img_dataB64}'>"
message_text = "A intruder was detected. We're all doomed!"
send_email(opts.email_acct, opts.email_pwd, recipient, "Intruder Alert!",
message_text, message_html)
# Could send an SMS or a tweet. Whatever takes your fancy...
last_alert_time = datetime.now()
def send_email(sender, pwd, recipient, subject, message_text, message_html):
msg = MIMEMultipart('alternative')
msg['From'] = sender
msg['To'] = recipient
msg['Subject'] = subject
text = MIMEText(message_text, 'plain')
html = MIMEText(message_html, 'html')
msg.attach(text)
msg.attach(html)
try:
server = smtplib.SMTP(opts.email_server, opts.email_port)
server.ehlo()
server.starttls()
server.ehlo()
server.login(sender, pwd)
server.send_message(msg, sender, [recipient])
except Exception as ex:
print(f"Error sending email: {ex}")
finally:
server.quit()
结论
我们已经走完了获取标准 Wyze 摄像机并更新其固件,以便能够访问 RTSP 流进行处理的过程。然后,我们使用了 Open Images 数据集来创建一个自定义 YOLOv5 模型来检测小动物。通过将此模型添加到 CodeProjet.AI 服务器的 YOLOv5 6.2 对象检测模块的 `custom-models` 文件夹中,我们就拥有了自己的浣熊检测器。再用一点 Python 代码,我们就可以使用这个检测器定期检查我们的 Wyze 摄像机馈送,并在这些戴着面具的小盗贼出现在视野中时向我们发送电子邮件。
代码包含在 CodeProject.AI 服务器的源代码中(在 `Demos/Python/ObjectDetect/racoon_detect.py` 中)。
我们编写 CodeProject.AI 服务器是为了省去设置 AI 系统和项目的麻烦。我们处理运行时、包以及让所有组件就位,以便我们可以直接跳到有趣的部分,例如检测“垃圾熊猫”。
请 下载 CodeProject.AI 并试用。添加您自己的模块,将其集成到您的应用程序中,训练一些自定义模型,并使用它来了解一些关于人工智能的知识。