iOS 对象检测，带实时相机预览

Jarek Szczegielniak

5.00/5 (2投票s)

2020年11月27日

CPOL

3分钟阅读

6950

134

在下一篇文章中，我们将开始开发使用该模型的 iOS 应用程序。

下载 iOS 预览 - 64.1 KB

引言

本系列文章假设您熟悉 Python、Conda 和 ONNX，并且具有使用 Xcode 开发 iOS 应用程序的一些经验。欢迎您下载此项目的源代码。我们将使用 macOS 10.15+、Xcode 11.7+ 和 iOS 13+ 运行代码。

在 iOS 应用程序中处理实时摄像头 feed 可能有点让人不知所措。我们将尽量使事情尽可能简单，更注重代码的可读性，而不是性能。此外，为了减少需要考虑的缩放选项数量，我们将使用固定的纵向方向。

本文的代码最初灵感来自此应用程序。

此演示应用程序是使用 Xcode 11.7 编写的，应该可以与 iOS 13 或更高版本中的任何 iPhone 7 及更新机型配合使用。

应用程序布局

从 storyboard 的角度来看，我们的应用程序非常简单。它包含一个带有单个 Preview View 控件的 View Controller。我们将使用此视图作为实时摄像头 feed。

捕获摄像头 Feed

所有负责处理摄像头输入和视频预览的代码都在 Controllers/VideoCapture 类中，该类实现了 AVCaptureVideoDataOutputSampleBufferDelegate。

以下成员存储了它的设置

private let captureSession = AVCaptureSession()
private var videoPreviewLayer: AVCaptureVideoPreviewLayer! = nil
private let videoDataOutput = AVCaptureVideoDataOutput()
private let videoDataOutputQueue = DispatchQueue(label: "VideoDataOutput", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
private var videoFrameSize: CGSize = .zero

由类构造函数调用的 setupPreview 方法将所有元素绑定在一起。

首先，它获取第一个可用的后置摄像头作为输入设备

var deviceInput: AVCaptureDeviceInput!
        
let videoDevice = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: .video, position: .back).devices.first
do {

接下来，它启动配置过程，强制摄像头输出 640 x 480 的帧。这种分辨率对于我们的 YOLO v2 模型来说已经足够了，因为它使用的图像会缩放到 416 x 416 像素。请注意，由于固定的纵向方向，我们将 48 x 640 作为输入维度存储在 videoFrameSize 变量中，以供将来使用

captureSession.beginConfiguration()
captureSession.sessionPreset = .vga640x480
self.videoFrameSize = CGSize(width: 480, height: 640)

配置继续为要处理的帧建立一个单元素队列（alwaysDiscardLateVideoFrames 标志）。这意味着，在当前帧的处理完成之前，后续帧将被丢弃。

captureSession.addInput(deviceInput)
if captureSession.canAddOutput(videoDataOutput) {
    captureSession.addOutput(videoDataOutput)
    videoDataOutput.alwaysDiscardsLateVideoFrames = true
    videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
    videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue)
} else {
    print("Could not add video data output to the session")
    captureSession.commitConfiguration()
    return
}

let captureConnection = videoDataOutput.connection(with: .video)
captureConnection?.isEnabled = true
captureConnection?.videoOrientation = .portrait
captureSession.commitConfiguration()

固定的纵向方向将使处理和绘制对象检测预测变得更容易。

摄像头 Feed 预览

在接下来的步骤中，setup 方法创建一个 videoPreviewLayer 实例，并将其作为子层添加到我们应用程序的视图中（下面的 viewLayer）

self.videoPreviewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
self.videoPreviewLayer.videoGravity = .resizeAspectFill
        
videoPreviewLayer.frame = viewLayer.bounds
viewLayer.addSublayer(videoPreviewLayer)

对 videoGravity 使用 .resizeAspectFill 值可确保视频填充整个可用屏幕。由于没有 iPhone 的屏幕比例等于 1.33:1（由 640 x 480 分辨率推断得出），因此每个帧都将在纵向视图中在两侧被裁剪。如果我们使用 .resizeAspect 代替，则整个帧都将可见，但上下方会有空白条。

完成摄像头预览配置

我们需要在 VideoCapture 类中添加三个方法

public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
    // We will handle frame(s) here
}

public func captureOutput(_ captureOutput: AVCaptureOutput, didDrop didDropSampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
    // Dropped frame(s) can be handled here
}

public func startCapture() {
    if !captureSession.isRunning {
        captureSession.startRunning()
    }
}

前两个方法是必需的，因为它们在 AVCaptureVideoDataOutputSampleBufferDelegate 中定义，而我们的 VideoCapture 类实现了该协议。目前，空实现就可以了。我们需要最后一个方法 startCapture 来开始处理视频 feed。

将完整的 VideoCapture 实现链接到 Main.storyboard 中的 cameraView，并使用实例变量来存储创建的 VideoCapture 实例，我们在主 ViewController 的 viewDidLoad 方法中创建一个新的 VideoCapture 实例

self.videoCapture = VideoCapture(self.cameraView.layer)
self.videoCapture.startCapture()

结论

我们现在有一个简单的 iOS 应用程序，配置为捕获和预览实时摄像头流。在本系列的下一篇（也是最后一篇）文章中，我们将扩展该应用程序以使用我们的 YOLO v2 模型进行对象检测。