How to process images real-time from the iOS camera
Learn how to process real-time camera feed images for computer vision features

Every passing day more and more apps are using the iPhone camera for more than just taking photos, videos or video calls. Apps such as Snapchat process images real-time to preview filters. Facebook attempts to make video calls fun with filters too. Many apps use the camera to scan barcodes and QR codes. Some scan credit cards and other documents.
In this post we’ll learn how to use the iOS device camera in a app. Furthermore we’ll learn how to receive images from the camera feed real-time to allow us to process these images. We won’t be processing the images themselves in this post.
For this post I assume you have intermediate knowledge of Swift and iOS development. You should be familiar with Grand Central Dispatch.
I have used Swift 5.2 and Xcode 11.4.1 for this article.
Getting started
The steps we’ll take in this post are:
- Create a new app
- Display the camera feed
- Processing the camera feed
Let’s dive in!
1. Create a new app
Let’s start by opening Xcode. From menu select File > New > Project…

When prompted “Choose a template for your new project:” search and select Single View App. Click Next.

When prompted “Choose options for your new project:” type ProcessingCameraFeed for “Product Name:”. Select Swift for “Language:”. Select Storyboard for “User Interface:”. Uncheck all check boxes. Click Next.

Save the project wherever desired then click Create.
2. Display the camera feed
In this section we’ll display the feed from the camera to the user.
To access the camera on iOS devices Apple provides us AVFoundation framework. The AVFoundation framework provides us methods to make use of various audiovisual media, such as microphone, wireless playback using Airplay connected devices as well as the camera amongst other things. In this post we’ll only be making use of the camera.
First let’s open the controller for our blank screen. Open ViewController.swift
. Add the following line of code right below import UIKit
:
import AVFoundation
This will allow us to make use of the AVFoundation framework.
Next let’s create an instance of AVCaptureSession. Add the following property to the ViewController
class:
private let captureSession = AVCaptureSession()
To output the camera feed we need to use AVCaptureSession to coordinate the devices capture inputs and the output destination.

Next let’s add the back camera as a capture input to our captureSession
. Add the following functions that will allow us to do so:
private func addCameraInput() {
let device = AVCaptureDevice.default(for: .video)!
let cameraInput = try! AVCaptureDeviceInput(device: device)
self.captureSession.addInput(cameraInput)
}
The code above takes the default camera which is the standard back camera, wraps it into a capture input and finally adds the camera input as part of our sessions inputs. As we’ll only be making use of the camera this will be our only input.
Now let’s call the addCameraInput
function. Add the following line of code to viewDidLoad
function:
self.addCameraInput()

Why do we call this at viewDidLoad
? We’ll cover that in a bit.
Next let’s present the camera feed on screen. Add the following property to the view controller, just below the captureSession
property:
private lazy var previewLayer: AVCaptureVideoPreviewLayer = {
let preview = AVCaptureVideoPreviewLayer(session: self.captureSession)
preview.videoGravity = .resizeAspect
return preview
}()
The property above will create and configure and instance of AVCaptureVideoPreviewLayer
when accessed the first time; it’s lazily loaded. Additionally we also specify on how to display the image from the camera feed; it will resize to fit the preview layers bounds whilst maintaining the aspect ratio of the image. To check resizing option check AVLayerVideoGravity.
Next let’s add the camera preview layer to the screen. Add the following function to do so:
private func addPreviewLayer() {
self.view.layer.addSublayer(self.previewLayer)
}
Notice that the preview layer is added as a sublayer of an existing view. That is why the earliest we can this function is after the view is loaded - that is at viewDidLoad
function.
Now let’s call the addPreviewLayer
and start the capture session so it coordinate the camera input with the capture session. Add the following two lines of code to viewDidLoad
:
self.addPreviewLayer()
self.captureSession.startRunning()
There’s one thing to do so we can successfully see the camera preview. We must update the frame of the preview layer when the view controller’s container view frame updates.
Add the following function to update the preview layers frame:
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
self.previewLayer.frame = self.view.bounds
}
CALayer’s require manually setting of the frame. AVCaptureVideoPreviewLayer is a subclass of CALayer. In this case we are setting the preview layer to cover the whole screen.

One last thing before we can run the app and show the camera preview. The app requires permission from the user to access the camera. Camera is considered as private. All private or privacy considered data on iOS requires the app to:
- declare the requirement of usage of such privacy considered aspect
- request user permission at runtime
We won’t be needing to handle user permission at runtime. We’ll assume the user will authorise use of the camera. However if the user denies permission our app will crash.
For the first part we have to declare that the app makes use of the camera. We need to declare this in the Info.plist
file of the app. Add the key NSCameraUsageDescription
with string value This message will be displayed to the user when requesting camera permission
.

Once NSCameraUsageDescription
is entered Xcode will replace the key with a more user friendly description of the key for display on the editor: Privacy — Camera Usage Description
.
Run the app on a device and accept the camera permission. You should now see the camera feed!
It is important to note that the code used in this step assumes that it will run on a device. All iOS devices have back cameras. However the app will crash if ran on a simulator as simulators have no cameras and aren’t capable of using a camera connected to macOS either.
3. Processing the camera feed
Next let’s get the camera feed images for processing. For such a task we’ll need to make use of AVCaptureVideoDataOutput. Add the following property to ViewController.swift
:
private let videoOutput = AVCaptureVideoDataOutput()
We can tell videoOutput
to send the camera feed images to a handler of our choice. Let’s configure the videoOutput
and add it to the capture session. Add the following function to do so:
private func addVideoOutput() {
self.videoOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString) : NSNumber(value: kCVPixelFormatType_32BGRA)] as [String : Any]
self.videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "my.image.handling.queue"))
self.captureSession.addOutput(self.videoOutput)
}
In the code above we first set the image pixel format to receive. In this case we specify that each pixel in the image should be a 32-bit Blue-Green-Red-Alpha format (32BGRA
). Next we tell videoOutput
to send the camera feed image to our ViewController
instance on a serial background thread. Finally we add videoOutput
as part of the capture session from it will receive the camera feed and forward it.
You might have noticed that Xcode complains that we can’t set our ViewController
instance as the handler of the camera feed output.

That is because in order to become a video output handler the class must conform to AVCaptureVideoDataOutputSampleBufferDelegate.
Change the ViewController class declaration to the following:
class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {
Now we’re ready to receive the camera feed image. To do so add the following AVCaptureVideoDataOutputSampleBufferDelegate protocol function:
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
guard let frame = CMSampleBufferGetImageBuffer(sampleBuffer) else {
debugPrint("unable to get image from sample buffer")
return
}
print("did receive image frame")
// process image here
}
The code above extracts the image buffer from sample buffer. CMSampleBuffer can contain different types of audiovisual media. Thus we first make sure that the sample contains an image.
Before we can receive the camera feed images we must add the video output to the capture session. Add the following line to viewDidLoad
before self.captureSession.startRunning()
:
self.addVideoOutput()

Run the app on a device and watch those frame come in on the console (View > Debug Area > Activate Console)! 🎉

Summary
In this post we learnt:
- how access iOS device camera
- how to display the camera feed
- how to process the camera feed image
Final Notes
You can find the full source code for this post in the link below:
In this post we slightly covered requesting camera permission with the assumption the user will grant access to the camera. Managing permissions was out of scope of the post. To learn more on handling permission check out “Requesting Authorization for Media Capture on iOS” from Apple documentation site.
In this post we learnt how to setup to process the camera feed. But what can we with those images?
Since iOS 11 Apple includes a framework within iOS called Vision framework. I previously posted on how to do face detection using this framework.
Alternatively there are third party libraries that offer tools and functionalities to process live camera images such as OpenCV. I previously posted on using OpenCV to run simple lane detection on iOS.
I will continue to post ways to process live images that can enhance the end user experience of your app.
Stay tuned for more posts on iOS development! Follow me on Twitter or Medium!