A very cool custom video camera with AVFoundation

AVFoundation is a very cool framework that allows you to collect multimedia data generated by different input sources (camera, microphone, etc.) and redirect them to any output destinations (screen, speakers, etc.). You can create custom playback and capture solutions for audio, video and still images. The advantage of using this framework with respect to the out-of-the-shelf solutions such as the MPMoviePlayerController or the UIImagePickerController is that you get access to the raw data of the camera. In this way, you can apply effects in real-time to the input signals for different purposes.

I have prepared for you a small app to show you how to use this framework and create a very cool video camera.


AVFoundation is based on the concept of the session. A session is used to control the flow of the data from an input to an output device. The creation of a session is really straightforward:

The session allows you to define the audio and video recording quality, using the sessionPreset property of the AVCaptureSession class. For this example, it’s fine to go for low quality data (so we save some battery cycle):

Capture Device

After the capture session has been created, you need to define the capture device you want to use. It can be the camera or the microphone. In this case, I am going to use the AVMediaTypeVideo type that supports videos and images:

Capture Device Input

Next step you need to define the input of the capture device and add it to the session. Here you go:

You check if you can add the device input to the session and if you can, you add it.


Before defining the device output, I want to show you how to preview the camera buffer. This will be the viewfinder of your camera, i.e. the preview of what the input device is seeing.

We can quickly render the raw data collected by the camera on the screen using the AVCaptureVideoPreviewLayer. We can create this preview layer using the session we defined above and then add it to our main view layer:

You don’t need to do any additional work. You can now display the camera signal on your screen.

If you instead want to do some more cool stuffs, for example, if you want to process the camera signal to create nice video effects with Core Image or the Accelerate framework (give a look at this post), you need to collect the raw data generated by the camera, process them, and, if you like it, display them on the screen.

Go baby, go!!!

We are ready to go. The last thing you need to do is to start the session:

Cool stuffs

Since the AVCaptureVideoPreviewLayer is a layer, you can obviously add animations to it. I am attaching here a very simple Xcode project showing the previous concepts. It creates a custom video camera with the preview rotating in the 3D space.

Real-time processing

If you want to do some image processing with the raw data captured by the the camera and display the result on the screen, you need to collect those data, process them and render them on the screen without using the AVCaptureVideoPreviewLayer. Depending on what you want to achieve, you have two main strategies:

  1. Either you capture a still picture as soon as you need one; or
  2. You capture continuously the video buffer

Now, the first approach is the simplest one: whenever you need to know what the camera is looking at, you just shoot a picture. Instead, if you want to process the video buffer, that’s more tricky, especially if your image processing algorithm is slower than the camera framerate output. Here, you need to evaluate which solution is more suitable for you case. Take into account that depending on the device you can get different image resolution. For example, the iPhone 4s can provide images up to 8 mega pixels. Now, that’s a lot of data to process in real-time. So, if you are doing real-time image processing, you need to accept some lower quality images.
But all these considerations are a topic for a next post.

Keep coding,




(Visited 477 times, 1 visits today)