close search

Back to Custom Camera Video Capturing Overview

Custom Camera Video Capturing Step 2: Capturing video frames

  1. 1
    Custom Camera Video Capturing Step 1:
    Initializing capture
  2. 2
    Custom Camera Video Capturing Step 2:
    Capturing frames

To see the code for this sample, switch to the video-capturer-camera branch of the learning-opentok-ios repo:

git checkout video-capturer-camera

This page shows the difference between this branch and the video-capturer-basic branch which it was built upon.

This branch shows you how to use a custom video capturer using the device camera as the video source.

This sample code uses the Apple AVFoundation framework to capture video from a camera and publish it to a connected session. The ViewController class creates a session, instantiates subscribers, and sets up the publisher. The captureOutput method creates a frame, captures a screenshot, tags the frame with a timestamp and saves it in an instance of consumer. The publisher accesses the consumer to obtain the video frame.

Note that because this sample needs to access the device's camera, you must test it on an iOS device. You cannot test it in the iOS simulator.

Initializing and configuring the video capturer

The [OTKBasicVideoCapturer initWithPreset: andDesiredFrameRate:] method is an initializer for the OTKBasicVideoCapturer class. It calls the sizeFromAVCapturePreset method to set the resolution of the image. The image size and frame rate are also set here. A separate queue is created for capturing images, so as not to affect the UI queue.

- (id)initWithPreset:(NSString *)preset andDesiredFrameRate:(NSUInteger)frameRate
    self = [super init];
    if (self) {
        self.sessionPreset = preset;
        CGSize imageSize = [self sizeFromAVCapturePreset:self.sessionPreset];
        _imageHeight = imageSize.height;
        _imageWidth = imageSize.width;
        _desiredFrameRate = frameRate;

        _captureQueue = dispatch_queue_create("com.tokbox.OTKBasicVideoCapturer",
    return self;

The sizeFromAVCapturePreset method identifies the string value of the image resolution in the iOS AVFoundation framework and returns a CGSize representation.

The implementation of the [OTVideoCapture initCapture] method uses the AVFoundation framework to set the camera to capture images. In the first part of the method an instance of the AVCaptureVideoDataOutput is used to produce image frames:

- (void)initCapture
    NSError *error;
    self.captureSession = [[AVCaptureSession alloc] init];

   [self.captureSession beginConfiguration];

    // Set device capture
    self.captureSession.sessionPreset = self.sessionPreset;
    AVCaptureDevice *videoDevice =
      [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
    self.inputDevice =
      [AVCaptureDeviceInput deviceInputWithDevice:videoDevice error:&error];
    [self.captureSession addInput:self.inputDevice];

    AVCaptureVideoDataOutput *outputDevice = [[AVCaptureVideoDataOutput alloc] init];
    outputDevice.alwaysDiscardsLateVideoFrames = YES;
    outputDevice.videoSettings =
      @{(NSString *)kCVPixelBufferPixelFormatTypeKey:

    [outputDevice setSampleBufferDelegate:self queue:self.captureQueue];

    [self.captureSession addOutput:outputDevice];

    // See the next section ...

The frames captured with this method are accessed with the [AVCaptureVideoDataOutputSampleBufferDelegate captureOutput:didOutputSampleBuffer:fromConnection:] delegate method. The AVCaptureDevice object represents the camera and its properties. It provides captured images to an AVCaptureSession object.

The second part of the initCapture method calls the bestFrameRateForDevice method to obtain the best frame rate for image capture:

- (void)initCapture
    // See previous section ...

    // Set framerate
    double bestFrameRate = [self bestFrameRateForDevice];

    CMTime desiredMinFrameDuration = CMTimeMake(1, bestFrameRate);
    CMTime desiredMaxFrameDuration = CMTimeMake(1, bestFrameRate);

    [self.inputDevice.device lockForConfiguration:&error];
    self.inputDevice.device.activeVideoMaxFrameDuration = desiredMaxFrameDuration;
    self.inputDevice.device.activeVideoMinFrameDuration = desiredMinFrameDuration;

    [self.captureSession commitConfiguration];

    self.format = [OTVideoFormat videoFormatNV12WithWidth:self.imageWidth

The [self bestFrameRateForDevice] method returns the best frame rate for the capturing device:

- (double)bestFrameRateForDevice
    double bestFrameRate = 0;
    for (AVFrameRateRange* range in
        CMTime currentDuration = range.minFrameDuration;
        double currentFrameRate = currentDuration.timescale / currentDuration.value;
        if (currentFrameRate > bestFrameRate && currentFrameRate < self.desiredFrameRate) {
            bestFrameRate = currentFrameRate;
    return bestFrameRate;

The AVFoundation framework requires a minimum and maximum range of frame rates to optimize the quality of an image capture. This range is set in the bestFrameRate object. For simplicity, the minimum and maximum frame rate is set as the same number but you may want to set your own minimum and maximum frame rates to obtain better image quality based on the speed of your network. In this application, the frame rate and resolution are fixed.

This method sets the video capture consumer, defined by the OTVideoCaptureConsumer protocol.

- (void)setVideoCaptureConsumer:(id<OTVideoCaptureConsumer>)videoCaptureConsumer
    self.consumer = videoCaptureConsumer;

The [OTVideoCapture captureSettings] method sets the pixel format and size of the image used by the video capturer, by setting properties of the OTVideoFormat object.

The [OTVideoCapture currentDeviceOrientation] method queries the orientation of the image in AVFoundation framework and returns its equivalent defined by the OTVideoOrientation enum in OpenTok iOS SDK.

Capturing frames for the publisher's video

The implementation of the [OTVideoCapture startCapture] method is called when the publisher starts capturing video to publish. It calls the [AVCaptureSession startRunning] method of the AVCaptureSession object:

- (int32_t)startCapture
    self.captureStarted = YES;
    [self.captureSession startRunning];

    return 0;

The [AVCaptureVideoDataOutputSampleBufferDelegate captureOutput:didOutputSampleBuffer:fromConnection:] delegate method is called when a new video frame is available from the camera.

- (void)captureOutput:(AVCaptureOutput *)captureOutput
       fromConnection:(AVCaptureConnection *)connection
    if (!self.captureStarted)

    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    OTVideoFrame *frame = [[OTVideoFrame alloc] initWithFormat:self.format];

    NSUInteger planeCount = CVPixelBufferGetPlaneCount(imageBuffer);

    uint8_t *buffer = malloc(sizeof(uint8_t) * CVPixelBufferGetDataSize(imageBuffer));
    uint8_t *dst = buffer;
    uint8_t *planes[planeCount];

    CVPixelBufferLockBaseAddress(imageBuffer, 0);
    for (int i = 0; i < planeCount; i++) {
        size_t planeSize = CVPixelBufferGetBytesPerRowOfPlane(imageBuffer, i)
          * CVPixelBufferGetHeightOfPlane(imageBuffer, i);

        planes[i] = dst;
        dst += planeSize;

                CVPixelBufferGetBaseAddressOfPlane(imageBuffer, i),

    CMTime minFrameDuration = self.inputDevice.device.activeVideoMinFrameDuration;
    frame.format.estimatedFramesPerSecond = minFrameDuration.timescale / minFrameDuration.value;
    frame.format.estimatedCaptureDelay = 100;
    frame.orientation = [self currentDeviceOrientation];

    CMTime time = CMSampleBufferGetPresentationTimeStamp(sampleBuffer);
    frame.timestamp = time;
    [frame setPlanesWithPointers:planes numPlanes:planeCount];

    [self.consumer consumeFrame:frame];

    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

This method does the following:

The implementation of the [AVCaptureVideoDataOutputSampleBufferDelegate captureOutput:didDropSampleBuffer:fromConnection] method is called whenever there is a delay in receiving frames. It drops frames to keep publishing to the session without interruption:

- (void)captureOutput:(AVCaptureOutput *)captureOutput
       fromConnection:(AVCaptureConnection *)connection
    NSLog(@"Frame dropped");

Other notes on the app

The OTVideoCapture protocol includes other required methods, which are implemented by the OTKBasicVideoCapturer class. However, this sample does not do anything interesting in these methods, so they are not included in this discussion.

  1. 1
    Custom Camera Video Capturing Step 1:
    Initializing capture
  2. 2
    Custom Camera Video Capturing Step 2:
    Capturing frames