Suggestions

close search

Add Messaging, Voice, and Authentication to your apps with Vonage Communications APIs

Visit the Vonage API Developer Portal

Live Captions — Vonage Video macOS SDK

The OpenTok macOS SDK includes methods for macOS clients to publish and receive captions in an OpenTok session.

Live captioning must be enabled at the session level via the REST API.

Live captioning is only supported in routed sessions.

This topic includes the following sections:

Publishing live captions

An OpenTok publisher can start or stop publishing real-time live captions by calling the otc_publisher_set_publish_captions() function:

otc_publisher_set_publish_captions(publisher, OTC_FALSE);

If the publisher does not include an audio track, the otc_publisher_callbacks.on_error() callback function is called with an error, with the error code set to OTC_PUBLISHER_MISSING_AUDIO_TRACK.

The Vonage macOS SDK does not support a publisher receiving events for its own captions. To render the speaker’s own captions, create a hidden subscriber (to the publisher’s stream) to listen for the caption events. (See the next section.)

Subscribing to live captions

A subscriber may start or stop receiving captions by calling the otc_subscriber_set_subscribe_to_captions() function:

otc_subscriber_set_subscribe_to_captions(subscriber, OTC_TRUE);

You can call this function regardless of whether the publisher of the stream is currently publishing live captions. The subscriber will start receiving captions data once the publisher begins publishing captions.

To stop receiving captions, pass OTC_FALSE into the function:

otc_subscriber_set_subscribe_to_captions(subscriber, OTC_FALSE);

Subscribers can verify whether they are actively subscribed to a stream's live captions using the otc_subscriber_get_subscribe_to_captions() function:

otc_bool isSubscribed = otc_subscriber_get_subscribe_to_captions(subscriber);

Subscribers receive captions via events. The OpenTok SDK does not display the text of the captions in the UI. Set the on_caption_text() member function of the otc_subscriber_callbacks instance to set up a listener for captions events:

static void on_caption_text(otc_subscriber *subscriber,
                                      void *user_data,
                                      const char* text) {
    // Display the text in the UI.
}

subscriber_callbacks.on_caption_text = on_caption_text;

The otc_stream_has_captions() function reports whether a stream has captions:

otc_bool hasCaptions = otc_stream_has_captions(stream);

Implement the on_stream_has_captions_changed() member function of the otc_session_callbacks instance to monitor when a stream has captions enabled and disabled:

static void on_stream_has_captions_changed(otc_session* session,
                                                   void *user_data,
                                             otc_stream *stream,
                                                  otc_bool has_captions) {
  // Adjust UI to indicate that captions are or are not available.
}

session_callbacks.on_stream_has_captions_changed = on_stream_has_captions_changed;

Receiving your own live captions

The Vonage Video API does not support a publisher receiving events for its own captions. To render the speaker's own captions, create a hidden subscriber (to the local publisher's stream) to listen for caption events. Do not display this subscriber's view in the UI, and do not subscribe to its audio (to avoid echo). You can then add the captions text to the UI.

The Publisher by default does not publish captions, so call the otc_publisher_set_publish_captions() function, passing in true:

    session_data->publisher = otc_publisher_new("opentok-macos-sdk-samples",
                                              NULL, /* Use WebRTC's video capturer. */
                                              &publisher_callbacks);
    otc_publisher_set_publish_captions(session_data->publisher, TRUE);

Subscribe to the stream in the publisher's on_publisher_stream_created() callback:

static void on_publisher_stream_created(otc_publisher *publisher,
                                        void *user_data,
                                        const otc_stream *stream) {
    NSLog(@"on_session_stream_received: streamId=%s", otc_stream_get_id(stream));
    subscriber_callbacks.on_caption_text = on_caption_text;
    otc_subscriber *subscriber = otc_subscriber_new(strcpy, &subscriber_callbacks);
    
    if (otc_session_subscribe(session, subscriber) == OTC_SUCCESS) {
        session_data_local->subscriber = subscriber;
        session_data_local->sub_stream = stream;

        otc_subscriber_set_subscribe_to_audio(session_data_local->subscriber, FALSE);
        otc_subscriber_set_subscribe_to_video(session_data_local->subscriber, TRUE); // This is a workaround for an OpenTok bug.
        otc_subscriber_set_subscribe_to_captions(session_data_local->subscriber, TRUE);

        self_publisher_stream_id = otc_stream_get_id(stream);
    }
}

As noted above, you must subscribe to video to receive captions. (This will be fixed in a future version.)

This example stores the publisher's stream ID in a selfPublisherStreamId variable, used here to prevent its view from being added to the UI:

static void on_subscriber_connected(otc_subscriber *subscriber,
                                    void *user_data,
                                    const otc_stream *stream) {
      NSLog(@"on_subscriber_connected: streamId=%s ", otc_stream_get_id(stream));
    
      if(otc_stream_get_id(otc_stream) == selfPublisherStreamId) {
        return;
    }

    // Show other subscribers in the UI
}

The self subscriber will now have your own publisher transcribed text and this can be captured in the callback as shown below:

void on_caption_text(otc_subscriber* subscriber,
                          void* user_data,
                          const char* text,
                          otc_bool is_final) {
    // Adjust UI to show captions.
    otc_stream* stream = otc_subscriber_get_stream(subscriber);
    if (strcmp(otc_stream_get_id(stream), self_publisher_stream_id) == 0)
    {
        // text is your own audio as transcribed text or captions
        NSLog(@"My own caption %@", text);
    } else {
        // other subscribers captions
    }
}