Learn the basics of what the Vonage Video API (formerly TokBox OpenTok) is and how it works.
This topic includes the following sections:
Looking for code? Understanding how the Vonage Video API platform works is extremely helpful for building Vonage Video API applications, but if you'd rather dive into some code and start building before learning the core concepts, check out our Tutorials.
The Vonage Video API platform makes it easy to embed real-time, high-quality interactive video, messaging, screen-sharing, and more into web and mobile apps. The platform includes client libraries for web, iOS, Android, Windows, macOS, Linux, and React Native, as well as server-side SDKs and a REST API. Vonage Video API uses WebRTC for audio-video communications.
All applications built with the Vonage Video API platform require two primary components:
Client:
App Server:
Session:
Every Vonage Video API video chat occurs within a session. You can think of a session as a “room” where clients can interact with one another in real-time. Sessions are hosted on the Vonage Video API cloud and manage user connections, audio-video streams, and user events (such as a new user joining). Each session is associated with a unique session ID. To allow multiple clients to chat with one another, you would simply have them connect to the same session (using the same session ID).
The app server (which executes the server-side code) is responsible for creating new sessions and generating unique authentication tokens for each new client. The client then uses that token to establish a connection to the session. Clients connected to a session can publish streams to the session and subscribe to other clients’ streams. For more info on how clients connect to a session, see the next section.
Debugging Sessions with Inspector
If you want to view information about a specific session, you can use Inspector. This tool provides a complete breakdown of the session, including users, quality, events, errors, and more. This is useful for debugging and getting a better understanding of what's happening in your app's sessions.
This section provides a visual breakdown of the steps required for two clients to initiate a video chat session.
Your app server, using code from an Vonage Video API server SDK, creates a session in the cloud via the Vonage Video REST API and receives the session ID. Think of the session as a “room” where the video chat will occur. At this point it is unoccupied.
When a user loads your client-side application, built with an Vonage Video API Client SDK, the client (a web page or mobile app) gets session info from the server. This includes a unique authentication token (the client’s “key”) created by your app server.
The client uses the session ID and token to establish a connection with the session. The client can then publish an audio-video stream to the session and listen for important events (such as a new user joining the session). At this point, the client is the only participant in the session.
When a new user loads the client-side application in a separate web page or mobile device (Client 2), the new client receives the session ID and a unique token from your app server. The client uses that info to establish a connection to the session.
Now that it’s connected to the session, Client 2 can subscribe to Client 1’s stream. Client 2 then publishes its own video stream to the session, and Client 1 subscribes to it. Both clients are now subscribed to each other's stream in a one-to-one video chat, and both are “listening” for new events (such as a new user connecting the session.)
Vonage Video API supports one-to-one communication and group communication. Multiple clients can connect to a session and each connected clients can subscribe to each stream in the session. In addition to the basic functionality above, the Vonage Video API platform provides a variety of additional features such as screen-sharing, archiving, text chat, broadcasting, and more. To see the complete set of features offered, see the developer guides.
After reviewing the materials above, you should have a basic understanding of how the Vonage Video API works. If you're ready to start building your first application, check out the Basic Client Tutorial for web, iOS, Android, Windows, or Linux:
You can also find a list of key terms and definitions below, along with a listing of other Developer Center resources.
Session
You can think of a session as a “room” where clients can interact with one another in real-time — associated with a unique session ID. New sessions are generated by your app server via the server SDKs. Clients can obtain the session ID and a unique authentication token from the server to initialize and connect to the session, after which they can publish streams and subscribe to streams in the session, as well as listen for events dispatched by the session (such as a new user connecting).
Check out our developer guides to learn how to create a session with the server SDKs, join a session with the client SDKs, and explore more session-related functionality. For information on specific classes, methods, and events associated with sessions, review our API Reference docs for Web, iOS, Android, Windows, and Linux.
Token
A token is a unique authentication “key” that allows a client to join a session. When a client attempts to join a session, your app server generates a unique token and sends it to the client along with the session ID. Tokens have expiration dates (specified by the server), whereas sessions never expire. Tokens can also be assigned roles — publisher, subscriber, or moderator, which determine the permissions of the client, such as publishing streams and subscribing to streams in the session.
Check out our developer guides to learn how to create a token with the server SDKs and join a session with the client SDKs. When building a demo or proof of concept, you can generate tokens in your Vonage Video API Account by creating a project, browsing to the Project Overview, and scrolling down to project tools, but a server component must be set up before going to production.
Client
A client is any browser or mobile app utilizing client-side code from the Vonage Video API client SDKs. The client is one of two key components that make up an Vonage Video API application (the other being the server), and is responsible for most Vonage Video API functionality, such as connecting to a session, publishing and subscribing to streams, listening for session events, and dispatching events to the session.
Before it can interact with a session, the client must receive the session ID and token generated by your app server. Check out our Basic Client Tutorial to learn how to implement client-side functionality, as well as our developer guides for more information.
Server
The app server, set up by the app developer using the Vonage Video API server SDKs, generates new sessions and tokens, which it then sends to the client. All Vonage Video API applications require both a client and server component. Check out our developer guides to learn how to create a session and create a token with the server SDKs, as well as other functionality handled by the server.
Connection
A client interacts with the session via a persistent event (or signaling) connection, which uses WebSockets to constantly listen for new events dispatched by the session. This connection is different from the WebRTC media connection(s) established when clients publish and subscribe to streams in the session.
In order to establish a connection to a session, the client must receive a session ID and token from your app server, which are then used for authentication. Each client usually opens a single connection to a given session, and each client connection to the session is associated with a unique connection ID. Check out our developer guides to learn how to connect to a session with the client SDKs.
Stream
A stream is a single audio-video signal, which includes a user's published camera and microphone feed. During a session, clients publish streams to the session and subscribe to other clients’ streams. When a new stream is created, an event is dispatched by both the Publisher object and Session object.
Publish
Once a client is connected to a session, it can publish an audio-video stream to the session using the device’s webcam and microphone. The client’s token role (publisher, subscriber, moderator) determines whether that client can publish to the session. This allows for sessions with only one or two publishers but many subscribers (one-to-many). Check out our developer guides to learn how to publish a stream to a session with the client SDKs.
Subscribe
Once a client is connected to a session, it can subscribe to any audio-video streams published by other clients in the session. Check out our developer guides to learn how to subscribe to streams in a session with the client SDKs.
Events
Once a client establishes a connection to a session, it is able to listen for events dispatched by the session. Events are dispatched for a variety of reasons, such as a new stream being created or a new client connecting or disconnecting from the session. If you want a client to react to a certain event, you must set up an event listener. For information on the various events that can occur during a session and how to handle them, check out the client SDK reference docs for web (JavaScript), iOS, Android, Windows, and Linux.
Client SDK
The client SDKs are a set of code libraries available for web (JavaScript), iOS, Android, Windows, and Linux, used to set up the client. The client-side code handles the majority of Vonage Video API functionality, including publishing and subscribing to streams in the session and listening for session events.
Server SDK
The server SDKs are a set of wrappers for the Vonage Video REST API available for Node, PHP, Java, .NET, Python, and Ruby. This code is set up on your app server and is used to generate new sessions and tokens for the client.
Vonage Video REST API
The Vonage Video REST API is an HTTP interface with the Vonage Video API Cloud used to create sessions and handle advanced features such as archiving and broadcast. The Vonage Video API server SDKs implement many of the methods of the REST API. Check out the REST API reference docs to learn the various methods available.
Vonage Video API Cloud — manages sessions, client connections, API calls, signaling, events, and just about everything else that’s not handled by the client SDKs or server SDKs.
The article you’re reading, along with all other Vonage Video API documentation, is part of the Vonage Video API Developer Center. The Developer Center is meant to provide everything a developer needs to successfully build and maintain an Vonage Video API application.
Here are some of the resources available in the Developer Center:
Hello World — a quick demonstration of the most basic Vonage Video API functionality
Tutorials — step-by-step walkthroughs on Building a Vonage Video API application and adding advanced features
Code Samples — a listing of GitHub repos with samples apps to help you build faster
Video Chat Embeds — the fastest way to integrate Vonage Video API functionality into your website with minimal code
Developer Guides — thorough documentation on all Vonage Video API features and functionality
Client SDK Reference for Web, iOS, Android, Windows, Linux — info on specific classes, methods, and events used by the client SDKs
REST API Reference Docs — a guide to using the Vonage Video REST API and server SDKs
Developer Tools — helpful tools for debugging sessions, testing API calls, and more
Beta Programs — a page listing all Vonage Video API public betas and how to join them
If you have a question or run into issues and can’t find the answer in the Developer Center, you can visit our support center to contact our support team or browse frequently asked questions.