Live Face Recognition with OpenTok and

Next weekend is our second HAPI Hack Weekend at our office in San Francisco and we’re excited to have a bunch of interesting APIs there to hack with. One API that I’m particularly excited about is, which gives developers a set of methods for easy face recognition.

I created a simple app to introduce the basics of using OpenTok with

Try the application here.
Get the source code here.

To use the app, once the stream pops up, hit accept, then press the ‘Identify Person’ button. This will capture an image then attempt to run face recognition on it. On the first few attempts, it will not recognize you since you haven’t trained it yet. When it asks, enter your first name to save the tag to the training set. After a couple tries, it should recognize you and say hello (it’ll also tell you to cheer up if you’re not smiling!).

In addition to providing the source code, I wanted to give a run through for those hackers interested in mashing up these APIs. While the code is too long to go through step by step, I will walk through the general process and endpoints I used to make the basic application.

1. Get the image data using OpenTok

You can use the getImgData() method on any OpenTok publisher or subscriber to get a base64 encoded image of the stream.

2. Save the image

Next you have to convert the base64 string in to an image file. In my application, I used the Imgur Upload API to upload the base64 string and get a remote image URL. Alternatively, you could write a method on your server to do the same thing.

3. Get the user ids to search for

Before you try to recognize an image, you must define a set of user ids to search against. You can create your own custom namespace or use Facebook / Twitter user ids.

In my application, I created my own namespace and then used’s account.users method to get a list of all user ids in that namespace. Alternatively, you could manage a list of user ids with your own database.

4. Try to recognize the image

To perform recognition on the image, pass the URL of the image and the list of user ids to the recognize method.

In the response you will get an array of tags. A tag is essentially data associated with a face that was found in the image. A tag contains all sort of cool data about the face, including whether the person is smiling, wearing glasses, male or female, as well as location data about the eyes, ears, mouth, and nose.

5. Analyze and handle the image tags

At this step there are a couple scenarios that could happen, all which need to be handled differently. Here are the three scenarios and how I handled them in my application:

  1. No tags were returned.

    This means there were no faces in the image. In this case we can’t really do anything other than tell the user.

  2. A tag was returned, but no users were recognized.

    This means there was a face tagged, but the face did not match any users. In this case, we ask the user her name then call the method to associate the tag with the user. This gives data to use for doing future recognize calls.

  3. A tag was returned and a user was recognized.

    In this case we can get the user id of the recognized user and say “Hello userid!”.

While this application was meant to illustrate the basics of how OpenTok and can work together, I’m sure there are dozen of creative uses for face recognition in live video that have been yet to be done.

If you have any clever ideas, please post them in the comments–or better yet, come to HAPI Hack on August 13 and 14 and build it!

Try the application here.
Get the source code here.

Some test text.