Building a Snapchat-like app with WebRTC in the browser

Building a Snapchat-like app with WebRTC in the browser

In light of the recent Snapchat IPO I thought it would be interesting to see whether it is possible to build a Snapchat-like app using WebRTC in a browser. The good news is that thanks to some new features in modern browsers (Firefox and Chrome) the answer to that question is yes!

To see a demo of my app running go to http://aullman.github.io/snapchat-killer . You can see the source code at https://github.com/aullman/snapchat-killer

Note: this app only really works properly on Chrome and Firefox for the Desktop.

Features

First of all, what are the features of a Snapchat-like app? In my app I wanted you to be able to:

  1. Take a snapshot (one click of the button).
  2. Record a video (hold down the button).
  3. Apply filters to the image/video of yourself.

Taking a Snapshot

When you click the capture button in the app we want to be able to take a snapshot of the current camera image and put it into an Image tag. There is a little bit of a complicated pipeline that you need to set up to be able to take a snapshot from a camera.

Building a Snapchat-like app with WebRTC in the browser

First we get the video stream from our camera using getUserMedia() and then we pass that into a video element to display it.

let videoElement;
navigator.mediaDevices.getUserMedia({
  audio: true,
  video: true,
}).then(stream => {
  videoElement = document.createElement('video');
  videoElement.srcObject = stream;
  videoElement.muted = true;
  videoElement.play();
});

Then when the user clicks the button we draw the current frame of the video tag onto a canvas using the drawImage() function.

const canvas = document.createElement('canvas');
canvas.width = videoElement.videoWidth;
canvas.height = videoElement.videoHeight;
const ctx = canvas.getContext('2d');
ctx.drawImage(videoElement, 0, 0, canvas.width, canvas.height);

Now we can create an image tag and set the src of it to be the content of the canvas using the toDataURL function.

const image = document.createElement('img');
image.setAttribute('src', canvas.toDataURL('image/png'));

Record a video

When you hold down the capture button in Snapchat it records a video. We can recreate this feature in our app as well using the new MediaRecorder API. The MediaRecorder API lets you record any MediaStream.

We start recording by creating a new MediaRecorder object and passing in the MediaStream we got from getUserMedia() and some options that tell it what codec we want to use. Then we listen for the ‘dataavailable’ event and collect a bunch of recorded blobs, which are 10 ms of video at a time.

recordedBlobs = [];
const options = { mimeType: 'video/webm;codecs=vp9' };
mediaRecorder = new MediaRecorder(mediaStream, options);
mediaRecorder.addEventListener('dataavailable', event => {
  if (event.data && event.data.size > 0) {
    recordedBlobs.push(event.data);
  }
});
mediaRecorder.start(10); // collect 10ms of data

Then to playback the video we create a video element and set our recordedBlobs as the src of the video.

const recordedVideo = document.createElement('video');
recordedVideo.setAttribute('autoPlay', 'true');
recordedVideo.setAttribute('loop', 'true');
const superBuffer = new Blob(recordedBlobs, { type: 'video/webm' });
recordedVideo.src = window.URL.createObjectURL(superBuffer);

Downloading Images and Videos

Now we have our image tag with a snapshot in it and a video tag with a recorded video in it but how do we let the user download that? To do that we can just create an anchor tag with the href set to the src of the video or the image. Then we set the download property of this anchor tag to set the default filename. Finally we click the link programmatically which will cause the user to be prompted to download the file.

const a = document.createElement('a');
a.style.display = 'none';
a.href = captured.getAttribute('src');
a.download = `snap.${captured.tagName === 'IMG' ? 'png' : 'webm'}`;
document.body.appendChild(a);
a.click();
setTimeout(() => {
  document.body.removeChild(a);
}, 100);

Applying filters

Applying a filter to snapshot with WebRTC

Now we’re at the fun part. We don’t want to just take boring snaps and videos. We want to create some filters that make our photos more interesting. To do this we need to alter our video pipeline a little bit to include a filtering step.

How adding photo filters to WebRTC snapshot works - Build a Snapchat clone

We are still drawing the video into the canvas but now we’re getting the image data out of the canvas. Image data is an array of the rgba values of each pixel in the canvas. By manipulating these values before we draw them onto the next canvas we can do whatever we want to the resulting image.

let requestId, tmpCanvas, tmpCtx, ctx;
const drawFrame = () => {
  if (!ctx) {
    ctx = canvas.getContext('2d');
  }
  if (!tmpCanvas) {
    tmpCanvas = document.createElement('canvas');
    tmpCtx = tmpCanvas.getContext('2d');
    tmpCanvas.width = canvas.width;
    tmpCanvas.height = canvas.height;
  }
  tmpCtx.drawImage(videoElement, 0, 0, tmpCanvas.width, tmpCanvas.height);
  const imgData = tmpCtx.getImageData(0, 0, tmpCanvas.width, tmpCanvas.height);
  const data = filter(imgData);
  ctx.putImageData(data, 0, 0);
  requestId = requestAnimationFrame(drawFrame);
};

There are lots of libraries out there that can help you with the filter library. I ended up using tracking.js but you could even just use CSS filters and get the same result. I wrote in a bit more detail about creating your own filters in another post.

To record a filtered video we need to get a MediaStream which is filtered. So we use requestAnimationFrame() to repeatedly draw frames onto the canvas. We can then use the new captureStream function to get a MediaStream that we can pass to our MediaRecorder. We will also want to keep the audio so we will need to add the audio track from our original MediaStream back into the new canvas stream.

let filteredStream = canvas.captureStream();
if (stream.getAudioTracks().length) {
  filteredStream.addTrack(stream.getAudioTracks()[0]);
}

Now our MediaRecorder will record our filtered video stream.

Face detection

Building face detection into your Snapchat clone using WebRTC

Snapchats filters use all kinds of clever facial detection to be able to morph your face. The great news is that there are libraries in javascript for us to be able to do the same thing. The library I used for this is clmtrackr.

clmtrackr tracks positions on the face and gives them to you as an array. You can then use this array to eg. draw some silly glasses on top of the face. We do this by looking at the position and angle of the eyes and draw our comedy glasses image in the right position on our filtered canvas.

const distance = (point1, point2) => {
  return Math.sqrt(Math.pow(point2[0] - point1[0], 2) + Math.pow(point2[1] - point1[1], 2));
};
const drawGlasses = (videoElement, canvas) => {
  let image;
  let ctx;
  if (!ctx) {
    ctx = canvas.getContext('2d');
  }
  if (!image) {
    image = document.createElement('img');
    image.src = 'https://aullman.github.io/opentok-camera-filters/images/comedy-glasses.png';
  }
  let positions = clmtrackr.getCurrentPosition();
  if (positions && positions.length > 20) {
    const width = distance(positions[15], positions[19]) * 1.1;
    const height = distance(positions[53], positions[20]) * 1.15;
    const y = positions[20][1] - (0.2 * height);
    const x = positions[19][0];
    // Calculate the angle to draw by looking at the position of the eyes
    // The opposite side is the difference in y
    const opposite = positions[32][1] - positions[27][1];
    // The adjacent side is the difference in x
    const adjacent = positions[32][0] - positions[27][0];
    // tan = opposite / adjacent
    const angle = Math.atan(opposite / adjacent);
    try {
      ctx.translate(x, y);
      ctx.rotate(angle);
      ctx.drawImage(image, 0, 0, width, height);
      ctx.rotate(-angle);
      ctx.translate(-x, -y);
    } catch (err) {
      console.error(err);
    }
  }
};

This is just some simple comedy glasses but the same idea would apply for an animated rainbow coming out of someone’s mouth. clmtrackr also has some examples of real-time face deformation and face substitution.

Conclusion

There is obviously a lot more to the Snapchat app than my simple application here. But it’s great to see that the APIs are there in modern browsers to be able to build an app like this. Hopefully in the future we’ll see more of these popular social apps built on top of WebRTC and other browser technologies.