Is WebRTC Ready for H.264?

The long-running video codec debate has, without a doubt, been the biggest open issue in the WebRTC standards effort.

In a surprise announcement last week, Cisco introduced a mechanism through which H.264 could be used in WebRTC browser implementations free from MPEG-LA’s licensing burden.

Cisco’s maneuver was a master stroke from the playbook of open standards strategy.  The licensing deal they announced with MPEG-LA appears to cut the legs out from under the main pragmatic argument opposing H.264 (ie. the royalty problem).  Mozilla’s support lent Cisco’s approach instant credibility from the ideological wing (ie. the open source camp).  And by keeping this under wraps until a week before the upcoming IETF 88 meeting, at which the video codec debate is to be revisited, Cisco left no time for any coordinated response from the VP8 camp.

The reaction from the web bears witness to both the admiration for and frustration with Cisco’s master stroke:

Here at TokBox, we’ve been taking a few days to work through the parts of this announcement that make our brains hurt, so we can figure out if it’s a good thing.

And we’ve concluded that while it’s a step in a good direction, it’s not quite good enough.

Putting it in black and white, while we take our hats off to Cisco for their generosity, we still believe that H.264 as the sole MTI video codec for WebRTC is the wrong strategy.

Why?  Because we don’t think that the Cisco proposal goes far enough to make an H.264-based version of WebRTC work for the thousands of applications we’ve seen implement face-to-face video using OpenTok.

From our reading of the situation, there are too many open issues, and too many gaps in the support that Cisco proposes.  Here are five simple observations:

  1. Many of the most interesting use cases we have seen for WebRTC involve a mobile endpoint running a native application that embeds WebRTC.  It is not clear whether Cisco’s licensing scheme supports this class of applications (whether they are 100% native or PhoneGap-powered).  If it doesn’t, this means that the garage developers, who are the cornerstone of innovation in the mobile application universe, will only be able to access WebRTC for their native app if they’re ready to support H.264 licensing fees.
  2. Cisco has not yet taken a definitive stand on extending their offer to iOS, one of the two dominant mobile OS out there.  Apple’s existing terms of service don’t allow apps to download plugins at runtime.  Whether you like Apple or hate them, TokBox’s point of view is that we should be doing everything we can to extend WebRTC to that platform, because we’re focussed on putting real-time communication in the hands of all end users.  Between the previous point and this one, the import of Cisco’s proposal does not appear to bring us closer to that goal.
  3. Without a concrete implementation to benchmark, we have concerns about the impact of needing to dynamically download the H.264 codec plugin on overall startup times for WebRTC conversations, particularly in mobile contexts.  Our experience over the past several years is that customer satisfaction drops as delays in conversation startup times grow.  Insisting on a dynamically downloaded code component is taking us in the wrong direction.  WebRTC is supposed to be plugin-free, not plugin-full.
  4. To those who point to onboard H.264 encoding/decoding as an alternative path for supporting H.264 on mobile platform, we simply observe that things aren’t as easy as you might like.  Some phones only have hardware-assisted decoding, not encoding.  Others that have hardware acceleration don’t make the relevant APIs publicly available to developers.  And even when they do, the hardware pipelines only support a single stream at a time.  Which means you’re back out in the cold the moment you want to interact with two live WebRTC streams at the same time on your phone or your tablet.
  5. And finally, H.264 licensing has always been a big fat ugly mess.  Everybody talks about MPEG-LA (the organization from which Cisco is licensing), but it is generally acknowledged that there are patent holders with claims over H.264 who aren’t a part of the MPEG-LA group and who consequently aren’t covered by the Cisco license.  Their point of view on Cisco’s proposal is unknown. (To be fair, the VP8 patent rights picture is not crystal clear either.)

These are just a few of the main issues.  If you want some more things to think about, read Tsahi’s blog post and you can find your own path to making your brain hurt.

So where does this leave us?  At this point, we still believe VP8 is the right MTI video codec for WebRTC.

With the Cisco proposal now on the table, we believe H.264 should be embraced as an optional alternative video codec, available in deployments when its use can lead to significant performance optimization or is needed for interop.

But before anybody in the VP8 or H.264 camps gets too excited, we’d like to follow up our statement above with a call for everybody involved to take a step back and do what’s right for the users.

Google is advancing its one-size-fits-all video codec philosophy through the open source code it writes – and largely controls – in the WebRTC reference implementation that just about everybody else depends on.  The audio processing pipeline supports multiple codecs, and is remarkably easy to extend to include even more.  The video processing code supports only VP8, and makes it difficult to add optional support for another codec, like H.264.  Sure, it’s just code. Sure, it’s open source.  So you can do whatever you like, but the truth is that it is difficult and expensive to have to branch the code to support two codecs downstream of the reference implementation, and then to keep up to date with all the rest of the changes in the reference implementation.  What Google is doing is a form of evil.  So Google, knock it off.  Put some switch statements in the WebRTC video pipeline.

Cisco is being quite “generous” with its licensing support of H.264.  But don’t kid yourself, this is self-interested generosity.  $65M, or whatever number you want to estimate as the cost of their WebRTC MPEG-LA licensing deal, is a drop in the bucket when you look at the size of Cisco’s H.264-dependent conferencing business.  We do think Cisco is being quite creative with their approach, and we truly appreciate it.  But we call upon Cisco to take their game to the next level.  WebRTC is about more than the Web.  It’s about mobile too.  Cisco, if you want us all to get behind this, bring Apple to the table and make your proposal work for any WebRTC browser or software endpoint, regardless of device.  That way, all mobile developers can play with H.264-powered WebRTC without fear.

MPEG-LA are doing their best to make sure that H.264 stays relevant in a real-time universe.  They’ve got their eye on the ripple effect of VP8 hardware support, and presumably on risks to the future value of licensing for H.265.  You might think Cisco is being completely creative without MPEG-LA’s blessing, but a rational assessment is that MPEG-LA is likely involved in this deal.  But if the goal for MPEG-LA is to counter the risk of VP8, why not just go big time?  If you want us to get 100% serious about H.264 for WebRTC, why not just make it free?  MPEG-LA, you’ll get our attention if you declare software implementations of H.264 codecs for WebRTC free and clear of royalties.

Are any of these things going to happen?  Of course not.  But watching the needs of users come second can be the most frustrating part of watching elephants dance around open standards.

Let there be no mistake.  Every major player at the WebRTC standards table has an agenda.  And while they will all portray their actions in the most positive of lights, smart observers will pay careful attention to the unspoken agendas at play.  This week’s rapid-fire developments are no exception.

We’d love for Cisco’s move to be a first step in a series of improvements that make the WebRTC standard even better – a series of steps that truly bring H.264 to the party.  In this blog post, we’ve told you some of what we’d like to see.

But in the meantime, for those focussed on this week’s IETF 88 meeting, let’s recap:

  1. We still stand behind VP8 as the video codec MTI for WebRTC.
  2. We support H.264 as an alternative secondary codec for optimization purposes, particularly in mobile contexts.
  3. We call upon Google to make it easier in the reference implementation for implementors to support a secondary video codec.
  4. We encourage Cisco to make their H.264 licensing deal work for all WebRTC browsers and WebRTC software-endpoints, regardless of device or platform.

Is WebRTC ready for H.264?  Yes, we think it is.

Is WebRTC ready for H.264 ONLY?  In our humble opinion, not yet.


TokBox is a wholly-owned subsidiary of Telefónica.  With hundreds of millions of mobile customers, many of whom have H.264-equipped phones, Telefónica is strongly behind H.264 as the MTI video codec for WebRTC.  At TokBox, we are arguing for a dual strategy – something that will make the world great for all those mobile phones, while at the same time enabling developers, enterprises and institutions all over the world to easily and cost-effectively deploy applications that embed high-performance face-to-face video.