Why do we care about ORTC?

ORTCFor those who might not know (and are still interested in the topic?) ORTC, Object RTC, is an initiative that was started one year ago by a group of people who were not comfortable with the approach taken for the design of the WebRTC APIs.  This group recently published the first official draft of an alternative API including support from very relevant people from Google and Microsoft.

Everybody that has worked long enough with existing WebRTC will probably agree with the fact that beyond the basic 1:1 call, the most important WebRTC API today is RegExp.

The problem is that the existing API was designed, to some extent, to mimic traditional telephony flows and protocols that are based on the concept of offer and answer and a textual format to represent the configurations called SDP. Although this model ensures a very smooth transition with legacy SIP based platforms, it also requires a complete offer/answer roundtrip and manipulation of the SDP blobs when trying to customize the behavior of the stack.

There are definitely some configurable parameters/flags that we call constraints but they provide very limited customization.  As soon as you need to disable a codec, enable stereo audio, configure bitrates or enable simulcast you end up “fixing” the SDP using regular expressions.  You can look at the Google reference implementation if you don’t believe it.

The reason why ORTC is important for TokBox is because it opens up the possibility for us to do all of the above without having to use workarounds or hacks to get access to internal capabilities of the WebRTC Browsers.

With ORTC there will be a clean way to do it.  The new API model will make it possible to access the internal capabilities of WebRTC in an object oriented mode that are not exposed today and will bring much more flexibility as we can already see in the proposed simulcast/layering capabilities in ORTC.

One question that usually arises around ORTC is backwards compatibility. From our point of view WebRTC as a whole is composed of three different layers:

1) Protocols defined by the IETF.

2) Media engines provided by Google including coding, signal processing and buffering.

3) API defined by the W3C and exposed in the browser to access those capabilities.

ORTC only changes the API layer, which is normally only around 5% of the full stack and does this by providing an API that ensures the possibility of implementing any existing WebRTC use case/flow on top of it.  With that in mind, we think that we probably don’t need to call the new API ‘WebRTC 2.0’ and we’re confident that we can have a smooth transition with endpoints supporting both the new and the old style APIs.

That said, the new approach might have some drawbacks.  We believe that ORTC is not making it easier for developers however it is making it simpler given that the APIs are cleaner.  At the end of the day we can always create wrappers with different semantics like we do with TokBox client SDKs to simplify development of applications on top of raw WebRTC APIs.