CU-RTC-Web: Microsoft’s view on web real time communications

Microsoft released an alternative proposal to the W3C WebRTC 1.0 Working Draft, designated Customizable, Ubiquitous Real Time Communication over the Web (CU-RTC-Web) demonstrated by a working prototype showcasing an interoperability scenario with a voice call between Google Chrome running on MacOSX and Internet Explorer 10 on Microsoft Windows OS.

Microsoft draft outlines a low-level API that allows developers more direct access to the underlying network and media delivery components, exposing objects representing network sockets and giving explicit application control over the media transport.

Technically, from a functionality and interoperability stand point, both approaches are equivalent, but while WebRTC leverages Session Description Protocol (SDP) for media negotiation, CU-RTC-Web redesigns the functionality around JavaScript, arguing that endpoints should not be required to support SDP processing thus providing simple, transparent objects.

Following the ongoing discussions around video codecs in WebRTC, with concerns being raised about patents on H.264 being too restrictive to be used as a core Web technology and Microsoft not yet committed to support VP8, this proposal leaves up to the developer to select the codec to be used, embedding flexibility on its core to support popular media formats and codecs while remaining open to future innovation.

Co-authored by Skype Senior Architect Martin Thomson, Lync Principal Architect Bernard Aboba and Microsoft Open Technologies Principal Program Manager Adalberto Foresti, Microsoft’s proposal shows a strong commitment to the technology which could drive Skype to open up its walled garden and enable interoperability with other third-party services.

Although Microsoft strategy is not fully clear, WebRTC and CU-RTC-Web will most likely be debated over the coming months (remember VHS vs. Betamax or Blue Ray vs. HD DVD standards war?). WebRTC takes a simple approach which Microsoft may genuinely believe is too restrictive. On the other hand, Microsoft may be playing a delaying game to protect Skype and their  €6 billion investment.

Most likely this war will delay time to market for a full interoperable solution. It took SIP three years to get from the initial draft submission to standardization in RFC2543 and the first commercial SIP phones and services took a couple more years. The first SIP-based 3GPP draft took five years after the protocol was introduced. WebRTC was introduced in May 2011, how much time will we take to see real WebRTC services?

Let me know your thoughts by comment this post or send me an email!

Looking into WebRTC

WebRTC is an HTML5 standard being drafted by the World Wide Web Consortium (W3C) Web Real-Time Communications Working Group and the Internet Engineering Task Force (IETF) Real-Time Communication in WEB-browsers Working Group with first working on standardizing the interaction with HTML and the second the underlying protocols. The framework was open sourced in June 2011 by Google under a royalty free BSD (Berkeley Software Distribution) style license as Google bought the company Global IP Solutions which owned the intellectual property.

WebRTC framework includes iLBC (Internet Low Bitrate Codec), iSAC (Internet Speech and Audio Coder), G.711, and G.722 codecs for audio and VP8 for video. These codecs include capabilities such as packet loss concealment and echo cancellation so they can robustly cope with a lack of guaranteed quality of service.

Although still under development, WebRTC standards aim to provide simple access to a robust, state-of-the-art, real-time voice and video engine, placed in the web browser, along with all the transport and security tools required to make it work. However, it is important to highlight that WebRTC is only a media tool, without any specific signaling channel. Technology relies on the developer to implement session management mechanisms to establish and manage real-time communications between two or more parties.

Traditionally, the roles of signaling and media in real-time communications are fairly straightforward:

  • Media or bearer, is the channel, stream, or circuit that is actually carrying the voice or video image across the network;
  • Signaling is separated from media, and responsible for user management, including identity, authorization and authentication, charging, location management, and routing.

Although signaling and media are separate, they are connected at the core of any real-time communication service because the signaling must negotiate and establish the media sessions. WebRTC abstracts signaling by offering a signaling state machine that maps directly to PeerConnection.

WebRTC also manages a number of practical issues: includes and abstracts key NAT (Network Address Translator) and firewall traversal technology such as STUN (Simple Traversal of User datagram protocol through Network address translators), ICE (Interactive Connectivity Establishment), TURN (Traversal Using Relay NAT), RTP-over-TCP (Real-time Transport Protocol over Transmission Control Protocol) and support for proxies.

However, a deeper look into the WebRTC landscape shows that there are still a number of technical battlegrounds and topics to debate:

  • Codec choices: particularly VP8 vs. H.264 for video;
  • Current draft WebRTC vs. Microsoft’s proposed CU-RTC-Web vs. other proprietary approaches;
  • The role of WebSockets, PeerConnection, SPDY and other assorted protocols for creating real time-suitable browser or application connections;
  • Signalling protocols adopted along with WebRTC – SIP, XMPP…;
  • Specification support across major browsers and devices, including when and how.

Both Cisco and Ericsson are fans of H.264 being made a mandatory video codec for WebRTC as it is widespread on the Internet and mobile devices and it is acknowledged for its good quality and bandwidth-efficiency. But it is not open-source and requires royalty payments. On the other hand, Google’s preferred VP8 is royalty-free but has limited support today.

From an end-to-end solution architecture perspective, there are also several aspects to define and clarify such as the roles of WebSockets (a browser-server protocol) and PeerConnection (browser-browser protocol) as well as the role of SIP (server/gateway-centric). In browser-to-browser communications scenario, there is very little space for communications service providers but without a server-side infrastructure able to connect and inter-work WebRTC applications with our endpoints, WebRTC will be the Internet equivalent of a pair of walkie-talkies blended into applications and web-pages.

There is an exciting open space for network vendors to close the gap on SIP/IMS-based platforms supporting WebRTC. It will be interesting to see how vendors design their solutions: Application Servers vendors will present WebRTC-to-SIP Gateways but SBC vendors might have an edge as they are seat on the network border and can enforce security mechanisms preventing exposing the network core.