Codementor Events

A quick WebRTC tutorial to build secure P2P communications on our own!

Published May 31, 2022Last updated Jun 06, 2022
A quick WebRTC tutorial to build secure P2P communications on our own!

Background preparations for building a webRTC app

Web Real-Time Communication (WebRTC) is a 2011- begun implementation by Ericsson which quickly became an open standard maintained by Google, Opera and Mozilla for browsers to implement peer-to-peer communications, data, video, audio communications without plugins for mobile platforms and Web of Things(WoT). As of 2021, Web RTC 1.0 is the standard in use. It mandates encryption and specifies three JavaScript APIs for browsers to access local media devices, and transmit media and generic applications as per real-time protocols.

WebRTC Signaling.gif

Although many web services use RTC they also need downloads or plugins like Facebook relying on Skype for audio and video P2P communications that are complicated and error-prone.

Hence, voice and video communication without special plugins and free to use services are preferred.

In this blog, we will look at building a simple Audio/Video-Chat Web Application using HTML5 Web UI, NodeJS WebSockets Signaling Server and Client JavaScript Code.

How does it work?

Before you dive into the tutorial, here are two important resources to understand WeRTC technologies “The Dawn of WebRTC” and peer-to-peer communications “An Introduction to the getUserMedia API”.

Principles of WebRTC P2P:

A webRTC application needs to

  • Firstly obtain media stream - audio, video or data
  • Secondly collect IP addresses and ports and exchange it with other clients
  • Thirdly, ‘signals’ are issued to report errors, start and close sessions
  • Fourthly, clients will have to exchange codecs, resolution and such about the media in transmission
  • Fifthly, the audio, video and data are streamed.

To achieve the above functionalities, the following three APIs are used in WebRTC:

1) MediaStream

Allows the web browser to access the stream from microphones or webcams

  • Synchronized stream of media with every stream having its own input and output and has three parameters:
    • Constraints object
    • Successful callback method
    • Failure callback method

2) RTCPeerConnection

Enables audio or video data transfer with encryption and bandwidth management between the local devices and a remote peer.
Caller and Called party have to set up their own RTCPeerConnection instance
RTC PeerConnection::onaddstream event callback is used to take care of HTML video or audio/video stream Initiator of the call or the caller will have to create an offer and use it for signaling services (like NodeJS server application using WebSockets) to the callee

The caller will receive the offer and ‘answer’ the call to create an answer to send to the caller

When using setLocalDescritpoin method three parameters can be considered: session description, success callback method and error callback method. In this method, the local description associated with a connection is changed. A description is used to define the properties of the connection and the codec.

3) RTCPeerConnection and Servers

Servers needed by a real application are:

  • Server for management of users
  • Exchanging information between peers server
  • Data exchange based on media, for video resolution
  • Connections for traversing NAT gateways and firewalls
  • The ICEFramework uses STUN protocol and its extension TURN to enable
  • RTCPeerConnection for NAT traversal and other details, using UDP to connect two video chat clients with minimum latency. Thus STUN servers need to: find the public address and port beyond the NAT. Two commonly used STUN servers are Google and Mozilla to get ICE candidates for other peers.

Peer Stun.png

Thus a theoretic simple webRTC application without the use of server components for signaling can be built but will have no practical value as the data is shared with the same peer.

4) RTCDataChannel

Enables P2P communication for generic data
RTCDataChannel interface uses a bi-directional data channel to connect peers on a connection. The objects of this type can be created with RTCPeerConnection.createDataChannel() or receive data channel on an existing RTC Peer Connection.

How to build a Simple Audio/Video-Chat Web Application?

In this section, we will learn to build a web application which performs a video call between two peers and displays local and remote videos. So the real application will have scope for complexities, managing users, and error capture. But, in this application, we consider only simple applications and do not consider error issues.

Assumptions made are remote located peers using Google Chrome or Firefox, with the ability to access the web application URL using 3G or DSL internet connectivity. The video call is initiated by one of the users with the ‘video call’ button; browsers of both users are enabled for webcam and microphone access and terminate the call with the ‘end call’ button.

The HTML5 Web UI

Using HTML5 Code, only four relevant elements are defined:

  • Two video elements for displaying remote and local videos
  • Two input elements to create the buttons for ‘video call’ and ‘end call.’
  • The script element is used at the end of the code to register a load event listener to execute pages which are fully loaded.
  • Relevant code is included for content using script element

The NodeJS WebSockets-based Signaling Server

This server receives messages from one client and broadcasts them to all the others. The messages signal information as per the peers needs and initiates P2P connection. For this purpose, WebSockets are used to build API modern browsers and need ws module for NodeJs

The Client JavaScript Code

This section focuses on the content used by the webrtc.js file. It is a file which defines the global variables for - replacing a new WebSocket connection to specify configuration parameters to initiate a new RTCPeerConnection by using Mozilla STUN services.

The reference to the local video stream is maintained and is released when the call ends. The callback method is used to assign load events.

It is at this stage that browser compatibility is checked to overcome strange situations by checking existence, using the getUserMedia method and navigator global object. Where no such methods are found, no calls are initiated and error messages are flashed. When webRTC is supported, calls are initiated and the event listener is used to assign an ‘end call’ button.

Finally, the Session Description Protocol(SDP) is used to signal the exchange of media configuration information.

1) Initiating a Call

Multiple steps will lead to the call being initiated. First, the local video stream is obtained and assigned to a video element to display on a page. Secondly, we can create and issues a connection offer to the other peers with createandsendoffer method. It is responsible for creating the instance and assigning the event.

ICE candidates forwarded to the signaling server for sending to other peers, while receiving a remote stream for assigning video elements.

2) Answering a Call

Call initiation the RTCPeerConnection helps in creating event listening as per requirement. The local stream is also used to get the video elements in use. The response is developed to answer the call and sent using the createAndSendAnswer, to prepare the answer and WebSocketChannel will connect with the signaling server and the loop is completed by forwarding the same to the other peer.

3) Ending a Call

Theoretically, WebRTC calls are ended by closing the peer connection via peerConn.close(). A check is then made to ensure “closed'' using the callback method assigned. But this results in two issues: This process may fail in Google Chrome and Firefox browsers. Secondly, a closed connection state can also be created erroneously when there is an internet outage or a break in the connection of peers. Hence, the signaling server is critical to ensure that both peers have the ‘real end call’ request.

Thus after RTCPeerConnection is closed, HTML5 elements are reset using stream sources for remote and local video. Finally, the ‘video call’ button will enable a new call and disable the ‘end call’ button.

Conclusion

In conclusion, the above tutorial is a basic guide to write your own video chat app using WebRTC technologies for secure P2P communications. For further features, you may need to go deeper into the workings under the hood, needing additional time investment. Alternatively, you can use the pre-developed platforms offered by leading vendors for efficient, packaged features at low costs for complete webRTC-led services!

Discover and read more posts from Parthibakumar Murugesan
get started
post comments1Reply
John Bishop
2 years ago

Super informative article 🙏 Thanks a million, Parthibakumar! 🙏 as for IVR builder. Why don’t you use this solution from Voiso https://voiso.com/cloud-ivr ? I may be wrong but I think it is the most efficient on the market at the moment and the pricing policy by Voiso is really fantastic. I mean there are simply zero reasons not to use it.