Real-time messaging technology in web applications

In Internet applications, many times we need real-time message interaction between the client and the server, such as the following scenarios.

SNS site user interaction message notification (weibo/twitter)
Real-time scrolling news (breaking news), live text (sports events) on portals
Online chat rooms (online customer service)
Real-time data display (real-time stock prices, real-time commodity prices, real-time server monitoring, etc.)

Next, let’s take a look at the common technical solutions for real time messaging implementation in web development, each of which has its own advantages and disadvantages and different options for different application scenarios.

Real-time push technology in the Web domain, also known as Realtime technology. The goal of this technology is to allow users to get real-time updates without refreshing their browsers. It has a wide range of application scenarios, such as online chat rooms, online customer service systems, commenting systems, WebIM, etc.

Normal HTTP

The HTTP protocol has one characteristic: passivity. What is passivity? In fact, the server cannot initiate contact with the client, it can only be initiated by the client. For example, if we want to get some data, we have to send a request from the client (such as a browser) to the server, and the server returns the query result. HTTP protocol can’t do that the server actively pushes information to the client, such as the alert of receiving new emails. This one-way request feature predetermines that if the server has continuous state changes, the client will be very troublesome to be informed.

The client requests a web page from the server
The server responds accordingly
The server returns the corresponding to the client

Before the WebSocket protocol, there were three ways to implement two-way communication: polling, long-polling, and iframe streaming.

Common real-time messaging technologies

AJAX Polling (polling)

The principle of polling is very simple, let the browser send a request every few seconds to ask the server if there is any new information.

The client requests the web page from the server using the normal http method
The client executes a JavaScript polling script in the web page and sends requests to the server in a regular loop (e.g. every 5 seconds) to get information
The server responds to each request and returns the appropriate information, just like a normal http request

The client regularly requests the server to ask if a new message is generated, in this case the client has to establish an http connection for each request and the server has to generate a response message.

An example in layman’s terms is.

while True:
 客户端：妹子，请你吃饭有空吗？（Request）
 服务端：没有！（Response）
 客户端：妹子，请你吃饭有空吗？（Request）
 服务端：没有。。（Response）
 客户端：妹子，请你吃饭有空吗？（Request）
 服务端：你好烦啊，没有啊。。（Response）
 客户端：妹子，请你吃饭有空吗？（Request）
 服务端：好啦好啦，有啦。（Response）
 客户端：妹子，请你吃饭有空吗？（Request）
 服务端：。。。。。没。。。。没。。。没有（Response）

As you can see, using polling, the connection between the client and the server will always be made, asking for it every so often. The disadvantage is also obvious: the number of connections will be many, one receiving and one sending. And every time a request is sent, there will be a Header of Http, which will be very traffic-consuming and will consume CPU utilization.

Pros: Simple to implement, easy to use, very low development cost, suitable for small applications just starting out or a backup to other programs.
Disadvantages: delayed messages, high network communication consumption (especially in mobile networks), server prone to peak requests.
Implementation: timed requests in the browser with js, or timed http requests in nativeapp on mobile devices.
Scenario: suitable for small applications.

AJAX Long-Polling (long polling)

Long-polling in fact, the principle is similar to polling, are using the polling method, but the adoption of the blocking model (keep calling, do not hang up if you do not receive), that is, the client initiates a connection, if there is no message, it does not return Response to the client. It does not return until there is a message, and after it returns, the client establishes a connection again, and so on.

The client requests a web page from the server using the normal http method.
The client executes the JavaScript script in the web page to send data and request information to the server
The server does not respond to the client’s request immediately, but waits for a valid update
The server pushes the data to the client only when the information is a valid update
When the client receives a notification from the server, it immediately sends a new request and moves on to the next poll

Or, in the vernacular, for example.

while True:
 客户端：妹子，请你吃饭有空吗？没有的话就等有了再返回给我吧！（Request）
 服务端：（额。。现在好忙，先不回复他，电话先不挂。）
 服务端：现在有空了。（Response）
 客户端：妹子，请你吃饭有空吗？没有的话就等有了再返回给我吧！（Request）
 服务端：（额。。现在好忙，先不回复他，电话先不挂。）
 服务端：现在有空了。（Response）

Essentially, long polling is an improved version of polling, where the client sends HTTP to the server and then sees if there is a new message, and if there is no new message, it waits. When there is a new message, it will be returned to the client. To some extent, this reduces the network bandwidth and CPU utilization problems. Since http packets often have a large amount of header data (usually more than 400 bytes), but very little data is actually needed by the server (sometimes only about 10 bytes), such packets are transmitted periodically on the network, which is inevitably a waste of network bandwidth.

Compared with the above polling mode.

Advantages: messages arrive at the client in a more timely manner; reduces the unnecessary waste of http requests constantly created and closed into. Long-polling supports most current browsers.

Disadvantages: The server side needs to maintain a large number of connections, the maintenance overhead of http connections is large; each time a message is generated, the connection needs to be recreated.

Implementation: The client simply sends a request and waits for a response. The server side needs to do two things: one is to maintain a large number of connections (Non-Blocking I/O); the other is to read the background message updates (asynchronous timing polling or triggered by events). Is a near real-time asynchronous approach.

Examples: WebQQ, Hi web version, Facebook IM.

Streaming-based mode (http-streaming / iframe-streaming)

In this case, the client maintains a persistent connection with the server, and new messages are continuously returned to the client through this persistent connection when they are generated on the server side. This mode is similar to long polling above, the difference is that only one connection needs to be created. Also, note that the http header needs to set the attributes Connection: keep-alive and Transfer-Encoding: chunked. Compared to longpolling mode above.

Advantages: messages can reach the client in real time; only one connection needs to be established between the client and the server.
Disadvantages: The server side also has to maintain a large number of connections, which is a lot of overhead.
Implementation: There are generally two ways for the client side: one is to hide the src of the iframe pointing to the server-side url and constantly dom rendering; the second is to use the XMLHttpRequest object in ajax to implement. For the server side, as with long polling, to maintain a large number of connections and handle background message updates.

HTML5 Server Sent Events (SSE) / EventSource

Traditionally, the server side does not actively push messages to the client side, the client side usually takes the initiative to request the server side to get the latest data. SSE is a technology that can actively push messages from the server side.

The essence of SSE is actually an HTTP long connection, except that instead of sending a one-time packet to the client, it sends a stream in the format of text/event-stream, so the client will not close the connection and will keep waiting for the server to send a new stream, and video playback is an example of this.

SSE uses the HTTP protocol, which is supported by existing server software. webSocket is a standalone protocol.
SSE is lightweight and simple to use; the WebSocket protocol is relatively complex.
SSE supports disconnected reconnection by default; WebSocket requires its own implementation.
SSE is generally used to send text only, binary data needs to be encoded and sent, WebSocket supports sending binary data by default.
SSE supports customizing the type of messages sent.
The client requests a web page from the server using the normal http method.
A connection is established between the client and the server by executing the JavaScript script in the web page
An event is sent to the client when there is a valid update on the server side
- Real-time push of server-to-client data, most of which you need
- You need a server that can do Event Loop
- Cross-domain connections are not allowed
- Default delay is 3 seconds, but can be adjusted.
- Unless Server-Sent Events doesn’t have to close the connection after every response is sent.
- Supports Chrome 9+, Firefox 6+, Opera 11+, Safari 5+

HTML5 Websockets

WebSocket was born in 2008, became an international standard in 2011 and is now supported by all browsers (see Quick Start for Beginners: A Concise Tutorial on WebSocket for more details). It is a new application layer protocol, a true full-duplex communication protocol designed specifically for web clients and servers, and can be compared to the HTTP protocol to understand the websocket protocol.

Their differences.

The protocol identifier is http for HTTP and ws for WebSocket.
HTTP requests can only be initiated by the client, and the server cannot actively push messages to the client, while WebSocket can.
HTTP requests have homologation restrictions, and communication between different sources requires cross-domain, while WebSocket has no homologation restrictions.

Their similarities.

both are application layer communication protocols.
the default ports are the same, both are 80 or 443.
Both can be used for communication between browsers and servers.
Both are based on TCP protocol.

Diagram of the relationship between the two and TCP.

The client requests the web page from the server using the normal http method
The client executes the JavaScript script in the web page and establishes a connection with the server
The server and the client can send valid data to each other in both directions
- The server can send data to the client in real time, while the client can also send data to the server in real time
- You need a server that can do Event Loop
- Using WebSockets allows connections to be established across domains
- It also supports third-party websocket host servers, such as Pusher or others. This way you only need to care about the client implementation, which reduces the development effort.

Benefits: Real-time communication, two-way interaction, saving server resources and bandwidth. True real-time.
Disadvantages: Insufficient browser support.
Implementation: client side requires html5 to implement, server side is generally supported by web servers.

WebSocket protocol is still under development, Chrome and Safri browsers support WebSocket by default, while Firefox and Opera turn off WebSocket by default for security reasons, and IE does not support it (including 9), currently [WebSocket protocol](http://dev.w3.org/ html5/websockets/) is the latest “Draft 76”. If you are interested, you can read the following information.

If you think this is not enough and want to learn more, you can refer to the following documents and manuals

Flash Socket

Embed a Flash program JavaScript using Socket class in the page to communicate with the socket interface on the server side by calling the socket interface provided by this Flash program, JavaScript controls the display of the page after receiving the information transmitted from the server side.

Pros: Realize real instant communication instead of pseudo instant.
Disadvantages: Flash plug-in must be installed on the client side; non-HTTP protocol, cannot automatically traverse firewalls. (Flash is dead)
Example: Web-based interactive game.

How to choose a solution?

In our practical application, how should the various schemes be chosen?

polling polling mechanism is very simple, easy to use, are http short connection, in the application architecture and ordinary interface api consistent, very suitable for small applications and just starting applications, can save a lot of development costs.

When the number of users rose to a certain level (such as daily activity of a million), the product has become more and more popular, we have higher requirements for the timeliness of the arrival of messages, and this polling mechanism is a test of the pressure on the server, then we can consider the long polling or http streaming way, these two ways in fact in the server-side implementation is similar, mainly to maintain a large number of long connections and asynchronous (or event-triggered) to get the message.

In terms of keeping connections, after the advent of java’s nio, which provides us with convenience, mainstream application servers like tomcat and jetty have support, but the server working on jvm to keep a large number of socket connections when gc is a very serious problem. This can be achieved by using nginx’s push module to maintain connections and push messages in real time.

Module that supports long polling mode:https://pushmodule.slact.net/
Module that supports streaming mode: http://wiki.nginx.org/HttpPushStreamModule

These two modules work on nginx to maintain a large number of http connections, and implement the pub/sub protocol to support message sending, the basic process is the following figure, the sender (publisher) pushes the message to the nginx server, and then the push module is responsible for sending the message to the subscriber (client): with the help of this nginx push module, we can save a lot of work, and the application only needs to focus on business logic (that is, what the publisher does), simplifying the application architecture, while the high performance of nginx also has a good performance.

When our product grows to 10 million daily activities, we may have to consider better solutions, such as the WebSocket mentioned above, but this solution is not the mainstream solution for browser compatibility issues. Generally in this case, we have to consider tcp-based socket long connection mode, through some kind of message pushing protocol (xmpp/mqtt/self-defined protocol, etc.) to achieve real-time interaction between client and server side.

Socket.IO

Socket.IO is a fully JavaScript implemented, Node.js based, WebSocket enabled protocol for real-time communication, cross-platform open source framework that includes client-side JavaScript and server-side Node.js. It is a library that provides cross-platform real-time communication for real-time applications. socket.io aims to make real-time applications possible on every browser and mobile device, blurring the differences between different transport mechanisms. socket.io gets its name from the fact that it uses the HTML5 WebSocket standard supported and adopted by browsers, since not all browsers support WebSocket, the library supports a number of downgraded features: the

Websocket
Adobe® Flash® Socket
AJAX long polling
AJAX multipart streaming
Forever Iframe
JSONP Polling

In most contexts, you have the option to maintain a similar long connection to the browser with these features.

IO is designed to build real-time applications that work well on different browsers and mobile devices, such as real-time analytics systems, binary stream data processing applications, online chat rooms, online customer service systems, commenting systems, WebIM, etc. Socket.IO already has many powerful modules and extension APIs, such as (session. socket.io) (http session middleware for session-related operations), socket.io-cookie .com/ivpusic/socket.io-cookie) (cookie parsing middleware), session-web-sockets (secure way to session passing), socket-logger (JSON format logging tool), websocket.MQ networkimprov/websocket.MQ) (reliable message queue), socket.io-mongo (adapter using MongoDB adapter), socket.io-redis (adapter for Redis), socket.io-parser Automattic/socket.io-parser) (the default protocol implementation module for server-side and client-side communication), etc.

IO implements a real-time, bi-directional, event-based communication mechanism, which solves the problem of real-time communication and unifies the way server and client are programmed. After starting Socket, it is like establishing a pipeline between the client and the server, so that both sides can communicate with each other. It also works well with traditional request methods provided by Express.js, i.e., it provides two types of connections on the same domain and port: request/response, websocket (flashsocket, ajax…).

Table of Contents