Byte Order

When it comes to byte order, there are two major CPU families involved. These are Motorola’s PowerPC series CPUs and Intel’s x86 series CPUs. The PowerPC series uses Big Endian to store data, while the x86 series uses Little Endian to store data. So what exactly is Big Endian and what is Little Endian?

In fact, Big Endian means the highest valid byte is stored in the low address, while Little Endian means the lowest valid byte is stored in the low address.

The text is rather abstract, so here is an example to illustrate.

Big Endian

1
2
3
4
5
Low Address High Address
----------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     12     |      34    |     56      |     78    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Little Endian

1
2
3
4
5
Low Address High Address
----------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     78     |      56    |     34      |     12    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

From the above two figures, we can see that the Big Endian way of storing data is in line with our human habits of thinking, while the Little Endian…

Why should I pay attention to the byte order issue? You may ask. Of course, if you write a program that runs only in a standalone environment and doesn’t deal with other people’s programs, then you can completely ignore the byte order. But what if your program has to interact with other people’s programs? Here I want to talk about two languages. Programs written in C/C++ store data in the order of the CPU on which they are compiled, while programs written in java store data in the Big Endian way only. Imagine what happens if you use C/C++ to write a program on an x86 platform that interoperates with someone else’s java program. Take the 0x12345678 example above. Your program passes a pointer to 0x12345678 to the java program, and since java uses the Big Endian method to store data, it will naturally translate your data to 0x78563412. What? It turns into another number? Yes, that’s what happens. Therefore, it is necessary to do a byte-order conversion before passing your C program to the java program.

It is no coincidence that all network protocols also use the Big Endian approach to transmit data. Therefore, we sometimes call the Big Endian method network byte order. When two hosts with different byte order communicate, they must be converted to network byte order before sending data.

Big Endian : the highest byte is in the lowest bit of the address, and the lowest byte is in the highest bit of the address, in that order; Little Endian : the lowest byte is in the lowest bit, and the highest byte is in the highest bit, in reverse order.

Endian refers to the logical-to-physical cell arrangement relationship when the smallest physical cell is smaller than the smallest logical cell. The smallest physical unit we come into contact with is byte, in the field of communication, here is often bit, but the principle is similar.

As an example.

If we write 0x1234abcd to a memory starting with 0x0000, the result is as follows.

position big-endian little-endian
0x0000 0x12 0xcd
0x0001 0x34 0xab
0x0002 0xab 0x34
0x0003 0xcd 0x12

Currently, Little Endian is the mainstream, because there is no need to consider addresses when converting data types (especially pointer conversions).

Origin of the terms Big Endian & Little Endian

The word “endian” comes from the book of Gulliver’s Travels. The civil war in the Little People’s Kingdom stems from the question of whether to start with the “Big-Endian” or the “Little-Endian” when eating eggs. This led to six rebellions, in which one emperor died and another lost his throne. In an era when Swift was a satire of the ongoing conflict between England and France, Danny Cohen, an early pioneer of network protocols, first used these two terms to refer to byte order, and the term became widely accepted.