FAT file system knowledge overview

A file is, in fact, data. Data is represented in the computer as 0/1, and the most basic unit is the bit. 8 bit = 1 Byte, 1024 Byte = 1 KB, 1024 KB = 1 MB, and so on. The content of a file is also a combination of several 01 strings. When reading/writing a file, we call the functions read()/write() in the kernel, which take the descriptor of the file and then read/write the data of the specified length. All data is also in the form of 0/1, but when we apply these functions, the data is converted to a more advanced representation, such as char, int or other types.

What is a disk

Now that hard disks have reached terabytes of capacity, what is their physical implementation and how do they work? I’ll briefly take a look at the details at How a hard disk work?, which explains very succinctly how it works.

The most important thing about a disk is the sector, which has a ring of magnetic tracks on it, and these tracks store information. How to read and write it? At the physical level, this is achieved by changing the polarity of each storage unit in the track with a magnetic head.

The physical storage unit in a disk is called a sector, and a storage unit in a file system is called a block (cluster in FAT systems), with each cluster corresponding to one or more sectors. A more detailed explanation is available on the wiki Disk Sector

FAT32 file storage

When you normally manipulate files, for example, you open a doc file, add some content and then save it, or delete a file to the recycle bin, how are they implemented internally? Different file systems have different ways of implementing them. But all operations cannot be performed without storage as a basis. The question arises: how to design a file system that can both read and write files efficiently and locate them quickly?

Let’s look at the most primitive idea: direct sequential addition, that is, adding files one by one to the storage space (hard disk). However, such an implementation is not conducive to either finding or deleting, adding and modifying. Think about it, if some files are deleted, gaps will be created and when files are added again, the separate gaps may not be enough to accommodate the new files, thus creating waste. And whenever you look for a particular file, you need to traverse all the file structures, and this is going to take quite a long time.

Let’s look at the way FAT32 is implemented: it divides the storage space into small blocks (clusters), and when storing files, they are cut into small blocks of corresponding length and then filled into the hard disk.

This way, we don’t have to worry about files that are too big to fit in the gaps, because we can put some of the chunks in one gap and others in another, which makes efficient use of disk space.

The second concept is that FAT32 uses a linked table data structure. In other words, each cluster on the disk is a node in a chain table, and they record the location of the next cluster (next pointer). In FAT32, only the cluster where each file starts is recorded, so we need to use a chain table to access the entire file.

The table used to store the information in this chain is called FAT ( FILE ALLOCATE TABLE ), the real place where the data is stored is separate from the FAT, the role of the FAT is to facilitate the search.

Next, let’s look at the delete operation. This will lead to another proprietary structure: FILE ENTRY

First of all, let’s think about deleting a file or writing a new file (e.g. copy and paste). Deleting. It’s almost an inverse process, so why the difference in time? Actually, when you delete a file, the file system doesn’t actually wipe the data off the disk (which is why we can hopefully recover the deleted file), but only modifies its FILE ENTRY information.

What is FILE ENTRY? In simple terms, it is a small structure that records the attributes of a file. They are stored uniformly in ROOT DIRECTORY. Let’s take a look at the overall appearance of a FAT32 disk

image

Let’s ignore the first few sectors and start with the FAT. A FAT system has several FAT structures (the number of nodes needed for the chain table varies depending on the size of the disk), next to the FAT sector is ROOT DIRECTORY, which is the directory structure of the entire disk, and what is stored in it is what we call FILE ENTRY, that is, the attributes of each file. DATA FIELD, which is used to store the real file content.

When we view information about a file instead of opening it, we don’t need to access the file’s data directly. The file system will find the corresponding FILE ENTRY in the ROOT DIRECTORY and display the relevant information. This includes: file name, creation and modification time, file size, location of the first cluster of the file, read-only/hidden, etc. Note that folders are also represented as a file in the file system and also have a corresponding FILE ENTRY, only they store a batch of files ( there is a corresponding flag in the FILE ENTRY indicating whether it is a folder or not).

Back to our topic of deleting files, when a file is deleted, the system will find the corresponding FILE ENTRY and change the first character of the file name to 0xE5 - done. It is that simple, just the file attributes are changed, no internal data is changed at all. At this time if we add another file into it, since the system will determine the available space by looking for ROOT DIRECTORY, so if some FILE ENTRY file name is found to be unallocated or deleted flag, then the corresponding cluster will be occupied. But until they are overwritten, these deleted files are still on your hard drive (you just lose access to their information). This is why deletion is faster.

BPB (BIOS Paramter Block)

One of the most important data structures in the FAT file system is the BPB (BIOS Parameter Block), which stores important configuration parameters for the FAT file volume. Private Boot Record), but it is only the first sector of the reserved area and data volume.

We use the following structure to describe the structure of the BPB.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
pub struct BiosParameterBlock {
    pub(crate) bytes_per_sector: u16,
    pub(crate) sectors_per_cluster: u16,
    pub(crate) reversed_sector: u16,
    pub(crate) fats: u8,
    pub(crate) root_entries: u16,
    pub(crate) total_sectors_16: u16,
    pub(crate) media: u8,
    pub(crate) sectors_per_fat_16: u16,
    pub(crate) sectors_per_track: u16,
    pub(crate) heads: u16,
    pub(crate) hidden_sectors: u32,
    pub(crate) total_sectors_32: u32,

    // Extended BIOS Paramter Block
    pub(crate) sectors_per_fat_32: u32,
    pub(crate) extended_flags: u16,
    pub(crate) fs_version: u16,
    pub(crate) root_dir_first_cluster: u32,
    pub(crate) fs_info_sector: u16,
    pub(crate) backup_boot_sector: u16,
    pub(crate) reserved_0: [u8;12],
    pub(crate) drive_num: u8,
    pub(crate) ext_sig: u8,
    pub(crate) volume_id: u32,
    pub(crate) volume_label: [u8;11],
    pub(crate) fs_type_label: [u8;8]
}

Since the boot sector is located in cluster 0 of the data volume, we can simply read it out and read the corresponding field according to the memory layout, without passing in the corresponding location.

BIOSParameterBlock Description of some fields:

Name Description
bytes_per_sector the number of bytes per sector, only the following values apply: 512/1024/2048/4096
sectors_per_cluster The number of sectors in each cluster, this value must be a power of 2, legal values are 2/4/8/16/32/64/128, not another byte per cluster greater than 32K
reserved_sector the number of reserved sectors in the voulme reserved area starting from the first sector of the volume, this field must not be 0, for FAT32, this value is usually 32.
fat_nums The number of FAT32 sectors on the volume, in theory any value greater than or equal to 1 is valid, but a value of 2 is recommended
root_entry_cnt This value should be set to 0 for FAT32
fat_size for FAT32 this value should be 32 (each FAT entry has 32 bits)
root_cluster The cluster number of the root cluster, which exists only on FAT32 filesystems and is usually set to 2, but is not mandatory
root_dir_sector 0
volume_id The volume id, which, along with volume_label, supports volume tracking on removable media. These values allow FAT file system drivers to detect erroneous disk insertions into removable drives, and this ID is typically generated by simply combining the current period and time into a 32-bit value
volume_label This field matches the 11-byte volume label in the root directory. the FAT file system driver should ensure that this field is updated when the volume label in the root directory changes or its name is created. When there is no volume label, the setting for this field is the string “No Name”

Root Directory

After getting the information about the boot sector, we can read the structure of the Root Directory. Since the boot sector already has information about the total number of FAT-occupied clusters and the total number of reserved clusters, we can go beyond the previous clusters to find the Root Directory, so the following method is defined in our implementation.

1
2
3
4
5
6
7
/// Get the first sector offset bytes of the cluster from the cluster number
pub(crate) fn offset(&self, cluster: u32) -> usize {
    ((self.reversed_sector as usize)
    + (self.fats as usize) * (self.sectors_per_fat_32 as usize)
    + (cluster as usize - 2) * (self.sectors_per_cluster as usize))
    * (self.bytes_per_sector as usize)
}

FAT and Cluster

The FAT (File Allocate Table) is another important component, which describes the extension information of a file using a chain table. The purpose of this structure is to define a chain of file ranges (cluster chains). Note that both directroy and file are contained in the file and are not different on the FAT. A directory is actually a file with special attributes indicating that its contents are a directory table.

The data area is divided into a certain number of sector blocks called clusters, and the data area is managed in this unit. Each item of the FAT is associated with each cluster in the data area, and the FAT value indicates the status of the corresponding cluster. However, the first two FAT entries, FAT[0] and FAT[1], are reserved and are not associated with any clusters. The third FAT entry, FAT[2], is the entry associated with the first cluster in the data area, with a valid cluster number starting at 2.

FATs are typically replicated for redundancy purposes, as corruption of any FAT sector can result in severe data loss. BPB_NumFATs indicates the number of FAT copies and the size of FAT sectors becomes fat_nums * sectors_per_fat_32. The FAT driver usually references only the first FAT copy, and any updates to FAT entries are reflected in each FAT copy.

The files on a FAT volume are managed by directories, which are arrays of 32-byte directory entry structures. Details of the directory entries are described below. A directory entry has the file name, file size, timestamp, and first cluster number of the file data. The cluster number is the entry point for following a chain of clusters of file data. If the file size is zero, the first cluster number is set to zero and no data clusters are assigned to the file.

As mentioned above, cluster numbers 0 and 1 are reserved, and valid cluster numbers start at 2. Cluster number 2 corresponds to the first cluster in the data area. Therefore, in a volume with N clusters, the valid cluster numbers are from 2 to N+1 and the number of FAT entries is N+2. The location of data cluster N is calculated as follows.

1
FirstSectorofCluster = DataStartSector + (N - 2) * BPB_SecPerClus;

If the file size is larger than the sector size, the file data spans multiple sectors in the cluster. If the file size is larger than the cluster size, the file data spans multiple clusters in the cluster chain. The value of the FAT entry indicates the backward cluster number (if present) so that any byte offsets in the file can be reached by following the cluster chain. Cluster chains cannot be tracked backwards. The FAT entry with the last link in the cluster chain has a special value (end-of-chain, EOC, flag) that never matches any valid cluster number. The EOC flags for each FAT type are as follows.

  • FAT12: 0xFF8 - 0xFFF (usually 0xFFF)
  • FAT16: 0xFFF8 - 0xFFFF (usually 0xFFFF)
  • FAT32: 0x0FFFFFF8 - 0x0FFFFFFF (usually 0xFFFFFFF)

There is also a special value, the bad cluster flag. The bad cluster flag indicates that there are defective sectors in the cluster that cannot be used. Bad clusters found during formatting, surface inspection or disk repair are recorded in the FAT as bad cluster flags. The value of the bad cluster flag is 0xFF7 for FAT12, 0xFFF7 for FAT16, and 0x0FFFFFF7 for FAT32.

The value of the bad cluster tag will never equal any valid cluster number on a FAT12/16 volume. However, it can equal any number of allocatable clusters, since the maximum number of clusters is not defined in FAT32. Such FAT32 volumes may confuse disk utilities, so you should avoid creating such FAT32 volumes. Therefore, the maximum number of clusters for a FAT32 volume is actually 268435445 (about 256 M clusters).

For implementation reasons, some systems have limits on the maximum number of clusters. For example, Windows 9X/Me supports a maximum FAT size of 16 MB, and it limits the number of clusters to a maximum of approximately 4 M clusters.

Each allocatable FAT entry FAT[2] and the initial value after it is zero, which means that the cluster is not in use and is available for new allocations. If the value is not zero, the cluster is in use or is corrupted. Idle cluster counts are not recorded anywhere in the FAT12/16 volumes and a full FAT scan is required to obtain this information. FAT32 supports FSInfo to store the idle cluster count to avoid a full FAT scan due to its very large FAT structure.

The first two FAT entries, FAT[0] and FAT[1], are reserved and are not associated with any clusters. These FAT entries are initialized when the volume is created, as follows.

  • FAT12: FAT[0] = 0xF?? ; FAT[1] = 0xFFF
  • FAT16: FAT[0] = 0xFF? ; FAT[1] = 0xFFFF
  • FAT32: FAT[0] = 0xFFFFFF? ; FAT[1] = 0xFFFFFFFF

?? The value of FAT[0] is the same as the value of BPB_Media, but this entry does not have any function. Some bits in FAT[1] record the error history.

Volume dirty flag: (FAT16: bit15, FAT32: bit31): cleared at boot and restored at clean shutdown. Cleared at boot indicates a dirty shutdown and possible logical errors in the volume. Hard error flags: (FAT16: bit14, FAT32: bit30): cleared on unrecoverable read/write errors, indicating a surface check is required. These flags indicate the possibility of errors on the volume. Some operating systems that support this feature check for these flags at boot time and automatically launch the disk checker tool. The Windows 9X series uses these flags. The Windows NT series does not use these flags, but uses the alternatives in the BPB.

There are two more important things to know about FAT regions. One is that the last sector of the FAT may not be fully used. In most cases, the FAT ends in the middle of the sector. The FAT driver should not have any assumptions about unused areas. It should be filled with zeros when the volume is formatted and should not be changed afterwards. Another one is that BPB_FATSz16/32 can indicate a value larger than the volume requirement. In other wards, unused sectors can follow each FAT. This may be the result of data sector alignment or other reasons. In addition, these sectors are padded with zeros when formatted.

The following table shows the range of FAT values and the meaning of each FAT type:

FAT12 FAT16 FAT32 Sense
0x000 0x000 0x00000000 free
0x001 0x0001 0x00000001 Reserve
0x002 - 0xFF6 0x0002 - 0xFFF6 0x00000002 - 0x0FFFFFF6 In Use (next linked value)
0xFF7 0xFFF7 0x0FFFFFF7 bad cluster
0xFF8 - 0xFFF 0xFFF8 - 0xFFFF 0x0FFFFFF8 - 0x0FFFFFF in use (end of chain)

Note:

1
2
data_sectors = total_sectors - (reserved_sectors_cnt + (fat_nums * fat_size)) -2 ?
count_of_clusters = data_sectors / sectors_per_clusters
1
2
3
fat_offset = N * 4(N为簇号)
fat_sectors_offset = reserved_sectors_count + (fat_offset / bytes_per_sectors)
fat_entry_offset = fat_offset % bytes_per_sectors

The upper 4 bits of the FAT32 Entry are not used.

We know that the first two clusters of the FAT32 file system data clusters are set as reserved clusters, so what should the first two entries of the FAT be set to? For a FAT32 file system, FAT[0] = 0x0FFFFFF8, the second reserved cluster FAT[1] is set by FORMAT as the EOC flag, and for FAT32, the file system driver can use the higher two bits of the FAT[1] entry as the dirty volume flag (all other bits are always set to 1).

clean_shut_bit_mask = 0x080000000;

  • a bit of 1 indicates clean
  • a bit of 0 indicates dirty

hard_err_bit_mask = 0x04000000.

  • a bit of 1 indicates no disk IO error
  • a bit of 0 indicates a disk IO error has occurred

FAT Volume Initialization

FAT Directory Structure

A FAT directory is a file that consists of a linear table of 32 bytes. The root directory is a special case.

For FAT32, the root directory is variable in size and is a chain of clusters, just like any other directory entry. The first cluster of the FAT32 volume root directory is stored in the root_cluster, and the root directory itself on any FAT type does not have any date or time stamp, does not have a file name (except for the implied file name “/”), and does not include the “.” and “…” and “…”. A special feature of the root directory is that its files are set with the flag bit of the volume_id_attr attribute.

domain offset size description
name 0 11
attribute 11 1
NT_reverse 12 1
creation_time_tenth 13 1
creation_time 14 2
creation_date 16 2
last_access_date 18 2
first_cluster_high 20 2
write_time 22 2
write_date 24 2
first_cluster_low 26 2
file_size 28 4

Note

name[0] = 0xE5, indicating that the directory entry is empty

name[0] = 0x00, again indicating that the directory entry is empty, and that the remainder of the directory entry is empty

name[0] = 0x05, indicating that the directory entry is the 0xE5 character

The following characters are not valid in dir_name.

  • less than 0x20 except 0x05
  • -0x22, 0x2A, 0x2B, 0x2C, 0x2E, 0x2F, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F, 0x5B, 0x5C, 0x5D, a and 0x7C.

When the attribute field is ATTR_DIRECTORY indicating that the entry is a directory (0x10), the file_size attribute should be set to 0 indicating that the file_size is not used by following its cluster until the EOC flag is found.

A cluster is assigned to that directory, then first_cluster_low and first_cluster_high are set to that cluster number, and an EOC flag is placed on that cluster entry in the FAT. Next, initialize all bytes of that cluster to 0. If the directory is not the root directory, you need to create two special entries in the first 32 bytes of the Directory Entry of that directory (i.e. the first 2 32 byte areas of the data area of the cluster you just assigned) if the directory is not the root directory.

These two directory entries are ‘.’ and ‘…’.

FAT Long Directory Entries

Each long directory entry must be used in conjunction with a short directory entry

domain offset size description
dir_order 0 1
dir_name1 1 10 first to fifth character of long directory item name
dir_attr 11 1 Must be Attr_Long_Name
dir_type 12 1 If 0, the directory entry is a subcomponent of the directory entry
dir_checksum 13 1 The checksum of the name of the short directory entry at the end of the long directory
dir_name2 14 12 6-11 characters
first_cluster_low 26 2 must be 0
dir_name3 28 4 12-13 characters

For a compatible design, long directory entries and short directory entries must appear in pairs and be physically preceded by and contiguous with short directory entries.

First, each member of a set of long directory entries is uniquely numbered, and the last member of the set is dir_ord with a flag indicating that it is the last member of the set. dir_ord field is used for this determination, with the value of dir_ord set to 1 for one member of the set and n or LAST_LONG_ENTRY for the nth. The file system always uses these to indicate the last directory entry of a “free” directory.

In addition, the 8-bit checksum is calculated for the name included in the short directory entry when creating the segment and long directory entries. all 11 characters in the segment name are used to calculate the checksum.

Checksum calculation algorithm.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
//-----------------------------------------------------------------------------
//	ChkSum()
//	Returns an unsigned byte checksum computed on an unsigned byte
//	array.  The array must be 11 bytes long and is assumed to contain
//	a name stored in the format of a MS-DOS directory entry.
//	Passed:	 pFcbName    Pointer to an unsigned byte array assumed to be
//                          11 bytes long.
//	Returns: Sum         An 8-bit unsigned checksum of the array pointed
//                           to by pFcbName.
//------------------------------------------------------------------------------
unsigned char ChkSum (unsigned char *pFcbName)
{
    short FcbNameLen;
    unsigned char Sum;

    Sum = 0;
    for (FcbNameLen=11; FcbNameLen!=0; FcbNameLen--) {
        // NOTE: The operation is an unsigned char rotate right
        Sum = ((Sum & 1) ? 0x80 : 0) + (Sum >> 1) + *pFcbName++;
    }
    return (Sum);
}