Accessing a virtual disk using a loop device

The loop device is a virtual device under Linux. It is widely used for processing files such as virtual disks and ISO images. Recently, I have been working on an ARM virtual system, and I need to use the loop device to create a system image. It took a lot of research to get a relatively comprehensive understanding of it. In order to make the article clear, this article will also introduce the basics of Linux disk management.

Disk devices

Linux manages disks in the form of files. Usually the first device corresponds to the file /dev/sda. What you may not know is that we can read and write to this file directly. On a Linux system, this is a normal file. The only difference is that the size of the file is fixed, i.e. the capacity of the disk. We can’t append data at the end of the file. But the entire file can be read and written with all data. For example, we can write hello.

Don’t operate on your own system, it will clear the entire disk of data!

`1`	`echo -n hello > /dev/sda`

Then we can also read out the five bytes.

1
2

head -c 5 /dev/sda
hello%

The Linux kernel driver converts the corresponding read and write calls into disk operations, and eventually the five bytes of hello data are saved to the underlying disk. Operating the disk in this way can also affect different programs, and programs need to handle the mapping of file names to file contents on their own. So the kernel provides a solution to these basic problems, and thus partitions and file systems are created.

Partitioning table

A partition is a division of the disk into several different pieces of space. Information such as the type and start and end location of each block is stored in a certain segment at the beginning of the hard disk. This section is also called a partition table. The mainstream partition tables are DOS and GPT, and the common partitioning tools under Linux are fdisk and gdisk, both of which support both kinds of partition tables.

Once the partitioning is done, Linux automatically recognizes the partition table information and then creates the corresponding device files for each partition. For example, if we divide /dev/sda into two partitions, the kernel will create /dev/sda1 and /dev/sda2 files respectively.

These two files can be read and written directly, just like /dev/sda. The only difference is that reading and writing to /dev/sda1 does not affect reading and writing to /dev/sda2. This is the least granular isolation provided by the operating system. But again, reading and writing data by filename is not supported.

File systems

To do this, we also need to format each partition, i.e. create file system information.

Linux supports many file systems, such as EXT4/FAT/XFS/BTRFS and so on. The most common one is EXT4, which is used by many Linux distributions as the root filesystem. Each file system provides a corresponding tool. For example, to create an EXT4 system, you need to use mkfs.ext4.

`1`	`mkfs.ext4 /dev/sda1`

Note that you need to specify the partition device file /dev/sda1 when creating the filesystem, not the disk device file. In theory, you can also use /dev/sda directly, but I haven’t tried it. Readers can try it at their own risk 😄

After creating the EXT4 file system, we still can’t use it directly. Instead, we need to perform the desired mount operation.

`1`	`mount /dev/sda1 /mnt`

The first argument to mount is the partitioned device, and the second argument is the folder to be mounted, also called the mount point. /mnt is the traditional mount point under Linux, and it is usually used for temporary mounting of devices. You can also create your own mount directory. No matter what files are in the mount directory, as soon as the mount is completed, the files and directories in the corresponding device will be displayed.

The unmount command for mounting is umount, not unmount.

`1`	`umount /mnt`

The mount point directory will re-display its original contents after unmounting.

Automounting

Actually, you can compare the mount with the previous /dev/sda1. Since the path to the partitioned device is determined by the kernel, the kernel will automatically create the partitioned device based on the partition table information. This in itself is “mounting” the partition table information. But when it comes to the file system, you can’t do that. Because the kernel does not know which folder to mount the file system to, there is no way for the kernel to automount it. The user has to do it manually. Because of this, early Linux systems were very unfriendly to users. Because of this lack of clarity, users always thought that Linux was bad. And it was, because in the early days, you had to specify the file system type manually when you mounted it!

Linux was also designed with the intention that the UNIX system trusts the user unconditionally. Even if the user types rm -rf /, it will execute! Because it trusts, it requires the user to know their system well, and it requires the user to be responsible for their actions. So the greater the ability, the greater the responsibility. As a UNIX user, you can’t mindlessly execute a command and then ask the system to underwrite your actions. This is the philosophy that Linux inherited from UNIX.

I do like this design, but it is really not newbie friendly. As time went on, Linux implemented an automatic mount feature through the udev system. A typical area is the automatic ejection of the corresponding folder when you plug in a USB drive. This system is just a plug-in for the previously mentioned content, and is still essentially using the mount command to mount. Only the new system corresponds to the location of the mount point to make a unified agreement, so as to achieve the function of automatically opening the flash drive or CD.

All of the above is background knowledge.

Virtual CD

As we said before, the kernel saves the contents written to the /dev/sda file directly to the real hard disk device. But sometimes we don’t want to manipulate the device directly, like reading an ISO image. The so-called ISO image is a data file copied from a CD. Normally to read the contents of this file you need to first record it to a new CD and then read it with an optical drive device. The corresponding Linux kernel is in the /dev/cdrom device file (which may vary slightly from distribution to distribution). This means that the content read from the /dev/cdrom file is the content from the CD-ROM.

But nowadays, optical drives are rare and burners are even rarer. Even if you have all the equipment, what the application can read is just another file /dev/cdrom. In other words, after going through the ISO file ➜ burner ➜ CD ➜ CD-ROM ➜ /dev/cdrom process, the application is just reading another file, so it would be nice if all the intermediate processes could be eliminated and the ISO file could be read directly.

A loop device is a virtual device that corresponds to the device file /dev/loop0. If there are more than one, they are numbered sequentially. The losetup command is needed to operate the loop device.

For example, let’s start by associating the ISO file to the loop device.

`1`	`losetup /dev/loop0 alpine-virt-3.16.3-x86_64.iso`

You can check the correspondence after the association.

1
2
3

losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                              DIO LOG-SEC
/dev/loop0         0      0         0  0 /home/ts/alpine-virt-3.16.3-x86_64.iso   0     512

At this point we are ready to mount the CD.

1
2
3

mount /dev/loop0 /mnt
ls /mnt
apks  boot  efi

This means that the /dev/loop0 device is now equivalent to the original /dev/cdrom device. Once mounted, you can now view the contents of the CD image in the /mnt directory.

These two steps can even be combined into one.

`1`	`mount -o loop alpine-virt-3.16.3-x86_64.iso`

After using it, you need to uninstall /mnt and then unlink it.

`1`	`losetup -d /dev/loop0`

CD-ROMs are special and cannot be partitioned. So it cannot be considered a typical usage scenario. But here is the key, so all read and write operations to the loop device will be “forwarded” to the corresponding common file, which in the above example is an ISO image file. This means that we don’t need a real CD-ROM drive, but we can read the contents of the ISO image. This is called a virtual CD-ROM drive.

Virtual Hard Disk

The same principle applies to regular disks. When we use virtual machines we create various virtual disks, and these are essentially image files as well. Unlike ISO files, virtual disks can be written to, and their capacity can be adjusted. However, the processing principle is the same and they all need to be associated via loop devices.

Not all virtual disks can be handled directly by the kernel. Things like cow2 are compressed by QEMU and cannot be handled by the kernel. But we can convert them to so-called raw format files, and the kernel can handle them.

Under Linux, creating a virtual disk is creating a normal file.

`1`	`dd if=/dev/zero of=disk1.img bs=1M count=256M`

Then associate it with the loop device.

`1`	`losetup /dev/loop0 disk1.img`

At this point /dev/loop0 becomes a disk device just like the /dev/sda device. The only difference is that /dev/sda actually saves the data to the hard disk, while /dev/loop0 saves it to the disk1.img file.

We can use fdisk to partition /dev/loop0. After partitioning, the kernel will automatically create a partition device file like /dev/loop0p1. Then we use mkfs.ext4 to create a filesystem for a partition like /dev/loop0p1. Finally, we mount it to a folder. After that all file operations within the mount point will be recorded to the virtual disk disk1.img.

Some articles also say that you need to run kpartx -a /dev/loop0 before the system will create the corresponding partition device node, and that it is in the /dev/mapper directory. But I experimented without this step. Maybe the new system is smarter. But if you execute this command, you need to execute kpart -d /dev/loop0 before unlinking.

If disk1.img is created by QEMU and the operating system is installed, then we can also view the root directory of the corresponding system after mounting it and even modify the files in it. But only if the current system supports the corresponding file system.

The above is all, I hope it will help you.

Table of Contents