Linux filesystems

Disk storage: filesystems

A filesystem describes how data is organised on disk in files and directories. The traditional Linux filesystems is ext4, but alternate filesystems are available, most notably xfs and btrfs. Until recently btrfs was considered experimental and it is not used as the default filesystem on Linux, but both xfs and btrfs have been shown to have faster performance than ext4, especially when dealing with multi-client workloads. As a result they are often used as the underlying filesystems for more complex storage configurations (e.g. when using GlusterFS).

Filesystems exist on top of block devices. In Linux every piece of hardware is made available via a device node in /dev. Thus a hard disk might appear as /dev/sda or /dev/vda on a virtual machine. If you examine this file you see that it is a block device:

$ ls -l /dev/vda2 
brw-rw----. 1 root disk 252, 2 Feb  7 10:33 /dev/vda2

Linux also allows simple files on disk to be presented as block devices using the loopback system. This is configured using the losetup command, which has to be run as root. This is useful for experimenting with filesystems and storage. Before using a file as a simulated disk, you need to create an empty file, which you can do with dd. For example, to create a 1GB empty file:

dd if=/dev/zero of=/tmp/disk1.img bs=1024 count=1000000

This will create a file /tmp/disk1.img that is full of empty bytes. Then this can be made available as a ‘disk’ using losetup, e.g.

$ sudo losetup -f /tmp/disk1.img 
$

To see what this created, use losetup -a:

$ sudo losetup -a
/dev/loop0: [fc00]:28591048 (/tmp/disk1.img)
$ ls -l /dev/loop0
brw-rw---- 1 root disk 7, 0 Feb  7 11:42 /dev/loop0
$

This shows that /tmp/disk1.img is made available as /dev/loop0 and the ls -l shows that /dev/loop0 is a block device. To remove a loopback device use losetup -d:

$ sudo losetup -d /dev/loop0
$ sudo losetup -a
$

Filesystems are created on block devices using mkfs. E.g.

$ sudo mkfs.ext4 /dev/loop0
mke2fs 1.42.9 (4-Feb-2014)
Discarding device blocks: done                            
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
62592 inodes, 250000 blocks
12500 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=260046848
8 block groups
32768 blocks per group, 32768 fragments per group
7824 inodes per group
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376
Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

$

The verbose output from mkfs.ext4 shows all the structure being created on the ‘disk’. This structure is used to provide redundancy in case some corruption happens on the filesystem. Your filesystem can now be mounted:

$ sudo mount /dev/loop0 /mnt
pvh@ebm:/
$ mount
/dev/mapper/ubuntu--vg-root on / type ext4 (rw,errors=remount-ro)
...
/dev/loop0 on /mnt type ext4 (rw)
pvh@ebm:/
$ ls /mnt
lost+found

The filesystem is now empty and you can write to it. Unmount it with umount:

$ sudo umount /mnt
$ 

Filesystem integrity can be checked (and optionally repaired) with fsck. This is used on the block device, not the filesystem mount point, and can only be done when the filesystem is not mounted. Typically the **fsck -p* command is used to automatically fix safely fixable errors:

$ sudo fsck -p /dev/loop0 
fsck from util-linux 2.20.1
/dev/loop0: clean, 11/62592 files, 8345/250000 blocks
$

The “clean” message shows that the filesystem is clean and doesn’t need to be checked. The clean flag is set when a filesystem is unmounted, so if a server crashes (e.g. due to a power fault) it will be set and the fsck output might look like this:

$ sudo fsck -p /dev/loop1 
fsck from util-linux 2.20.1
/dev/loop1: recovering journal
/dev/loop1: clean, 11/62592 files, 8345/250000 blocks
$

If fsck -p fails you might need to use fsck -y but this might result in data loss. It is a last ditch attempt to recover a corrupt filesystem.

Leave a Reply

Your email address will not be published. Required fields are marked *