Disk storage: filesystems
A filesystem describes how data is organised on disk in files and directories. The traditional Linux filesystems is ext4, but alternate filesystems are available, most notably xfs and btrfs. Until recently btrfs was considered experimental and it is not used as the default filesystem on Linux, but both xfs and btrfs have been shown to have faster performance than ext4, especially when dealing with multi-client workloads. As a result they are often used as the underlying filesystems for more complex storage configurations (e.g. when using GlusterFS).
Filesystems exist on top of block devices. In Linux every piece of hardware is made available via a device node in
/dev. Thus a hard disk might appear as
/dev/vda on a virtual machine. If you examine this file you see that it is a block device:
$ ls -l /dev/vda2 brw-rw----. 1 root disk 252, 2 Feb 7 10:33 /dev/vda2
Linux also allows simple files on disk to be presented as block devices using the loopback system. This is configured using the losetup command, which has to be run as root. This is useful for experimenting with filesystems and storage. Before using a file as a simulated disk, you need to create an empty file, which you can do with dd. For example, to create a 1GB empty file:
dd if=/dev/zero of=/tmp/disk1.img bs=1024 count=1000000
This will create a file
/tmp/disk1.img that is full of empty bytes. Then this can be made available as a ‘disk’ using losetup, e.g.
$ sudo losetup -f /tmp/disk1.img $
To see what this created, use losetup -a:
$ sudo losetup -a /dev/loop0: [fc00]:28591048 (/tmp/disk1.img) $ ls -l /dev/loop0 brw-rw---- 1 root disk 7, 0 Feb 7 11:42 /dev/loop0 $
This shows that
/tmp/disk1.img is made available as
/dev/loop0 and the ls -l shows that
/dev/loop0 is a block device. To remove a loopback device use losetup -d:
$ sudo losetup -d /dev/loop0 $ sudo losetup -a $
Filesystems are created on block devices using mkfs. E.g.
$ sudo mkfs.ext4 /dev/loop0 mke2fs 1.42.9 (4-Feb-2014) Discarding device blocks: done Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 62592 inodes, 250000 blocks 12500 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=260046848 8 block groups 32768 blocks per group, 32768 fragments per group 7824 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Allocating group tables: done Writing inode tables: done Creating journal (4096 blocks): done Writing superblocks and filesystem accounting information: done $
The verbose output from mkfs.ext4 shows all the structure being created on the ‘disk’. This structure is used to provide redundancy in case some corruption happens on the filesystem. Your filesystem can now be mounted:
$ sudo mount /dev/loop0 /mnt pvh@ebm:/ $ mount /dev/mapper/ubuntu--vg-root on / type ext4 (rw,errors=remount-ro) ... /dev/loop0 on /mnt type ext4 (rw) pvh@ebm:/ $ ls /mnt lost+found
The filesystem is now empty and you can write to it. Unmount it with umount:
$ sudo umount /mnt $
Filesystem integrity can be checked (and optionally repaired) with fsck. This is used on the block device, not the filesystem mount point, and can only be done when the filesystem is not mounted. Typically the **fsck -p* command is used to automatically fix safely fixable errors:
$ sudo fsck -p /dev/loop0 fsck from util-linux 2.20.1 /dev/loop0: clean, 11/62592 files, 8345/250000 blocks $
The “clean” message shows that the filesystem is clean and doesn’t need to be checked. The clean flag is set when a filesystem is unmounted, so if a server crashes (e.g. due to a power fault) it will be set and the fsck output might look like this:
$ sudo fsck -p /dev/loop1 fsck from util-linux 2.20.1 /dev/loop1: recovering journal /dev/loop1: clean, 11/62592 files, 8345/250000 blocks $
If fsck -p fails you might need to use fsck -y but this might result in data loss. It is a last ditch attempt to recover a corrupt filesystem.