Linux Commands

BTRFS Balance Bug in Kernel 5.14.x

There is a bug in kernel 5.14.x which causes a btrfs filesystem to go read-only when converting the metadata profile from single to dup. The filesystem goes into a read-only state. Solving this problem requires an unconventional fix.

About BTRFS Profiles

One of the great features of btrfs is the ability to use the different raid profiles. Since btrfs stores the data and metadata separately, you may stripe the data across the disks as raid0 and mirror all the metadata as raid1. This redundancy in metadata takes little extra space. And this is recommended in btrfs raid0 setups.

It has always been suggested to duplicate the metadata, even on a single disk, since losing the metadata means losing the data as well.

Btrfs is able to change a raid profile on a live system, converting the data and metadata to provide (or remove) redundancy. This occurs with a balance. A btrfs balance re-writes all the filesystem’s blocks and adjusts to match the new profiles as it does.

The Bug

I attempted to change the metadata profile on my single disk from single to dup for the redundancy.

$ sudo btrfs balance start -mconvert=dup /mnt

The filesystem immediately went into a read-only state. The system went down. When it rebooted, I could not get past the recovery initramfs. No data could be written to the disk.

When a balance operation is interrupted on btrfs filesystem, it automatically resumes the next time the filesystem is mounted. This can be usually stopped with:

$ sudo mount -o skip_balance,rw /dev/sdX /mnt

Then, cancel if needed using the following command:

$ sudo btrfs balance cancel /mnt

However, the bug not only caused the balance to lock up but the mount options to stop it from continuing were ignored. Every time the filesystem is mounted, the balance attempts to resume. It failed and the filesystem went on to read-only. If you encounter this, you must boot any distro using an older kernel. In my case, it was Arch with 4.18.

Mount the filesystem with the older kenel:

$ sudo mount -o skip_balance,rw /dev/sdX /mnt

Cancel the balance:

$ sudo btrfs balance cancel /mnt

Perform the balance again:

$ sudo btrfs balance start -mconvert=dup /mnt

Once the balance is complete, you can safely boot into a newer kernel, now with duplicated metadata in the filesystem. Check the profiles used by the filesystem. You will see that you have two copies of the filesytem’s metadata and only one copy of the data:

$sudo btrfs fi usage <mountpoint>

Conclusion

Btrfs is an amazing filesystem capable of many advanced options. However, when using btrfs, you should have working backups as well as a bootable kernel from an LTS distro for system rescue. Even though a balance can be run on a mounted root filesystem, it is not always advised to do this. There are still many bugs in the filesystem. You should be prepared for a filesystem rescue when the btrfs module hasn’t been fully tested against bleeding-edge kernels.

About the author

Joseph M Gant

Joseph M Gant is a professional poet, fiction writer, and cyber-activist. His creative work has appeared widely in small and academic press, and his technical writing has appeared in various cybersecurity blogs. He is an Arch linux enthusiast of 10 years, EFF supporter, and self-confessed datahoarder.
Find him on Twitter or on Github