BTRFS: A Better FS?

By David Chen @TheEgghead27

Lately, I've been using BTRFS for my root filesystem for better deduplication and filesystem compression (all those Proton prefixes for my games add up...) flawlessly, despite all those warnings from forum posts about what happens when you have a full disk.

A friend of mine ran out of space on BTRFS, proceeding to lose a few files he generated and ending up with an unbootable system because bootloader updates were failing when trying to write to disk.

Perhaps BTRFS melts like butter in this case, but it has a few helpful tricks to delay the inevitable fate of storage crunch!

Filesystem Compression

If Windows/NTFS can do it, why not Linux?

BTRFS, through the btrfs filesystem command, supports online defragmentation, which includes compression! You can Recursively Compress / with ZSTD with the following command (as root, such as with sudo or doas):

btrfs filesystem defragment -r -czstd /

There are a few options for compression algorithm, zlib, lzo, and zstd. Generally, zstd is the most performant option with great ratios; I went from using 134/150 GiB to 91/150 on my main disk just by running this command!

You can also set a specific file to have a certain compression algorithm:

btrfs property set <file> compression [algorithm]

The algorithms are the same as above.

Applying compression at mount

You can use the mount -o command to provide options. For compression with zstd, you can call:

mount -o compress=zstd [filesystem]

Alternatively, if you would like this to always be applied when your filesystem is mounted, you can set this in your /etc/fstab file:

/dev/sda2 / btrfs defaults,compress=zstd 0 1

Deduplication

One of the biggest features of butterFS BTRFS is that it implements Copy-on-Write (CoW), where you can make a copy of a file that doesn't use extra storage until one of the files gets written to, thus changing the data. This is done like so:

cp --reflink

Even for large files, the "copy" happens nearly instantly, because only metadata is being written to disk.

This works similarly to a hard-link (blogpost also pending!), where your filesystem has 2 file names point to the same data on disk, but you can safely write to one as if it were a distinct file, keeping the other one untouched!

duperemove

In addition to making your own reflinks, the duperemove command can read filesystem extents, blocks of data for the file, and hash them to determine where there might be duplicates. This can be due to identical files, but also due to similar data such as headers, like the Windows PE headers in the executables on a Wine prefix.

To Recursively scan files in a folder for duplicated data, and print out file size numbers in a Human-readable format, use

duperemove -rh <folder>

and to actually Deduplicate any found extents (on BTRFS or XFS filesystems), add the -d argument:

duperemove -drh <folder>

If you intend to repeatedly run duperemove on a folder, such as your Steam files, you can even save the hashes duperemove makes into a hashfile, allowing it to only rehash files that have changed since the last time you ran it. When I deduplicate files in my personal Steam folder, which includes Proton prefixes, I usually run:

duperemove --hashfile=steam-hashfile -drh ~/.local/share/Steam/

Enjoy the extra space and delayed, but always looming threat of running out...

Further reading

go home