Multidisk Filesystems: A Comparison

With the death of Kry­der’s law[1], hard drive den­sity has been crawl­ing along slowly in com­par­i­son to the ex­po­nen­tial growth of yes­ter­decade. The largest 3.5" hard dri­ves avail­able for pur­chase are only 16TB, with dri­ves up to 20TB slated for re­lease later this year; a measly 1.3x per year from the 2TB dri­ves of 2012. Data con­tin­ues to grow, how­ever, with the rise of Deep Learn­ing on ever larger datasets, Big Data, and grow­ing archival ef­forts[2].

As such, it makes sense to look at so­lu­tions for uti­liz­ing mul­ti­ple hard dri­ves for in­creased ca­pac­ity, speed, and re­li­a­bil­ity. Tra­di­tion­ally, the choice has been over dif­fer­ent lev­els of RAID im­ple­mented in soft­ware or hard­ware, but today, there are many dif­fer­ent filesys­tems built from the ground up to sup­port mul­ti­ple disks. This guide is meant to lay out the many dif­fer­ent choices in a sim­ple, easy to di­gest for­mat.

This post is still a work in progress and will keep being up­dated as the filesys­tem land­scape changes.

At a Glance

Repli­ca­tion Par­ity Re­siz­ing2 Snap­shots Com­pres­sion Tier­ing In­tegrity Sta­ble?
ZFS Yes 1/2/3 No Yes Yes Lim­ited Yes Yes
mdadm Yes 1/2 Yes De­pends1 De­pends1 De­pends1 No Yes
bcachefs Yes WIP No? WIP Yes Yes Yes Beta
btrfs Yes Buggy No? Yes Yes Yes Yes Yes

1 When com­posed with other tools.
2 Refers specif­i­cally to adding ad­di­tional (iden­ti­cal) dri­ves, one at a time, to an ex­ist­ing par­ity filesys­tem.

De­tails

ZFS

ZFS is the ar­guably the most feature-​rich sin­gle filesys­tem, and a sta­ble one at that. It sup­ports 1-3 disk par­ity, snap­shots, com­pres­sion, and in­tegrity checks. Many NAS sys­tems, such as FreeNAS, use ZFS.

How­ever, it does have sev­eral major weak­nesses. ZFS par­ity vdevs can­not be ex­panded, forc­ing up­grades to ei­ther add more par­ity vdevs (in RAID 0, es­sen­tially) or re­place the en­tire pool at once—some­thing that, for home con­sumers, can be­come too ex­pen­sive. While the im­ple­men­ta­tion of pool ex­pan­sion is cur­rently in progress, it has been in progress for sev­eral years now. ZFS also has very lim­ited stor­age tier­ing: the main form of SSD caching is L2ARC, which places high de­mands on RAM. L2ARC also isn’t per­sis­tent across re­boots (yet).

mdadm

MDADM does RAID and just RAID. MDADM works at the block level, so many other tools like bcache and LVM can be com­posed with it to pro­vide ad­di­tional fea­tures. This does add ad­di­tional mov­ing parts, though, in­creas­ing the chance of fail­ure. Ad­di­tion­ally, filesys­tems on top of mdadm have no way of co­or­di­nat­ing with it, and so if any bi­trot oc­curs, filesys­tems with in­tegrity check­ing will not be able to re­pair the dam­aged data using mdadm’s re­dun­dancy.

bcachefs

Bcachefs is the new chal­lenger. Still under de­vel­op­ment, bcachefs is nonethe­less shap­ing up to be a very promis­ing next-​gen filesys­tem with plans for al­most all of the major fea­tures. The de­vel­op­ment is ac­tive, al­though nearly all of the de­vel­op­ment seems to be done by Kent Over­street him­self.

btrfs

Btrfs is an­other fea­ture rich-​filesystem. How­ever, btrfs par­ity is buggy, and there have been many re­ports of btrfs eat­ing data. With proper back­ups (which you should be keep­ing any­ways!) and avoid­ing the trou­ble­some fea­tures, though, this shouldn’t be a big issue for non-​production sys­tems.


  1. Kry­der’s law is es­sen­tially the Moore’s law of stor­age den­sity, stip­u­lat­ing an ex­po­nen­tial growth in areal stor­age den­sity of mag­netic disks. ↩︎

  2. Not to men­tion linus ISO col­lec­tors, of whom there are a sur­pris­ingly large num­ber. ↩︎

...