Choosing a Linux file system for your application is an important decision. This tutorial describes some of the major Linux file systems and provides recommendations on the right file system to suit your application.
What is Linux File System
Almost every bit of data and programming needed to boot a Linux system and keep it working is saved in the file system. For example, the operating system itself, compilers, application programs, shared libraries, configuration files, log files, media mount points, etc.
File systems operate in the background. Like the rest of an operating system’s kernel, they’re largely invisible in everyday use.
Linux file system is generally a built-in layer of a Linux operating system used to handle the storage data management. It controls how data is stored and retrieved.
In addition, it manages the file name, file size, creation date, and much more information about a file.
Ext4 File System
In 1992 the Extended File System or ext was launched specifically for the Linux operating system. It has its roots in the Minix Operating system. In 1993 an update called Extended File System 2 or ext2 was released and was the default file system in many Linux distros for many years.
By 2001 ext2 was upgraded to ext3, which introduced journaling to protect against corruption in crashes or power failures.
Ext4 (Fourth Extended Filesystem) was introduced in 2008, and it has been the default Linux filesystem since 2010. It was designed as a progressive revision of the ext3 file system and overcame some limitations in ext3.
As a result, ext4 has significant advantages over its predecessor, such as improved design, better performance, reliability, and new features.
Nowadays ext4 is the default file system on most Linux distributions. It can support large files and file systems of up to 16 terabytes.
It also supports an unlimited number of sub-directories (the ext3 file system only supports up to 32,000). Further, ext4 is backward compatible with ext3 and ext2, allowing these older versions to be mounted with the ext4 driver.
There is a reason ext4 is the default choice for most Linux distributions. It’s tried, tested, stable, performs excellent, and is widely supported. So if you are looking for stability, ext4 is the best Linux filesystem for you.
Despite its features, ext4 does not support transparent compression, transparent encryption, or data deduplication.
XFS File System
XFS is a highly scalable file system developed by Silicon Graphics and first deployed in the Unix-based IRIX operating system in 1994. It is a journaling file system and, as such, keeps track of changes in a log before committing the changes to the main file system.
The advantage is guaranteed consistency of the file system and expedited recovery in the event of power failures or system crashes.
Initially, XFS was created to support extremely large filesystems with sizes of up to 16 exabytes and file sizes of up to 8 exabytes. As a result, it has a long history of running on large servers and storage arrays.
One notable feature of XFS is Guaranteed Rate IO. This allows applications to reserve bandwidth. The file system calculates the available performance and adjusts its operation according to the existing reservations.
XFS has a reputation for operating in environments that require high performance and scalability and hence is routinely measured as one of the highest performing file systems on large systems with enterprise workloads.
Today XFS is supported by most Linux distributions and has become the default filesystem on Red Hat Enterprise Linux, Oracle Linux, CentOS, and many other distributions.
Best Use Cases for XFS File System
So, do you have a large server? Do you have large storage requirements or have a local, slow SATA drive?
If both your server and your storage device are large, and there is no need to reduce the filesystem size, XFS is likely to be the best choice.
XFS is an excellent filesystem, that scales well for large servers. But even with smaller storage arrays, XFS performs very well when the average file sizes are large, for example, hundreds of megabytes in size.
Btrfs File System
Btrfs is the next-generation general-purpose Linux file system that offers unique features like advanced integrated device management, scalability, and reliability. It is licensed under the GPL and open for contribution from anyone. Different names are used for the file system, including “Butter FS,” “B-tree FS,” and “Better FS.”
Btrfs development began at Oracle in 2007. It was merged into the mainline Linux kernel in 2009 and debuted in the Linux 2.6.29 release.
Btrfs is not a successor to the default ext4 file system used in most Linux distributions, but it offers better scalability and reliability. Instead, Btrfs is a copy-on-write (CoW) file system intended to address various weaknesses in current Linux file systems.
Btrfs primarily focuses on fault tolerance, self-healing properties, and easy administration.
Btrfs can support up to a 16 exbibyte partition and a file of the same size. So, if you are confused by the numbers, all you need to know is that Btrfs can support up to sixteen times the data of Ext4.
How Does Copy-on-Write Work and Why Would You Want it
On a traditional file system, modifying a file would read the data, change it and then write it back to the same place. In a copy-on-write file system, it reads the data, modifies it, and writes it to a new location. This prevents data loss during the read-modify-write transaction because the data is always on a disk.
Since you don’t “repoint” until the new block is entirely written out, if you lose power or crash in the middle of a write, you end up with either the old block or the new block, but not a half-written corrupted block. So you don’t need to fsck
filesystems on startup, and you lower your risk of data corruption.
You can snapshot the filesystem at any point, creating a snapshot entry in the metadata with the current set of pointers.
This protects old blocks from being garbage collected later on and allows the filesystem to present a volume as it was during the snapshot. In other words, you have instant rollback capabilities. You can even clone that volume to make it a writable volume based on the snapshot.
Your other choice is ZFS on Linux, which may be more stable, but requires a few more steps to install on typical Linux distributions.
Btrfs Features
- Copy on Write (CoW) and snapshotting – Make incremental backups painless even from a “hot” filesystem or virtual machine (VM).
- File-level checksums – Metadata for each file includes a checksum used to detect and repair errors.
- Compression – Files may be compressed and decompressed on the fly, which speeds up read performance.
- Auto defragmentation – A background thread tunes the filesystems while they are in use.
- Subvolumes – Filesystems can share a single space pool instead of being put into their partitions.
- RAID – Btrfs does its RAID implementations, so LVM or mdadm are not required to have RAID. Currently, RAID 0, 1, and 10 are supported. RAID 5 and 6 are considered unstable.
- Partitions are optional – While Btrfs can work with partitions, it has the potential to use raw devices (/dev/<device>) directly.
- Data deduplication – There is limited data deduplication support; however, deduplication will eventually become a standard feature in Btrfs. This enables Btrfs to save space by comparing files via binary diffs.
Btrfs is a filesystem that does not need administration once implemented. Therefore, you should never have to run the fsck
command on it. Whenever any errors or inconsistencies arise, it should just handle them and be on its way.
While it is true that Btrfs is still considered experimental and is currently under active development, the time when Btrfs will become the default filesystem for Linux systems is getting closer. Some Linux distributions have already begun to switch to it with their current releases.
If you aren’t afraid of dealing with a somewhat less mature ecosystem, Btrfs may be the better option for you.
ZFS File System
ZFS (Zettabyte File System) remains one of the most technically advanced and feature-complete filesystems since it appeared in October 2005. It is a local filesystem (i.e., ext4) and logical volume manager (i.e., LVM) created by Sun Microsystems.
ZFS was published under an open-source license until Oracle bought Sun Microsystems and closed the license.
You can think of ZFS as a volume manager and a RAID array in one, which allows extra disks to be added to your ZFS volume, which allows additional space to be added to your file system. In addition, ZFS comes with some other features that traditional RAID doesn’t have.
ZFS depends heavily on memory, so you need at least 8GB to start. In practice, use as much as possible for your hardware/budget.
ZFS is commonly used by data hoarders, NAS users, and other geeks who prefer to put their trust in a redundant storage system of their own rather than the cloud. It’s a fantastic file system to manage multiple disks of data and rivals some of the superb RAID setups.
ZFS is similar to other storage management approaches, but it’s radically different in some ways. For example, ZFS does not usually use the Linux Logical Volume Manager (LVM) or disk partitions, and it’s generally convenient to delete partitions and LVM structures before preparing media for a zpool.
The zpool is the analog of the LVM. A zpool spans one or more storage devices, and members of a zpool may be of various types. The basic storage elements are single devices, mirrors, and raidz. All of these storage elements are called vdevs.
ZFS can enforce storage integrity far better than any RAID controller, as it has intimate knowledge of the filesystem structure. As a result, data safety is an important design feature of ZFS. All blocks written in a zpool are aggressively checksummed to ensure the data’s consistency and correctness.
For server use where you want to eliminate almost entirely any possibility of data loss and stability is the name of the game, you may want to look into ZFS.
ZFS Features
Endless scalability. Well, it’s not technically endless, but it’s a 128-bit filesystem capable of managing zettabytes (one billion terabytes) of data. Therefore, no matter how much hard drive space you have, ZFS will be suitable for managing it.
Maximum integrity. Everything you do inside of ZFS uses a checksum to ensure file integrity. As a result, you can rest assured that your files and their redundant copies will not encounter silent data corruption. Also, while ZFS is busy quietly checking your data for integrity, it will do automatic repairs anytime it can.
Drive pooling. The creators of ZFS want you to think of it as being similar to how your computer uses RAM. When you need more memory in your computer, you put in another stick, and you’re done.
Similarly, with ZFS, when you need more hard drive space, you put in another hard drive, and you’re done. There is no need to spend time partitioning, formatting, initializing, or doing anything else to your disks. When you need a bigger storage “pool,” just add disks.
RAID. ZFS is capable of many different RAID levels while delivering performance comparable to that of hardware RAID controllers. This allows you to save money, make setup easier, and access superior RAID levels that ZFS has improved upon.
Reiser4 File System
ReiserFS is a general-purpose, journaled computer file system initially designed and implemented by a team at Namesys led by Hans Reiser. Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file system included in the standard kernel.
With the exception of security updates and critical bug fixes, Namesys has ceased development on ReiserFS. Reiser4 is the successor filesystem for ReiserFS. It has added encryption, improved performance, and much more.
Reiser4 requires a patched kernel. Unfortunately, it is still not included in the official Linux kernel, but patches for Linux-5.x are already available. The reasons Reiser4 is not in the Linux kernel today can be summarized as claims that further testing is required.
Reiser4 provides the most efficient disk space usage among all file systems in all scenarios and workloads. ReiserFS offers advantages over other file systems, especially when handling a large number of small files.
It supports journaling for fast recovery in case of problems. The file system structure is based on trees. In addition, Reiser4 consumes a little more CPU than other filesystems.
Reiser4 has a unique ability to optimize disk space occupied by small files (less than one block). This is because they are stored entirely in their inode, without allocating blocks in the data area.
As well as implementing the traditional Linux filesystem functions, Reiser4 provides users with some additional features: transparent compression and encryption of files, full data journaling, and almost unlimited (with the help of plug-in architecture) extensibility.
However, there is no support for direct IO (work has begun on implementation), quotas, and POSIX ACL.
Conclusion
Choosing the file system that satisfies your specific application needs requires consultation and research of various parameters.
This article outlines the benefits of ext4, ZFS, XFS, Btrfs, and Reiser4 file system options to assist you in deciding the correct file system for your application environments.
Thank you for spending your time here.