Fast non-volatile memories (NVMs) are appearing on the processor memory bus alongside DRAM, becoming non-volatile main memories (NVMMs). The resulting hybrid memory systems will provide software with low-latency, high-bandwidth access to persistent data. However, managing, accessing, maintaining consistency and providing protection for data stored in NVMM raises a host of challenges. Existing file systems built for spinning or solid-state disks introduce software overheads that would obscure the performance that NVMs should provide, but proposed NVMM file systems for either incur similar overheads or fail to provide the strong consistency and integrity guarantees that applications require.
This thesis first presents NOVA, a log-structured file system designed to maximize performance on hybrid memory systems while providing strong consistency guarantees. NOVA adapts conventional log-structured file system techniques to exploit the fast random access that NVMs provide. In particular, it maintains separate logs for each inode to improve concurrency, appends fine-grained metadata to the log to provide low-overhead atomicity, and stores file data outside the log to minimize log size and reduce garbage collection costs. NOVA's logs provide metadata and data atomicity and focus on simplicity and reliability, keeping complex metadata structures in DRAM to accelerate lookup operations. For operations that span multiple logs, NOVA uses lightweight journaling to provide fast atomic transaction semantics. In case of system failure, the per-inode logging design provides vast parallelism and fast recovery.
NOVA excels in metadata-intensive and write-intensive workloads. Experimental results show that NOVA provides 22% to 216× throughput improvement compared to state-of-the-art file systems, and 3.1× to 13.5× improvement compared to file systems that provide equally strong data consistency guarantees.
NVMM has different failure models from disks and SSDs. Disk I/O errors are not vital for the operating system, but memory errors can hang the entire OS. Persistent memory makes the issue worse: the error is durable and system reboot cannot remove it. How to handle persistent memory errors is still an open question. This thesis presents NOVA-Fortis, a fault-tolerant file system that based on NOVA and is both fast and resilient in the face of corruption due to media errors and software bugs. We identify and propose solutions for the unique challenges in adding fault tolerance and reliability techniques to a NVMM filesystem, and quantify the performance and storage overheads of these techniques. We find that NOVA-Fortis' reliability features consume 14.8% of the storage for redundancy and reduce application-level performance by between 2% and 38% compared to the same file system with the features removed. NOVA-Fortis outperforms DAX-aware file systems without reliability features by 1.5× on average. It outperforms reliable, block-based file systems running on NVMM by 3× on average.
Finally, the thesis evaluates existing applications and analyzes their access patterns to the file system. It finds out that existing NVMM file systems such as ext4-DAX and xfs-DAX do not perform well under applications' typical access patterns, like write-ahead logging (WAL). The thesis resolves the issue for both applications and file systems. On the application side, this thesis optimizes the access patterns and avoids the operations that result in high overhead. From the file system perspective, the thesis proposes a new fine-grained, scalable journaling module design for ext4 and improves the WAL performance of databases and key-value stores. The thesis also analyzes the file system scalabilty and NUMA impact on a multi-socket, multi-core machine, proposes and implements several solutions to fix the scalability and NUMA impact issues.