The Design and Implementation of a Log-Structured File System

The Design and Implementation of a Log-Structured File System

1991 | Mendel Rosenblum and John K. Ousterhout
This paper presents a log-structured file system (LFS) designed to improve disk storage efficiency. The LFS writes all modifications sequentially to a log, which allows faster file writing and crash recovery. The log contains indexing information to efficiently read files. To maintain large free areas on disk, the log is divided into segments, and a segment cleaner compresses live data from fragmented segments. Simulations show that a simple cleaning policy based on cost and benefit is effective. A prototype LFS called Sprite LFS outperforms Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Sprite LFS can use 70% of disk bandwidth for writing, whereas Unix systems typically use only 5-10%. The LFS is designed to handle the increasing disk I/O demands of modern applications. It assumes that files are cached in memory, making disk traffic dominated by writes. The LFS writes all new information to disk in a sequential log, eliminating almost all seeks and enabling faster crash recovery. Unlike traditional file systems, the LFS stores data permanently in the log, with no other structure on disk. The log contains indexing information to read files efficiently. To ensure large free space for writing, the LFS divides the log into segments and uses a segment cleaner to compress live data from fragmented segments. The cleaner segregates older, slowly changing data from young, rapidly-changing data and treats them differently during cleaning. Sprite LFS, a prototype LFS, is now in production use as part of the Sprite network operating system. Benchmark programs show that Sprite LFS is more than an order of magnitude faster than Unix for small-file writes, and at least as fast as Unix for reads and large writes. The LFS uses a segment-based approach to manage free space, dividing the disk into large fixed-size segments. The segment cleaner processes segments to compress live data and free up space. The LFS also uses a cost-benefit policy for segment cleaning, which selects segments based on the benefit of cleaning and the cost of cleaning. This policy allows cold segments to be cleaned at higher utilization than hot segments. The LFS also includes a crash recovery system that uses checkpoints and roll-forward to restore consistency after a crash. Checkpoints define consistent states of the file system, and roll-forward recovers information written since the last checkpoint. The LFS uses a two-phase process to create checkpoints, writing all modified information to the log and then writing a checkpoint region to a fixed position on disk. The LFS has been tested in production use and has shown improved performance compared to traditional file systems. It is used to manage five different disk partitions for about thirty users. The LFS is no more complicated than Unix FFS, with additional complexity for the segment cleaner compensated by the elimination of the bitmap and layout policies required by Unix FFS. The LFS also includes checkpointing and roll-forward code that is no more complicated than theThis paper presents a log-structured file system (LFS) designed to improve disk storage efficiency. The LFS writes all modifications sequentially to a log, which allows faster file writing and crash recovery. The log contains indexing information to efficiently read files. To maintain large free areas on disk, the log is divided into segments, and a segment cleaner compresses live data from fragmented segments. Simulations show that a simple cleaning policy based on cost and benefit is effective. A prototype LFS called Sprite LFS outperforms Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Sprite LFS can use 70% of disk bandwidth for writing, whereas Unix systems typically use only 5-10%. The LFS is designed to handle the increasing disk I/O demands of modern applications. It assumes that files are cached in memory, making disk traffic dominated by writes. The LFS writes all new information to disk in a sequential log, eliminating almost all seeks and enabling faster crash recovery. Unlike traditional file systems, the LFS stores data permanently in the log, with no other structure on disk. The log contains indexing information to read files efficiently. To ensure large free space for writing, the LFS divides the log into segments and uses a segment cleaner to compress live data from fragmented segments. The cleaner segregates older, slowly changing data from young, rapidly-changing data and treats them differently during cleaning. Sprite LFS, a prototype LFS, is now in production use as part of the Sprite network operating system. Benchmark programs show that Sprite LFS is more than an order of magnitude faster than Unix for small-file writes, and at least as fast as Unix for reads and large writes. The LFS uses a segment-based approach to manage free space, dividing the disk into large fixed-size segments. The segment cleaner processes segments to compress live data and free up space. The LFS also uses a cost-benefit policy for segment cleaning, which selects segments based on the benefit of cleaning and the cost of cleaning. This policy allows cold segments to be cleaned at higher utilization than hot segments. The LFS also includes a crash recovery system that uses checkpoints and roll-forward to restore consistency after a crash. Checkpoints define consistent states of the file system, and roll-forward recovers information written since the last checkpoint. The LFS uses a two-phase process to create checkpoints, writing all modified information to the log and then writing a checkpoint region to a fixed position on disk. The LFS has been tested in production use and has shown improved performance compared to traditional file systems. It is used to manage five different disk partitions for about thirty users. The LFS is no more complicated than Unix FFS, with additional complexity for the segment cleaner compensated by the elimination of the bitmap and layout policies required by Unix FFS. The LFS also includes checkpointing and roll-forward code that is no more complicated than the
Reach us at info@study.space
[slides] The design and implementation of a log-structured file system | StudySpace