CS 111: Operating System Principles

Lecture 9: File Systems

Data Presented: Wednesday, February 13, 2013

Professor: Paul Eggert

Scribe: Alexander Lin

Introduction

Most programs need a place to access, process, and store data. Although RAM provides some implementation for memory management, it is inadequate for meeting the demands of most applications. File systems are implemented to manage information on a much larger scale. The following properties characterize the benefits provided by any adequate file system:

CPU performance has increased on a non-linear scale as time has passed. However, disk performance continues to only improve at a steady, almost constant rate. With the development of the file system, CPU performance has far surpassed disk performance.

Figure 1: Performance Graph

The Disk Controller

Figure 2: The Disk Controller

A hard disk contains several platters containing data. The disk controller consists of an arm possessing several read/write heads for accessing data contained in each platter on the diak. As the disk rotates about an axis, sectors of the platters spin under the heads and are read into a buffer. The heads are aligned, so a cylinder of data is read for every disk rotation. The arm manipulates which cylinders of data are read by moving the heads to different cylinders of the platters. To read a specific sector from the disk, the arm must move the heads to the right cylinder and wait for the sector to rotate under the head.

Devices

Device Metrics

Hard Disk Drives (HDD)

A hard disk drive is a storage device that uses disks to store data. The drive contains a disk controller for managing data on a hard disk. Many of the metrics of the hard disk are influenced by the spinning disk.

Hard Disk Drive Example: Seagate Barracuda LP 2

Device Metrics Seagate Barracuda LP 2
Total Storage 2 TB
Cost $150
External Transfter Rate 300 MBits/s
Sustained Transfter Rate 90 MBits/s
Average Latency 5.5 ms
Average Seek Time 10 ms
Cache Size 32 MB
Disk Rotation Speed 5900 rpm (98.33 Hz)
Contact Starts and Stops 50,000
Power (Operating) 6.8 W
Power (Idle) 5.5 W
Read error rate 10^-14 nonrecoverable read errors per bit
AFR 0.32%
Operating Shock 70 G
Non Operating Shock 300 G
Noise (Operating) 2.6 Bels
Noise (Idle) 2.5 Bels

Solid-State Disks (SSD)

An SSD is a storage device that uses integrated circuits to store data. Data is stored electronically in the circuits instead of on a disk. SSD's contain no moving mechanical components, so they produce less noise, have smaller access times, and are less fragile than hard disks.

SSD Example: Corsair Force GT

Device Metrics Corsair Force GT
Total Storage 60 GB
Cost $90
External Transfter Rate 3 GBits/s
Sequential Read Rate 280 MBits/s
Sequential Write Rate 270 MBits/s
Mean Time Between Failure 1 Million Hours
Shock Capacity 1000 G
Maximum Power 20 Watts

Comparison

Device Metrics Seagate Barracuda LP 2 Corsair Force GT
Total Storage 2 TB 60 GB
External Transfter Rate 300 MBits/s 3 GBits/s
Cost $150 $90

Performance Strategies

Speculation

A file system uses speculation to improve performance by predicting requests ahead of time and processing them before the application makes them. The file system finishes work ahead of time to mask delay.

Example: Prefetching

Suppose an application reads blocks based on a predictable pattern. The file system could predict future blocks that the application will request and read them ahead of time. For example, if an application read adjacent blocks, the file system could read the next adjacent block before the application requests it.

Example: Batching

Suppose an application makes many requests to read data from a given location. Instead of reading one of these blocks each time the application requests one, the file system could read a large batch of blocks at once to prepare for future requests.

Dallying

When given a request by the application, the file system waits instead of processing the request immediately. Future requests may be done simultaneously with the previous request to save time as opposed to processing each request individually. Time is saved if simultaneous processing results in the prevention of repeating similar instructions for each request.

Example: Batching

Write data is placed on a buffer to write to the disk. Instead of performing a write for every write request, the file system delays writing to the disk until the buffer is full. Batching increases the amount of data to write but reduces the number of disk writes which saves time.

Example: Reading

Suppose an application requests a read from a given location and then requests another read from a different location. The file system could delay the processing of the request and instead wait for any future read requests from the current location. This reduces the amount of searching done for the file system.

Locality of Reference

In practice, data requests have patterns which can be analyzed to predict the application's next request.

Spatial Locality

The file system can take advantage of spatial locality if the application repeatedly requests data in locations that are adjacent to each other. Speculation can be used to access memory early and get data in adjacent locations before the application requests them. Dallying can be used to wait for all requests from the current location before accessing memory.

Figure 1: Spatial Locality

Temporal Locality

An application may make several requests for data in a small location during a given time interval. Dallying can be used to wait for all requests from the current location.

Figure 1: Temporal Locality

Performance Metrics

Example Problem

Simple Problem

Consider a machine with the following properties:

The following code reads and processes one character at a time:

	for(;;) {
		char c;
		if(sys_read(&c) < 0) {
			break;
		}
		process(c);
	}
			

Analysis of Simple Problem

The latency per read is:

	  5 microseconds (send command)
	+ 50 microseconds (disk latency)
	+ 1 microsecond (inbyte)
	+ 0.125 microseconds (compute)
	= 56.125 microseconds
				

A machine that uses this simple strategy for reading and processing bytes experiences high latency and low throughput. Utilization is also very low.

Batching Problem

Instead of processing only one Byte at a time, batch with 40 Bytes to increase throughput.

	for(;;) {
		char buf[40];
		if(sys_read(buf, 40) < 0) {
			break;
		}
		process(buf, 40);
	}
			

Analysis of Batching Problem

The latency per read is:

	  5 microseconds (send command)
	+ 50 microseconds (disk latency)
	+ 40 microseconds (1 PIO/Byte * 40 Bytes)
	+ 5 microseconds (0.125 microseconds/computation * 40 computations)
	= 100 microseconds
				

A machine that uses batching for reading and processing bytes experiences a lower latency and higher throughput for each requested byte in comparison to the simple strategy. Utilization also increases but is still small.

Overlapping Input and Computations

The elapsed time for processing a request can be improved by implementing device interrupts. Interrupt handlers can handle PIOs and computations in parallel.

The following pseudocode handles device interrupts:

	do {
		block until there is an interrupt
		handle interrupt()
	} while (more data);

	handle interrupt() {
		Issue PIOs for next request
		compute
	)
			

Analysis of Overlapping Problem (Part 1)

The latency per read is:

	  5 microseconds (send command)
	+ 50 microseconds (disk latency)
	+ 40 microseconds (PIO)
	+ "5 microseconds" (computation) (done in parallel with PIO ==> = 0 microseconds)
	= 95 microseconds
				

Analysis of Overlapping Problem (Part 2)

The latency per read is:

	  5 microseconds (send command)
	+ 50 microseconds (disk latency)
	+ 400 microseconds (PIO)
	+ "50 microseconds" (computation) (done in parallel with PIO ==> = 0 microseconds)
	= 455 microseconds
				

Direct Memory Access (DMA)

Direct memory access allows the file system to access memory in parallel with the CPU. With DMA, the file system can perform computations while the DMA controller completes the transfer of data.

Analysis of DMA

The latency per read is:

	  5 microseconds (DMA setup)
	+ 50 microseconds (disk latency)
	+ "40 microseconds" (PIO) (done in parallel with other work ==> = 0 microseconds)
	+ "5 microseconds" (computation) (done in parallel with disk latency ==> = 0 microseconds)
	+ 5 microseconds (overhead of interrupt handler)
	= 60 microseconds
				

DMA and Polling

To avoid the extra overhead of interrupts, a polling strategy is used instead of a blocking strategy. The file system sets up the data transfer for the DMA controller to complete. Then the file system performs other work while the DMA controller finishes the transfer. However, the file system polls the disk controller after finishing its work instead of having the DMA controller interrupt it.

Analysis of DMA and Polling

The latency per read is:

	  5 microseconds (DMA setup)
	+ 50 microseconds (disk latency)
	+ "40 microseconds" (PIO) (done in parallel with other work ==> = 0 microseconds)
	+ "5 microseconds" (computation) (done in parallel with disk latency ==> = 0 microseconds)
	= 55 microseconds
				

Multitasking

Multitasking allows processes to run simultaneously through a scheduler. If the current thread cannot obtain the resources it needs to do work, then it invokes the scheduler and transfers control of the CPU to another thread that can use it.

	while (inb(0x1f7) & 0xc0 != 0x40)){
		yield();
	}