CS111 Lecture 11

Nov 9, 2009

by Fei Jia and Alex Fan

Eggert's final words on low level block access to disk

What is a "Hard Disk"?

Hard disk is non-volatile memory. It organizes data in the form of blocks. Typically a disk block is 8KB, and it supports random access.

The factors affecting disk speed:
1. Seek Time - The time needed for the system to move the head to the appropriate track or cylinder, to access a block on the disk.
2. Latency Time - After the head is on the right track, the time until the desird block rotates under the read-write head.
3. Transfer Time - The time for actual transfer of data between the disk and main memory.
In class, we assume a simple performance model to measure the seek time. The cost of seeking from block i to block j = | i - j | (the distance between i and j )

Disk Scheduling Algorithms

The disk controller serves one request at a time. Any other additional requests, normally from other processes, will need to be queued. Given that we want access the whole bunch of blocks: { }, how do we schedule the access order to optimize seek time?

1) Shortest-Seek-Time-First Scheduling. (SSTF)

Method: Select the request with the minimum seek time from the current head position. This means the disk controller services together all requests close to the current head position, before moving the head far away to service another request.
Disadvantage: Starvation of some requests.

2) First-Come, First-Served Scheduling. (FCFS)

Advantage: This algorithm is easy to program and is intrinsically fair.
Disadvantage: The disk head jumps large distances and cannot optimize seek time.
Given random requests, the average seek time is |R-H|.

Compromise Algorithms (performance of SSTF and starvation prevention of FCFS):

3) A Chunking Algorithm

Method: Grab a chunk of 10 elements, do SSTF on these 10 elements before grabbing next 10 elements.
This algorithm can increase the chunk size for performance or decrease the chunk size for fairness.

4) Elevator Algorithm.

Method: Keep moving in the same direction until there are no more outstanding requests in that direction, then they switch directions.
Advantage: no starvation.
Disadvantage: a little unfair. If a request arrives in the queue just in front of the head, it will be serviced almost immediately, whereas a request arriving just behind the head will have to wait until the head moves to the end of the disk, reverses direction, and returns, before being serviced.
The average wait time in the middle is t/2.
The average wait time at the top or bottom is t/4.

5) Circular Elevator Algorithm.

A slight modification to elevator algorithm: when the highest numbered cylinder with a pending request has been serviced, the head goes to the lowest-numbered cylinder with a pending request and then continues moving in an upward direction.

The average wait time is t/2 everywhere.

6) Anticipatory Scheduling.

Suppose we have two processes A and B, each reads disk sequentially.
A: 101, 102, 103, 104, ....
B: 501, 502, 503, 504, ....
A process appears to be finished reading from the disk when it is actually processing data in preparation of the next read operation. This will cause switching to an unrelated process.
The request order using SSTF: 101, 501, 102, 502, 103, 503
Seek time = 400 / block
Idea: Anticipatory scheduling waits for a short time after a read operation in anticipation of another close-by read requests. It yields a huge win in real-time systems.

Flash Memory (a better Hard Disk?)

Flash Memory is a non-volatile computer storage that can be electrically erased and reprogrammed. It offers faster read access times than hard disks.
Properties:
1) Random access is fast.
2) Writing is much slower than reading, even slower than sequential writes to disk.
3) Memory wear. Flash memory has a finite number of erase-write cycles. Most commercially available flash products are guaranteed to withstand around 100,000 write-erase-cycles.
4) More expensive than disk.
Some file system designed specifically for Flash, such as LogFS, UBIFS.

Eggert bought a 32GB single cell NAND with the following specs:

Read:35000/sec

Write:3300/sec

2.5W (max)

2.1W (typical)

0.1W (sleep)

MTBF (mean time between failure): 2 million hours

Cost: $800 (much more expensive than 32GB disk)

File System Design

What is a "file system"?

It is software that organizes computer files.

It manages the creation, deletion, and access to these files. It represents the files.

It is a data structure on a disk (also known as non-volatile memory).

What are some examples of File Systems?

The simplest possible file system

Protected Area (8 bytes)

Files

Unused space

This file system fits the definition - it organizes computer files and that's all.

It is functional; but obviously it has many drawbacks:

Fragmentation is very likely as previously created files are deleted.

Searching for a file requires checking the whole disk due to this fragmentation. Unused space exists between files and the file system does not know where to stop searching.

RT-11

RT-11 is a single-user, real-time operating system developed in the 1970s.

It has a very simple file system (so simple that Eggert wrote a similar file system in his college days).

It has aspects that can be seen in today's file systems.

The disk is partitioned into two parts:

UNIX

UNIX is a very widespread computer operating system.

Its file system shares many aspects with the FAT file system, yet replaces the FAT with two of its own unique elements.

The disk is partitioned into five parts.

Boot Sector

Superblock

Block

Bitmap

Inode

Table

Data Blocks

Like FAT, the Boot Sector contains the code for booting up the disk.

Like FAT, the Superblock contains metainformation about the file system such as the size of the file system and number of blocks in use.

The Block Bitmap contains a bit for each data block. The purpose of the block bitmap is to efficiently find a free block or clusters of free blocks.

The Inode Table contains information about every file.

Inode stands for "Index node" and there is 1 inode for every directory file or regular file.

Each inode stores the following information on its corresponding file:

Type of the file (regular file or directory)

Dates / Timestamp of last read, write, or change

Permissions (to read, write)

Link count (number of directory entries that point to this inode) ... When link count reaches 0, then the inode can be freed and this effectively deletes the file

Array of pointers to blocks ... Represents about 10 KB of data (10 blocks * 8 KB/block) because each pointer points to a block

Indirect block ... Represents about 16 MB of data (2000 blocks * 8 KB/block) because it is a pointer to a block of pointers and each of these pointers points to a data block

Double-indirect block ... Represents about 32 GB of data (2000 indirect blocks * 16 MB/indirect block) because it is a pointer to a block of pointers and each of these pointers points to an indirect block

For the UNIX file system, there are two kinds of files: 1) Regular files

The contents of their assigned blocks are the file data

2) Directories

The contents of their assigned blocks are directory entries

Directory entries consist of:

File name

Inode Number of corresponding file

It is important to note that regular file inodes do not contain the information about the file's name nor the file's containing directory. This enables:

Safe file renaming - The file inode cannot be accidentally lost.

The UNIX primitives 'link' & 'unlink' are used in this process. The link primitive creates a copy of the directory entry (with a different file name) and adds another link to the regular file. The unlink primitive removes a directory entry and reduces a link to the regular file.

i.e. link ("d/f", "e/g"); //Create a second directory entry

unlink( "d/f"); //Achieves file renaming safely

Easy, cheap change of directories - Copying directories involves adding a new directory inode and incrementing a link count for each of the files copied. Barely any new space is required.

Data Blocks are each typically 8 KiB and can be assigned to any file.

Pros of UNIX file system:

Easy to grow a file - Change a value in the block bitmap and add a pointer to the inode
Safe, cheap file renaming and directory copying (still requires more seeks than RT-11)
No external fragmentation

Cons of UNIX file system:

Still possible to have internal fragmentation.
Still possible to have FAT fragmentation.

Integer	Meaning of Integer
-1	Unused block
0	EOF - this is the last block of the file
> 0	Index of the next block in the file