CS 111: Operating Systems Lecture 13 - File System Robustness


Scribes: Jason Chen

Lecture Date: February 29, 2012


Part 0 File System Implementation Review

Last lecture, we left off by talking about using indirection to accomplish various tasks. Most notably, this is helpful in establishing symbolic links. However with indirection, we sacrifice speed. For example, lseek(fd_____) and read(_____) both suffer from speed issues when indirection is introduced. Three seeks will result in the program running three times slower. Furthermore, if bad things were to occur it would be difficult to fix the problem.

indirection_diagram

In order to deal with this issue, we attempt to design systems that are fault avoidant.

//this is an example of a bad boot script

mknod("/dev/null", magic numbers) //only root can run it

open("file", O_CREAT|O_RDWR|O_CONFIG,0644, 1024*1024*1024)

>/dev/null

chmod 666 /dev/null


//this is an example of a good boot script

mknod("/dev/null, c, 3 19) //this creates a special file

ls -l /dev/null

//which prints out


crw-rw-/.../dev/null


We also use named pipes

$mkfifo /temp/mypipe

$ls -l /tmp/mypipe

prw-rw-rw.../tmp/mypipe

$cat /tmp/mypipe > foo &

$sed s/a/b/.../etc/passwd >/tmp/mypipe

ino_t

This results in a relation with a read only file system in /usr/. However, this leads to some problems.

     1. Link counts don't work in a read only file system.

     2. And all filesystems are yoked together.


In order to deal with this problem, we consider the use of a mount table which divides the structures into various categories. Mounted tables are typically used to record currently active file systems.

mount_table

File System Robustness


Ultimately, in order for a file system to be considered robust, we have a few goals:

     1.Durability - the ability to survive limited hardware failure

          (e.g. loss of power)

     2. Atomicity - changes to files are either made or not made

     Performance - (has two competing goals)

          3. High throughput

          4. Low latency

It is very tempting to cheat and often very lucrative to do so as you can often save a lot of money.


Say we are attempting to create a text editor that will either save the changes we made to it or return the old information when it crashes. We first have some underlying assumptions about the structure of the program.


text_editor

GOLDEN RULE OF ATOMOCITY states that you NEVER overwrite your only copy on disk.


Our test phrase that we enter:

     Fore score and seven years

should have the Fore replaced by Four in this situation

     Four


We also have a few simple assumptions.

     Lampson-Sturgis assumptions

          1. Storage writes may fail but a later read will detect the bad block (a failed write to block x may corrupt block y)

          2. Stored blocks can decay spontaneously (later reads will detect)

          3. Errors are rare.


A way we are able to solve the problem is by using three copies of the write. Although two can speed up the process, we use three to be safer.

atomicity_assump

However a problem exists in the fourth row with the NEW, ?, and the OLD. Since there are two possibilities, we impose a few set of rules:

     1. We use majority rule to repair data

          Basically if there is unknown data we use the data that is the majority out of the three copies at the time.

     2. We use #1 if all three disagree

          Such as in row four.


Suppose we want rename("d/a", "e/b") to be atomic.

working_directory

     1.Increment link count - write inode of file

     2. Write dest dir data block

     3. Write ssrc dir data block

     (potential to crash at this point if there is a storage leak)

     4. Clear link count

In order to prevent crash we use the function detect fsck which we pass occasionally to catch leaks.


File system correctness invariants

(Disaster if compromised) 1. Every block is used for exactly one purpose.

(Disaster if compromised) 2. All reference blocks are initialized to data appropriate for its type.

(Disaster if compromised) 3. All referenced data blocks are marked used in bitmap.

(Just a mere leak)        4. All unreferenced data blocks are marked free in bitmap.