CS 111: Operating Systems Lecture 13 - File System Robustness

By Steven Pham
Lecture Data: February 27, 2013


For a robust file system, we have a few goals:
1. Performance: System with high throughput and low latency
2. Durability: Data survives failure and is enduring
3. Atomicity: Correctness of data - changes made to a file system are completed or not completed.


Durability Issues:

When considering durability, many possible issues could occur.  In some cases, the keyboard, CPU, RAM, etc. could die or screw up the underlying failure.

In an ideal world, we would want to address all such possible issues to ensure the complete durability of our system. Realistically, however, we must limit our analysis to a scope of limited failure. Mainly, we only want to consider the problems most likely to occur that will cause damage.


Battery Example

Must consider a model for failure that corresponds to reality and is simple:
- No battery
- Only failure to worry about: loss of power

Say we want to edit a file:
data_file

What if we lose power when attempting to edit the file and write to the disk sector?

In the event of a loss of power, a trap occurs telling processes to stop immediately.



Possibilities at this point:

1. The power supply has a small amount of power remaining to finish task
2. Bad sector results since task cannot finish

Under these assumptions, we can assume the data contents to follow this model:

data_contents


Atomicity Solutions?

- More file space
- Never overwrite your only or last copy
- Keep many backups


Backups example

In a file system with backups, we write to F2 first then overwrite the old F1. Using markers, we can keep track of which file is currently being modified.

F2 = Y
Marker = F2
F1 = Y
Marker = F1

data_backups

With markers:
data_markers
Using this approach, we can assure atomicity by checking the file marker and ensuing status of each file. While this method works, we have to utilize too many write sectors and might want a more efficient approach.


Time append example:
Alternatively, we can try appending the time stamp:

time_append

Using this approach, we write the data to the file and the nappend the current time.

The only problem is that the current time value could be completely garbage or a wrong value from a bad write and the resulting value could be incorrect.


Classic Solution:

In the classic solution approach, we use three file copies for safety and complete atomicity with staggered writing to all 3 copies:


classic_solution

If a crash happens, we run this code:

if F1 = F2
  return F1
else
  return F3

Utilizing this code, should a crash happen before fully copying the data to all 3 files, the files can be successfully reverted to old or new data accordingly.

This approach requires 3 copies to be held on disk, however, and this can be a lot of memory space just to simply assure atomicity for one file on the file system.<

Lampson-Sturgis Assumptions:

1. Storage writes may fail and can corrupt written or nearby sectors
2. Storage can decay spontaneously
3. A later read can detect the bad sector
4. Errors are relatively rare
5. Repairs can be done in time



Example: rename(“a”, “b”)



inode

1. Read inode 357 into memory
2. Read data block 52 into RAM
3. Change “a” to “b” in RAM
4. Write data block out to disk
5. Update inode time stamp
6. Write inode to disk

(Doesn’t exactly follow Lance-Sturgis assumptions)


Say instead we wanted to rename file a to bb - rename("a", "bb"):
In this case, if we can't find directory memory in allocated space, we have to go find empty space to copy the entire directovery over.



Rename ("d/a", "e/a")


inode
inode

Actions:

Read inodes 357 and 362 as well as data blocks 52 and 1012 into RAM

Write link count 2

Write updated data block 1012

Write link count 1

Write updated data block 52

Write updated inode 362

Write updated inode 357

If we crash between writes to data blocks, link count will not be updated. In this case, files must be removed or dangling pointers will remain.


Invariants for file systems:

1. Every block in the file system is used for exactly one purpose

2. All referenced blocks are properly initialized with data appropriate for their type

3. All referenced data blocks are marked as used in bitmap

4. All unreferenced data blocks are marked as free in bitmap


Consequences for violation of invariants:
1. Disaster
2. Disaster
3. Disaster
4. Memory leak


Valid HTML 4.01 Transitional