CS 111 Scribe Notes

May 22nd, 2012

By: Jun Isaji and Keon Vafai

 


 

Table of Contents

Miscellaneous File System Topics

File System Invariants

File System Checker (FSCK)

High/Low-level Aborts

Virtual Memory

Why Virtual Memory

Alternatives to VM

Virtual Memory in Hardware

 


 

Some Miscellaneous File System Topics

 

File System Invariants

For robustness purposes, we want our file system to exhibit certain properties that guarantee the correct operation of file system functions. We call these properties “File System Invariants”. We depend on these invariants to assure that the structure and correctness of our system is always maintained.

If invariants hold before an operation, and then operation is completed, invariants will hold afterwards.

Just like there are consequences to the law, there are consequences to violating our invariants. Lets see what happens when we violate each one.

 

File System Invariant

Consequences of Violating Invariant

1. Every block is used for exactly one purpose – a block is either a super block, a block in the free block bitmap, an inode block, or a data block
  • Disaster (Data Lost)
  • If a block is used for more than one purpose, when a function writes to a block, it will erase important data that another function depends on.
  • For example, if the file system is using a block as an inode block, and it gets written with data, all of the data of that file’s inode block will be lost.
2. All referenced blocks are initialized
  • Disaster (Data Lost)
  • If a block is not zeroed out, the file could have garbage values that the file system then depends on and makes a mistake.
  • For example, if we make a new file but don’t write anything to it, its size field could be left uninitialized with garbage in it.
    • Then when the file is written to, the size directs it to a block pointer that is also garbage, but points to a block in use.
    • This then results in data loss.
3. All referenced data blocks are marked as used in the bitmap
  • Disaster (Data Lost)
  • If a block is referenced but it is not marked in the free block bitmap, it could be used by another inode trying to find a free block.
  • That block could then be overwritten because its not marked as being used, and data is lost.
4. All unreferenced data blocks are marked free
  • Problem (Storage Leak)
  • If an unreferenced data block is marked as used, it will never be used again.
  • This is not a big problem, unlike the consequences to the other invariants.
  • This results in a data leak, where the system will run out of memory prematurely but no data will be lost

Back to Contents

 

File System Checker (FSCK)

Lets recall the last lectures’ implementation of rename. It has a problem: if power fails in middle of the operation, link count could become 2 when there is actually only 1 link pointing to it.
This is not a disaster, just a problem. These can be fixed later by fsck – the UNIX file system checker that will run after every reboot.

There are still some problems with this implementation:

Note: Lost and found – fsck puts orphan files here (orphan – inode exists but no directory points to it)

To preserve invariants, we interleave writing data and updating the inode. Ex: Write directory block 1, write inode entry, write directory block 2, write inode entry…

 

But our disk scheduling algorithms rearrange our careful writes! We could fix this by ignoring the scheduling algorithm and issue writes in order. But this is extremely slow. Or, we devise a system where the lower level is told which writes and reads depend on each other. Ordinary data can usually be done in any order, but metadata depends on each other. The overhead required to track these dependent writes is not too big, and with advances in file system technology it is worth doing despite this overhead costs.

Back to Contents

 

High/Low-level Aborts

Moving to a more general topic (not just file systems):

In our systems we may want a high level task to inspect low level transactions. Lets say that we have an application in which a high level task needs to know whether an operation it calls has aborted. There are two main methods to handling these aborts:

Cascading aborts – A high level task waits for confirmation that a low level operation has completed successfully. A low level abort causes a high level abort, which can cause yet another higher process to abort, hence the name cascading. This approach is safe, but the high level must wait for confirmation, which makes this slow.

Fast and furious approach – Compensating action approach. The high level task gets a later notification that the lower level transaction failed. This should happen rarely. The high level task can assume the lower level succeeded, and if it didn’t, the program will dial back any changes it made on the wrong assumption. This allows for better performance, but when problem occurs, applications must repair a failure.

Example:

write(fd, buf, 100); //Returns fine

close(fd) == -1;

errno == EIO; //Errors out based on the earlier write call

In this example, the write call immediately returns execution to the program, which continues normally until it reaches the call to close(). At this point, write() notifies us of the error, at which point we have to take care of fixing any now-incorrect actions.

Back to Contents

 


 

Virtual Memory

 

Why virtual memory?

Back to Contents

 

Lets focus on protection. What are the alternatives to Virtual Memory?

Base/Bounds

Back to Contents

 

Virtual Memory in Hardware

Virtual Memory - Lets stop insisting that segments are allocated contiguously, just like our file system. We then break these base-bounds segments into pages. Then, we map virtual memory into physical pages. Need a magic box (address translation table or Page table)

Page Table

Patterson, David A., and John L. Hennessy. *Computer Organization and Design: The Hardware/software Interface*. Waltham, MA: Morgan Kaufmann, 2012. Print.

 

A page table is an array of physical addresses, along with some other bookkeeping bits. Some of the upper bits of the virtual address is used as an index into the array, and the result is the physical address in RAM. It can also indicate whether or not the requested address is in RAM or on disk.

Virtual Addresses

Patterson, David A., and John L. Hennessy. *Computer Organization and Design: The Hardware/software Interface*. Waltham, MA: Morgan Kaufmann, 2012. Print.

 

C-ish explanation of Hardware:

Physical_page_no = pmap(virtual_page_no);

Int pmap(int vpn)
{

return PAGE_TABLE[vpn];

}

Since the page table itself needs to be in data, it must be in physical memory with a physical address. (It cant have a virtual address because we would need another page table to get the physical address)

Note: Register %cr3 is the page table base register. %cr3 points to physical memory. A page table is 2^22 bytes or 4MB per process.

Back to Contents