Scribe Notes Lecture 2

Patrick McKeown

Matt Miguel

Jason Tsao

3/31/2010

Water Bed Effect

Another common problem with systems: Trade-Offs

The idea of Trade-Offs can be construed by the Water Bed Effect.

If one pushes down on one part of a water bed, then the water bed tends to push back up in other areas.

Analogously, if one tries to optimize one aspect of a system, other aspects may suffer in performance.

An example of this in software design is performance vs. complexity.
A program that is completely optimized for performance might be extremely complicated and difficult for a human programmer to follow. On the other hand, a program that is very simple to follow may run much slower than it would have if it was more complicated. Another example of trade offs is memory usage vs. speed.

Trade Offs: “There is no such thing as a free lunch”

Sorting in RAM: 10 GiB with 1 kiB records

~ 10 million records

A quicksort or heapsort will require O(N log N) copies of 1kB records.

How can we further reduce the cost in memory and time?

Optimization Solution: Create a smaller array of pointers!

For copy overhead: 128x faster and will cost (1 + 1/128) RAM vs. original

There will be additional costs: dereferencing pointers during comparisons

These are examples of trade-offs. Part of design is dealing with trade-offs, and often answers will come directly from the users or customers.

We will consider a case of extreme trade offs: An elephant in one corner of a waterbed.

We will optimize paranoia.

Assume users do not trust software on computer.

Application: Paranoid Professor

This professor has a written proposal due Friday at noon, E.T.

The professor is paranoid that competitors will steal his ideas.

One requirement he must meet is that his proposal must have fewer than 20,000 words.

Your application must count the number of words in the proposal.

Simple UI

The professor does not trust Windows or Linux and asks that your program contains no operating system code at all.

There is a simple user interface:

Input is simply plugging the computer in.

Output is a single number displayed on the screen corresponding to the word count of the proposal.

The first step in the design process is to check requirements.

Is there any additional information that we require from the professor to accomplish our task?

-Where is the proposal store?

--In a hard disk connected to the computer. The computer is enclosed in a vault 20 meters underground.

-Is there a required limit on speed?

--The program should be plenty fast.

-How is the proposal maintained?

--The proposal is maintained by word processor from a separate team. Storage details on the disk are up to you.

-What is the architecture of the computer we are to use?

--Dell, 3 years old, 1GiB RAM, Core 2, 2GHz, 1 TB Hard Disk, 7200 rpm, Text VGA Output

-What should we consider as words in the proposal?

--A maximal sequence of alphabetic characters.

Let us consider the topics we will need to understand to solve this problem.

How is data read from a hard drive?

Disk Drives

A hard drive consist of spinning magnetic disks where magnetic bit patterns interact with read/write heads to transmit or receive information from the rest of the computer. Data is addressed by platter, track and sector. A track corresponds to a certain radius from the center of a track, and a sector corresponds to a particular portion of a track.

Transfer rates for our machine is 500 Mb/s.

Disks are slow because they need to wait until the right sector comes to the head.

The average rotational latency for our computer is 4.16 ms.

Another source of delay comes from the need for heads to move to different tracks. This is called seeking. Seeking and waiting can run in parallel. The average seek time is 10 ms.

To make our program plenty fast, our memory needs to have high spatial locality. Our data needs to be on the same disk and either the same or adjacent tracks.

Interface Problems:

The CPU communicates through a bus to another component called the Disk Controller (DC).

The Disk Controller than deals directly with the hard disks.

The CPU communicates to the DC by accessing DC registers and using special instructions and protocols.

The DC can be thought of as effectively another CPU devoted solely to managing the hard disk.

The CPU can read out of DC register 0x1F7 to check the status of the DC.

Disk Controllers can be busy, which they indicate by the status register.

Busy Disk Controllers will not properly respond to requests.

The CPU has a special instruction to query the status register of a DC.

Once a DC is ready, the CPU can talk to a disk.

The next step for the CPU is to store the number of sectors that it wants to read in register 0x1F2.

This is an 8 bit register, and thus a CPU can only read 256 sectors at a time.

The sector offset must be stored into several adjacent DC registers, starting at 0x1F3 and ending at 0x1F6. This gives 32 bits for offset, allowing the CPU to access about 2 TB of disk space since sectors are 512 bytes long.

The CPU must store a special read command into register 0x1F7 and once again wait until the DC is ready.

The Disk Controller returns the read memory to a special register: 0x1F0, and a special instruction serially loads all the data into RAM.

As far as the CPU is concerned, all the DC register addresses are simply bus addresses.

Computer Startup

What happens in a computer when it initially powers up? If all bits in RAM and CPU registers initialize to 0, then the computer could never boot. There must be a way for the computer to access and run start up instructions. This is accomplished through read-only memory (ROM). ROM data is always set to particular values for a given machine and cannot be overwritten. Most modern computers actually utilize electronically-erasable programmable read-only memory (EEPROM). EEPROM actually can be electronically modified in special circumstances.

In most machines, a cache is also utilized during computer run time, but for our purposes, we should view the cache only as a performance enhancement.

In addition to ROM/EEPROM, the program counter of the CPU must initialize to an address in read only memory.

The start up program written in ROM has a special name: BIOS (Basic Input/Output System).

To the BIOS, all other programs, including the operating system, are just applications.

Because the BIOS is not coded in ordinary memory, it is not even considered software, but rather firmware.

Assume that our paranoid professor trusts the BIOS.

The BIOS for our computer runs the traditional 20 year Intel standard boot procedure.

When the BIOS first runs, it performs various system tests. It then searches for devices (device probing) by attempting to load and store into device registers and seeing if it works.

Specifically, the BIOS searches the first sector of a device for a specific pattern of bits.

Master Boot Record

The first sector of any device is called the Master Boot Record (MBR) of the device. The BIOS looks at the last two bytes of the MBR for the pattern 0xAA55 (Little Endian). This pattern signifies that the device is bootable, and consequently, the BIOS will attempt to boot the device. The last two bytes of the MBR collectively up the MBR signature.

Immediately preceding the MBR signature in memory is a block of 64 bytes called the Table of Primary Partitions. The partition table is further divided into four sub-blocks of 16 bits each according to the IBM partition table scheme. Each of these sub-blocks contains information regarding a particular partition of the memory device, including a status field that indicates if the partition is bootable, the starting sector of the partition and the number of sectors in the partition.

The remaining 446 bytes in the MBR contain x86 machine code. If the device of interest is bootable, the BIOS takes the MBR machine code, loads it into memory at address 0x7C00 and executes it.

If our program contained less than 446 bytes of code, we could simply place our code in the MBR and we would be done. This is not the behavior a typical computer exhibits during start up obviously.

More generally, the MBR scans the partition table for a bootable partition. When it finds one, it loads the first sector of the partition into RAM and jumps to it in RAM. The first sector of a partition is called the Volume Boot Record (VBR) of the partition.

At this point, the VBR takes over control, and may call subsequent programs or even boot the operating system. In Linux distributions, the VBR calls a program called the GRUB (Grand Unified Bootloader). The GRUB subsequently boots the operating system by copying the application from the disk.

Memory Layout

How to copy data from disk to memory: Programmed I/O (PIO)

The following code is representative of how data can be copied from a hard disk into memory.

Void readsector(int s, char * a) {

//Wait until the device is ready

while( (inb(0x1F7) & (0xC0)) != 0x40)

Continue;

//inb loads a byte out of a register and returns the byte

//Similarly, outb writes data to a disk controller register

//Set the device registers of the disk controller

outb(0x1F2,1);

outb(0x1F3, s & 0xFF);

outb(0x1F4, (s>>8) & 0xFF);

outb(0x1F5, (s>>16) & 0xFF);

outb(0x1F6, (s>>24) & 0xFF);

//Set up the read instruction

outb(0x1F7, 0x20);

//Wait until the device is ready

while( (inb(0x1F7) & (0xC0)) != 0x40)

Continue;

//This command means, please read 128 words into our buffer

insl(0x1F0, buf, 128);

//insl is short for insert long which reads data into memory

//At this point, it is possible that the DC can fail, which can

//be checked for in the status register

}

VBR Code

//Assume that the main program is 10 sectors long at offset 9000

for (int i=1; i<=9 ; i++)

read_sector(9000+i,(char *) (0x100000 + (i*512));

Main Program

void main(void){

int nwords = 0;

bool in_word = FALSE;

int s = 1000000;

for(;;){

char buf[512];

read_sector(s++,buf);

for(int j=0; j<512; j++){

if( buf[j] == 0){

nwords += in_word;

write_out(nwords);

return;

}

bool alpha = isalpha( (unsigned char) buf[j]);

nwords += ~ alpha & in_word;

in_word = alpha;

}

Memory Mapped I/O

A portion of memory starting at 0xB8000 belongs to the output device (monitor).

The monitor memory space consists of an 80 x 25 grid of cells, where each cell contains 16 bits. The low order byte of a cell corresponds to an ASCII code, while the higher order byte corresponds to color and other display details.

Writing To Display

void write_out(unsigned nwords){

char* cursor = (char *)(0xB8000 + 2000);

do{

*cursor-- = '0' + (nwords % 10);

*cursor-- = 7; //gray on black

}while((nwords/=10)!=0);