CS111 Lecture 15 Scribe Notes

3/7/2012

by Stephen Kou and Dennis Chan

Victim Policies

Data that we want to use must be first loaded into RAM. There is also a limited amount of RAM. Therefore, when a block is not found in RAM that is full of other blocks, the system must choose to evict a block which will be called the victim block in order to load what we want. There are many ways of choosing a victim. However, which is the best? It has to be the one that produces the least amount of page faults in our system. So we must test these policies on real machines in order to find out. Here are some policies:

A) Pick a random one to evict

This method can work if all memory accesses are random, but they are not. We also want locality of reference. There are two types of locality.

1)temporal locality: if page P is accessed at time T, then P might get accessed again at time T + є

2)spatial locality: if page P1 is accessed at time T, then the page next to P2 might get accessed next

B) FIFO (first in, first out) - evict blocks in the order of which they came in


0

1

2

3

0

1

4

0

1

2

3

4

A

▲0

0

0

▲3

3

3

▲4

4

4

4

4

4

B


▲1

1

1

▲0

0

0

0

0

▲2

2

2

C



▲2

2

2

▲1

1

1

1

1

▲1

1

= page fault

FIFO yields 9 page faults.

What if we increase the amount of page frames?


0

1

2

3

0

1

4

0

1

2

3

4

A

▲0

0

0

0

0

0

▲4

4

4

4

▲3

3

B


▲1

1

1

1

1

1

▲0

0

0

0

▲4

C



▲2

2

2

2

2

2

▲1

1

1

1

D




▲3

3

3

3

3

3

▲2

2

2



This yields 10 page faults! What is going on? This is called Belady's anomaly. Belady's anomaly shows us that it is possible to have more page faults as page frames increase.

C) Oracle policy - evict the page that we don't need for the longest time



0

1

2

3

0

1

4

0

1

2

3

4

A

▲0

0

0

0

0

0

0

0

0

▲2

▲3

3

B


▲1

1

1

1

1

1

1

1

1

1

1

C



▲2

▲3

3

▲4

4

4

4

4

4

4

= page fault

This policy yields 7 page faults

D) Least Recently Used (LRU) - evict the oldest one



0

1

2

3

0

1

4

0

1

2

3

4

A

▲0

0

0

3

3

3

▲4

4

4

▲2

2

2

B


▲1

1

1

▲0

0

0

0

0

0

▲3

3

C



▲2

2

2

▲1

1

1

1

1

1

▲4

= page fault

LRU gives us 10 page faults.



Improving performance

A) Create better victim policies

This option is difficult because there has been a lot of policies made already.

B) Demand paging

Demand paging is an idea where we do not load pages until the program requests them. There are several advantages and disadvantages to this.

+read just the main page with the first instruction which allows for faster boot up time

-more page fault code is executed because we load pages on request

-more disk arm movement in a multiprocess application

C) Dirty bit

We can improve performance by not writing a victim page that has never been written to because it isn't changed in the disk too. It would be a huge waste of time if we wrote a page to that is a carbon copy of what is on disk. We can avoid this by having a bit that indicates the entry has been changed and, therefore, needs to be written into disk. We call this bit the dirty bit. This can be done in software or hardware. It is much faster if done in hardware.

Return Oriented Programming

Let go off on a tangent and put on our black hats for a moment. Return oriented programming is based on the idea where the stack is open to buffer overflows which can lead to malicious attacks. The attacker just needs to overflow the buffer and overwrite the return address. When the return instruction executes, it can return to another part of the memory where arbitrary code can be executed.

return programming


Differences between fork and vfork

fork() copies all of a process's address space which can be expensive. That is why vfork() was introduced. The address space copy is faked by vfork(). This enables both processes to access the same location of physical memory. In addition, the parent in vfork() is frozen until the child exits or execs.

p = vfork()

if(!p){

fiddle around (should not to global RAM because it is visible to parent)

exec() or exit()

}



Distributed Systems and Remote Procedure Calls (RPC)

rpc




RPCs differ from ordinary functions:

+body of function runs on another machine

-slower

+hard modularity (not sharing address space)

-no call by reference (always call by value)

+caller and callee may have different architectures

Example:

There is an Intel and Sparc machine communicating with each other. In other to do so, they must send messages and data through the internet in big endian format. The Intel machine must swap its bytes before sending it because it is a little endian machine. But the Sparc machine can just feed data through because it is a big endian machine.

The caller must marshal data. In other words, if A has data in a tree structure and wants to send it to B, it must reorganize it into a stream to send to B. B must unmarshal the data into a structure it recognizes.

graphic