Lecture 19: Scheduling/Cloud Computing

Anthony Pham, Huy Le

Scheduling

  1. Schedule because of limited resources

  2. Scheduling Metrics

    Like a car factory:
    arrival -> start building -> 1st model(1st output -> 2nd output -> ... -> done) -> next model
    For processes:
    fork -> exec -> 1st process (1st write -> 2nd write -> ... -> exit) -> context switch -> next model

Aggregate statistics

Maximize: utilization and throughput.
Utilization - % of CPU time doing useful work.
Throughput - rate of useful work being done (sum of run times/total time).
Minimize: latency, response time, wait time.
The two goals can conflict.

Average Scheduler times

Table of Jobs and their times. d is context switch time. | represents context switch.
Jobs Arrival Run time Wait time
A 0 5 0+d
B 1 2 4+d
C 2 3 5+2d
D 3 4 13+3d

Simple Scheduler - First come first serve

AAAAA|BB|CCCCCCCCC|DDDD
utilization almost 1
avg wait time = 5.5 +2.5d
avg TT = 10.5 + 2.5d

Shortest Job First

AAAA|BB|DDDD|CCCCCCCCC
utilization = 20/(20 + 4d)
avg wait = 4.25 + 2.5d

SJF + preemption

A|BB|DDDD|AAAA|CCCCCCCCC
avg wait = 2.25 + 2.25d
abg run = 2.25 + 2.5d + 5 + (6 + 5d/4)
avg wait time went down, avg run time went up
(std dev/worst case) of (wait/run) <= fairness metric

FCFS + preemption = round robin

(buggy)
|A|B|C|D|A|B|C|D|A|C|D|A|C|D|A|CCCCC
avg wait time = 2.5d
avg TT = 12.25 + 11.24d
need to put new jobs at end of the queue to prevent large job starvation

Priorities - set by user or by system
FCFS priority: -arrival time
SJF priority: -run time

Real time scheduling - constraints that must be hit, or else.

Mars rover had 2 priorities: high, medium, and low. There is a bug with its priorities where
if a low priority task had a lock, and a high priority task has to wait for it, if a medium task
runs forever, since it won't have to use the lock, it takes over, leaving the high priority task
to hang in a problem known as priority inversion.

We fix this by temporarily having the high priority task lend its priority to the low priority task
with the lock so that it can finish.

More Scheduling

1 CPU, 1 Disk, many I/O requests. Which to do first?
A simple abstraction of a disk is an array, where the disk head is in one element, and the where it wants
to go is represented by another. The cost to access the array is the distance in the array away from
where the head is. Average seek time is between 1/4 and 1/2 the size of the array.
Simplest disk scheduler: FCFS

Shortest seek time first (SSTF)

Combine FCFS (for batches) + SSTF (for within a batch).

Most stuff here won't work with flash.

Cloud Computing Intro

Advantages

Disadvantages

Trust is the biggest problem we face, because in the end we have to trust the people building it.