UCLA Computer Science 111, Winter 2012.
Operating Systems Principles
Lecture 7: Scheduling Algorithms

Scribed by Nevin Li, Nico Guiton, and Victor Kai



When multiple programs require the resources of a computer, the programs must be executed with efficient use of resources while maintaining quality of service in delivering output to the user. Scheduling is a technique for solving this problem. This lecture dicusses the main goals and various methods of scheduling.

Topics


Event-Driven Programming

In this approach to programming, applications are divided into smaller chunks called callbacks. Callbacks are functions that are each triggered by a specific event. They must do their work and return quickly, and so they cannot wait for input or output due to time constraints. In order to get around this constraint, callbacks can schedule requests and evoke events that can process input and output.

The main advantage of callbacks is that it is very simple to implement and does not take too much overhead, so performance is good assuming that the callbacks are implemented on a single CPU for simple calculations. The main disadvantage of callbacks is that the callbacks have to be cooperative. This means that threads must occasionally release control of the CPU to allow other threads to run.

Threads

Threads are essentially processes with more limited functionality. Processes typically have multiple threads to handle different tasks. Like processes, threads have an instruction pointer, registers, and a stack (shared with its parent process, but no protection). However, unlike processes, threads do not have an owner, memory, or file descriptors (they have to use the ones belonging to the parent process).

Cooperative Threading

Cooperative threading is when a thread that is currently using the CPU voluntarily yields its CPU to the other threads in the program. This implementation of threading is relatively easy to implement, but has some issues that prevent it from being widely used. The biggest issue with this approach is that it depends on every single thread following this protocol. If a single thread never yields, then the other threads will never be able to run, causing a system lockdown. Yields also need to be called frequently in the program, which can make the code hard to read and maintain.

Preemptive Threading

In order to solve the issue of uncooperative threads, we can use timer interrupts in order to force the CPU to switch threads every so often. In this implementation of hreading, the motherboard sends a signal every X amount of time to the CPU, and the CPU will switch threads every time it receives this signal. The time interval for each signal must be carefully decided. Too short of an interval and the CPU will spend all its time switching threads instead of actually executing them. On the other hand, a lengthy time interval can potentially delay threads that need to be executed frequently.

Mechanism vs. Policy

Mechanisms are the parts of the system that authorize operations and distribute the system's resources. A system's policy decides which operations to run, when, and using what resources. It is important to distinguish between mechanisms and policy to improve modularity. This allows more programs to make use of a system's mechanism implementation without having to change their individual policies thereby increasing the interchangeability of policies and mechanisms to fit different user needs.

Scheduling Metrics

Since there are a lot of different ways to implement scheduling, and a lot of different qualities to optimize for, we first need to define some metrics from which we can judge the relative effectiveness of each scheduling algorithm. This section will describe the various sorts of metrics used to measure the algorithms and provide some examples of what these metrics actually measure. The two examples we will be using are the factory example, which describes a car factory that deals with car orders from its customers and the threading example, which describes the process in which threads are run by the CPU.

Scheduling Metrics

Wait Time The time from the arrival of the order to the start of processing that order.
Factory Example: The time from when someone places an order to when the factory starts producing cars for the order.
Threading Example: The time from when the thread requests to be run to when the thread is actually run.
Response Time The time from the arrival of the order to the time the first output is produced.
Factory Example: The time from when someone places an order to when the factory has finished producing the first car in the order.
Threading Example: The time from when the thread requests to be run to when the thread first produces output.
Turnaround Time or Latency The time from when the thread requests to be run to when the thread has completed running.
Factory Example: The time from when someone places an order to when the factory has finished producing the order.
Threading Example: The time from when the thread requests to be run to when the thread is actually run.
Context Switching It takes time to switch from producing one thing to producing another.
Factory Example: The time it takes for the factory to switch from producing one car to another.
Threading Example: The time it takes for the CPU to switch from one thread to another.

Other useful metrics:

Average Expected Time The average time it takes for a thread to complete.
Worst-Case Time The time it takes for the thread to finish in the worst case scenario.
Utilization The fraction of the time that the CPU is doing useful work.
Throughput The rate at which useful work is being done.

Ideally, we want the scheduling algorithm that we choose to have good throughput, good utilization, and be fair to all processes. Throughput and utilization would signify that the scheduler is efficiently managing the CPU and scheduling lots of useful work to be done quickly, and fairness allows every thread should get access to the CPU at least once in a while. Fairness prevents starvation, which means that a thread never is allowed access to the CPU.

Scheduling Policies

The following table is used to explain several scheduling policies. Listed are arrival times and run times of each job. (Job A arrives at time 0 and takes 5 seconds to run, Job B arrives at time 1 and takes 2 seconds to run, etc.)

Job Name Arrival Time Run Time
A 0 5
B 1 2
C 2 9
D 3 4

In the following examples, a double bracket ][ signifies a context switch. Each letter represents one second of a process being run by the CPU. Each context switch requires the CPU spend ε seconds to stop running one process and switch to the next. For example: ][ A A ][ B means that the CPU must first spend ε seconds context switching to A. Then, the CPU runs A for two seconds and then performs a context switch to B. Finally, the CPU runs B for one second before ending.

ε is the amount of time it takes the CPU to do a context switch. Each letter is considered to be one second of the process running.

First Come First Served

][ A A A A A ][ B B ][ C C C C C C C C C ][ D D D D

Utilization = 20/(20 + ε)
Wait time = ε + (4 + 2ε) + (5 + 3ε) +(13 + 4ε) = 22 + 10ε
Average Wait Time = 5.5 +2.5ε
Average Turnaround = 10.5 + 2.5ε

In this algorithm, arriving jobs are put at the end of the job queue, and the CPU will execute the first job in the queue once it's done with its current job. Jobs are executed until either the job is completed or when it requires input or output.

First Come First Served inherently favors long jobs since the CPU will run them until they are completed, despite them taking a longer time to execute than shorter jobs. Since the CPU does not frequently switch jobs, this algorithm has a very high throughput compared to other algorithms. Unfortunately, First Come First Serve suffers from what is known as the convoy effect , which is when a single long job could potentially block many shorter jobs that have an IO command from being executed. As a result, once you finish the long job, you have to wait for the smaller jobs to get the IO, which is inefficient since they could have been waiting for the IO while the longer job was executing.

Shortest Job First

][ A A A A A ][ B B ][ D D D D ][ C C C C C C C C C

Utilization = 20/(20 + 4ε)
Wait time = ε + (4 + 2ε) + (5 + 3ε) +(13 + 4ε) = 22 + 10ε
Average Wait Time = 4.25 +2.5ε
Average Turnaround = 9.25 + 2.5ε

In this algorithm, jobs are placed on the queue based on their estimated run time from shortest job to longest job and are executed in that order. In order to use this algorithm, the runtime of the jobs must be known (or estimated) before execution, which may be difficult to do. Since you place the jobs on the queue before executing them, you need to preempt the jobs in order to do that. The main issue with this algorithm is that starvation is possible if short jobs keep on getting scheduled faster than they can be completed, which would prevent longer jobs from ever getting a chance at the CPU.

Round Robin Scheduling

][ A ][ B ][ C ][ D ][ A ][ B ][ C ][ D ][ A ][ C ][ D ][ A ][ C ][ D ][ A ][ C C C C C

Utilization = 20/(20 + 4ε)
Average Wait Time = 2.5ε
Average Turnaround = 12.5 + 13.5ε
Throughput = 20 /(20 + 16ε)

Pick a time interval and run first come first served, switching to a new job in between time intervals and putting the previously running job back to the at the end of the queue. New jobs must be placed at the end of the queue to avoid starvation.

Priority Scheduling

An alternative way to schedule jobs is to give each job an arbitrary number that represents the priority of that job. There are two main ways to assign priorities: either the user can assign them directly (external scheduling), or the system can assign them (internal scheduling). A common priority scheduling tactic for internal scheduling is to factor in the arrival time of the job and job length.