BasicBites - Work Scheduling

Threads, queues, blocking, quanta

Modern processors have several cores, perhaps four or more. Code in each core can create and manage seperate threads of execution. When a process starts, its processing is assigned to a core, and is loaded with a single main thread running. The code of that program can create additional threads to work on some partition of the program's activities.

Figure 1. Windows Thread Scheduling

1.0 Windows Scheduling

Windows schedules threads, not processes. A process is simply a container for resources shared by all the threads running in that process. Figure 1. illustrates that: each thread has an associated Thread Context Block (TCB) and each TCB holds a pointer to a shared Process Context Block (PCB) that keeps track of the process resources.

When a thread is created it has an assigned priority. Its TCB is placed in a "ready to run" queue of TCBs with the same priority. Windows supports seven priorities, three of which are shown in Figure 1.

The Windows scheduler uses a preemptive First Come, First Served strategy. It starts a new thread by going to the highest priority queue with one or more TCBs, dequeues the TCB at the end and creates a thread using the TCB contents. The thread is assigned a quantum or time-slice that determines the maximum time it can run before being suspended. The quanta value is determined by its thread's priority - higher priorities get larger quanta, and so, run longer.

A lower priority queue gets serviced only if all higher priority queues are empty. For that reason, it is entirely possible, though unlikely, that a low priority thread never runs. Windows is not a "Fair" operating system.

When a running thread's time-slice expires it is placed at the end of the queue for its priority and waits for service by the scheduler. There are three other reasons for a thread to stop: it may block waiting for a resource held by another thread, e.g. an IO device or file, it may be suspended by another thread making a call to suspend it using the thread API, or it may abort due to an unhandled exception.

When a blocked thread becomes unblocked, as signaled by the resource on which it waits, it is placed back on the queue for its priority and waits to be serviced. When a suspended thread is released by another thread, it too is placed on its ready to run queue.

Linux and MacOS use essentially the same strategies with some amendments. Linux, for example, maintains two stacks of prioritized queues: active and expired. These schedulers behave in a manner similar to Windows, but schedule processes instead of threads. Unlike Windows, Linux threads are simply child processes that share the same virtual address space as their parent process.

2.0 Consequences:

The good: Windows thread average processing rates depend on a, possibly program defined, thread priority. Each core works most often on the most important threads. Blocked or suspended threads use almost no processor resources. Each thread has access to a set of shared process resources.

The bad: low priority threads may never run. A high priority thread may be blocked waiting for a resource held by a lower priority thread. Each thread has start-up latency that depends on its priority and the number of threads ahead of it in its ready to run queue and the number of waiting threads with higher priority. A fraction of a core's execution is spent on thread context switching. If a large number of threads are running on a core, its average useful work performed may become very low.

1.0 Windows Scheduling

2.0 Consequences:

3.0 References: