///Processes/ProcessTalk.tioga
August 28, 1986
Mesa Processes
· Process qualities
· Language support
· System support
· Some observed statistics
· PrincOps data structures
· DragOps data structures
· DragOps primitives
· Open questions
Process qualities
Mesa processes are:
· "lightweight"
low-cost creation, switching, termination
little state information
· in single address space
· dynamic (not permanently assigned)
· coordinated by monitors [C. A. R. Hoare]
Language support
· PROCESS
FORK creates processes from invocations
RETURN terminates process
JOIN waits for process to finish
· MONITOR
Monitor locks (MLs) synchronize data access
ENTRY procedures
use MLs to synchronize execution
MONITORED RECORD
protects record instead of global frame
· CONDITION
Condition variables (CVs) protected by MLs
WAIT
suspends process until condition occurs
NOTIFY, BROADCAST
indicates when condition has occurred
System support
· Priorities
8 levels
· Scheduler
runs highest priority ready process
time slicing within priorities
· Interrupts and notifications
device "interrupts" become "naked" notifies
· Timeouts
when a condition is overdue to occur
default is no timeout
· Detached processes
· Aborting processes
Some observed statistics (Dorado)
Process switches per second in Cedar
minimum: ~ 250 per second
maximum: ~ 1000 per second
· Background: 240 per second
· Compilation: 330 per second
adds in I/O switching to background
· TiogaToInterpress: 350 - 850 per second
variance still mysterious
· MergeCedarMap: 660 per second
uses multiple processes
Things to note
most process switches due to CV wait
PrincOps
· Primitives in microcode
including most scheduling
· Single processor
· Very little state to switch
about 11 msecs in basic cost
· 1024 processes maximum
Cedar boot file made for 300 maximum
100-150 typically in use
· Emphasis on small size
8 bytes per process
2 bytes per monitor lock
4 bytes per condition variable
DragOps
· Primitives in software
· Multi-processor
minimum: 1
maximum: < 100
· Moderate state to switch
up to 12 frames in registers
100-200 msecs typical cost (estimate)
· No inherent process maximum
· No small size emphasis
DragOps design objectives
· Efficient primitives
especially cheap ML processing in normal case!
· Dynamic process to processor association
all processors are "equal"
avoid bottlenecks caused by dedicated processors
· Scalable
same mechanism works for 1 to 100 processors
no inherent limit on processes
· Debuggable
keep more than minimal info (i.e. ML owner)
Process data (PrincOps)
· PrincOps
PDABase: TYPE =
LONG BASE POINTER TO ProcessDataArea;
PDA: PDABase = LOOPHOLE[200000B];
PsbIndex: TYPE = CARDINAL [0..1024);
-- same as PROCESS
PsbHandle: PDABase
RELATIVE POINTER TO ProcessStateBlock;
ProcessStateBlock: TYPE = RECORD [
link: PsbLink, -- to next waiting process
flags: PsbFlags, -- incl. cleanup link
context: Context, -- frame | state vector
timeout: Ticks]; -- ticks remaining
Process data (DragOps)
· DragOps
Process: TYPE = LONG POINTER TO ProcessRep;
ProcessRep: TYPE = RECORD [
queue: Queue, -- CV | ML being waited
next: Process, -- next waiting
meta: Process, -- next on meta queue
when: Word, -- ticks to go | page
priority: Priority, -- process priority [0..7]
state: ProcessState, -- ready, waiting, ...
abort: AbortState, -- request & enable bits
euState: EUstate]; -- switched registers
Monitor locks
· PrincOps
Monitor: TYPE = RECORD [
reserved: [0..17B] ← 0,
tail: PsbIndex,
available: [0..1] ← 0,
locked: BOOL];
· DragOps
QueueRep: TYPE = RECORD [
busy: Process,
chain: Process];
MonitorLockRep: TYPE = RECORD [
queue: QueueRep,
owner: Process];
Condition variables
· PrincOps
Condition: TYPE = RECORD [
reserved: [0..17B] ← 0,
tail: PsbIndex,
abortable: BOOL,
wakeup: BOOL];
ConditionVariable: TYPE = RECORD [
condition: Condition,
timeout: Ticks]; -- CARDINAL
· DragOps
ConditionRep: TYPE = RECORD [
queue: QueueRep,
timeout: Ticks]; -- INT
Open question: retain abortable flag in CV?
Monitor Entry & Exit
· MonitorEntry [lock: MonitorLock]
self: Process = CurrentProcess[];
DO
IF CStoreProcess[
ptr: @lock.owner, old: NIL, new: self]
THEN RETURN;
Enqueue[@lock.queue, self];
IF lock.owner # NIL
THEN Yield[waitingML, lock.owner]
ELSE DequeueSelf[@lock.queue];
ENDLOOP;
· MonitorExit [lock: MonitorLock]
lock.owner ← NIL;
IF lock.queue.chain # NIL THEN {
next: Process = Dequeue[@lock.queue];
IF next # NIL THEN MakeReady[next]};
Wait & Notify
· Wait [cond: Condition, lock: MonitorLock]
self: Process = CurrentProcess[];
Enqueue[@cond.queue, self];
MonitorExit[lock];
Yield[waitingCV, NIL, cond.timeout];
IF self.next # NIL THEN
DequeueSelf[@cond.queue];
-- dequeue self from queue if necessary
MonitorEntry[lock];
· Notify [cond: Condition]
best: Process = Dequeue[@cond.queue];
IF best # NIL THEN MakeReady[best];
Enqueue
· Enqueue [q: Queue, p: Process]
tail: Process;
AcquireQueue[q];
tail ← q.chain;
IF tail = NIL
THEN p.next ← p
ELSE {p.next ← tail.next; tail.next ← p};
q.chain ← p;
p.queue ← q;
ReleaseQueue[q];
Note: priority not used to place in queue
Dequeue
· Dequeue [q: Queue] RETURNS [best: Process]
IF q.chain # NIL THEN {
AcquireQueue[q];
IF q.chain # NIL THEN {
tail: Process = q.chain;
this: Process ← best ← tail.next;
IF best = tail
THEN q.chain ← NIL
ELSE {
lag: Process ← tail;
WHILE this # tail DO
next: Process ← this.next;
IF next.priority > best.priority
THEN {lag ← this; best ← next};
this ← next;
ENDLOOP;
IF tail = best THEN q.chain ← lag;
lag.next ← best.next};
best.queue ← NIL;
best.next ← NIL};
ReleaseQueue[q]};
Note: highest priority process removed
AcquireQueue & ReleaseQueue
· AcquireQueue [q: Queue]
self: Process = CurrentProcess[];
WHILE NOT CStoreProcess[
ptr: @q.busy, old: NIL, new: self] DO
WHILE q.busy # NIL DO
Yield[ready, q.busy];
ENDLOOP;
ENDLOOP;
Note: Yield to owner gets around "livelock"
· ReleaseQueue [q: Queue]
q.busy ← NIL;
Yield
· Yield [nextState: ProcessState,
 nextProcess: Process,
 when: Word]
nextProcess # NIL =>
switch to nextProcess if it is ready
nextProcess = NIL =>
switch to best ready process
nextState = waitingCV, waitingML =>
wait for being made ready
nextState = ready =>
put self at back of ready queue for priority
when used for
# of ticks to go for CV timeout
page # for page fault
Meta Queues
· Ready meta-queues
one for each priority
· Timeout meta-queue
scanned at each "tick" to update timeout words
relatively few processes use timeouts
-- will this stay true?
· Page fault meta-queue
can page fault while "waiting" for CV or ML
Reschedule
· Interrupts all Dragon processors
the lowest priority processor grabs the meta lock
meta lock ownership gives right to
run I/O processing
run scheduler
· I/O processing
interrupts turn into naked notifies
(all data must be resident)
· Scheduler
scheduler code chooses best processes to run
(which may force switching)
scheduler sets orders field in processor data
(reschedule interrupt used to deliver orders)
· Processor orders
tested at every reschedule interrupt
can force switch to designated process
(NIL => switch to "best" ready process)
can use to panic stop (debugging & stack tracing)
Miscellaneous
· Stack save/restore
uses nachos to save frames
same as stack overflow/underflow
20-30 msecs / frame to save & restore
dominant cost for process switch
avg. # of frames not known
· Emergency limits on nacho allocation
per process limit
(soft limit - can override)
total limit
(hard limit - go to teledebugger)
· Timeout meta-queue
allows waiting for timeout & CV at same time
only need to scan processes actually waiting
timed-out processes remove self from CV queue
Open Questions
· Process switch cost
Effect of multiple processors on switching rate?
Average # of frames at process switch?
2 may be low, 10 probably high
frequent switching reduces average depth
Are nachos efficient enough?
Can modest extra hardware help?
· Process abort
enable: per CV or per process?
must be careful about interaction with UNWIND
· Timeout
how fine a grain?
avoid multiple wakeups at same instant?
how many processes will use timeouts?
· Bottlenecks
scheduler lock?