///Processes/ProcessTalk.tioga August 28, 1986 Mesa Processes · Process qualities · Language support · System support · Some observed statistics · PrincOps data structures · DragOps data structures · DragOps primitives · Open questions Process qualities Mesa processes are: · "lightweight" low-cost creation, switching, termination little state information · in single address space · dynamic (not permanently assigned) · coordinated by monitors [C. A. R. Hoare] Language support · PROCESS FORK creates processes from invocations RETURN terminates process JOIN waits for process to finish · MONITOR Monitor locks (MLs) synchronize data access ENTRY procedures use MLs to synchronize execution MONITORED RECORD protects record instead of global frame · CONDITION Condition variables (CVs) protected by MLs WAIT suspends process until condition occurs NOTIFY, BROADCAST indicates when condition has occurred System support · Priorities 8 levels · Scheduler runs highest priority ready process time slicing within priorities · Interrupts and notifications device "interrupts" become "naked" notifies · Timeouts when a condition is overdue to occur default is no timeout · Detached processes · Aborting processes Some observed statistics (Dorado) Process switches per second in Cedar minimum: ~ 250 per second maximum: ~ 1000 per second · Background: 240 per second · Compilation: 330 per second adds in I/O switching to background · TiogaToInterpress: 350 - 850 per second variance still mysterious · MergeCedarMap: 660 per second uses multiple processes Things to note most process switches due to CV wait PrincOps · Primitives in microcode including most scheduling · Single processor · Very little state to switch about 11 msecs in basic cost · 1024 processes maximum Cedar boot file made for 300 maximum 100-150 typically in use · Emphasis on small size 8 bytes per process 2 bytes per monitor lock 4 bytes per condition variable DragOps · Primitives in software · Multi-processor minimum: 1 maximum: < 100 · Moderate state to switch up to 12 frames in registers 100-200 msecs typical cost (estimate) · No inherent process maximum · No small size emphasis DragOps design objectives · Efficient primitives especially cheap ML processing in normal case! · Dynamic process to processor association all processors are "equal" avoid bottlenecks caused by dedicated processors · Scalable same mechanism works for 1 to 100 processors no inherent limit on processes · Debuggable keep more than minimal info (i.e. ML owner) Process data (PrincOps) · PrincOps PDABase: TYPE = LONG BASE POINTER TO ProcessDataArea; PDA: PDABase = LOOPHOLE[200000B]; PsbIndex: TYPE = CARDINAL [0..1024); -- same as PROCESS PsbHandle: PDABase RELATIVE POINTER TO ProcessStateBlock; ProcessStateBlock: TYPE = RECORD [ link: PsbLink, -- to next waiting process flags: PsbFlags, -- incl. cleanup link context: Context, -- frame | state vector timeout: Ticks]; -- ticks remaining Process data (DragOps) · DragOps Process: TYPE = LONG POINTER TO ProcessRep; ProcessRep: TYPE = RECORD [ queue: Queue, -- CV | ML being waited next: Process, -- next waiting meta: Process, -- next on meta queue when: Word, -- ticks to go | page priority: Priority, -- process priority [0..7] state: ProcessState, -- ready, waiting, ... abort: AbortState, -- request & enable bits euState: EUstate]; -- switched registers Monitor locks · PrincOps Monitor: TYPE = RECORD [ reserved: [0..17B] _ 0, tail: PsbIndex, available: [0..1] _ 0, locked: BOOL]; · DragOps QueueRep: TYPE = RECORD [ busy: Process, chain: Process]; MonitorLockRep: TYPE = RECORD [ queue: QueueRep, owner: Process]; Condition variables · PrincOps Condition: TYPE = RECORD [ reserved: [0..17B] _ 0, tail: PsbIndex, abortable: BOOL, wakeup: BOOL]; ConditionVariable: TYPE = RECORD [ condition: Condition, timeout: Ticks]; -- CARDINAL · DragOps ConditionRep: TYPE = RECORD [ queue: QueueRep, timeout: Ticks]; -- INT Monitor Entry & Exit · MonitorEntry [lock: MonitorLock] self: Process = CurrentProcess[]; DO IF CStoreProcess[ ptr: @lock.owner, old: NIL, new: self] THEN RETURN; Enqueue[@lock.queue, self]; IF lock.owner # NIL THEN Yield[waitingML, lock.owner] ELSE DequeueSelf[@lock.queue]; ENDLOOP; · MonitorExit [lock: MonitorLock] lock.owner _ NIL; IF lock.queue.chain # NIL THEN { next: Process = Dequeue[@lock.queue]; IF next # NIL THEN MakeReady[next]}; Wait & Notify · Wait [cond: Condition, lock: MonitorLock] self: Process = CurrentProcess[]; Enqueue[@cond.queue, self]; MonitorExit[lock]; Yield[waitingCV, NIL, cond.timeout]; IF self.next # NIL THEN DequeueSelf[@cond.queue]; -- dequeue self from queue if necessary MonitorEntry[lock]; · Notify [cond: Condition] best: Process = Dequeue[@cond.queue]; IF best # NIL THEN MakeReady[best]; Enqueue · Enqueue [q: Queue, p: Process] tail: Process; AcquireQueue[q]; tail _ q.chain; IF tail = NIL THEN p.next _ p ELSE {p.next _ tail.next; tail.next _ p}; q.chain _ p; p.queue _ q; ReleaseQueue[q]; Note: priority not used to place in queue Dequeue · Dequeue [q: Queue] RETURNS [best: Process] IF q.chain # NIL THEN { AcquireQueue[q]; IF q.chain # NIL THEN { tail: Process = q.chain; this: Process _ best _ tail.next; IF best = tail THEN q.chain _ NIL ELSE { lag: Process _ tail; WHILE this # tail DO next: Process _ this.next; IF next.priority > best.priority THEN {lag _ this; best _ next}; this _ next; ENDLOOP; IF tail = best THEN q.chain _ lag; lag.next _ best.next}; best.queue _ NIL; best.next _ NIL}; ReleaseQueue[q]}; Note: highest priority process removed AcquireQueue & ReleaseQueue · AcquireQueue [q: Queue] self: Process = CurrentProcess[]; WHILE NOT CStoreProcess[ ptr: @q.busy, old: NIL, new: self] DO WHILE q.busy # NIL DO Yield[ready, q.busy]; ENDLOOP; ENDLOOP; Note: Yield to owner gets around "livelock" · ReleaseQueue [q: Queue] q.busy _ NIL; Yield · Yield [nextState: ProcessState, nextProcess: Process, when: Word] nextProcess # NIL => switch to nextProcess if it is ready nextProcess = NIL => switch to best ready process nextState = waitingCV, waitingML => wait for being made ready nextState = ready => put self at back of ready queue for priority when used for # of ticks to go for CV timeout page # for page fault Meta Queues · Ready meta-queues one for each priority · Timeout meta-queue scanned at each "tick" to update timeout words relatively few processes use timeouts -- will this stay true? · Page fault meta-queue can page fault while "waiting" for CV or ML Reschedule · Interrupts all Dragon processors the lowest priority processor grabs the meta lock meta lock ownership gives right to run I/O processing run scheduler · I/O processing interrupts turn into naked notifies (all data must be resident) · Scheduler scheduler code chooses best processes to run (which may force switching) scheduler sets orders field in processor data (reschedule interrupt used to deliver orders) · Processor orders tested at every reschedule interrupt can force switch to designated process (NIL => switch to "best" ready process) can use to panic stop (debugging & stack tracing) Miscellaneous · Stack save/restore uses nachos to save frames same as stack overflow/underflow 20-30 msecs / frame to save & restore dominant cost for process switch avg. # of frames not known · Emergency limits on nacho allocation per process limit (soft limit - can override) total limit (hard limit - go to teledebugger) · Timeout meta-queue allows waiting for timeout & CV at same time only need to scan processes actually waiting timed-out processes remove self from CV queue Open Questions · Process switch cost Effect of multiple processors on switching rate? Average # of frames at process switch? 2 may be low, 10 probably high frequent switching reduces average depth Are nachos efficient enough? Can modest extra hardware help? · Process abort enable: per CV or per process? must be careful about interaction with UNWIND · Timeout how fine a grain? avoid multiple wakeups at same instant? how many processes will use timeouts? · Bottlenecks scheduler lock? 4Open question: retain abortable flag in CV? ΚΤ– PressFonts–"slides" style˜Iunleaded•Mark insideFooteršΡdis˜K– outsideFooterš˜title˜IraggedšΟb˜Mšž˜Mšž˜Mšž˜Mšž˜Mšž˜Mšž˜Mšž˜—˜šž˜šœ˜Mšœ)˜)Mšœ˜—Mšœ˜Mšœ$˜$Mšœ*˜*——˜šž ˜ Mšžœ#˜'Mšžœ˜Mšžœ˜ —šž ˜ M˜+šžœ ˜M˜ —šž˜M˜'——šž ˜ M˜*šž˜M˜'—šž˜M˜%———˜šž ˜ Mšœ˜—šž ˜ M˜#M˜—šž˜M˜+—šž ˜ M˜$M˜—Mšž˜Mšž˜—šœ!˜!Mšž$œ5˜YMšž˜šž˜M˜#—šž)˜)M˜—šž˜M˜—šž˜M˜$——˜šž˜M˜—Mšž˜šž˜Mšœ Οgœ˜—šž˜M˜$M˜—šž˜M˜M˜M˜——˜Mšž˜šž˜M˜ M˜—šž˜M˜MšœŸœ˜%—Mšž˜Mšž˜—˜šž˜M˜.—šž*˜*M˜M˜0—šž ˜ M˜,M˜—šž ˜ M˜+——šœžœ˜šž ˜ Mš œ Οkœ œ œ œ œ˜5Mš œ  œ ˜!Mšœ  œ œ Οc˜7Mšœ œ œ œ˜9šœ œ œ˜"Mšœ‘˜*Mšœ‘˜'Mšœ‘˜*Mšœ‘˜$———šœžœ˜šž ˜ Mš œ  œ œ œ œ ˜+–1.2 in tabStopsšœ  œ œ˜M–1.2 in tabStopsšœ‘˜%M–1.2 in tabStopsšœ‘˜M–1.2 in tabStopsšœ‘˜$M–1.2 in tabStopsšœ ‘˜!M–1.2 in tabStopsšœ‘˜.M–1.2 in tabStopsšœ‘˜+M–1.2 in tabStopsšœ‘˜+M–1.2 in tabStopsšœ‘˜(———˜ šž ˜ šœ  œ œ˜M˜M˜M˜Mšœ œ˜—M˜—šž ˜ šœ  œ œ˜M˜Mšœ˜—šœ œ œ˜Mšœ˜Mšœ˜———˜šž ˜ šœ  œ œ˜M˜M˜Mšœ  œ˜Mšœ œ˜—šœ œ œ˜"Mšœ˜Mšœ‘ ˜M˜——šž ˜ šœ œ œ˜Mšœ˜Mšœ‘˜——MšΟi,™,—˜šž"˜"Mšœ!˜!š ˜Mš œ+ œ  œ œ˜IMšœ˜Mš œ œ œ œ˜TMš œ˜——šž!˜!Mšœ  œ˜š œ œ œ˜ M˜%Mš œ œ œ˜$———˜ šž+˜+M˜!Mšœ˜Mšœ˜Mšœ œ˜$Mš œ  œ œ‘(˜ZMšœ˜—šž˜M˜%Mš œ œ œ˜#——˜šž ˜ Mšœ˜Mšœ˜Mšœ˜Mš œ œ œ  œ%˜GMšœ ˜ Mšœ ˜ Mšœ˜—Mš’)˜)—˜šžΠbkž˜,š œ  œ œ˜Mšœ˜Mš  œ  œ œ> œ œ  œ œ( œ  œ4 œ/ œA œ  œ  œ< œ œ˜έMšœ˜——Mš’&˜&—šœ˜šž˜Mšœ!˜!š œ œ8 œ  ˜Sšœ˜Mšœ˜Mšœ˜—Mšœ˜—Mš’+˜+—šž˜Mšœ  œ˜ ——˜šžE˜EMšΟz œ€ œ˜9Mš€ œ&˜1Mš€ œ4˜=Mš€ œ8˜AMš€œ?˜C——˜ šž˜M˜—šž˜M˜.Mšœ&‘˜=—šž˜M˜+——šœ ˜ šž"˜"M˜1M˜C—šž˜M˜?—šž ˜ M˜HMšœ[˜[—šž˜M˜$Mšœ( œ#˜NM˜1——˜ šž˜M˜;MšœŸœZ˜a—šž&˜&M˜-M˜-—šž˜M˜,M˜,M˜-——˜šž˜Mšœ0˜0šœ&˜&Mšœ’œ˜M˜(—Mšœ˜M˜—šž˜M˜Mšœ' ˜-—šž ˜ M˜M˜'M˜%—šž ˜ M˜———…—π(ψ