About processes:
1. All processes execute in a single address space, although we will want to extend our model to multiple address spaces some day (Dragon has address space switching support).
2. Processes are (relatively) lightweight. Most of the process switch time is devoted to saving and restoring registers. We estimate 100-300 msecs per process switch, depending on the amount of state.
3. Processes are dynamically assigned to processors by the scheduler, so that processes can be considered virtual processors. We think of the scheduler as executing in the cracks between processes.
4. The scheduler attempts to run higher priority processes in preference to lower priority processes. However, there is no guarantee that a higher priority process will preempt a lower priority process, even for single processor systems.
About locks:
1. Primitive spin locks (PSLs) are the most primitive locks. PSLs are built out of CStore, are only acquired with reschedule off, and must be pinned. A single processor system should never wait for PSLs, obviously. In a multi-processor system PSLs must never be held long, or they become a bottleneck. The most important (and currently only) PSL is the metaLock, which controls access to the global process state information (especially the meta queues).
2. General spin locks (GSLs) use CStore and directed Yield. If CStore cannot acquire the lock, then the process attempting to acquire the lock donates its cycles to the process holding the lock using directed Yield. If the process holding the lock cannot be run, directed Yield acts like undirected Yield. GSLs can be non-pinned.
3. General queues (GQs) use a GSL to serialize ownership of the queue, and a circular queue of processes to hold the processes that are waiting. See Enqueue and Dequeue for details.
4. Monitor locks (MLs) are built from a GQ and an ownership word, which is used to denote the process owning the lock. See MonitorEntry and MonitorExit for details.
5. Condition variables (CVs) do not assume that a monitor is held, although one is always present. CVs also support timeouts. CVs are built from a GQ and a timeout word that holds the initial value of the timeout period. A flags word contains a bit that can enable aborts to affect the CV.
6. Interrupt condition variables (ICVs) are the mechanism for turning device signals and interrupts into ICV notifications. ICVs do not assume that a monitor is held, since none is ever present. ICVs also support timeouts. ICVs are built from a CV and a request bit for interrupt requests so we do not lose them while not waiting for the ICV. ICVs must be pinned, to allow the scheduler to remove requests from them.
External procedures
Fork:
PUBLIC
PROC
[proc:
PROC
ANY
RETURNS
ANY, argPtr: WordSeq, nArgs:
INT, nRets:
INT]
RETURNS [new: Process ←
NIL] = {
... forks a new process, which will initially call the given procedure using the given arguments. The number of words of arguments and returns is also necessary.
In the calling process:
F1: allocate a new process. Return NIL if we can't do this. (Error??)
-- CHH: PrincOps ProcessImpl.Fork waits until a process is available. Do we really want to change the semantics? March 12, 1987
F2: newFX ← Allocate a FX of (IF nRets > maxStackRets THEN nRets ELSE 0)+(IF nArgs > maxStackArgs THEN nArgs ELSE 0) words.
F3: copy the arguments from argPtr^ for nArgs words to newFX
(the original argument storage belongs to the caller)
F4: setup the process block & make it ready to run Forkee, having set things up so that Forkee's arguments will be correct. Forkee[proc, newFX, nArgs, nRets]
{
-- The following SIM code works if there are no args or results ...
simProc: PROC ← LOOPHOLE [proc];
RDI: allocate storage, copy args to a safe place, etc.
DisableInterrupts[];
AcquireMetaLock[];
new ← NewProcessInner[];
new.priority ← CurrentProcess[].priority;
RDI: make the process a schedulable entity that will run Forkee when it's scheduled. For now ...
C7Process.Detach[ FORK SimForkee[new, simProc] ];
EnqueueReady[new];
ReleaseMetaLock[];
EnableInterrupts[]; -- automatic in RDI
};
};
SimForkee:
PROC [process: Process, proc:
PROC] ~ {
SimBackground[]; -- SIM only, because we just forked
Note at this point SimCurrentProcessor[] is NIL! This is okay, since we're about to suspend ourselves. When we get switched back to, the current processor will be filled in.
SimCallAndSuspend[process.sim.lock, process, NIL]; -- just suspend until we're switched to
RDI: get arguments from somewhere
proc[];
RDI: put results somewhere
MonitorEntry[@process.lock];
DO
SELECT process.joinState
FROM
none, exiting => {
process.joinState ← exiting;
Wait[@process.joinCondition, @process.lock] };
detached, joined => {
EXIT };
joining => {
process.joinState ← exiting;
Notify[@process.joinCondition];
Wait[@process.joinCondition, @process.lock] };
ENDCASE => ERROR;
ENDLOOP;
process.joinState ← none; -- not really necessary (will be done by CleanProcess)
MonitorExit[@process.lock];
RDI: Free the results
SIM: don't create a suspension here ... this process won't get switched back to!
RDIDirectedYield[free, NIL, 0]; -- RDI: replace by DirectedYield.
};
Forkee:
PROC [proc:
PROC
ANY
RETURNS
ANY, argPtr: WordSeq, nArgs:
INT, nRets:
INT]
RETURNS [
--doesn't--] ~ {
retRecord: WordSeq;
FF0: IF nRets > maxStackRets THEN
{ retRecord ← argPtr^;
argPtr^ ← argPtr + nArgs;
};
FF1: Depending on nArgs, either push argPtr or push nArgs from argPtr^;
FF2: Push appropriate descriptor information for proc
FF3: Call proc using an inline SFCI instruction; it eventually will return or not as it sees fit.
JJ1: proc's RETURN record may either be in the FX pointed to by argPtr at offset nArgs or on the stack.
JJ2: wait for JOIN or DETACH by another process.
JJ3: IF JOINed
THEN {
Copy return values in stack or FX to a Nacho;
Give the Nacho to the other process
}
JJ4: FrameAllocator.Free[argPtr];
JJ5: place the process in the free list (state ← free), free any remaining resources (possibly a FX) and quietly disappear
NotYetImplemented[];
};
Join:
PUBLIC
PROC [process: Process, retPtr: WordSeq] = {
... joins the specified process, which waits for the process to complete, then copies the return values to the address given by retPtr.
In the calling process:
J1: wait for process to be in the done state (wait on condition?)
J2: copy the return values to retPtr^
J3: place the process in the free list (state ← free)
J4: return to the caller
{
-- SIM only, no args or results
ValidateProcess[process];
MonitorEntry[@process.lock];
DO
SELECT process.joinState
FROM
none, joining => {
process.joinState ← joining;
Wait[@process.joinCondition, @process.lock] };
exiting => {
get the results
process.joinState ← joined;
Notify[@process.joinCondition];
EXIT };
detached, joined => {
This is a client error. What to do ????
ERROR;
};
ENDCASE => ERROR;
ENDLOOP;
MonitorExit[@process.lock];
};
};
SimDetach:
PROC [process: Process] ~ {
ValidateProcess[process];
MonitorEntry[@process.lock];
SELECT process.joinState
FROM
none => {
process.joinState ← detached };
exiting => {
process.joinState ← detached;
Notify[@process.joinCondition] };
joining => {
This is a client error. What to do ????
ERROR };
detached, joined => {
This is a client error, but probably benign. What to do ????
ERROR };
ENDCASE => ERROR;
MonitorExit[@process.lock];
};
DirectedYield:
PUBLIC
PROC
[nextState: ProcessState, nextProcess: Process, when: Ticks] =
TRUSTED {
SIM only ... this is a veneer around RDIDirectedYield, below.
DoDirectedYield: PROC ~ { RDIDirectedYield[nextState, nextProcess, when] };
SimCallAndSuspend[CurrentProcess[].sim.lock, CurrentProcess[], DoDirectedYield];
This is where the process will resume when it's rescheduled.
SimCheckForRescheduleTrap[];
};
RDIDirectedYield:
PUBLIC
PROC
[nextState: ProcessState, nextProcess: Process, when: Ticks] =
TRUSTED {
... yields the processor to the nextProcess, placing the current process in nextState. If nextProcess = NIL or is not ready, then the executing processor is yielded to the "best" ready process, placing the current process in nextState.. DirectedYield inhibits critical traps, and performs a ProcessSwitch. The when word is used for passing through timeout information.
SIM: This routine must be called only when critical traps are enabled.
RDI: change name to DirectedYield!
DisableInterrupts[];
CurrentProcess[].when ← when;
MarkFrame[];
This Indicates that the current frame should not be saved.
ProcessSwitch[nextState, nextProcess];
EnableInterrupts[]; -- SIM only. RDI: happens automatically.
};
Wait:
PUBLIC
PROC [cond: Condition, lock: MonitorLock] = {
... waits for the condition variable cond to be the target of a Notify, Broadcast, or timeout. If lock # NIL, then lock must be released during the possible pause, and lock must be re-acquired after the Wait has finished.
Place self on the condition variable; we do this before giving up the monitor to avoid missing a Notify, which would be wrong; we cannot get a Notify between now and releasing the monitor lock, since one must possess the monitor lock in order to do a Notify.
Enqueue[@cond.queue];
Release the monitor lock. After this the condition can be Notified by another process.
IF lock # NIL THEN MonitorExit[lock];
???? How test abort ????
IF (CurrentProcess[].abortState # requested)
OR (
NOT cond.flags.abortEnable)
THEN {
Go to sleep, choosing the "best" process to run next. If we get a Notify after exiting the monitor and before going to sleep, the Notify will remove this process from the condition, yet will NOT place this process on the ready queue (because its state is not yet waitingCV).
DirectedYield[waitingCV, NIL, cond.timeout];
};
Remove self from the condition queue, if still on it. DequeueSelf has no effect if the current process is not on a queue, so a race with some other process trying to dequeue this process is benign.
DequeueSelf[@cond.queue];
Reacquire the monitor lock.
IF lock # NIL THEN MonitorEntry[lock];
Final test for abort request while we were waiting on the CV.
???? How test abort ????
IF CurrentProcess[].abortState = requested AND cond.flags.abortEnable THEN TestAbort[];
};
WaitInterrupt:
PUBLIC
PROC [cond: Condition] = {
... waits for cond to be the target of an interrupt request, or timeout. Unlike Wait, there are no monitor locks involved, and we have to proceed immediately if interrupts have occurred since the last WaitInterrupt.
For now, we prohibit aborts on interrupt condition variables; comments suggest how aborts could be implemented if we wanted them.
IF
NOT cond.flags.condRequest
THEN {
Enqueue[@cond.queue];
IF
NOT cond.flags.condRequest
THEN {
We are now on the queue. A DelicateNotify is guaranteed either (i) to remove somebody from the queue, or (ii) to fail to acquire the queue. In case (i), if this process is the one removed from the queue, and it happens before the DirectedYield below, then the DirectedYield will return immediately. Otherwise, things happen normally. (See the comments in Wait.) In case (ii), the queue is held by some process in WaitInterrupt, which is guaranteed to examine cond.flags.condRequest shortly. In no case will an interrupt be lost.
ABORTS: IF we've been aborted THEN skip the DirectedYield ...
DirectedYield[waitingICV, NIL, cond.timeout];
};
DequeueSelf[@cond.queue];
};
ABORT: IF we should abort THEN raise ABORTED; Also, make sure to change the code below to preserve the value of cond.flags.abortEnable.
Clear cond.flags.condRequest. This is correct even if it's been set since we've examined it, since the current process (doing WaitInterrupt) is about to wake up.
cond.flags ← defaultCondFlags;
};
Notify:
PUBLIC
PROC [cond: Condition] = {
... notifies the "best" process waiting on cond that the condition has occurred. The protecting monitor lock is presumed to be held, but is not needed for this primitive.
best: Process = Dequeue[@cond.queue];
IF best #
NIL
THEN MakeReady[best];
There really was a process waiting on the CV. Although a process may be made ready, the process switch is not immediately required. The process switch may occur when the protecting monitor lock is released.
};
Broadcast:
PUBLIC
PROC [cond: Condition] = {
... notifies all processes waiting on cond that the condition has occurred. To eliminate race problems, all of the waiting processes are made ready while holding onto the queue. The protecting monitor lock is presumed to be held, but is not needed for this primitive.
tail: Process ← cond.queue.chain;
IF tail #
NIL
THEN {
AcquireQueue[@cond.queue];
IF (tail ← cond.queue.chain) #
NIL
THEN {
cond.queue.chain ← NIL;
DO
Remove all of the processes from the circular queue and make them ready. Although processes may be made ready, the process switch is not immediately required. The process switch may occur when the protecting monitor lock is released.
this: Process = tail.next;
next: Process ← this.next;
tail.next ← next;
this.queue ← NIL;
this.next ← NIL;
MakeReady[this];
IF this = next THEN EXIT; -- we just removed the last one
ENDLOOP;
};
ReleaseQueue[@cond.queue];
};
};
MonitorEntry:
PUBLIC
PROC [lock: MonitorLock] = {
... is performed at the entry to any monitored procedure to lock up the monitor for exclusive access. This operation is performed frequently, so it should be fast in the case where the monitor is not already owned.
self: Process = CurrentProcess[];
DO
Try for conditional store of current PSB index with 0 wait chain. If we succeed, then we are in, and can return.
IF CStoreProcess[ptr: @lock.owner] THEN RETURN;
Place self on the monitor queue.
Enqueue[@lock.queue];
If the monitor is still held, then go to sleep, yielding to best process, only to wake up when the monitor queue is OK. Otherwise, some process released the monitor while we were trying to Enqueue, so we should get off the monitor queue and try again.
{
owner: Process = lock.owner;
IF owner #
NIL
THEN {
THROUGH [0..spinTries)
WHILE owner.state = running
DO
SimYield[]; -- SIM only
ENDLOOP;
IF lock.owner # NIL THEN DirectedYield[waitingML, lock.owner, 0];
};
DequeueSelf[@lock.queue];
};
ENDLOOP;
};
MonitorExit:
PUBLIC
PROC [lock: MonitorLock] = {
... frees up the monitor lock. If there is a process waiting the monitor, dequeue the process from the monitor queue and place it in the ready queue. The current coding does nothing if we do not own the lock. Should we die instead of doing nothing?
IF lock.owner = CurrentProcess[]
THEN {
Give up ownership of the lock.
lock.owner ← NIL;
IF lock.queue.chain #
NIL
THEN {
Make ready the next process waiting for the monitor. There is a slight possibility that some other process will jump in ahead of this one while the monitor is free, but fairness is not guaranteed.
next: Process = Dequeue[@lock.queue];
IF next # NIL THEN MakeReady[next];
};
IF deferredWakeups
THEN {
deferredWakeups = TRUE indicates that we deferred a desirable wakeup while processing a Broadcast or Notify for some monitor. Statistically speaking, it was probably for this monitor. So we raise reschedule here to get the best processes running.
RaiseReschedule[];
};
};
};
TestAbort:
PUBLIC
PROC = {
... tests for a requested abort.
???? CHH says this is bogus. ????
self: Process = CurrentProcess[];
IF self.abortState = requested THEN { self.abortState ← none; ERROR ABORTED };
};
RequestAbort:
PUBLIC
PROC [process: Process] = {
... requests the given process to abort itself.
???? CHH says this is bogus. ????
SELECT process.abortState
FROM
none => {
process.abortState ← requested;
MakeReady[process];
RaiseReschedule[];
};
inhibited =>
process.abortState ← delayed;
ENDCASE;
};
SetPriority:
PUBLIC
SAFE
PROC [p: Priority] ~
TRUSTED {
... changes the priority of the current process. We need to recompute lowestRunningPriority; also, if our new priority is lower than that of some ready process we need to initiate a process switch.
self: Process ~ CurrentProcess[];
oldPriority: Priority;
doYield: BOOL;
DisableInterrupts[];
AcquireMetaLock[];
oldPriority ← self.priority;
self.priority ← p;
Recompute lowestRunningPriority ...
SELECT
TRUE
FROM
p <= lowestRunningPriority => {
lowestRunningPriority ← p };
oldPriority = lowestRunningPriority => {
lowest: Priority ← tailProcessor.running.priority;
FOR proc: Processor ← tailProcessor.next, proc.next
WHILE proc # tailProcessor
DO
lowest ← MIN[lowest, proc.running.priority];
ENDLOOP;
lowestRunningPriority ← lowest };
ENDCASE => NULL;
doYield ← (p < highestReadyPriority);
ReleaseMetaLock[];
IF doYield
THEN {
???? All that's really necessary here is { MarkFrame[]; ProcessSwitch[ready, NIL] }, but that won't work in the simulation environment! ????
EnableInterrupts[]; -- SIM only?
DirectedYield[ready, NIL, 0] }
ELSE {
EnableInterrupts[]; -- SIM only. Automatic in RDI.
};
};
Reschedule Interrupt
intHandlers: ARRAY RequestKind OF IntHandler ← ALL [StrayIntHandler];
StrayIntHandler: IntHandler
[requestKind: RequestKind]
~ TRUSTED {
SIM version. Obviously we need to do something better here for RDI.
ERROR };
RegisterIntHandler:
PUBLIC
SAFE
PROC [requestKind: RequestKind, intHandler: IntHandler]
RETURNS [old: IntHandler] ~
TRUSTED {
IF intHandler = NIL THEN intHandler ← StrayIntHandler;
DisableInterrupts[];
AcquireMetaLock[];
old ← intHandlers[requestKind];
intHandlers[requestKind] ← intHandler;
ReleaseMetaLock[];
EnableInterrupts[]; -- SIM only. Automatic in RDI.
};
SimCheckForRescheduleTrap:
PROC ~ {
SIM only.
On a real Dragon the current processor can change whenever interrupts are enabled! This code would be bogus on a real machine, but, after all, it's here to simulate a real machine's interrupt mechanism ...
IF NOT CurrentProcessor[].sim.interruptsEnabled THEN RETURN;
WHILE CurrentProcessor[].sim.rescheduleRequested
DO
CurrentProcessor[].sim.rescheduleRequested ← FALSE;
DisableInterrupts[]; -- so RescheduleTrap will be called with interrupts disabled
SimCallAndSuspend[CurrentProcess[].sim.lock, CurrentProcess[], RescheduleTrap];
This is where the current process resumes when it's rescheduled.
ENDLOOP;
};
RescheduleTrap:
PROC = {
... is called whenever the reschedule interrupt is enabled and either the reschedule line is asserted OR reschedule traps become enabled and there is a deferred reschedule pending. In either case, we start up with critical traps inhibited, and a special return (hand-coded in the RescheduleTrap vector) enables critical traps on exit. On entry, volatile EU state has been pushed in such a way that when the process is switched to later the volatile EU state will be restored correctly (again, hand-coded in the RescheduleTrap vector).
RescheduleTrap is primarily called in response to two causes: device interrupts and explicit condition variable wakeups. Device interrupts are turned into condition variable wakeups in this routine, so the appearance outside of this routine is that we only have condition variable wakeups. Statistically we only get to enter RescheduleTrap if there is a ready process with priority >= lowestRunningPriority, and we try to give the scheduler to a processor with priority = lowestRunningPriority. The major exception to this is time-slicing, which should be kept to a modest rate (the current system uses about 10 per second).
entryPriority: Priority ← CurrentProcess[].priority;
DO
First, try to grab ownership of the scheduler (if at lowest priority)
IF entryPriority.
ORD <= lowestRunningPriority.
ORD
AND schedulerLock =
NIL
THEN
This processor is one of the best to try the scheduler, and the scheduler is free.
IF CStoreProcess[ptr: @schedulerLock]
THEN {
At this point we have acquired the scheduler.
currentReq: RequestWord ← ReadAndClearDeviceRequests[];
sample & clear request word (currently protected by the schedulerLock)
IF currentReq # nullRequestWord
THEN {
Make maximum possible room on IFU stack for device interrupt handler. This is expensive enough to be worth doing before grabbing the metaLock.
MarkFrame[];
SaveProcessState[CurrentProcess[]];
};
AcquireMetaLock[];
don't allow changes to the ready queue by other processors
IF currentReq # nullRequestWord
THEN {
Service requests in the currentReq word by tweaking the appropriate interrupt condition queues. This loop is not the fastest way to do this, but it is not worth improving unless we find out that it is taking a lot of time.
FOR reqx: RequestKind
IN RequestKind
DO
IF currentReq[reqx]
THEN {
(intHandlers[reqx])[reqx];
currentReq[reqx] ← FALSE;
IF currentReq = nullRequestWord THEN EXIT;
};
ENDLOOP;
};
IF highestReadyPriority.
ORD > lowestRunningPriority.
ORD
THEN {
There are ready processes that are better than at least some processor, so go do the bulk reassignment.
AssignProcessors[];
};
Release locks (in the given order).
???? what does the order have to do with anything ???? - ajd
schedulerLock ← NIL;
ReleaseMetaLock[];
};
Normal exit, just follow orders.
DO
SELECT CurrentProcessor[].orders
FROM
noChange =>
EXIT;
If no orders to follow, then exit the inner loop.
switchToGiven => {
The executing processor has been asked either to switch to a particular process or to switch to the "best" available process.
nextProcess: Process ← CurrentProcessor[].switchTo;
CurrentProcessor[].orders ← noChange;
Reset the orders, so the executing processor is available for more orders
(like during the call to ProcessSwitch).
IF (nextProcess #
NIL
AND nextProcess.state = ready)
OR CurrentProcess[].priority <= highestReadyPriority
THEN {
It is possible that a switch is necessary. Otherwise keep running the current process.
MarkFrame[];
Indicates that only frames older than the current frame are to be saved.
ProcessSwitch[ready, nextProcess];
};
};
panicStop => {
This state occurs whenever we need to put the world to sleep in a hurry.
self: Process = CurrentProcess[];
Save the state so it can be examined/modified.
MarkFrame[];
Indicates that only frames older than the current frame are to be saved.
SaveProcessState[self];
Mark self as stopped and spin until no longer stopped ...
CurrentProcessor[].orders ← stopped;
WHILE (CurrentProcessor[].orders = stopped)
DO
SimYield[]; -- SIM only
ENDLOOP;
Restore the state in case some piece of it was modified while we were stopped.
RestoreProcess[self];
};
ENDCASE => ERROR;
ENDLOOP;
Test for residual unserviced interrupts
If some process already has the scheduler lock then we need not persist.
IF schedulerLock # NIL THEN EXIT;
If no device requests, then there is no point in going back for another try at the scheduler.
IF NOT DeviceRequestsPending[] THEN EXIT;
By this point we have given a lowestRunningPriority processor the first chance to get in and process the interrupt. However, lowestRunningPriority must be treated as a hint, not as the truth. So if there are still device requests pending and no one has the scheduler, we go back to try to service those requests using the current processor. We lower entryPriority to try to force an entry into the scheduler.
entryPriority ← Priority.FIRST; -- ???? was IF entryPriority.ORD > lowestRunningPriority.ORD THEN entryPriority ← entryPriority.PRED; ????
ENDLOOP;
EnableInterrupts[]; -- SIM only. Automatic in RDI.
};
Special external procedures
ShuffleReadyQueues:
PUBLIC
PROC = {
... is called by the timeslicer service. It advances all processes waiting in the ready queues by one, and directs all undirected processes to switch to the best available process(es).
DisableInterrupts[];
AcquireMetaLock[];
Just order each processor to switch to the best available process.
FOR processor: Processor ← tailProcessor.next, processor.next
DO
SELECT processor.orders
FROM
noChange => {
Force a switch to the best available process
processor.orders ← switchToGiven;
processor.switchTo ← NIL;
};
ENDCASE;
IF processor = tailProcessor THEN EXIT;
ENDLOOP;
Last, raise the reschedule interrupt to make all processors choose the best available. Even the executing processor will eventually respond.
RaiseReschedule[];
ReleaseMetaLock[];
EnableInterrupts[];
};
CausePanicStop:
PUBLIC
PROC = {
... causes all processors except the current one to come to a screeching halt with properly saved state. When we want to perform stack tracing, delicate debugging, or prepare for world-swap debugging, we use this routine to shut down the world. On entry, interrupts are assumed to be enabled and the metaLock is free. On exit, interrupts are disabled, the metaLock is held, and the panicStopLock is held. ???? is that possible ???? - ajd
self: Process = CurrentProcess[];
DO
DisableInterrupts[];
Inhibit interrupts while we try to acquire the panicStopLock.
IF CStoreProcess[ptr: @panicStopLock]
THEN
EXIT;
If panicStopLock is acquired, exit with interrupts disabled.
EnableInterrupts[];
Allow interrupts to occur while we are waiting. This allows the current processor to panic stop if it is requested to do so.
WHILE panicStopLock #
NIL
DO
Spin without using CStore.
SimYield[]; -- SIM only.
ENDLOOP;
ENDLOOP;
MarkFrame[];
Indicates that only frames older than the current frame are to be saved.
SaveProcessState[self];
We save the process state to make sure that the calling process is stable, and to make sure that we have enough stack registers to complete processing. We must have interrupts disabled to call this routine, but we don't need the metaLock.
DO
current: Processor ← CurrentProcessor[];
done: BOOL ← TRUE;
AcquireMetaLock[];
FOR processor: Processor ← current.next, processor.next
WHILE processor # current
DO
SELECT processor.orders
FROM
noChange => {
processor.orders ← panicStop;
processor.switchTo ← NIL;
done ← FALSE;
};
stopped => {};
ENDCASE => done ← FALSE;
ENDLOOP;
IF done THEN EXIT;
ReleaseMetaLock[];
RaiseReschedule[];
ENDLOOP;
At this point all of the processors except the current processor have entered the stopped state, and are waiting for different orders to be given. We return to the caller with interrupts off and the metaLock held.
???? How do we return to the caller with interrupts off ???? Isn't the interrupt enable state part of the status that gets restored on procedure return ???? - ajd
};
ReleasePanicStop:
PUBLIC
PROC = {
... allows the stopped processors to resume with normal processing. Call this procedure to release all processes from their stopped state. We assume that the metaLock is held and interrupts are off, and that the current process holds the panicStopLock. On exit, interrupts are off, and the current process has released the metaLock and the panicStopLock.
???? Isn't changing the interrupt enable state of your caller bogus ???? - ajd
FOR processor: Processor ← tailProcessor.next, processor.next
DO
SELECT processor.orders
FROM
stopped => processor.orders ← noChange;
ENDCASE;
IF processor = tailProcessor THEN EXIT;
ENDLOOP;
At this point all of the other processors have been released from bondage. All we have to do now is release the panicStopLock and the metaLock, enable interrupts, and return.
panicStopLock ← NIL;
ReleaseMetaLock[];
EnableInterrupts[]; -- ???? bogus ???? - ajd
};
SatisfyPageFault:
PUBLIC
PROC [startPage, endPage:
INT] = {
... for each process P in the page fault with startPage d P.page d endPage, make the process ready.
???? Why can't we use a trailing pointer and remove processes from pageFaultMetaQueue directly here, instead of calling DequeueMeta (which has to search the queue) ???? After all, we hold the metaLock! ???? - ajd
DisableInterrupts[];
AcquireMetaLock[];
{
tail: Process ← pageFaultMetaQueue;
p: Process ← tail;
WHILE p #
NIL
DO
next: Process ← p.meta;
SELECT p.state
FROM
waitingPage =>
IF WordToInt[p.page]
IN [startPage..endPage]
THEN {
DequeueMeta[@pageFaultMetaQueue, p];
EnqueueReady[p];
place the process on the ready queue (also maintains highestReadyPriority)
IF p.priority > lowestRunningPriority
THEN deferredWakeups ←
TRUE;
The new process might be worth waking, but we don't do it here. We just remember that it was desirable.
};
ENDCASE;
p ← next;
IF p = tail THEN EXIT;
ENDLOOP;
};
ReleaseMetaLock[];
EnableInterrupts[];
IF deferredWakeups THEN RaiseReschedule[];
};
Internal procedures
ProcessSwitch:
PROC [nextState: ProcessState, nextProcess: Process] = {
... saves the state of the current process. If the process is not waiting, then it is placed on the ready queue. If nextProcess # NIL then it is the next process to run. Otherwise, the "best" ready process is chosen. During this sequence, we assume that critical traps are disabled. Note that if nextProcess # NIL, we must switch to it if it is ready, even if the current process is "better" (see comments in AcquireQueue to see why we require this).
self: Process ~ CurrentProcess[];
selfPriority: Priority ~ self.priority;
SaveProcessState[self];
This is expensive enough to do outside of the region that holds the metaLock.
AcquireMetaLock[];
From now on, don't allow changes to the ready queue by other processors.
SELECT nextState
FROM
waitingML => {
Check to see if this process is trying to go to sleep on a monitor lock that has already been exited, which could happen if the MonitorExit occurred after this process got on the queue, but before the Yield. We have carefully arranged for this to be OK with Dequeue and friends.
IF self.next = NIL THEN GO TO bailOut;
self.state ← nextState;
};
waitingCV, waitingICV => {
Check to see if this process has already been removed from the condition queue, which could happen if the Notify occurred after this process got on the queue, but before the Yield. We have carefully arranged for this to be OK with Dequeue and friends.
IF self.next = NIL THEN GO TO bailOut;
IF self.when # noTimeout THEN EnqueueMeta[@timeoutMetaQueue, self];
self.state ← nextState;
};
waitingPage => {
A page fault has occured. Put the offending process to sleep on the page fault metaQueue, then notify the page fault handling process.
EnqueuePageFault[self];
self.state ← waitingPage; -- (done by EnqueuePageFault)
DelicateNotify[pageFaultCond];
};
ready => {
IF (nextProcess #
NIL)
AND (nextProcess.state = waitingPage)
THEN {
This is a directed yield to a process that is page faulting, so we should also wait for that page to be ready. It does us no good to proceed until that process is ready.
self.page ← nextProcess.page;
EnqueuePageFault[self];
self.state ← waitingPage; done by EnqueuePageFault ????
}
ELSE {
EnqueueReady[self];
self.state ← ready; done by EnqueueReady ????
};
};
free => {
Release the process structure here. This process will never be switched to again. We can't touch the process structure hereafter. We assume all resources (other than the Process structure itself) have already been released.
CleanProcess[self];
self.state ← free; done by CleanProcess ????
};
ENDCASE => ERROR;
IF nextProcess =
NIL
OR nextProcess.state # ready
THEN {
This is an undirected yield OR a directed yield to a process that is not ready for some peculiar (possibly transient) reason. Just choose the best process to run. There are as many idle processes as processors, and the idle processes are always ready to run, so there will always be a process ready to run.
nextProcess ← NIL; -- Part of sanity check, below ???? - ajd
FOR priority: Priority
DECREASING
IN [Priority.
FIRST .. highestReadyPriority]
DO
tail: Process = readyQueueArray[priority];
IF tail # NIL THEN { nextProcess ← tail.meta; EXIT };
ENDLOOP;
IF nextProcess = NIL THEN ERROR; -- Sanity check ???? - ajd
};
At this point nextProcess # NIL, and we're committed to switching to it.
DequeueMeta[@readyQueueArray[nextProcess.priority], nextProcess];
Note: this is currently the ONLY place that a process is removed from a ready queue!
Recompute the highestReadyPriority.
DO
pri: Priority ← highestReadyPriority;
IF pri = Priority.FIRST OR readyQueueArray[pri] # NIL THEN EXIT;
highestReadyPriority ← pri.PRED;
ENDLOOP;
Make nextProcess run on this processor.
nextProcess.state ← running;
SetCurrentProcess[nextProcess]; -- RDI: this loads an aux register. Is this the place for it?
CurrentProcessor[].running ← nextProcess;
Recompute the lowestRunningPriority.
IF selfPriority = lowestRunningPriority
THEN {
new: Priority ← tailProcessor.running.priority;
FOR each: Processor ← tailProcessor.next, each.next
WHILE each # tailProcessor
DO
IF new > each.running.priority THEN new ← each.running.priority;
ENDLOOP;
lowestRunningPriority ← new;
};
ReleaseMetaLock[];
IF nextProcess # self
THEN RestoreProcess[nextProcess];
Note that if we are the same process no state needs to be restored, since nothing should have changed except volatile EU state, which is restored later.
EXITS
bailOut => ReleaseMetaLock[];
We get here when we have NOT found a better process to run. Therefore there is no change necessary, and things get done more quickly.
};
MakeReady:
PROC [process: Process] = {
... places the given process on the ready queue using the meta link. Also removes process from the timeout queue if it is there.
SIM only: on entry, we require that interrupts NOT be disabled.
DisableInterrupts[];
AcquireMetaLock[];
SELECT process.state
FROM
waitingCV, waitingML, waitingICV => {
priority: Priority = process.priority;
DequeueMeta[@timeoutMetaQueue, process];
remove from timeout queue if present
EnqueueReady[process];
place the process on the ready queue (also maintains highestReadyPriority)
IF priority > lowestRunningPriority
THEN deferredWakeups ←
TRUE;
The new process might be worth waking, but we don't do it here. We just remember that it was desirable.
};
ENDCASE;
ReleaseMetaLock[];
EnableInterrupts[]; -- SIM only. Automatic in RDI.
};
Enqueue:
PROC [q: Queue, p: Process ← CurrentProcess[]] = {
... places the process on the queue in FIFO order. Priority processing is the responsibility of Dequeue.
AcquireQueue[q];
{
tail: Process = q.chain;
IF tail =
NIL
THEN p.next ← p
ELSE {p.next ← tail.next; tail.next ← p};
q.chain ← p;
p.queue ← q;
};
ReleaseQueue[q];
};
Dequeue:
PROC [q: Queue]
RETURNS [best: Process ←
NIL] = {
... performs dequeue by process priority. We perform a linear search of every process in the queue to find the best. The best process in the queue is the first one we find with the maximum priority. Performing the priority testing in Dequeue is more expensive for the case where there are several things on the queue, but it allows us to change process priorities without altering the queue. If the process we find has also timed out, we have to handle the case elsewhere, but it is OK for Dequeue to dequeue such a process. Dequeue does not remove such a process from the timeout queue.
lag: Process ← q.chain;
IF q.chain #
NIL
THEN {
AcquireQueue[q];
best ← DequeueInner[q];
ReleaseQueue[q];
};
};
DequeueInner:
PROC [q: Queue]
RETURNS [best: Process ←
NIL] =
--INLINE-- {
... is the inside part of Dequeue. While we still dequeue by process priority, we assume that we have already grabbed ownership of the queue. This allows us to have common code for this case and the more delicate notification required by interrupt requests & page faults.
???? BUT WE DON'T HAVE COMMON CODE! - ajd
lag: Process ← q.chain;
IF lag #
NIL
THEN {
tail: Process = lag;
best ← lag.next;
IF lag = best
THEN q.chain ←
NIL
only process in queue, so clear the chain
ELSE {
head: Process ← best; -- loop invariant: lag.next = best
this: Process ← head;
DO
next: Process ← this.next;
IF next = head THEN EXIT; -- we have wrapped
IF next.priority > best.priority
THEN {lag ← this; best ← next};
new candidate for the best process
this ← next;
ENDLOOP;
IF tail = best
THEN q.chain ← lag;
don't queue chain point to deleted process
lag.next ← best.next;
};
best.queue ← NIL;
best.next ← NIL; -- the last change, due to test in ProcessSwitch
};
};
DequeueSelf:
PROC [q: Queue] = {
... dequeues the current process from the given queue. This operation has no effect if the process was not in the queue.
IF CurrentProcess[].next #
NIL
THEN {
lag: Process ← q.chain;
IF lag #
NIL
THEN {
AcquireQueue[q];
IF (lag ← q.chain) #
NIL
THEN {
tail: Process = lag;
DO
p: Process = lag.next;
IF p = CurrentProcess[]
THEN {
IF p = lag
THEN q.chain ←
NIL
last one in queue, so show no chain
ELSE {
IF p = tail
THEN q.chain ← lag;
don't let deleted one be the queue head
lag.next ← p.next;
};
CurrentProcess[].queue ← NIL;
p.next ← NIL; -- the last change, due to test in ProcessSwitch
EXIT;
};
lag ← p;
IF lag = tail THEN EXIT;
ENDLOOP;
};
ReleaseQueue[q];
};
};
};
AcquireQueue:
PROC [q: Queue] = {
... grabs ownership of the given queue. Performs Yield[ready, owner] until queue is owned. This directed Yield is necessary to overcome nasty interactions between spin locks and preemptive scheduling.
WHILE
NOT CStoreProcess[ptr: @q.busy]
DO
owner: Process ← q.busy;
IF owner = NIL THEN LOOP;
IF owner.state # running
THEN DirectedYield[ready, owner, 0];
Try to donate our cycles to the owner of the queue. Note that if the owner is running on another processor that there is not need to Yield, since the queue should be given up quite soon.
SimYield[]; -- SIM only!
ENDLOOP;
};
ReleaseQueue:
PROC [q: Queue] =
--INLINE-- {
... releases ownership of the queue. Assumes that the current process owns the queue.
RDI: change this to an inline that just sets q.busy ← NIL.
SimStoreNilInLock[ptr: @q.busy]; -- SIM only.
};
Internal procedures (interrupts disabled, must not hold metaLock)
AcquireMetaLock:
PROC =
--INLINE-- {
... grabs ownership of all of the meta queues. On entry, we require that critical traps are disabled, of course. On exit, we hold the meta lock.
WHILE
NOT CStoreProcess[ptr: @metaLock]
DO
spin using local cache only to reduce bus contention
WHILE metaLock #
NIL
DO
SimYield[]; -- SIM only
ENDLOOP;
ENDLOOP;
};
ReleaseMetaLock:
PROC =
--INLINE-- {
... releases ownership of the meta lock. On entry, we require that critical traps are disabled and that the metaLock is held. On exit, the metaLock is not held, although critical traps are still disabled.
RDI: change this to an INLINE that directly sets metaLock ← NIL.
SimStoreNilInLock[ptr~@metaLock]; -- SIM only
};
SaveProcessState:
PROC [process: Process] =
--INLINE-- {
... saves frames and process info from the current processor for the given process. On entry, we require that critical traps are disabled. On exit, there are no frames in the IFU stack (so a return will provoke a stack underflow). Note that this procedure must be an INLINE to keep from putting another frame on the stack.
save stack frames
SaveStack[]; (RDI only)
save EU regs (RDI: fix this)
FOR which: EUstateIndex IN EUstateIndex DO
process.euState[which] ← EUFetchSpecial[which];
ENDLOOP;
check for local stack overflow (RDI: fix this)
IF WordToInt[process.euState[framesLeft]] <= 0 THEN
cause a stack overflow fault for the caller
NotYetImplemented[];
NULL };
RestoreProcess:
PROC [process: Process] =
--INLINE-- {
... restores the given process to a running state. Also changes the result of CurrentProcess. On entry, we require that critical traps are disabled, and process.state = running. Note that this procedure must be an INLINE to keep from putting another frame on the stack.
restore EU regs (RDI: fix this)
FOR which: EUstateIndex IN EUstateIndex DO
EUStoreSpecial[which, process.euState[which]];
ENDLOOP;
???? WHO should set the current process aux reg ???? I claim it should not be done here. In fact, I claim RestoreProcess doesn't need an argument; it always restores from CurrentProcess[]. - ajd
};
Internal procedures (interrupts disabled, must hold metaLock)
DelicateNotify:
PROC [cond: Condition] =
--INLINE-- {
... does an ICV notification, with the understanding that interrupts are disabled and the metaLock is held, so we just can't yield the processor. The condition must be resident, of course.
cond.flags.condRequest ← TRUE;
IF CStoreProcess[ptr: @cond.queue.busy]
THEN {
We got the queue, so we can make the best process in it ready. If we did not get the queue, then the owner of it (who's running WaitInterrupt) will examine the flags pretty soon and see the notification.
best: Process ← DequeueInner[@cond.queue];
IF best #
NIL
THEN {
The following is like MakeReady ... see the comments in Wait and WaitInterrupt.
SELECT best.state
FROM
waitingICV => {
remove from timeout queue if present
DequeueMeta[@timeoutMetaQueue, best];
place the process on the ready queue (also maintains highestReadyPriority)
EnqueueReady[best];
The new process might be worth waking, but we don't do it here. We just remember that it was desirable.
IF best.priority > lowestRunningPriority THEN deferredWakeups ← TRUE;
};
running, ready => NULL;
ENDCASE => ERROR; -- ???? I'm not sure of this at all ???? - ajd
};
ReleaseQueue[@cond.queue];
};
};
AssignProcessors:
PROC = {
... sets processor orders based on whether the a ready process is "better" than any running processes; we run through the ready processes in highest to lowest order. At the end of the reassignment, if we had to change the orders of any processor other than our own then we raise reschedule to get the other processors to do their switching. On entry we require that interrupts are disabled and the metaLock is held.
The cost of algorithm in this procedure has an upper bound of O(N*MIN[N,MAX[K,R]), where N is the number of processors, K is highestReadyPriority-lowestRunningPriority, and R is the number of ready processes with priority IN (lowestRunningPriority..highestReadyPriority]. In practice, the algorithm has somewhat better average behavior, since in most cases there will have been only one process made ready by a single device interrupt, an idle processor will answer the interrupt first, and immediately assign the new process to itself with no further reschedule activity. In this (anticipated) normal case the cost is O(K)+O(1). This algorithm could be made faster in the high cost cases by keeping the processors sorted by priority, but that would increase costs in ProcessSwitch so it might not be a net perfomance gain.
needReschedule: BOOL ← FALSE;
current: Processor = CurrentProcessor[];
firstAvail: Processor ← current;
deferredWakeups ←
FALSE;
At this point we are about to reassign all of the processors that need it, so nothing is really deferred.
FOR priority: Priority
DECREASING
IN (lowestRunningPriority..highestReadyPriority]
DO
tail: Process = readyQueueArray[priority];
IF tail #
NIL
THEN {
FOR process: Process ← tail.next, process.next
DO
best: Processor ← NIL; -- ???? Why "← NIL"? Isn't this dead? ???? - ajd
bestPriority: Priority ← process.priority;
{
DO
The first scan is for some available processor that could run the requested process. If none is found, then we exit all loops.
best ← firstAvail;
IF best.orders = noChange
THEN {
np: Priority = best.running.priority;
IF np < bestPriority THEN {bestPriority ← np; EXIT};
};
firstAvail ← best.next;
IF firstAvail = current THEN GO TO done;
IF best.switchTo = process AND best.orders = switchToGiven THEN
GO TO alreadyAssigned;
ENDLOOP;
IF bestPriority # lowestRunningPriority
THEN
If the first available processor found is not already at the minimum possible priority then it is worth scanning for another processor with a lower priority.
FOR p: Processor ← best.next, p.next
WHILE p # current
DO
SELECT p.orders
FROM
switchToGiven => IF p.switchTo = process THEN GO TO alreadyAssigned;
noChange => {
np: Priority = p.running.priority;
IF np < bestPriority
THEN {
best ← p;
bestPriority ← np;
IF np = lowestRunningPriority THEN EXIT;
};
};
ENDCASE;
ENDLOOP;
We have found the best choice for the given process.
best.switchTo ← process;
best.orders ← switchToGiven;
IF best # current
THEN needReschedule ←
TRUE;
If we are gave orders to another processor, then we rememebr the fact.
IF best = firstAvail
THEN {
The best processor was the first one we found. Therefore we advance firstAvail to the next processor (unless that happens to be the current processor, in which case we are completely done scanning).
firstAvail ← best.next;
IF firstAvail = current THEN GO TO done;
};
EXITS alreadyAssigned => {
In this case the process was already assigned to a processor. So we want to choose another processor (starting at firstAvail) and another process. This should be rare.
???? How can it happen at all ???? THINK! - ajd
};
};
IF process = tail THEN EXIT;
ENDLOOP;
EXITS done => EXIT;
};
ENDLOOP;
IF needReschedule THEN RaiseReschedule[];
};
EnqueueReady:
PROC [process: Process] =
--INLINE-- {
... places the given process on the ready queue using the meta link. On entry, we require that critical traps are disabled and that the metaLock is held. On exit, process.state = ready.
priority: Priority = process.priority;
EnqueueMeta[@readyQueueArray[priority], process];
IF priority > highestReadyPriority THEN highestReadyPriority ← priority;
process.state ← ready;
};
EnqueuePageFault:
PROC [process: Process] =
--INLINE-- {
... places the given process on the page fault queue using the meta link. On entry, we require that critical traps are disabled and that the metaLock is held. On exit, process.state = waitingPage.
EnqueueMeta[@pageFaultMetaQueue, process];
process.state ← waitingPage;
};
EnqueueMeta:
PROC [metaQueue: MetaQueue, process: Process] =
--INLINE-- {
... places the given process on the given meta queue using the meta link. On entry, we require that critical traps are disabled and that the metaLock is held.
IF process.meta =
NIL
THEN {
tail: Process = metaQueue^;
IF tail =
NIL
THEN process.meta ← process
ELSE { process.meta ← tail.meta; tail.meta ← process };
metaQueue^ ← process;
};
};
DequeueMeta:
PROC [metaQueue: MetaQueue, process: Process] =
--INLINE-- {
... removes the given process from the given meta queue (timeout, ready, pageFault). On entry, we require that critical traps must be disabled and that the metaLock is held. The process state is not changed except for the meta link.
IF process.meta #
NIL
THEN {
remove from meta queue if it is there
tail: Process = metaQueue^;
IF tail #
NIL
THEN {
lag: Process ← tail;
DO
p: Process = lag.meta;
IF p = process
THEN {
IF lag = p
THEN metaQueue^ ←
NIL
last one in queue, so show no chain
ELSE {
IF p = tail
THEN metaQueue^ ← lag;
don't let deleted one be the queue head
lag.meta ← p.meta;
};
p.meta ← NIL;
EXIT;
};
lag ← p;
IF lag = tail THEN EXIT;
ENDLOOP;
};
};
};
Low-level primitives
Conditional Store
simCStoreLock: SimLock ← NEW [SimLockRep ← []]; -- SIM only
CStoreProcess:
PROC
[ptr: ProcessPtr, old: Process ←
NIL, new: Process ← CurrentProcess[]]
RETURNS [stored:
BOOL] =
-- MACHINE CODE -- {
... atomically performs {IF (stored ← ptr^ = old) THEN ptr^ ← new}.
SIM: all CStores are atomic wrt one another, and wrt setting lock to nil, below.
DoIt:
ENTRY
PROC [lock: SimLock] ~ {
ENABLE UNWIND => NULL;
IF (stored ← (ptr^ = old)) THEN ptr^ ← new };
DoIt[simCStoreLock];
};
CStoreWord:
PROC
[ptr: WordPtr, old, new: Word]
RETURNS [stored:
BOOL] =
-- MACHINE CODE -- {
... atomically performs {IF (stored ← ptr^ = old) THEN ptr^ ← new}.
SIM: all CStores are atomic wrt one another, and wrt setting lock to nil, below.
DoIt:
ENTRY
PROC [lock: SimLock] ~ {
ENABLE UNWIND => NULL;
IF (stored ← (ptr^ = old)) THEN ptr^ ← new };
DoIt[simCStoreLock];
};
SimStoreNilInLock:
ENTRY PROC [lock: SimLock ← simCStoreLock, ptr: ProcessPtr] ~ {
SIM only. In RDI just do the assignment inline.
ptr^ ← NIL;
};
Interrupts and I/O
simIOLock: SimLock ← NEW [SimLockRep ← []]; -- SIM only
simIORequestWord: RequestWord ← nullRequestWord; -- SIM only
DisableInterrupts:
PROC [] = {
... inhibits interrupts (must be a procedure call for RDI)
SimThisProcessor[].sim.interruptsEnabled ← FALSE; -- Change in RDI
};
EnableInterrupts:
PROC [] = {
... enables interrupts (must be a procedure call for RDI)
SimThisProcessor[].sim.interruptsEnabled ← TRUE; -- Change in RDI
SimCheckForRescheduleTrap[]; -- SIM only
};
ReadAndClearDeviceRequests:
ENTRY
PROC [lock: SimLock ← simIOLock]
RETURNS [rw: RequestWord] ~ {
... atomically reads and clears the device requests. This is done by reading the appropriate EU I/O location (Interrupt Request Word) then clearing the bits read.
RDI: this should be MACHINE CODE to read EU I/O location, and not ENTRY.
ENABLE UNWIND => NULL;
rw ← simIORequestWord;
FOR k: RequestKind
IN RequestKind
DO
IF rw[k] THEN simIORequestWord[k] ← FALSE;
ENDLOOP;
};
DeviceRequestsPending:
ENTRY
PROC [lock: SimLock ← simIOLock]
RETURNS [
BOOL] =
-- MACHINE CODE -- {
... Determines whether or not device requests are pending, but does not alter them. This must agree with ReadAndClearDeviceRequests.
RDI: this should be MACHINE CODE to read EU I/O location, and not ENTRY.
ENABLE UNWIND => NULL;
FOR k: RequestKind
IN RequestKind
DO
IF simIORequestWord[k] THEN RETURN [TRUE];
ENDLOOP;
RETURN [FALSE];
};
RaiseReschedule:
PROC =
-- MACHINE CODE -- {
... raises the Reschedule interrupt, which, if disabled, will occur when it becomes enabled again.
How to do this in RDI?
deferredWakeups ← FALSE; -- ???? I'm not completely sure about this ???? - ajd
SimRaiseReschedule[]; -- Wrong for RDI.
};
SimRaiseReschedule:
PROC ~ {
SIM only.
This is a crock: it ought to be done atomically.
FOR i:
CARDINAL
IN [0..maxProcessors)
DO
processors[i].sim.rescheduleRequested ← TRUE;
ENDLOOP;
};
WordToInt:
PROC [x: Word]
RETURNS [
INT] =
MACHINE
CODE { PrincOps.zEXCH };
On a Dorado we must swap (16-bit) words. This is wrong for RDI, where the body should just be RETURN [LOOPHOLE[x]]
???? Should be in Basics or something like that ???? - ajd
EUFetchSpecial:
PROC
[which: EUstateIndex]
RETURNS [w: Word ← wordZero] =
-- MACHINE CODE -- {
... fetches a word from the given special EU register.
NotYetImplemented[noOp];
};
EUStoreSpecial:
PROC [which: EUstateIndex, x: Word] =
-- MACHINE CODE -- {
... stores a word into the given special EU register.
NotYetImplemented[noOp];
};
SaveVolatileEUState:
PROC ~ {
RDI: put carry and field register into current process block.
NotYetImplemented[noOp];
};
RestoreVolatileEUState:
PROC ~ {
RDI: set carry and field register from current process block.
NotYetImplemented[noOp];
};
SaveStack:
PROC =
--INLINE-- {
... saves the stack (updates the current value of the hook EU register), returning the number of frames saved. The stack is saved up to (and including) the eldest marked frame. Also updates the framesLeft EU register. For excruciating details, see GenStack.mesa.
NotYetImplemented[noOp];
};
MarkFrame:
PROC =
--INLINE--
-- MACHINE CODE -- {
... marks the youngest frame on the IFU stack as being the last one to save. The eldest frame marked is actually the last one saved by SaveStack. This procedure probably needs to be called when interrupts are disabled.
NotYetImplemented[noOp];
};
Initialization
initialized: BOOL ← FALSE;
Init:
PROC ~ {
... called by each processor when it starts, with interrupts disabled and with enough room on the EU stack that a stack overflow is guaranteed not to occur. The first processor to execute this procedure creates the Nachos and Process objects, and is responsible for running SystemStartup[]. Every other process just creates itself Processor and Process objects for the scheduler to manipulate and then runs Idle[] forever.
processor: Processor;
process: Process;
first: BOOL;
AcquireMetaLock[];
If I am the initializer, set up the Nachos and Process objects.
first ← (NOT initialized);
IF first
THEN {
CreateNachos[];
CreateProcesses[];
pageFaultCond ← @pageFaultCondRep;
InitializeCondition[pageFaultCond, noTimeout];
initialized ← TRUE };
Allocate myself Processor and Process objects, initialize them so the scheduler can manipulate them.
processor ← CreateProcessor[];
process ← NewProcessInner[];
process.state ← running;
processor.running ← process;
InsertProcessor[processor];
IF process.priority < lowestRunningPriority THEN lowestRunningPriority ← process.priority;
SetCurrentProcess[process];
SetCurrentProcessor[processor];
ReleaseMetaLock[];
EnableInterrupts[];
IF first THEN SystemStartup[];
Idle[];
};
Idle:
PROC ~ {
RDI this is just a simple DO NULL ENDLOOP, and should be INLINE.
DO
SimCheckForRescheduleTrap[];
SimYield[];
ENDLOOP;
};
SystemStartup:
PROC ~ {
process: Process;
SetPriority[normal];
[] ← RegisterIntHandler[0, HandleTick];
process ← Fork[CountA, NIL, 0, 0];
SimDetach[process];
process ← Fork[CountB, NIL, 0, 0];
SimDetach[process];
SetPriority[Priority.FIRST];
};
HandleTick: IntHandler
[requestKind: RequestKind]
SIM only - this is copied from the body of ShuffleReadyQueues.
~ TRUSTED {
IF requestKind # 0 THEN ERROR;
FOR processor: Processor ← tailProcessor.next, processor.next
DO
SELECT processor.orders
FROM
noChange => {
Force a switch to the best available process
processor.orders ← switchToGiven;
processor.switchTo ← NIL;
};
ENDCASE;
IF processor = tailProcessor THEN EXIT;
ENDLOOP;
RaiseReschedule[];
};
Tick: PROC ~ { EntryTick[] };
EntryTick:
ENTRY
PROC [lock: SimLock ← simIOLock] ~ {
simIORequestWord[0] ← TRUE;
SimRaiseReschedule[] };
counterA: CARD ← 0;
counterB: CARD ← 0;
CountOrders: TYPE ~ { init, go, quit };
countOrders: CountOrders ← init;
CountA:
PROC ~ {
target: CARD ← 1;
WHILE countOrders = init
DO
SimYield[];
SimCheckForRescheduleTrap[];
ENDLOOP;
WHILE countOrders = go
DO
WHILE counterB < target
DO
SimYield[];
SimCheckForRescheduleTrap[];
ENDLOOP;
target ← target.SUCC;
counterA ← counterA.SUCC;
ENDLOOP;
};
CountB:
PROC ~ {
target: CARD ← 1;
WHILE countOrders = init
DO
SimYield[];
SimCheckForRescheduleTrap[];
ENDLOOP;
WHILE countOrders = go
DO
WHILE counterA < target
DO
SimYield[];
SimCheckForRescheduleTrap[];
ENDLOOP;
target ← target.SUCC;
counterB ← counterB.SUCC;
ENDLOOP;
};
Process Storage
Mostly just dummied-up for simulation purposes.
maxProcesses: CARDINAL = 40;
processes: ARRAY [0..maxProcesses) OF ProcessRep;
InvalidProcess: PUBLIC ERROR [process: Process] ~ CODE;
???? RDI: these definitions mean "addressing units" where they say "WORD" ????
processRepWords: INT ~ (BITS[ProcessRep] + BITS[WORD] - 1) / BITS[WORD];
processesWords: INT ~ maxProcesses * processRepWords;
ValidateProcess:
PUBLIC
SAFE
PROC [process: Process] ~
TRUSTED {
RDI: Obviously this has to change. Note we want ValidateProcess to be cheap.
delta: INT ← LOOPHOLE[process, INT] - LOOPHOLE[LONG[@processes], INT]; -- RDI: "LONG" is pornographic
IF (delta < 0) OR (delta >= processesWords) OR ((delta MOD processRepWords) # 0) THEN ERROR InvalidProcess[process];
};
SimCreateProcesses:
PROC [] ~ {
Sim only.
p: Process;
FOR i:
CARDINAL
IN [0..maxProcesses)
DO
p ← @processes[i];
processes[i].sim ← NEW [ SimProcessRep ← [lock~NEW[SimLockRep ← []], awakened~FALSE, processor~NIL] ];
ENDLOOP;
};
CreateProcesses:
PROC [] ~ {
... create all the process objects. On entry we assume interrupts are disabled and the metaLock is held.
RDI: obviously this has to change. Note we want ValidateProcess to be cheap.
FOR i:
CARDINAL
IN [0..maxProcesses)
DO
CleanProcess[@processes[i], FALSE];
ENDLOOP;
};
CleanProcess:
PROC [p: Process, check:
BOOL ←
TRUE]
RETURNS [] ~ {
... put a Process object on the free list, restoring its fields to a "clean" state. On entry we require that interrupts are disabled and the metalock is held.
RDI: remove initialization of sim fields; do something about euState?
p.queue ← NIL;
p.next ← NIL;
p.when ← noTimeout;
p.page ← wordZero;
p.priority ← Priority.FIRST;
p.state ← free;
RDI: do something about the EUState here? Probably not necessary. - ajd
InitializeMonitorLock[@p.lock];
p.abortState ← none;
p.joinState ← none;
InitializeCondition[@p.joinCondition, noTimeout];
{
-- SIM only
p.sim.awakened ← FALSE;
p.sim.processor ← NIL;
};
IF tailFreeProcess =
NIL
THEN p.meta ← tailFreeProcess ← p
ELSE { p.meta ← tailFreeProcess.meta; tailFreeProcess.meta ← p };
};
NewProcessInner:
PROC
RETURNS [p: Process] ~ --INLINE-- {
... get a process object from free process list. On entry we require that interrupts are disabled and the metalock is held.
IF tailFreeProcess = NIL THEN ERROR; -- too horrible to contemplate
p ← tailFreeProcess.meta;
IF tailFreeProcess = p THEN tailFreeProcess ← NIL ELSE tailFreeProcess.meta ← p.meta;
p.meta ← NIL; -- for sanity when debugging
};
NewProcess:
PROC
RETURNS [p: Process] ~ {
... get a process object from free process list.
SIM: On entry we require that interrupts are enabled.
DisableInterrupts[];
AcquireMetaLock[];
p ← NewProcessInner[];
ReleaseMetaLock[];
EnableInterrupts[]; -- SIM only (happens automatically in RDI).
};
Differences Between PrincOps and DragOps
Note: this section requires knowledge of the PrincOps architecture. I freely admit that my knowledge of said architecture is spotty at best. This section is also incomplete.
Multiple processors
With current PrincOps machines atomicity is acheived through a combination of locks, inhibition of interrupts, priority levels, and non-interruptibility of non-faulting microcode routines. In the DragOps arhcitecture, only locks (based on memory bus arbitration) and inhibition of interrupts for a single processor (based on IFU inhibition of the Reschedule interrupt) can be used as sources of atomicity. This complicates life, you betcha.
Memory vs. time
For PrincOps, memory was assumed to be expensive, and cycles were assumed to be expendable if it saved memory. For example, a process block costs only 8 bytes when no state vector is used, and only costs 36 additional bytes when a state vector is used (the unusual case).
The DragOps, memory is assumed to be relatively cheap, and cycles should only be spent to avoid gross memory costs. For example, a process block costs at least 40 bytes in the normal case, and there is no exceptional case.
The ready queue
For PrincOps, the head of the ready queue is the currently running process. For DragOps, running processes are not on the ready queue at all. Instead, running processes are located through the current ring of processors.
The DragOps ready queue is special in that it is actually an array of metaqueues, one for each priority level. This bounds the search for the best ready process to a time proportional to the number of priority levels (small), yet also allows us to place processes on the ready queue in unit time.
Queues and meta queues
For PrincOps, all queues are circular queues of processes linked through a particular word in the PSB (Process State Block). Condition variables, monitor locks, and even the ready queue are all queues where the queue pointer points at the tail of the processor queue. For DragOps, there are Queues and MetaQueues. Although both are circular queues, and both point at the tail, the link words are different. A DragOps process can be queued at the same time on both a queue and a metaqueue.
DragOps queues are used by "normal" Mesa monitors and condition variables. MetaQueues are used for more specialized reasons, like the ready queue (waiting for processor), the pageFault queue (waiting for a page to be brought into memory), and the timeout queue (waiting for some time to expire). Queues need not be pinned, while MetaQueues must be pinned. The scheduler is permitted to move processes between MetaQueues, but is not permitted to touch queues. All of this is done in the name of simplicity, you betcha.
Monitor locks & condition variables
For PrincOps, monitor locks and condition variables are managed by microcode that has carefully been constructed to not take interrupts at awkward times.
For DragOps, managing monitor lock and condition queues is too complex to be performed in a single instruction, so each monitor lock and condition has a spin lock that must be acquired before the queue can be examined or modified. A potential problem with this scheme is that a higher-priority process may attempt to acquire the spin lock while a lower-priority process possess the lock. Therefore, each spin lock has the address of the owning process, so a process that fails to acquire the lock can yiled control directly to the owning process. Although this is technically a violation of preemptive scheduling, it is necessary to prevent livelock (where no progress can be made because of the higher-priority process spinning).
State vectors
For PrincOps, the eval stack and other cleanup information is contained in a state vector. Minimal stack restrictions on monitor entry and condition wait ensure that the eval stack is empty when such processes go to sleep, which tends to keep the number of state vectors low.
For DragOps, there are no state vectors. When a process is about to go to sleep, it saves the frames in the IFU and EU stacks to memory, saves the appropriate EU registers to the process block, and puts itself on the appropriate metaqueue (if any). The only problem is ensuring that there are sufficient nachos to accomodate the stack of the process. The current thinking is to have an emergency-level process put every other process to sleep when frames are about to run out. When that has been done, a world-swap debug can take place.
An improvement on this scheme can be made if there is an error generated for a particular process when it has used "too many" frames. Since this is typically due to a process executing unbounded recursion, we can tolerate at least a few runaways without having to world-swap.
In both cases, we have to have some mechanism for keeping track of how many frames are in use by a given process, and how many are left in the global pool of frames. Details are yet to be specified.
Costs
For PrincOps, a process switch is on the order of 5 procedure call times. A procedure call (and return) is on the order of 5 microseconds (on the Dorado).
For DragOps, a process switch is on the order of 100 procedure call times (very rough estimate). Partly, this is because procedure calls are about 10 times faster, and partly this is because there is more per process state to save and restore.
Timeout
For PrincOps, a timeout occurs when the clock ticks and some PSB that is awaiting timeout has its wait count go to 0. All PSBs have to be touched at every tick, although this is done via microcode.
For DragOps, at each tick every process in timeoutMetaQueue has its when field decremented. If that field reaches 0, the process is placed in the ready queue. It then becomes the responsibility of the process to remove itself from the appropriate condition variable. This makes page faults (if any) due to touching the condition variable occur in the "right" process. Only PSBs waiting for timeout have to be touched at every tick.
Aborts
For PrincOps (as implemented for Cedar), aborts are enabled/inhibited on a per condition variable basis, and aborts are tested only by CVs & ICVs with aborts enabled and by explicit tests.
For DragOps, aborts are enabled/inhibited on a per process basis and on a CV basis. Abort requests are tested when waiting for CVs, ICVs, and MLs, as well as the normal explcit tests. Is it worth having the enable/disable on a per-process basis? (it costs rather little)
Things to do
This section is a list of things that still need some design work.
Fork / join
These need to be coded in greater detail.
Page faults
How will the fault handler work and place things to sleep waiting for page fault? For an EU page fault, the offending memory address is in the MAR register in the EU. For an IFU page fault the offending PC is for the start of the instruction, so the memory address to page in may be the next page (since this fits well with page clustering anyway, we may want to always page in the following page on an IFU page fault whenever the following page is also valid).
Soft and hard emergency limits on frames
These need support in the scheduler, as well as in stack save/restore. Reaching the soft limit (per process) should raise an error. This is just an attempt to restrain runaway processes. Reaching the hard limit will cause system shutdown and a call to the world-swap debugger. This probably will only happen if a significant number of runaways occur.
Timeout mechanism
In current Cedar relatively few condition variables have timeouts (the default is not to have timeouts). The current design for the timeout queue is based on the assumption that this situation will continue on Dragon. The current design has O(n) insertion and deletion of entries from the queue, O(n) updating for every tick, and O(1) testing, where n is the number of processes waiting for timeout. A heapsort-style priority queue has O(n log n) insertion and deletion, and O(1) testing, but it also has more complicated code, and needs an auxilliary pinned data structure (1 word / process). The current simple design will be used until it becomes a bottleneck.
Should the timeout "process" just be part of the scheduler? How will elapsed time be determined, and to what accuracy? Just for the record, there are three clocks of current interest to Cedar:
1: Time of day clock. This need not have resolution of better than a second, and it need not be locally stable, but it should have very good long-term accuracy.
2: Timeout clock. This need not have great accuracy, but it must generate interrupts, and it should have resolution of 10-100 milliseconds. More resolution is OK provided that timeout interrupts don't occur more often than every 10 milliseconds.
3: Elapsed time clock. This need not have great long-term accuracy, but it should have very good short-term accuracy, have roughly microsecond resolution, and it must be quite cheap to read. This clock can easily synthesize the Timeout clock.
Stack trace
SafeStorage needs a way to trace out the process stacks atomically. That is, all possible references in the process stacks must be discovered without any reference containing processes making headway. Dump frames are already pinned, and the panicStop mechanism can be used to stop all processes except for the current process (we may also want to have an stopRC command that would stop all processes that had not declared themselves to be non-RC). We can allocate frame extensions out of a special pool of memory that allows scanning. Possibly we should also make this pool pinned?
Reschedule & page fault
Setting L at procedure essentially transfers registers from the called frame to the calling frame, as far as stack save and restore are concerned. It will only work properly if both the calling frame and the called frame are in the EU (and IFU) registers. If the various compilers (and assembly language coders) guarantee that no calls are made before setting L, then there are only two sources of having the calling frame migrated into memory - reschedule & page fault. For these traps, if the youngest frame below the trap is P, and the next youngest is Q, and Q has zero registers, then both P and Q have to be restored before returning from the trap.