DragonStackSave.mesa
Russ Atkinson, March 19, 1984 9:18:22 pm PST
DragonStackSave.All[] generates code for stack saving and restoring. The basic assumptions are:
0. Stack frames are saved/restored with traps disabled.
1. Stack frames (from the euStack and ifuStack) are saved in dump frames.
2. Dump frames are fixed-size (20 words) blocks of resident memory.
3. A dump frame never contains more than one stack frame, but a stack frame may require several dump frames.
4. A dummy frame is maintained at the eldest position in the stack to handle stack underflow, since there is no trap for stack underflow.
DIRECTORY
DragonProcessOffsets,
DragOpsCross,
HandCoding,
HandCodingPseudos;
DragonStackSave: CEDAR PROGRAM
IMPORTS HandCoding, HandCodingPseudos
= BEGIN OPEN DragonProcessOffsets, HandCoding, HandCodingPseudos;
Word: TYPE = DragOpsCross.Word;
StackLog: NAT = 7;
Note: StackSize = 2^StackLog
RegArray: TYPE = ARRAY Reg OF Word;
DumpFrame: TYPE = LONG POINTER TO DumpFrameRep;
DumpFrameRep: TYPE = RECORD [
link: DumpFrame, -- link to previous frame
lastPC: Word, -- continuation PC for this frame
nRegs: Word, -- number of regs used for this frame at last save
others: DumpFrame, -- if # NIL, then it is a pointer to an aux frame
regs: RegArray -- addressable by local regs
];
linkOffset: NAT = 0;
constLink: ConstSpec = const0;
lastPCOffset: NAT = 1;
constLastPC: ConstSpec = const1;
nRegsOffset: NAT = 2;
constNRegs: ConstSpec = const2;
othersOffset: NAT = 3;
constOthers: ConstSpec = const3;
regOff: CARDINAL = 4; -- offset of DumpBlock.regs
LocalAllocationRep: TYPE = RECORD [
lockPtr: GlobalAllocationPtr, -- pointer to global lock for frame allocation
ptrs: LocalArray -- start of frame pointers
];
LocalArray: TYPE = ARRAY [0..16] OF DumpFrame;
type of holder for frames per processor, limit is quite arbitrary (need not be fixed)
NIL is stored at the very end to make test for empty easy
GlobalAllocationPtr: TYPE = LONG POINTER TO GlobalAllocationRep;
GlobalAllocationRep: TYPE = RECORD [
lockPtr: Word, -- pointer to global lock for frame allocation
gFree: Word, -- pointer to the next free frame in frames
emergency: Word, -- pointer to the emergency limit for frames (not yet used)
frames: GlobalArray -- space for the global pool of dump frames
];
GlobalArray: TYPE = ARRAY [0..1024] OF DumpFrame;
type of holder for frames, limit is quite arbitrary (need not be fixed)
NIL is stored at the very end to make test for empty easy
StackMargin: NAT ← 8;
Number of words in EU stack to reserve to handle stack overflow. Will this really be enough?
framesToTransfer: NAT ← 4;
Number of frames to transfer between local and global frame pools.
StackUnderflowTrap: PROC = {
This code is used whenever the caller needs to transfer control cleanly to the process stack kept in the hook. There is never any return from this transfer, and the contents of the register stack for the caller are aggressively discarded (popped off the stack). This code also initializes the dummy frame kept on the ifuStack to have a return point within this code. The overflow routine must preserve the contents of this dummy frame, but does not need to know what those contents need to be.
setupLabel: Label = GenLabel[];
restoreLabel: Label = GenLabel[];
entryLabel: Label = GenLabelHere[];
freeRoutineLabel: Label = GenLabel[];
flushLabel: Label = GenLabel[];
MakeLabelGlobal["DragonStack.StackUnderflowTrap", entryLabel];
DisableTraps[];
This allows this routine to be called from anywhere normal that has managed to setup hook properly. Traps must stay disabled during the remainder of this routine to make sure that our manipulations of the stack model stay kosher.
drLFC[UseLabel16[flushLabel]];
Our first call is to an internal routine that will flush everything on the ifuStack up to the point where the stack is completely empty. We never actually return from this call.
SetLabel[flushLabel];
At this point the youngest frame must be that of our caller, and its PC is unique in the stack, since we disabled traps prior to the call. This gives us a simple method for popping all of the frames off of the stack by popping the eldest until it is equal to the youngest. Isn't science wonderful?
GetYoungestPC[];
Push the youngest PC, which is the return PC for this local call
GetEldestPC[];
Push the eldest PC, which also flushes it from the ifuStack
drRJNEBJ[TRUE, popSrc, UseLabel8[flushLabel]];
Jump if they are not equal (predict not equal)
ProcedureEntry[NIL, 0];
When we are starting up from ground zero we don't make any assumptions about the agreement between L and S. So we make L agree with S under the assumption that nothing is on the EU stack.
{loopLabel: Label = GenLabelHere[];
At this point there are NO frames on the register stack. This leaves us free to setup for restoring the youngest frame in the chain maintained by the hook register.
drLFC[UseLabel16[restoreLabel]];
Call our local routine which will restore the stack frame referenced by hook, then "call" the "continuation PC" of the restored frame, leaving the address of the following jump as the "restoration PC". The next underflow will then loop around, restoring the stack, and get to this call again. This means that we do not have to know the return address outside of this routine. Further, provided that stack save/restore is working correctly, there are no frames on the stack when we return here.
DisableTraps[];
When we "return" to this routine we need to disable further traps, since the frame hacking stuff after this cannot tolerate being interrupted. Of course, this means that it is just barely possible for interrupts to occur between the return and the trap disabling. The only way that this can happen is if we see a reschedule interrupt before we disable traps. This means that the reschedule routine must be prepared to recover from this unsavory situation. This code has not yet been designed, but it may take the form of a very careful entry to stack saving. Can we make it fall out of stack save for almost free?
drJB[UseLabel8[loopLabel]];
This instruction is the normal place to return to when the stack is completely empty. This means that we need to restore another frame from the frames saved in memory, then transfer to that frame.
};
SetLabel[freeRoutineLabel];
This is an excellent place to put the internal routine to free up the frame referenced by temp. This way we keep the code common, and keep jump distances short where they count. The only penalty is a few cycles for the call/return.
Note that at this point we have 3 frames on the IFU stack.
FreeDumpFrame[];
alloc the dump frame pointed to by temp
drRETN[];
return without S adjustment
SetLabel[setupLabel];
This is a utility routine that we need in order to set L, the new stack limit, and so forth. It needs to be a procedure because we have to set L (sigh). After return we have the following:
([S] = L = (oldL - #regs) MOD StackSize; [S-1] = #regs; [S-2] = 16)
(sLimit = (L - StackMargin) MOD StackSize)
(temp = hook'; hook = temp^)
drLIB[16]; LRegI[hook, constNRegs];
Push 16, then put the # of registers in [S]
([S] = #regs, [S-1] = 16)
GetYoungestL[]; drRVSUB[c: topDst, a: topSrc, b: belowSrc];
calculate the new L
([S] = oldL - #regs; [S-1] = #regs; [S-2] = 16)
ExtractField[first: 32-StackLog, bits: StackLog];
mask off nasty bits
([S] = (oldL - #regs) MOD StackSize; [S-1] = #regs; [S-2] = 16)
drDUP[]; SetYoungestL[];
This will make L have the right value on return
([S] = newL; [S-1] = #regs; [S-2] = 16).
MoveReg[temp, hook];
Put the first frame to restore in temp
MoveRegI[hook, hook, constLink];
Make the hook point to the next frame (hook ← (hook+0)^)
drDUP[]; drSUBB[StackMargin];
Calculate the new stack limit
([S] = newL-StackMargin; [S-1] = newL; [S-2] = #regs; [S-3] = 16)
ExtractField[first: 32-StackLog, bits: StackLog]; SetSPLimit[];
mask off nasty bits and set the new stack limit
(sLimit = (newL-StackMargin) MOD StackSize)
([S] = newL; [S-1] = #regs; [S-2] = 16)
drRETN[];
return without S adjustment
([S] = newL; [S-1] = #regs; [S-2] = 16)
{shortLabel: Label = GenLabel[];
At this point there is precisely one frame on the IFU stack, which is the dummy frame. This is actually a very handy place to keep the dummy frame. However, since we need a place to store the frame we want to restore, we reserve one more frame by using a call.
SetLabel[restoreLabel];
drLFC[3];
call the next instruction, just to make a new frame
drLFC[UseLabel16[setupLabel]];
After this call, L = [S], and is at the proper point for restoring the registers. Also, we have unhooked this bunch of save blocks from hook. The start of our save block chain is now in temp.
([S] = L = (oldL - #regs) MOD StackSize; [S-1] = #regs; [S-2] = 16)
SetYoungestL[];
Set the youngest L, so it will have the proper value when we return to the restored frame. At this point we have:
([S] = #regs; [S-1] = 16)
LRegI[hook, constLastPC]; SetYoungestPC[];
set the return PC to be that of the frame we want to restore.
drRJGEBJ[popLeft: FALSE, right: topSrc, dist: UseLabel8[shortLabel]];
If #regs <= 16 (the usual case), skip the code that dumps extra blocks.
{loopLabel: Label = GenLabelHere[];
At this point, there nregs > 16, so dump the first 16.
Now do 16 reads into the stack registers from the dump block
FOR i: NAT IN [0..15] DO
drRAI[reg1: [reg[i]], reg2: temp, disp: regOff+i];
ENDLOOP;
LRegI[temp, constOthers];
push the next dump frame onto the stack
([S] = nextDumpFrame; [S-1] = #regs; [S-2] = 16)
drLFC[UseLabel16[freeRoutineLabel]];
free the dump frame pointed to by temp (no change to S)
SReg[temp];
temp ← nextDumpFrame
([S] = #regs; [S-1] = 16)
drARL[16]; drSUBB[16];
adjust L and #regs ([S] = #regs left; [S-1] = 16)
drRJLB[popLeft: FALSE, right: topSrc, dist: UseLabel8[loopLabel]];
go do another frame (if necessary), do not change S
};
SetLabel[shortLabel];
This is point where we restore frames without extensions, and also restore the remnant of frames with extensions. [S] = #regs; [S-1] = 16.
{jumpLabel: Label = GenLabel[];
The best way to quickly restore a variable # of regs is to jump into a table of read instructions. This takes roughly 6 cycles to do the jump, and we stand to save at least one cycle for every store we save (not counting avoiding extra dirty cache entries).
drRVADD[c: belowDst, a: topSrc, b: topSrc];
([S] = #regs; [S-1] = 2*#regs)
drRVADD[c: belowDst, a: belowSrc, b: belowSrc];
([S] = #regs; [S-1] = 4*#regs)
drRVSUB[c: belowDst, a: popSrc, b: belowSrc];
([S] = -3*#regs)
IndexedJump[dest: jumpLabel];
jump into the table
FOR i: NAT DECREASING IN [0..15] DO
drRAI[reg1: [reg[i]], reg2: temp, disp: regOff+i];
ENDLOOP;
SetLabel[jumpLabel];
};
drLFC[UseLabel16[freeRoutineLabel]];
free the dump frame pointed to by temp
At this point all of the registers have been restored, all of the dump blocks have been freed, and there are two frames on the IFU stack. The top frame on the IFU stack is for the restored frame, and the frame under that is the dummy frame. We resume the restored frame by returning to it with traps enabled.
drRETT[];
"Return" to the restored frame, enabling traps as we go. S is not changed.
};
};
IFUStackOverflowTrap: PROC = {
Note: assume that there are sufficient words on the EU stack to complete the saving of one frame. The stack overflow trap is disabled on entry to this routine, and must be reenabled on the way out. If the IFU stack has overflowed, then saving just one frame will be sufficient.
entryLabel: Label = GenLabelHere[];
callLabel: Label = GetGlobalLabel["DragonStack.InternalSaveEldest"];
MakeLabelGlobal["DragonStack.IFUStackOverflowTrap", entryLabel];
GetEldestPC[];
Flush the dummy frame, saving it on the stack (L is not important, so is not saved). This PC is used to restore the dummy eldest frame at the end of this procedure.
drLFC[UseLabel16[callLabel]];
Call to save one frame
At this point, [S] = the PC of the dummy frame, which must go back on the IFU stack. We also have to make the dummy L = eldest L and calculate the new stack limit.
GetEldestL[]; PReg[temp]; drSUBB[StackMargin];
Calculate the new stack limit. save the eldest L for further use.
ExtractField[first: 32-StackLog, bits: StackLog]; SetSPLimit[];
mask off any garbage and set the new stack limit
SetEldestPC[]; LReg[temp]; SetEldestL[];
Replace the dummy eldest frame. The dummy eldest L must be inherited from the real eldest L in order to properly preserve returned values on the stack!
drRETT[];
This return enables traps, and does not disturb S, but does restore L.
};
EUStackOverflowTrap: PROC = {
Note: assume that there are sufficient words on the EU stack to complete the saving of one frame. Traps are automatically disabled on entry to this routine, and must be reenabled on the way out. If we save a frame, but don't get any more space in the EU stack, then we have to keep saving frames until a frame is saved that has at least one EU register saved with it.
entryLabel: Label = GenLabelHere[];
callLabel: Label = GetGlobalLabel["DragonStack.InternalSaveEldest"];
MakeLabelGlobal["DragonStack.EUStackOverflowTrap", entryLabel];
GetEldestPC[];
Flush the dummy frame, saving it on the stack (L is not important, so is not saved). This PC is used to restore the dummy eldest frame at the end of this procedure.
{loopLabel: Label = GenLabelHere[];
drLFC[UseLabel16[callLabel]];
Call to save one frame
LRegI[hook, constNRegs];
drRJNEB[right: const0, popLeft: TRUE, dist: UseLabel8[loopLabel]];
If nregs = 0, then loop to the start. We must get at least one word saved to terminate. Of course, at least one frame must have more than 0 words in order to get the EU stack overflow trap in the first place, so termination is assured.
};
At this point, [S] = the PC of the dummy frame, which must go back on the IFU stack. We also have to make the dummy L = eldest L and calculate the new stack limit.
GetEldestL[]; PReg[temp]; drSUBB[StackMargin];
Calculate the new stack limit. save the eldest L for further use.
ExtractField[first: 32-StackLog, bits: StackLog]; SetSPLimit[];
mask off any garbage and set the new stack limit
SetEldestPC[]; LReg[temp]; SetEldestL[];
Replace the dummy eldest frame. The dummy eldest L must be inherited from the real eldest L in order to properly preserve returned values on the stack!
drRETT[];
This return enables traps, and does not disturb S, but does restore L.
};
RescheduleTrap: PROC = {
This is the place where we trap to when the reschedule interrupt is handled. Further traps are disabled so we don't get into too much trouble.
entryLabel: Label = GenLabel[];
exitLabelLabel: Label = GenLabel[];
callLabel: Label = GetGlobalLabel["DragonStack.StackOverflowCall"];
schedBase: RegSpec = reg0;
junkReg: RegSpec = reg1;
nastyLabel: Label = GenLabel[];
trySched: Label = GenLabel[];
noSched: Label = GenLabel[];
callSched: Label = GenLabel[];
followOrders: Label = GenLabel[];
ProcedureEntry[entryLabel, 0];
MakeLabelGlobal["DragonStack.RescheduleTrap", entryLabel];
There is one nasty problem: what do we do if there is only one frame on the stack? In such a case, the stack underflow routine must complete before we can look at the stack safely. Luckily, that routine is guaranteed to finish. So we just return to it without turning on traps, and particularly without turing off the reschedule request.
GetYoungestPC[];
This acquires the youngest saved PC in the IFU stack
([S] == schedBase = youngest PC)
GetEldestPC[];
This acquires the eldest saved PC in the IFU stack
([S] == junkReg = eldest PC; [S-1] = youngest PC)
drRJEB[popLeft: FALSE, right: schedBase, dist: UseLabel8[nastyLabel]];
Compare the two (predict different), and jump if same
([S] == junkReg = garbage; [S-1] == schedBase = garbage)
drRAI[schedBase, global, pdSchedulerBase];
By convention, the first page (at least) of the global frame table is resident, and the scheduler base is in the first 256 words.
drRAI[junkReg, process, pdProcessPriority];
junkReg = priority of our process
First thing to do is to determine if we have anything to do. If not (a frequent case), we can just continue with normal processing. The first test is to see if the priority of this processor is equal to or lower than the priority of the lowest priority running process. If so, we try to get the scheduler. If not, we go to test for any orders for this processor.
drLRIn[schedBase, pdSchedulerPriority];
[S] = lowest priority running process
drRJGB[popLeft: TRUE, right: junkReg, dist: UseLabel8[trySched]];
SetLabel[noSched];
At this point we either did not try for or could not get the scheduler. So we just look at our processor directions to see if this processor has been asked to do something.
drRAI[junkReg, processor, pdProcessorOrders];
Load the orders
drJNEBB[UseLabel8[followOrders], LOOPHOLE[DragOpsCross.ProcessorOrders[reset]]];
If orders are present, try to follow them
At this point there is only one word on the stack, which we can discard. Then we just proceed as though nothing had happened.
drDIS[]; -- clear off the stack
drRETT[];
This return enables traps, and does not disturb S, but does restore L.
SetLabel[nastyLabel];
At this point we have discovered that we are trying to interrupt in the middle of the dummy frame! Although somewhat rare, this really can happen!
([S] = eldest PC; [S-1] = youngest PC)
SetEldestPC[];
The two are the same, so we have interrupt the dummy frame, which is not really allowed. We have to restore the dummy PC. We assume that the eldest L should have been undisturbed by the preceding madness (if not, we will have to restore that, too).
drJ1[]; drJ1[];
Stall for two instructions to let the restored PC "take"
drRET[0];
Return to the stack underflow routine with traps disabled. When it has completed the frame restoration, it will reenable traps and again take the reschedule interrupt. At that time we hopefully will find nothing to do, because some other processor will have taken the bait.
SetLabel[trySched];
At this point we have determined that we must try for the scheduler. First try for the scheduler lock.
LReg[schedBase]; IF pdSchedulerLock # 0 THEN drADDB[pdSchedulerLock];
([S] = addr of sched lock)
LReg[process]; drLC0[];
([S] = 0; [S-1] = process addr; [S-2] = addr of sched lock)
drCST[];
Try to acquire the scheduler lock
([S] = sample; [S-1] = 0; [S-2] = process addr; [S-3] = addr of sched lock)
drRJEB[popLeft: TRUE, right: popSrc, dist: UseLabel8[callSched]];
Test for success, popping both sample and 0 words
drAS[2];
Could not get the lock, so discard the leftovers
drJB[UseLabel8[noSched]];
And go try for orders
SetLabel[callSched];
Well, what do you know! We are now the new scheduler processor!
drAS[2];
flush the junk left over from CST
drRRI[junkReg, schedBase, pdSchedulerOwnerCall]; drSFC[];
call the scheduling routine (with schedBase as argument)
drRETT[];
When we get back here, return to the poor interrupted beastie
SetLabel[followOrders];
At this point we have determined that we must follow orders.
drLRIn[schedBase, pdSchedulerOrdersCall]; drSFC[];
call the orders following routine (with schedBase as argument)
drRETT[];
When we get back here, return to the poor interrupted beastie
};
InternalSaveEldest: PROC = {
This routine saves the eldest frame. It is the caller's responsability to save and/or restore the dummy frame. This routine is intended for use ONLY by the various overflow and reschedule trap routines.
setupLabel: Label = GenLabel[];
allocRoutineLabel: Label = GenLabel[];
{
SetLabel[allocRoutineLabel];
This is an excellent place to put the internal routine to alloc a new dump block. This way we keep the code common, and keep jump distances short where they count. The only penalty is a few cycles for the call/return.
AllocDumpFrame[];
free the dump frame pointed to by temp
drLC0[]; drPSB[othersOffset];
always clears the others field on allocation
drRETN[];
return without S adjustment
};
{
This is an internal routine that we use to setup stuff (especially L) prior to saving the frame. It needs to be a procedure because we have to set L (sigh).
before entry:
[S] = newBlock; [S-1] = PC for dummy frame; hook = frame chain
after exit:
[S] = #regs; [S-1] = 16; L = L of frame to save
localHook: RegSpec = reg0;
localNRegs: RegSpec = reg1;
ProcedureEntry[setupLabel, 1];
Make our local L good for a few locals, we get one sent to us on the stack
LReg[hook]; drPSB[linkOffset];
store the hook into the new dump block
([S] = localHook = newBlock; (localHook+linkOffset)^ = hook)
PReg[hook]; PReg[temp];
store the new dump block addr into hook and temp
([S] = localHook = hook = temp = newBlock)
GetEldestL[]; drDUP[]; SetYoungestL[];
Get the L for the frame we wish to save, put it into localNRegs and youngest L
([S] = localNRegs = L of frame to save; [S-1] = localHook)
GetEldestPC[]; drSRIn[localHook, lastPCOffset];
Store the continuation PC and remove the eldest PC from the ifuStack.
((localHook+lastPCOffset)^ = PC of frame to save; [S-1] = localHook)
GetEldestL[]; drRVSUB[c: topDst, a: topSrc, b: localNRegs];
Calculate the number of regs in this frame
([S] = localNRegs = eldestL - localNRegs'; [S-1] = localHook)
ExtractField[first: 32-StackLog, bits: StackLog];
determine the # of regs to store, using modulo arithmetic
([S] = localNRegs = #regs = (eldestL - localNRegs') MOD StackSize)
drWRI[localNRegs, localHook, nRegsOffset];
Save #regs into the save block
(localHook+nRegsOffset)^ = localNRegs
drLIB[16]; SReg[localHook];
([S] = #regs; [S-1] = 16)
drRETN[];
return without S adjustment
};
{entryLabel: Label = GenLabelHere[];
shortLabel: Label = GenLabel[];
MakeLabelGlobal["DragonStack.InternalSaveEldest", entryLabel];
drLFC[UseLabel16[allocRoutineLabel]];
Allocate the next save block to use
([S] = newBlock; [S-1] = PC of dummy frame)
drLFC[UseLabel16[setupLabel]];
Init the new save block. Its address is left in hook. L gets set to the L for the frame we want to save. In addition:
([S] = #regs; [S-1] = 16)
drRJGEBJ[popLeft: FALSE, right: topSrc, dist: UseLabel8[shortLabel]];
If #regs <= 16 (the usual case), skip the code that dumps extra blocks.
{loopLabel: Label = GenLabelHere[];
At this point, there nregs > 16, so dump the first 16. The save block to use has already been allocated, and its address is in temp.
FOR i: NAT IN [0..15] DO
drWAI[reg1: [reg[i]], reg2: temp, disp: regOff+i];
ENDLOOP;
LReg[temp];
Push the address of the completed save block
([S] = temp = oldBlock; [S-1] = #regs; [S-2] = 16)
drLFC[UseLabel16[allocRoutineLabel]];
Allocate the next save block to use (it gets left on the stack)
([S] = newBlock; [S-1] = temp = oldBlock; [S-2] = #regs; [S-3] = 16)
PReg[temp]; drWSB[othersOffset];
set temp to the new dump block addr
also store the new block addr into the old block
(temp = (temp'+othersOffset)^ = newBlock; [S] = #regs; [S-1] = 16)
drARL[16]; drSUBB[16];
adjust L and #regs
([S] = #regs; [S-1] = 16)
drRJLB[popLeft: FALSE, right: topSrc, dist: UseLabel8[loopLabel]];
go do another frame (if necessary)
};
SetLabel[shortLabel];
(#regs IN [0..16]; [S] = #regs; [S-1] = 16)
{jumpLabel: Label = GenLabel[];
The best way to quickly restore a variable # of regs is to jump into a table of read instructions. This takes roughly 6 cycles to do the jump, and we stand to save at least one cycle for every store we save (not counting avoiding extra dirty cache entries).
drRVADD[c: belowDst, a: topSrc, b: topSrc];
([S] = #regs; [S-1] = 2*#regs)
drRVADD[c: belowDst, a: belowSrc, b: belowSrc];
([S] = #regs; [S-1] = 4*#regs)
drRVSUB[c: belowDst, a: popSrc, b: belowSrc];
([S] = -3*#regs)
IndexedJump[jumpLabel];
jump into the table
FOR i: NAT DECREASING IN [0..15] DO
Generate 16 stores of locals to the save block
drWAI[reg1: [reg[i]], reg2: temp, disp: regOff+i];
ENDLOOP;
SetLabel[jumpLabel];
};
drRETN[];
This return does not enable traps, and does not disturb S, but does restore L.
};
};
AllocDumpFrame: PROC = {
Allocate a dump frame using the local array of frames, defaulting to the global array of frames if not immediately successful. The address of the allocated frame is left on the stack. There is no provision for failure from the global array at this time.
exitLabel: Label = GenLabel[];
enterLabel: Label = GenLabelHere[];
LRegI[free, const0];
Push the next frame address. If none in our local cache, a 0 will be pushed.
drRJNEBJ[right: const0, popLeft: FALSE, dist: UseLabel8[exitLabel]];
Jump if we got the frame; leave 0 on stack if not. Predict success.
drRRX[c: topDst, a: base, b: const0];
Replace top of stack (was 0) with the addr of the lock.
LReg[process];
Push the desired new value for the lock field
drLC0[];
Push the assumed old value of the lock field (i.e. free)
{retryLabel: Label = GenLabelHere[];
drCST[0];
Try for the global lock for the frame allocator. We store our process ID into the lock if the lock was free, and return the sampled value of the lock at the start of the instruction in any case. Note that there are 4 words on the stack (lockAddr, new, old, sample).
drRJEB[right: popSrc, popLeft: FALSE, dist: UseLabel8[retryLabel]];
Jump back to retry if we did not get it (predict success), and pop the sampled value. If this loop causes us to hold the bus too often, we can always insert a spin loop that does not use CST.
drAS[2];
On success, flush the old & new values, since we don't need them.
};
SubReg[free, const4];
Adjust the free pointer to accomodate the next transfers.
LRegI[base, const0]; drRSB[1];
Get the global free pointer (gFree) on the stack (lock addr is also left on stack)
FOR i: NAT IN [0..4) DO
drRSB[i]; LReg[free]; drWB[i];
(free+i)^ ← (gFree+i)^
ENDLOOP;
drADDB[framesToTransfer];
adjust gFree for the number of frames allocated
drPSB[1];
Write gFree back, leaving the addr of the lock on the stack.
drLC0[]; drWSB[0];
Clear out the lock, which lets other processors get their chance.
drJB[UseLabel8[enterLabel]];
Go to retry the allocation from the start (stack is back to ground level).
SetLabel[exitLabel];
This is the successful exit point. The allocated frame is on the stack.
AddReg[free, const1];
Adjust the free pointer, since we got a frame
};
FreeDumpFrame: PROC [] = {
Free a dump frame (in temp) using the local array of frames, defaulting to the global array of frames if not immediately successful. There is no provision for failure from the global array at this time.
exitLabel: Label = GenLabel[];
enterLabel: Label = GenLabelHere[];
drRVSUB[c: pushDst, a: free, b: const1];
push free - 1
drRJNEBJ[popLeft: FALSE, right: base, dist: UseLabel8[exitLabel]];
Determine if free-1 will collide with base, jump if not. Free-1 is left on the stack for use in storing away the freed frame. Predict success.
drRRX[c: topDst, a: base, b: const0];
Put the addr of the lock on the stack. We get to have a pop for free here.
LReg[process];
Push the desired new value for the lock field
drLC0[];
Push the assumed old value of the lock field (i.e. free)
{retryLabel: Label = GenLabelHere[];
drCST[0];
Try for the global lock for the frame allocator. We store our process ID into the lock if the lock was free, and return the sampled value of the lock at the start of the instruction in any case. Note that there are 4 words on the stack (lockAddr, new, old, sample).
drRJEB[right: popSrc, popLeft: FALSE, dist: UseLabel8[retryLabel]];
Jump back to retry if we did not get it (predict success), and pop the sampled value. If this loop causes us to hold the bus too often, we can always insert a spin loop that does not use CST.
drAS[2];
On success, flush the old & new values, since we don't need them. The lock address is left on the stack.
};
LRegI[base, const1]; drRSB[1];
get the global free pointer (gFree) on the stack (leaving the lock addr under it)
drSUBB[4];
gFree ← gFree - 4; leave gFree on stack
FOR i: NAT IN [0..4) DO
LReg[free]; drRB[i]; drPSB[i];
(gFree+i)^ ← (free+i)^
ENDLOOP;
drPSB[1];
write gFree back, leaving the lock addr on the stack
drLC0[]; drWSB[0];
clear out the lock, which lets other processors get theirs
drRVADD[c: pushDst, a: free, b: const3];
push free + (4-1)
SetLabel[exitLabel];
This is the successful exit point. There is room between free and base.
At this point the address of the slot to hold the frame is on the stack. First we store that address to free, then we load the frame to be freed and store it into the slot.
PReg[free]; LReg[temp]; drWSB[0];
(free ← [S]-)^ ← temp
};
All: PROC = {
InternalSaveEldest[];
generate this first to define the label "DragonStack.InternalSaveEldest"
IFUStackOverflowTrap[];
EUStackOverflowTrap[];
RescheduleTrap[];
StackUnderflowTrap[];
};
END.