:TITLE[Cedar]; %Ed Fiala 7 December 1983: Rewrite for Cedar 5.0 per Rovner's spec. Thoughts: 1) Can save 3 pages in ZCTObject by advancing bsiToFreeList from 400b to 10b, bsiToSize to 110b, sizeToBSI to 310b, and fosTable to 1000b. By making a page fault on the bsiToSize table impossible, this would make @FreeObject 4 cycles faster as well. 2) Half of the sizes in sizeToBSI seem to be wasted because the restriction on NHPs and Refs being even-aligned implies sizes should be even. Could have @AllocateObject fetch the BSI at (Size+1)/2 in the sizeToBSI table, saving 220b words. 3) Making MaxSmallBlockSize a value that is describable by a single microcode constant (i.e., 0 to 377b or (0 to 377b)*400b) saves 4 cycles in AllocateObject. 4) Storing BSI*2 in the SizeToBSI table saves 2 cycles (because BSI does not then require shifting left 1 before adding it to a constant displacement when determining the displacement of the long pointer in the bsiToFreeList array). 5) If NHP's were quadword-aligned, then the two NHP words and the link to the next free block could be obtained with a single PFetch4, and the words could be initialized with a single PStore4 in @AllocateObject, rather than PFetch1, PFetch2, PStore2, and PStore2. In addition to saving ~26 of the ~160 cycles in @AllocateObject and ~20+(Size-4)*5 cycles in @FreeObject, this change would allow Size/4 to be compressed to 8 bits in the bsiToSize table, saving 100b words; in conjunction with (1) and (2) above, all of the *bsi* tables would fit in the first page of ZCTObject, saving a page. Also, block-zeroing becomes simpler and faster in @FreeObject because of quadword alignment and block lengths. However, objects would then be limited to sizes that are 2+2, 2+6, 2+10, etc., excluding the 2+4, 2+8, etc. sizes. 6) CRefType, BlkZ, and LocalBlkZ should be renumbered so that alpha is 14xb, and these opcodes should be put out by the Cedar5 compiler. AssignRefNew is useless and should be deimplemented. 7) @FreeObject's check for microcode disabled should be moved after the block zeroing. Zeroing the block should be harmless (?), and if this is not done, then software has to re-zero words already zeroed. 8) Require all opcodes except AssignRef and CRefType to be minimal stack. To run Cedar5, RTMoveStatus, EnableMicrocode, DisableMicrocode, and CRefType must be defined. Other opcodes are optional, and debugging can take place if the opcode traps with parameter 0 when unimplemented. The Cedar compiler only produces the CRefType and AssignRef opcodes; all others are produced only in inlines. Cedar 5.0 initialization executes RTMoveStatus so that 5.0 debugging can take place within a 4.4 environment, so this old useless opcode must be defined; and CRefType clears the top two bits of the type word for backward compatibility. Here, RTMoveStatus puts 3 in gcZCT so that AssignRef will always trap with parameter 0. The pointer to a normal object header is called an NHP, and it always points at an even address; the associated Ref points at NHP+2; the object is an even number of words long. Normal object header format: Word - 2: 0..0 inZCT 1..1 maybeOnStack 2..7 BlockSizeIndex 8..8 f 0 => reclaim object if refCount=0 1 => queue object for finalization if refCount=0 9..14 refCount 15..15 rcOverflowed Word -1: 0..1 temporary for compatibility 2..15 type CRefType will be able to assume that the 1st two bits of the type word are 0 after Cedar5 conversion is complete. ZCTObject format 2^n pages; starts on a page boundary; first page is locked in storage. Word 0-1 wp lp to next ZCT entry into which to store an NHP or to a zctBlock link; block link has NIL=0 (empty) or pointer to next block; wp is EVEN. Word 2-3 rp lp to next ZCT entry to be examined by collector. Word 4-5 lastNP lp to lp ??? Word 6[15d..15d] markingDecrements Boolean. unused words to page boundary Word 400b-577b bsiToFreeList 100b FNHeaderP's (lp's to normal free Headers) Word 1000b-1077b bsiToSize 100b positive integers unused words to page boundary Word 1400b-2036b sizeToBSI 1076b bytes of block size indices in the range 0..77b unused words to page boundary Word 2400b-12377b fosTable 4k-word array of residues described below. FOSTableObject = 4k-word array of residues. Valid residues are: 0..777b 22-bit address space 0..3777b 24-bit address space 0..77777b 28-bit address space. fosWildCard=177777b fosEmpty=100000b Note that sizes obtained from bsiToSize INCLUDE the two NHP overhead words, while the size used to index the sizeToBSI table DOES NOT INCLUDE the two overhead words. % MC[MicVersion,3]; *Must agree with declaration in CedarMicrocode.Mesa *Trap parameters MC[RCOverflowTrap,1]; MC[MicDisabled,2]; *Microcode disabled. MC[RCUnderflowTrap,3]; MC[ZCTFullTrap,4]; MC[RCFinalizeTrap,6]; *Finalize the Ref. MC[NormalFreeListEmpty,6]; *Free list for AllocateObject is empty *TrapParm=5 seems to mean "disaster." MC[RCUnderBugTrap,5]; *About to decrement a refCount that is already 0. MC[RefUneven,5]; *Ref doesn't point at even word. MC[BadSize,5]; *Size from BSItoSize table not even. MC[BadBSI,5]; *BSI from SizeToBSI table > 77b. MC[NotMinimalStack,5]; *Opcode not minimal stack that should be. MC[BadCreateRefCount,5]; MC[BadZCTAlignmentTrap,5]; MC[ZCTWPUnevenTrap,5]; MC[BSItoFreeList,400]; *Displacement from gcZCT to bsiToFreeList table MC[BSItoSize,1000]; *Displacement from gcZCT to bsiToSize table MC[SizetoBSI,1400]; *Displacement from gcZCT to sizeToBSI table MC[FOSTable,2400]; *Displacement from gcZCT to fosTable Set[MaxSmallBlockSize,1076]; MC[BSIEscape,77]; *= Last[BlockSizeIndex] *Since gcZCTWP is only used with a displacement of 0, bits 8..15d in *gcZCTWPhi don't affect the function as a base register; bit 15d in *gcZCTWPhi is used for the "markingDecrements" flag. RV2[gcZCTWP,gcZCTWPhi,64]; *Permanent base register (always even) RV2[gcZCT,gcZCThi,24]; *Permanent page-aligned base register except *gcZCT[15d..15d]=1 disables microcode *Temporaries 44-47, 50-53, 56-57, 66-73 *RTemp (RM 52) holds 400b+alpha; must preserve it until trap is impossible. *RM 66-67 are LP and LPhi, which are used. ***Note: gcTemp0/1 were the same registers as gcLPR/gcLPRhi until I coded ***@AllocateObject. RV4[gcH0,gcH1,gcTemp0,gcTemp1,44]; RV[gcTemp2,50]; *Must be register after gcTemp1 for unaligned PStore4 RV[gcRLink,51]; RV2[gcNNHP,gcNNHPhi,70]; RV2[gcLPR,gcLPRhi,72]; *56-57 are unused temporaries. Set[gcPage,2]; *~400b mi Loca[Misc140d,gcPage,0]; *20b-way dispatch Set[gcPage2,10]; *Stuff that won't fit on gcPage gcRefill: PFetch4[PCB,IBuf,4], GoToP[MesaRefill], At[LShift[gcPage,10],377]; gcBugTrap: T _ SStkP, Skip; *Distinct from gcTrap for breakpoints gcTrap: T _ SStkP; gcTemp0 _ T, LoadPage[opPage0]; StkP _ gcTemp0, GoToP[UndefTrap]; OnPage[gcPage2]; gcP2Trap: T _ SStkP; gcTemp0 _ T, LoadPage[opPage0]; StkP _ gcTemp0, GoToP[UndefTrap]; gcRet2: Return; %Esc dispatch. Timing = 18.5 cycles (including buffer refill) to the 1st working mi for the opcode. T _ CNextData[IBuf], Call[.+2], Opcode[364]; LU _ NextInst[IBuf], CallX[P7Tailx]; RTemp _ T, LoadPage[xfPage1], Skip[H2Bit8']; TrapParm _ 0C, GoToP[MiscUnimp]; Dispatch[RTemp[11,3], GoToP[.+1]; OnPage[xfPage1]; Dispatch[RTemp,14,4], Disp[.+1], At[MiscDisp0,0]; ... % *GCSetup = RTMoveStatus = Esc 63b. *CRefType = Esc 72b. T _ (R400) + T, LoadPage[gcPage], At[MiscDisp0,3]; LU _ (RTemp) xor (72C), GoToP[.+1]; OnPage[gcPage]; LU _ (RTemp) xor (63C), GoTo[gcCRefType,ALU=0]; RTemp _ T, LoadPage[opPage0], Skip[ALU=0]; gcUnimp: TrapParm _ 0C, GoToP[UndefTrap]; *So that the old microcode will work in trap mode, make RTMoveStatus *(= GTSetup) disable the new microcode, pop the stack once, and return. *gcZCT is set to 3 to make AssignRef trap with TrapParm=0, and the fact *that gcZCT is odd disables all the new opcodes. gcZCT _ 3C, GoToP[.+1]; OnPage[opPage0]; LU _ NextInst[IBuf]; Stack&-1, NIRet; T _ (R400) + T, LoadPage[gcPage], At[MiscDisp0,6]; Dispatch[RTemp,14,4], RTemp _ T, NoRegILockOK, GoToP[.+1]; OnPage[gcPage]; Disp[@ReclaimedRef]; *Esc140 = ReclaimedRef *Esc141 = EnableMicrocode *Esc142 = DisableMicrocode *Esc143 = CreateRef *Esc144 = ReclaimableRef *Esc145 = AllocateObject *Esc146 = FreeObject @Esc147: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,7]; *undef @Esc150: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,10]; *undef @Esc151: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,11]; *undef @Esc152: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,12]; *undef @Esc153: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,13]; *undef @Esc154: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,14]; *undef @Esc155: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,15]; *undef @Esc156: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,16]; *undef @Esc157: LoadPage[opPage0], CallX[gcUnimp], At[Misc140d,17]; *undef OnPage[gcPage]; gcZCTEven: gcZCThi _ (RSh[gcZCThi,10]) + T + 1, GoToP[.+1]; OnPage[gcPage2]; T _ Stack&-1, Skip[R Odd]; gcZCT _ T, Return; TrapParm _ BadZCTAlignmentTrap, GoTo[gcP2Trap]; %Argument is long pointer to ZCTObject. Result is microcode version. This opcode loads the gcZCTWP/hi and gcZCT/hi base registers, the markingDecrements flag, and the MicEnabled flag. Timing = 18.5 + 42 cycles. % @EnableMicrocode: *Esc 141b T _ LSh[Stack&-1,10], At[Misc140d,1]; *Setup long pointer from [S]/[S-1] into gcZCT/gcZCThi. gcZCThi _ T, LoadPage[gcPage2], Call[gcZCTEven]; *Since the references below can't page fault, there is no problem with *clobbering the value of 1 or 3 and then faulting. PFetch2[gcZCT,gcZCTWP,0], Call[gcRet]; PFetch1[gcZCT,gcTemp0,6], Call[gcRet]; LU _ gcZCTWP, Skip[R Even]; TrapParm _ ZCTWPUnevenTrap, CallX[gcBugTrap]; *Put markingDecrements flag in bit 17b of gcZCTWPhi (Since the displacement *for gcZCTWP references is always 0, right-half of gcZCTWPhi is unused). Stack&+1 _ MicVersion; T _ LdF[gcTemp0,17,1]; gcZCTWPhi _ (LSh[gcZCTWPhi,10]) + T, GoTo[gcTail]; %Argument is long pointer to ZCTObject. Store the current value of gcZCTWP/hi back into the ZCTObject, disable microcode, and exit. Timing = 18.5 + 26 cycles. % @DisableMicrocode: *Esc 142b T _ LSh[Stack&-1,10], At[Misc140d,2]; *Setup long pointer from [S]/[S-1] into LP/LPhi; gcZCT odd=MicDisabled. gcZCThi _ T, LoadPage[gcPage2], Call[gcZCTEven]; *Convert gcZCTWP base register into long pointer. gcZCTWPhi _ RSh[gcZCTWPhi,10]; PStore2[gcZCT,gcZCTWP,0], Call[gcRet]; *Indicate disabled. gcZCT _ 1C, GoTo[gcTail]; %Insert the NHP in LP/LPhi in the ZCT and set the inZCT bit in gcH0. Advance the write pointer gcZCTWP/gcZCTWPhi after the store; if this advances to the block blink, follow the block link, trapping if there are no more free blocks. Timing = 21 cycles beginning at gcOnZa. % gcCR0: gcH0 _ (gcH0) or T, Skip[ALU=0]; TrapParm _ BadCreateRefCount, CallX[gcBugTrap]; gcOnZb: LPhi _ RSh[LPHi,10]; gcOnZa: PStore2[gcZCTWP,LP,0]; gcOnZ: gcTemp0 _ HiA[3774]; T _ (gcTemp0) or (LoA[3774]); LU _ (LdF[gcZCTWP,5,13]) xor T; gcH0 _ (gcH0) or (100000C), GoTo[gcOnZ1,ALU=0]; *4 cycle hold here. Can't do gcZCTWP_ first because the mi after the *Return uses LP,,LPhi as a base register. LPhi _ LSh[LPhi,10]; *Carry into gcZCTWPhi is impossible here. gcZCTWP _ (gcZCTWP) + (2C), Return; *When the doubleword at an address whose low-order bits are 3774b has just *been written, the next doubleword is the block link. Advance to the next *ZCT block, preserving markingDecrements. Final block has block link = 0. gcOnZ1: LoadPage[gcPage2]; PFetch2[gcZCTWP,gcTemp0,2], GoToP[.+1]; OnPage[gcPage2]; UseCTask; T _ APCTask&APC, Task; gcRLink _ T; T _ LSh[gcTemp1,10]; LU _ (gcTemp0) or T; LPhi _ LSh[LPhi,10], Skip[ALU#0]; TrapParm _ ZCTFullTrap, CallX[gcP2Trap]; gcZCTWPhi _ (LdF[gcZCTWPhi,17,1]) + T, Task; T _ gcTemp0; APCTask&APC _ gcRLink; gcZCTWP _ T, Return; OnPage[gcPage]; gcFixLP: T _ Stack&-1, Skip[R Even]; TrapParm _ RefUneven, CallX[gcBugTrap]; LP _ T; *Fetch 1st normal header word PFetch1[LP,gcH0,0]; gcRecSetup: T _ 377C; LU _ (NStkP) xor T; T _ LdF[LP,0,3], Skip[ALU=0]; TrapParm _ NotMinimalStack, CallX[gcBugTrap]; gcZCT, LoadPage[gcPage2], GoTo[gcMDisabled,R Odd]; *Setup for possible FOSTable access later during dead time of PFetch1. *22-bit NHP handled as follows: * High-order 9 bits are the residue R. * Low-order bit is discarded because Refs are even => NHPs are even. * 12 low-order bits (excluding lowest) are xor'ed with R to get X. *Note that LPhi is not a full base register--right-half is 0. T _ (RSh[LPhi,5]) or T, GoToP[.+1]; *T _ R OnPage[gcPage2]; T _ (LdF[LP,3,14]) xor T; *T _ X gcTemp1 _ T, Return; *gcTemp1 _ X OnPage[gcPage]; gcMDisabled: Stack&+2, GoToP[.+1]; *Clear flag bit which @FreeObject leaves set in high pointer word. OnPage[gcPage2]; Stack _ (Stack) and not (100000C); TrapParm _ MicDisabled, CallX[gcP2Trap]; OnPage[gcPage]; gcMakeFOSRes: T _ LdF[LP,0,3]; T _ (RSh[LPhi,5]) or T, Return; gcPopToHiBase: T _ LSh[Stack&-1,10], Return; %Argument is Ref already checked for non-NIL and plausibility. Result is NIL, if no further action is necessary, or the original Ref, if it should be reclaimed or has finalization. NIL is returned if Ref-2 is already in the ZCT, or if refCount#0, or if maybeOnStack=1, or if Ref-2 or the WildCard is found in the FOSTable. Timing = 18.5 + 46 + ( 0 or 1 if decremented RefCnt > 0 else 3 if Ref already in ZCT else 27 if put in ZCT because maybeOnStack=1 else 46 if put in ZCT because found in FOSTable else 49 if put in ZCT because WildCard in FOSTable else 26 or 29 if Ref should be reclaimed ) % @ReclaimedRef: T _ LSh[Stack&-1,10], At[Misc140d,0]; *Esc 140b LPhi _ T; *Software has already checked the Ref for validity before this opcode *is executed. T _ (Stack&-1) - (2C); LP _ T, Skip[Carry]; LPhi _ (LPhi) - (400C), Call[gcRet]; *Fetch the first word of the normal header. PFetch1[LP,gcH0,0], Call[gcRecSetup]; *Timing to here = 18.5+23 cycles. LU _ LdF[gcH0,11,5]; LU _ Dispatch[gcH0,16,2], Skip[ALU=0]; *Skip if RefCnt=0 or 1 *Done if old refCnt .gr. 1, in which case new refCnt will be .gr. 0. gcH0 _ (gcH0) - (2C), GoTo[gcRRInZCT]; gcH0 _ (gcH0) - (2C), Disp[.+1]; *Illegal underflow (decrementing RefCnt that is already 0) TrapParm _ RCUnderBugTrap, CallX[gcBugTrap], DispTable[4]; *Underflow after previous overflow TrapParm _ RCUnderflowTrap, CallX[gcTrap]; *RefCount going to 0. Test maybeOnStack. LU _ (LdF[gcH0,1,1]) - 1, DblGoTo[gcRRTo0ButInZCT,gcRRGoing0,R<0]; *RefCount being decremented to 0 after previous overflow; like RefCnt>0. PStore1[LP,gcH0,0], GoTo[gcNILExit]; *Going to 0 but already in ZCT. gcRRTo0ButInZCT: PStore1[LP,gcH0,0], GoTo[gcNILExit]; *Timing to here = 18.5+31 cycles. *RefCount being decremented to 0, no previous overflow, not in ZCT. gcRRGoing0: T _ (gcTemp1) + (FOSTable), Skip[ALU<0]; LPhi _ RSh[LPhi,10], GoTo[gcRROnZ]; *maybeOnStack=1 *Look up NHP in FOSTable (4k words long). PFetch1[gcZCT,gcTemp0], Call[gcMakeFOSRes]; ***7 cycle wait here. LU _ (gcTemp0) xor T, Skip[R>=0]; LU _ (gcTemp0) xnor (0C); *Know WildCard=177777b LU _ LdF[gcH0,10,1], Skip[ALU#0]; LPhi _ RSh[LPhi,10], GoTo[gcRROnZ]; *FOS match or WildCard Stack&+2, Skip[ALU#0]; *Reclaim Ref => return Ref on stack PStore1[LP,gcH0,0], GoTo[gcTail]; *Trap for finalization. TrapParm _ RCFinalizeTrap, CallX[gcTrap]; gcRROnZ: PStore2[gcZCTWP,LP,0], Call[gcOnZ]; gcRRInZCT: PStore1[LP,gcH0,0]; gcNILExit: Stack&+1 _ 0C; *Entries from @AllocateObject, @ReclaimableRef gcZeroExit: LU _ NextInst[IBuf]; *Entry from @ReclaimableRef. gcZeroExitx: Stack&+1 _ 0C, NIRet; %Argument is an NHP. Set maybeOnStack_markingDecrements and put the NHP into the ZCT. Timing = 18.5 + 68 cycles % @CreateRef: T _ LSh[Stack&-1,10], At[Misc140d,3]; *Esc 143b LPhi _ T, Call[gcFixLP]; ***NHP.maybeOnStack _ ZCT.markingDecrements (actually assume existing value *is 0 and set value to 1 if markingDecrements=1). T _ LSh[gcZCTWPhi,16]; LU _ LdF[gcH0,11,7], Call[gcCR0]; PStore1[LP,gcH0,0], GoTo[gcTail]; %Argument is an NHP. If refCount#0 or inZCT=1 then return "continue"; elseif maybeOnStack=1 or FoundInFHSnapshot then OnZ & return "continue"; elseif f=1 then return "finalizeIt" else return "reclaimIt"; Timing = 18.5 + 29 + ( 0 if in ZCT already (no action--rare race case) 2 if decremented refCnt#0 else 39 if maybeOnStack=1 else 58 if found in FOSTable else 61 if WildCard in FOSTable else 23 or 26 to reclaimIt or finalizeIt return ) % @ReclaimableRef: T _ LSh[Stack&-1,10], At[Misc140d,4]; *Esc 144b LPhi _ T, Call[gcFixLP]; *Timing to here = 18.5 + 22 cycles. LU _ LdF[gcH0,11,7], GoTo[gcZeroExit,R<0]; LU _ (LdF[gcH0,1,1]) - 1, Skip[ALU=0]; LU _ NextInst[IBuf], CallX[gcZeroExitx]; *Know continue=0 *refCount=0 & inZCT=0 T _ (gcTemp1) + (FOSTable), Skip[ALU<0]; LPhi _ RSh[LPhi,10], GoTo[gcOnZx]; *maybeOnStack=1 PFetch1[gcZCT,gcTemp0], Call[gcMakeFOSRes]; *Timing to here = 18.5 + 22 + 21 cycles. LU _ (gcTemp0) xor T, Skip[R>=0]; LU _ (gcTemp0) xnor (0C); *WildCard=177777b T _ (LdF[gcH0,10,1]) + 1, Skip[ALU#0]; LPhi _ RSh[LPhi,10], GoTo[gcOnZx]; *FOS match or WildCard *Return reclaimIt=1 if f=0 else finalizeIt=2. gcPushT: LU _ NextInst[IBuf]; Stack&+1 _ T, NIRet; gcOnZx: PStore2[gcZCTWP,LP,0], Call[gcOnZ]; *Return continue = 0 PStore1[LP,gcH0,0], GoTo[gcZeroExit]; %Accepts: [S]/ type [S-1]/ requested size (EXCLUDING the two NHP words). Trap if the microcode is disabled. Else return NIL if requested Size .gr. MaxSmallBlockSize (=1076b). Else extract the 8-bit BSI (Block Size Index) from ZCT.sizeToBSI. Finally, fetch the pointer to the free list for objects of this BSI from ZCT.BSItoFreeList. If the free list is NIL, trap. Otherwise, the free list points at the NHP of a block and the Ref returned will be NHP+2. Fetch the long pointer to the next free list object from [NHP+2]^ and store it into the free list header. Zero the two words which contained that long pointer (which leaves the object entirely zero). The maybeOnStack, F, RefCnt, RCOverflowed fields of a free list object are known to contain 0, and the body of the object is zero except for the two-word link to the next free list object; F cannot be initialized to 1 here when the Ref will have finalization because the Ref must be put in other data structures first. Timing: 18.5 + 137 cycles (assumes a reference begins the next opcode). % *Trap if microcode is disabled. gcAlloc1: T _ (RSh[Stack,1]) + T; gcZCT, Skip[R Even]; TrapParm _ MicDisabled, CallX[gcTrap]; PFetch1[gcZCT,gcTemp2]; LP _ HiA[MaxSmallBlockSize]; T _ (LP) or (LoA[MaxSmallBlockSize]), Return; @AllocateObject: *Esc 145b T _ LdF[Stack&-1,2,16], At[Misc140d,5]; *Save type in 2nd word of gcH0,,gcH1 pair for later. gcH1 _ T; *Displacement of BSI for this size block = SizeToBSI+(Size/2) because *the 6-bit BSIs are stored in 8-bit bytes. T _ SizeToBSI, Call[gcAlloc1]; *Since block size is a cardinal, sign bit test is ok. LU _ (Stack) - T - 1; T _ BSItoFreeList, Skip[ALU<0]; Stack _ 0C, GoTo[gcZeroExit]; *Save displacement of free list = BSItoFreeList+(BSI*2) in gcTemp2. LU _ Stack&-1, Skip[R Even]; gcTemp2 _ RHMask[gcTemp2], Skip; gcTemp2 _ RSh[gcTemp2,10]; gcTemp2 _ T _ (LSh[gcTemp2,1]) + T, Task; *Fetch free list long pointer into LP,,LPhi and trap if it is NIL. PFetch2[gcZCT,LP]; *A full base register is needed in case word 2 of the object is in the *next 64k storage bank. T _ LPhi _ LSh[LPhi,10]; LPhi _ (RSh[LPhi,10]) + T + 1; LU _ (LP) or T; Skip[ALU#0]; TrapParm _ NormalFreeListEmpty, CallX[gcTrap]; *Since the header, first two data words, and two ZCT words may be on *different pages and get page faults, both doublewords must be referenced *and gcOnZ called before smashing either storage or stack arguments. *Fetch [NHP+0]^. Timing to here 18.5+51 cycles. PFetch1[LP,gcH0,0], Call[gcRet]; *Fetch the next free list link = double word at [NHP+2]^ PFetch2[LP,gcLPR,2], Call[gcOnZb]; *Store the new free list pointer. Timing to here = 18.5+51+34 T _ gcTemp2, LoadPage[gcPage2]; PStore2[gcZCT,gcLPR], GoToP[.+1]; **Now committed--must not page fault now. *Push NHP+2 = the Ref onto stack. OnPage[gcPage2]; T _ (LP) + (2C); Stack&+1 _ T, Skip[Carry']; T _ (RSh[LPhi,10]) + 1, Skip; T _ RSh[LPhi,10]; Stack&+1 _ T, Task; *maybeOnStack_MarkingDecrements in header. T _ LSh[gcZCTWPhi,16]; gcH0 _ (gcH0) or T, LoadPage[gcPage]; PStore2[LP,gcH0,0], GoToP[.+1]; *Store NHP OnPage[gcPage]; T _ RSh[LPhi,10]; LPhi _ (LPhi) + T + 1; *Zero the two words which contained the free list link. T _ gcTemp0 _ 0C, Call[gcTtoTemp1]; PStore2[LP,gcTemp0,2], GoTo[gcTail]; %@FreeObject traps with parameter 2 if microcode is disabled; otherwise, it accepts an NHP on the stack and fetches 1st header word to obtain the BSI. It returns false=0 if the BSI for the block is 77b (BSIEscape). In all other cases, the object size (INCLUDING the 2 header words) is obtained from the BSItoSize table in the ZCT. Then the object is zeroed except for the first 4 words; the 2nd header word is zeroed. Finally, the current free list pointer for the object's BSI is fetched from the ZCT, put in words 2 and 3 of the object, and the NHP is put in the free list cell. Since up to 1074b words may have to be zeroed, it seems necessary to retain state during block zeroing for restart from page faults and interrupts. Since other processes could conceivably run and change the free list before @FreeObject completes, the free list update must be atomic, must be done only once, and must be done after block zeroing to prevent a race with @AllocateObject. Note that the register assignments are setup so that gcH1, gcTemp0, gcTemp1, and gcTemp2 can be used for quadword block-zeroing via an unaligned PStore4. Then the header and 1st few data words can be fixed with either a PStore4 (using gcH0, gcH1, gcTemp0, and gcTemp1) or a PStore2, depending on the quadword alignment. This works as follows: Initially both header alignment and block length are known to be even, and block size including the header is known to be at least 4 words. If there are 6 or more words left, zero the odd doubleword at the end of the block, if any. Otherwise or afterwards, zero quadwords until only 4 or 6 words are left. Finally, finish up with a PFetch2 of the free list link and two PStore2/4's in the most efficient order. Timing: 18.5 + 32 if BSI .eq. BSIEscape else 18.5 + 91 + (14 if block doesn't end at a quadword boundary) + (16 if block doesn't begin at a quadword boundary) + 14x(block size/4) cycles. % @FreeObject: *Esc 146b T _ LSh[Stack&-1,10], At[Misc140d,6]; LPhi _ T; *Setup complete LP,,LPhi base register for block zeroing; fetch 1st header *word into gcH0, and check for microcode disabled. LPhi _ (RSh[LPhi,10]) + T + 1, Call[gcFixLP]; T _ BSIEscape; *Time to here = 18.5+24 cycles LU _ (LdF[gcH0,2,6]) xor T; Stack&+2, GoTo[.+3,ALU#0]; LU _ NextInst[IBuf]; Stack&-1 _ 0C, NIRet; *Prepare for interrupt or page fault during block zeroing by setting the *sign bit of the NHP on the stack as a flag meaning "continue" and putting *the displacement to the next double-word to zero above the stack args. *Go if continuing from interrupt or page fault. Stack, T _ BSItoSize, Skip[R>=0]; *Continue from interrupt or page fault Stack&+1, GoTo[gcFreeContinue]; *Fetch cardinal size (including 2 header words) of blocks at this BSI; this *size will be at least 4 words, and the block begins at an even word. T _ (LdF[gcH0,2,6]) + T; PFetch1[gcZCT,Stack]; gcFreeContinue: T _ gcH1 _ 0C, Task; gcTemp0 _ T; gcTemp2 _ T, Call[gcTtoTemp1]; *At least 4 words are in the block initially (including the header), and at *least 4 remain after resumption from an interrupt or fault. If the final *2 words are quadaligned, they must be zeroed with a PStore2 before the *quadword inner loop can be entered. If the block is exactly 4 words long *in this case, the inner loop cannot be entered because its exit assumes *4 or 6 words are left, but only 2 will be left after the PStore2. T _ LdF[LP,16,1]; *Timing to here = 18.5+24+23 cycles LU _ (LdF[Stack,16,1]) xor T, Call[gcFEndChk]; *Return here to zero next quadword knowing that LP+Stack is quadword *aligned and that at least 4 words are left in the block (including the *header). Have remaining word count in Stack. LU _ (Stack) - (7C); T _ (Stack) - (4C), GoTo[gcFNoQ,ALU<0]; *At least 10b words remain in block--zero 4 of them. PStore4[LP,gcH1], NonQuadOK, GoTo[gcFInt,IntPending]; Nop; *Page faults will happen before this mi can complete. Stack _ (Stack) - (4C), Return; *If 4 words are left, finish with PF2/PS4; if 6 are left, with PF2/PS4/PS1. gcFNoQ: T _ BSItoFreeList, GoTo[gcFFinal4,ALU=0]; *The Finalize bit is known to be 0, so fetch a field that is 1 bit larger *than the BSI to effect x2. T _ (LdF[gcH0,2,7]) + T; *T _ 2*BSI + BSItoFreeList *Fetch the free list link for this BSI into the next two registers. PFetch2[gcZCT,gcH0], Call[gcRet]; PStore4[LP,gcH0,2], Call[gcRet]; *Clear the type in the 2nd header word; 1st header word not modified. gcFFinish1: PStore1[LP,RZero,1]; gcFFinish: LPhi _ RSh[LPhi,10], Call[gcRet]; PStore2[gcZCT,LP]; LU _ NextInst[IBuf]; Stack&-2 _ 1C, NIRet; *Return true=1 gcFFinal4: T _ (LdF[gcH0,2,7]) + T; *T _ 2*BSI + BSItoFreeList *Fetch the free list link for this BSI into the next two registers. PFetch2[gcZCT,gcTemp0], Call[gcRet]; PStore4[LP,gcH0,0], GoTo[gcFFinish]; gcFInt: Stack&-1, LoadPage[opPage0]; T _ 2C, GoToP[NOPint]; gcFEndChk: LU _ (Stack) - (5C), GoTo[gcFEndOK,ALU=0]; *Double-word at end of block. T _ (Stack) - (2C), GoTo[gcFEndChk1,ALU<0]; *Zero non-quadword-aligned doubleword at end of block PStore2[LP,gcTemp0]; *Delay so that Stack_ below won't happen until PStore2 can't fault. Nop; Nop; Stack&-1, GoTo[gcFEndOK1]; *4-word block crossing quadword boundary special case. gcFEndChk1: T _ BSIToFreeList; T _ (LdF[gcH0,2,7]) + T; *T _ 2*BSI + BSItoFreeList PFetch2[gcZCT,gcTemp0], Call[gcRet]; PStore2[LP,gcTemp0,2], GoTo[gcFFinish1]; *Now have complete quadword at end of block gcFEndOK: T _ Stack&-1; gcFEndOK1: Stack _ (Stack) or (100000C); Stack&+1 _ T, Return; gcTtoTemp1: gcTemp1 _ T, Return; %Obsolete PStore2 block zeroing--has bug on page fault of PS2. Timing = 18.5 + 117 + 17x(no. words zeroed/2) cycles @FreeObject: *Esc 146b T _ LSh[Stack&-1,10], At[Misc140d,6]; LPhi _ T; *Complete LP,,LPhi base register setup (need full base register for *block zeroing below), fetch 1st header word into gcH0, and check for *microcode disabled. LPhi _ (RSh[LPhi,10]) + T + 1, Call[gcFixLP]; T _ BSIEscape; *Time to here = 18.5+24 cycles LU _ (LdF[gcH0,2,6]) xor T; Stack&+2, GoTo[.+3,ALU#0]; LU _ NextInst[IBuf]; Stack&-1 _ 0C, NIRet; *Prepare for interrupt or page fault during block zeroing by setting the *sign bit of the NHP on the stack as a flag meaning "continue" and putting *the displacement to the next double-word to zero above the stack args. *Go if continuing from interrupt or page fault. Stack, T _ BSItoSize, GoTo[gcFreeContinue,R<0]; *Fetch cardinal size (including 2 header words) of blocks at this BSI; this *size will be at least 4 words, and the block begins at an even word. T _ (LdF[gcH0,2,6]) + T; PFetch1[gcZCT,Stack]; T _ gcH1 _ 0C, Call[gcFreeInitSub]; LU _ Stack&-1; *Don't set flag until after page fault on fetch of size is impossible. Stack _ (Stack) or (100000C), GoTo[gcFreeContinue1]; gcFreeContinue: *Continue from interrupt or page fault T _ gcH1 _ 0C, Call[gcFreeInitSub]; gcFreeContinue1: Stack&+1, Call[.+2]; *Block zeroing loop returns here. PStore2[LP,gcTemp0], GoTo[gcFreeInt,IntPending]; LU _ (Stack) - (4C); T _ Stack _ (Stack) - (2C), Skip[ALU=0]; gcRet: Return; *Loop *Fetch the free list link for this BSI into the next two registers. T _ BSItoFreeList; T _ (LdF[gcH0,2,7]) + T; *T _ 2*BSI + BSItoFreeList *Timing to here: 18.5+24+38+17*(no. words zeroed/2) cycles PFetch2[gcZCT,gcTemp0], Call[gcRet]; *Clear the type in the 2nd header word; 1st header word not modified. PStore1[LP,RZero,1], Call[gcRet]; PStore2[LP,gcTemp0,2], Task; LPhi _ RSh[LPhi,10]; PStore2[gcZCT,LP]; LU _ NextInst[IBuf]; Stack&-2 _ 1C, NIRet; *Return true=1 gcFreeInitSub: gcTemp0 _ T; gcTemp2 _ T; gcTtoTemp1: gcTemp1 _ T, Return; gcFreeInt: Stack&-1, LoadPage[opPage0]; T _ 2C, GoToP[NOPint]; *T _ no. bytes by which to backup PC % %Replace the Ref at [S]..[S-1] by its canonical type from the low-order 14d bits of (Ref-1)^. The two high bits of the word containing the type must be zeroed now. After Cedar 5.0 is in general use, then the full word can be used for the type. Timing = 18.5 + (10 if NIL else 28) cycles. % gcCRefType: T _ LSh[Stack&-1,10]; *Esc 72b LU _ (Stack) or T, LoadPage[gcPage2]; LPhi _ T, GoToP[gcGRT2,ALU=0]; *If Ref=NIL, return type=0. OnPage[gcPage2]; T _ (Stack&-1) - 1; LP _ T, Skip[Carry]; LPhi _ (LPhi) - (400C), Call[gcRet2]; PFetch1[LP,Stack,0], Call[gcRet2]; gcGRT2: T _ LdF[Stack&-1,2,16]; LU _ NextInst[IBuf]; Stack&+1 _ T, NIRet; %AssignRef[refNew,ptrRef] returns nothing; implements (([S],,[S-1])+alpha)^ _ ([S-2],,[S-3]). RefCnt for the previous Ref in that location (refOld) is decremented and that for the new Ref (refNew) is incremented (but no update for Ref=NIL or for refOld=refNew). When decrementing RefCnt for refOld, maybeOnStack will be set to 1 if a collection is in progress, and Ref-2 will be entered in the ZCT if necessary. Code is unsafe against write protect faults. Timing (including buffer refill): 4.5 + 53 if refOld=refNew else 4.5 + 63 + [25 if PStore2 crosses quadword boundary] + [34 if refNew#NIL] + [(31 or 32) + ( 0 if refOld-2 not put in ZCT because refCnt>0 else 2 if refCnt goes to 0 but rcOverflowed=1 1 if already in ZCT else 24 if must put in ZCT) if refOld#NIL] Register usage: RTemp/ opcode number for possible trap. MNBR/ alpha gcLPR,hi/ pointer to double-word from which the old Ref will be fetched and into which the new Ref will be stored; the double-word is not necessarily on an even word boundary and might cross 64k word boundaries. LP,hi/ long pointer containing refOld; later base register to refOld-2. gcNNHP,hi/ base register to refNew-2 gcH0/ contents of 1st normal header word for refOld gcH1/ contents of 1st normal header word for refNew gcTemp0/ temporary for gcOnZ gcTemp1/ temporary for gcOnZ gcRLink/ temporary for gcOnZ % OnPage[gcPage]; gcARrefOldFix: LU _ LdF[gcH0,11,5], Skip[ALU#0]; *maybeOnStack_markingDecrements gcH0 _ (gcH0) and not (40000C), DblGoTo[.+2,.+3,ALU#0]; gcH0 _ (gcH0) or T, Skip[ALU=0]; gcH0 _ (gcH0) - (2C), Return; *Decremented refCnt > 0. *refCnt is 0 or 1; dispatch on low bit of refCnt and RCOverflowed bit. LU _ Dispatch[gcH0,16,2], GoTo[gcARrefOldFix1,R<0]; gcH0 _ (gcH0) - (2C), Disp[.+1]; *refCnt is 0 and never overflowed--error. TrapParm _ RCUnderBugTrap, CallX[gcBugTrap], DispTable[4]; *refCnt is 0 and previously overflowed--trap. TrapParm _ RCUnderflowTrap, CallX[gcTrap]; *refCnt going to 0--put NHP in ZCT. LPhi _ RSh[LPhi,10], GoTo[gcOnZa]; *refCnt going to 0 after previous overflow--same as refCnt>0. Return; gcARrefOldFix1: gcH0 _ (gcH0) - (2C), Disp[.+1]; *refCnt being decremented is 0 and never overflowed--error. TrapParm _ RCUnderBugTrap, CallX[gcBugTrap], DispTable[4]; *refCnt being decremented is 0 and previously overflowed--trap. TrapParm _ RCUnderflowTrap, CallX[gcTrap]; *refCnt going to 0, but already in ZCT. gcRet: Return; *refCnt going to 0 after previous overflow--same as refCnt>0. Return; @AssignRef: LU _ LdF[gcZCT,16,1], GoTo[gcARNo,R Odd], Opcode[76]; gcARGo: T _ LdF[Stack&-1,12,6]; gcLPRhi _ T, Task; gcLPRhi _ (LSh[gcLPRhi,10]) + T + 1; T _ Stack&-1, LoadPage[gcPage]; gcLPR _ T, GoToP[.+1]; *Have ptrRef in base register format in gcLPR,,gcLPRhi. *Fetch refOld = (ptrRef+alpha)^ into LP/LPhi. OnPage[gcPage]; T _ (MNBR _ NextData[IBuf]) + 1; PFetch1[gcLPR,LPhi]; RTemp _ 76C, Task; T _ MNBR; PFetch1[gcLPR,LP], Call[gcPopToHiBase]; *Put refNew into gcNNHP,,gcNNHPhi base register while simultaneously *checking for refNew=NIL. gcNNHPhi _ T; *Timing = 4.5 + 28 cycles to here. T _ (Stack&+1) - (2C), GoTo[gcAR1a,ALU#0]; gcNNHP _ T, Skip[Carry']; PFetch1[gcNNHP,gcH1,0], GoTo[gcAR1b]; *refNew is NIL. LPhi _ LSh[LPhi,10]; LP _ (LP) - (2C), GoTo[gcAR2]; *refNew can't be NIL. gcAR1a: gcNNHP _ T, Skip[Carry]; gcNNHPhi _ (gcNNHPhi) - (400C), Call[gcRet]; *refNew is non-NIL; have refNew-2 base register in gcNNHP/gcNNHPhi. PFetch1[gcNNHP,gcH1,0]; gcAR1b: T _ LPhi _ LSh[LPhi,10]; LU _ (gcNNHPhi) xor T; T _ (gcNNHP) + (2C), Skip[ALU=0]; T _ 77C, GoTo[gcAR1d]; LU _ (LP) xor T; T _ 77C, Skip[ALU#0]; Stack&-2, GoTo[gcTail]; *refNew=refOld => NOOP *refNew # refOld (2 cycle hold here usually) gcAR1d: LU _ (LdF[gcH1,11,6]) xor T; LP _ (LP) - (2C), Skip[ALU#0]; *Trap if refNew's refCnt will overflow. TrapParm _ RCOverflowTrap, CallX[gcTrap]; *Timing = 4.5 + 36 + (17 if refNew#NIL). gcAR2: gcH1 _ (gcH1) + (2C), GoTo[gcAR3,Carry]; LPhi _ (LPhi) - (400C); T _ LSh[Stack&-1,10], GoTo[gcAR5,Carry']; Stack&+1; gcAR3: PFetch1[LP,gcH0,0], Call[gcRet]; *Position markingDecrements flag over maybeOnStack in T bit 1. T _ LSh[gcZCTWPhi,16], Call[gcARrefOldFix]; *Finally, all results have been computed; refOld-2 has been entered in *the ZCT if refOld#NIL and decremented RefCnt=0; gcH1 holds the updated *header word at (refNew-2)^ if refNew#NIL; gcH0 holds the updated header *word at (refOld-2)^ if refOld#NIL; StkP points at refNew on the Stack. PStore1[LP,gcH0,0], Call[gcPopToHiBase]; gcAR5: LU _ (Stack&+1) or T; T _ MNBR, GoTo[.+3,ALU=0]; *refNew#NIL header update; clear maybeOnStack in refNew's header. gcH1 _ (gcH1) and not (40000C); PStore1[gcNNHP,gcH1,0], Call[gcRet]; *Timing to here: 4.5 + 46 + (34 if refNew#NIL) *+ [(31 or 32) + * ( 0 if not put in ZCT because refCnt>0 else * 2 if not put in ZCT because refCnt previously overflowed else * 5 if refOld already in ZCT else * 24 if must put in ZCT) * if refOld#NIL] PStore2[gcLPR,Stack]; GoTo[gcTail,QuadOvf']; Stack&+2; T _ (MNBR) + 1; PStore1[gcLPR,Stack], Task; T _ MNBR; PStore1[gcLPR,Stack]; gcTail: LU _ NextInst[IBuf], CallX[gcTailx]; gcTailx: NIRet; *AssignRefNew now same as AssignRef--will deimplement. @AssignRefNew: LU _ LdF[gcZCT,16,1], GoTo[gcARGo,R Even], Opcode[77]; gcARNo: RTemp _ 76C, Skip[ALU#0]; *Cedar 5 mode, but microcode disabled. TrapParm _ MicDisabled, GoTo[UndefTrap]; *Cedar 4 mode trap. TrapParm _ 0C, GoTo[UndefTrap]; :END[Cedar];e24(1792)\f2