:IF[AltoMode]; TITLE[MesaJ.Alto.Mode]; :ELSEIF[CedarMode]; TITLE[MesaJ.Cedar.Mode]; :ELSE; TITLE[MesaJ.Pilot.Mode]; :ENDIF; % Edit by Fiala 1 February 1982: Change MesaRefill to avoid race against page fault; avoid bypass kludge after various PFetch4's which cause trouble when error correction transport advances page faults. Edit by Fiala 12 June 1981: Fixed @JB and JB problem analogous to J2/J3; faster MUL and DIV algorithms; MesaRefillLoc. Edit by Fiala 23 April 1981: save 1 mi in MesaRefill. Edit by Neely April 2, 1981 5:57 PM Create a J2: and J3: separate from @J2: and @J3: because can not call or goto an opcode because of possible page faults in the instruction stream. Edit by Fiala 30 March 1981 Fix harmless bug in JW saving 3 mi. Edit by Fiala 5 March 1981: Fix JB 0 and JB 1; TrapParm_0 at UnImpP6. Edit by Fiala February 12, 1981 9:55 AM Fix UnImpP6 for Pilot; implement Cedar undefined opcode trap change; tiny improvement to DIV. Edit by Jim Frandeen January 5, 1981 4:33 PM: Delete CallX Macro definition for new D0Lang. Edit by Johnsson September 29, 1980 8:46 AM; trap fixup Edit by Fiala August 25, 1980 6:31 PM Incorporate Alto Mesa improvements since September 1979: bum out 25b mi net; speed up AND, OR, XOR, Jx, JEQx, JNEx, JIB, JIW, DIV, LDIV, SHIFT, DCOMP, and DUCOMP; remove range restrictions on two byte jumps (-128 to +127 now legal). Edit by Johnsson July 28, 1980 5:38 PM: trap fixup Edit by Jim Frandeen January 22, 1980 11:24 AM: cleanup for D0Lang Version 6. Edit by Jim Frandeen December 22, 1979 9:43 PM: cleanup for D0Lang Version 5. MESA JUMP AND ARITHMETIC INSTRUCTIONS PCB,,PCBhi is a base register pointing to the current instruction quadword. PCB[14:15] are 0, and the low 3 bits of the PC (which point to a byte within the quadword) are kept in PCF. Since code segments cannot cross 64K boundaries and are limited to 32K words in length, the two bytes of PCBhi are forced to be equal, rather than having the least significant byte differ from the msb by 1 as is the normal case for base registers. Refill occurs when, at the onset of a NextInst or NextData, PCF contains a value greater than 7b. In this case the mi is aborted and the trap mi at location 0, 'LoadPage[0], GoTo[377]', is executed, sending control to location 377 on the page that caused refill. MesaRefill6 below is the mi for page 6; identical mi exist on other pages from which refill might occur. Refill timing is as follows: The aborted mi, trap mi at 0, and PFetch4 use 6 cycles; in Pilot mode, memory wait uses 12 more, totalling 18 cycles; in Alto mode, memory wait and byte swapping add 20, totalling 26 cycles. This time is distributed over the 8 bytecodes in a quadword; any additional refills caused by jumps are charged to the jump bytecodes themselves. Hence, Pilot mode refill accounts for 2.25 and Alto mode for 3.25 cycles/non-jump bytecode. % *NOTE: MicroD almost can't place page 6 due to shortage of branch condition *target pairs. This occurs as follows: of the 128 branch target pairs, *opcode entries use 64; opcode entries which do a call use 13 more, leaving *only 51 for everything else, which is barely enough. MesaRefill: *Called from many places PCF _ RZero, At[MesaRefillLoc]; :IF[AltoMode]; **************************************** PCB _ (PCB) + (4C); SwapBytes: *Called from MesaX IBuf _ LCy[IBuf,10]; SwapBytes1: *Called from MesaX IBuf1 _ LCy[IBuf1,10]; IBuf2 _ LCy[IBuf2,10]; IBuf3 _ LCy[IBuf3,10], Return; :ELSE; ************************************************ *Hold page fault on page 0 and do not advance PCB until a fault is impossible; *otherwise, there is a race between the fault and the PCB_PCB+4; this race *is particularly bad if error correction happens. Nop; Nop; PCB _ (PCB) + (4C), Return; % xBuf _ T; Nop; UseCTask; T _ APCTask&APC, LoadPage[opPage2]; xBuf1 _ T; OnPage[opPage2]; T _ (SStkP&NStkP) xor (377C), Task; xBuf2 _ IP[FFault]C; StkP _ xBuf2, xBuf2 _ T, NoRegILockOK, Call[.+1]; LU _ (Stack) and (40000C); T _ xBuf, Skip[ALU=0]; Return; StkP _ xBuf2; APCTask&APC _ xBuf1; PCB _ (PCB) + (4C), Return; % :ENDIF; *********************************************** MesaRefill6: PFetch4[PCB,IBuf,4], GoToP[MesaRefill], At[3377]; *Page 6 refill %NOTE: after any PFetch4 on page 6 faults, the fault handler will resume after filling IBuf with 377b bytes; this means that it is inadvisable to use the bypass kludge after the PFetch4 and must remain on page 6. If the bypass kludge were used, the transport for a preceding PFetch4, such as the one in RDC.Mc, could experience error correction and advance the time of a page fault so that the mi containing the bypass kludge was aborted, and this would execute incorrectly. % JnComO: PFetch4[PCB,IBuf]; JnComO1: Nop; T _ PCB _ (PCB) + T; PCB _ (LSh[PCB,1]) + 1; JnFin: PCF _ PCB, PCB _ T, NoRegILockOK; *Only low 3 bits of PCF loaded :IF[AltoMode]; **************************************** JSwapx: IBuf _ LCy[IBuf,10]; IBuf1 _ LCy[IBuf1,10]; IBuf2 _ LCy[IBuf2,10]; IBuf3 _ LCy[IBuf3,10]; P6Tail: LU _ NextInst[IBuf]; *Odd; paired with JJmp :ELSE; ************************************************ Nop; **NextInst is illegal in the mi after PCF_. JSwapx: P6Tail: LU _ NextInst[IBuf]; *Odd; paired with JJmp :ENDIF; *********************************************** P6Tailx: PCB _ (PCB) and not (3C), NIRet; JnComZ: PFetch4[PCB,IBuf]; Nop; T _ PCB _ (PCB) + T; PCB _ LSh[PCB,1], GoTo[JnFin]; JJmp: T _ (PCF.word) + T, DblGoTo[JnComO,JnComZ,R Odd]; *Even; paired with P6Tail JnEe: T _ (PCF.word) + T, GoTo[JnComO]; JnEo: T _ (PCF.word) + T + 1, GoTo[JnComZ]; %Jn, n=2-8. PCF points to the byte beyond the opcode when execution starts (i.e., 1<=PCF<=10b), so if PCF is odd, the opcode is the even byte of the current word, else the odd byte of the previous word. PCX is loaded from PCF at T2 of the first mi executed. The word displacement of the target from PCF[0:2] and the final lsb of the PC are: n PCF PCF even odd 2 0,1 1,0 3 1,0 1,1 4 1,1 2,0 5 2,0 2,1 6 2,1 3,0 7 3,0 3,1 8 3,1 4,0 9 4,0 4,1 % *Note: cannot use @J2, @J3, or @JB because, after page faults, *PCF and PCX are wrong for continuation. J2: SkipData, CallX[P6Tail]; J3: SkipData, CallX[J2]; JB: T _ NextData[IBuf], CallX[JBr]; *Paired with Enojmp @J2: SkipData, CallX[P6Tail], Opcode[200]; *Time = 6+(18*(2/8)) = 10.50 @J3: SkipData, CallX[J2], Opcode[201]; *Time = 8+(18*(3/8)) = 14.75 @J4: SkipData, CallX[J3], Opcode[202]; *Time =10+(18*(4/8)) = 19.00 @J5: T _ 2C, GoTo[JJmp], Opcode[203]; *Time =22 to 23 @J6: T _ 2C, Cycle&PCXF, DblGoTo[JnEo,JnEe,R Odd], Opcode[204]; @J7: T _ 3C, GoTo[JJmp], Opcode[205]; @J8: T _ 3C, Cycle&PCXF, DblGoTo[JnEo,JnEe,R Odd], Opcode[206]; @J9: PCB _ (PCB) + (4C), Opcode[207]; **NOT PFetch4 (bypass kludge) PFetch4[PCB,IBuf,0], GoTo[JSwapx]; *PCF unchanged (Time = 21.50) %Jump Byte: alpha is a signed displacement from the opcode. AllOnes is used as a temporary, restored when done. Note that RH[PCBhi] = LH[PCBhi] since code can't cross 64k boundary. % @JB: T _ NextData[IBuf], Opcode[210]; JBr: T _ (PCFReg) + T, Skip[H2Bit8']; PCB _ (PCB) - (200C); *Offset by 400b bytes if negative displacement :IF[AltoMode]; **************************************** *T has displacement relative to LAST byte of opcode. **JB 0 is a problem here with PCF .eq. 0. JBx: AllOnes _ (AllOnes) + T; :ELSE; ************************************************ *T has signed BYTE displacement relative to FIRST byte of opcode. **JB 0 with PCF .eq. 0 or 1 and JB 1 with PCF .eq. 0 are problems. AllOnes _ (Form-2[AllOnes]) + T; :ENDIF; *********************************************** JBy: T _ RSh[AllOnes,1], Skip[R>=0]; T _ (LSh[R400,7]) or T; *Fault handler fills IBuf with 377b bytes and resumes if this PFetch4 faults. PFetch4[PCB,IBuf]; PCF _ AllOnes; PCB _ (PCB) + T; AllOnes _ (Zero) - 1, GoTo[JSwapx]; *Jump Word: alpha,,beta is a 2's complement displacement. :IF[AltoMode]; **************************************** @JW: Cycle&PCXF, Skip[R Even], Opcode[211]; CSkipData; T _ NextData[IBuf]; *get beta LU _ CycleControl _ NextData[IBuf]; *get alpha T _ (LHMask[Cycle&PCXF]) or T; *CycleControl is in bits 0:7 *T has 2's complement displacement relative to LAST byte of the opcode JWx: T _ (PCFReg) + T, GoTo[JBx]; :ELSE; ************************************************ @JW: LU _ CycleControl _ NextData[IBuf], Opcode[211]; *get alpha T _ NextData[IBuf]; *get beta T _ (LHMask[Cycle&PCXF]) or T; *CycleControl is in bits 0:7 *T has 2's complement displacement relative to the FIRST byte of the opcode JWx: T _ (PCFReg) + T; AllOnes _ (Form-4[AllOnes]) + T + 1, GoTo[JBy]; *-4+T+1 = T-3 :ENDIF; *********************************************** stkdif: LU _ (Stack&-1) - T, Return; Ejmp: T _ NextData[IBuf], CallX[JBr]; *Even; paired with J2 Enojmp: SkipData, CallX[P6Tail]; *Even; paired with JB JNEjO: T _ (Zero) + T + 1, GoTo[JnComZ]; *Odd JNEjE: PFetch4[PCB,IBuf], GoTo[JnComO1]; *Even *APC[13:14] contains 1, 2, or 3 for JEQ2, JEQ4, or JEQ6, respectively, *as left by the last NIRet. JEQev: LU _ (Stack&-1) - T, UseCTask; T _ (LdF[APCTask&APC,13,2]) - 1, Skip[ALU=0]; P6Taila: LU _ NextInst[IBuf], Call[P6Tailx]; *Odd JJmpa: T _ (PCF.word) + T, DblGoTo[JNEjO,JNEjE,R Odd]; *Even JEQod: LU _ (Stack&-1) - T, UseCTask; T _ LdF[APCTask&APC,13,2], DblGoTo[JJmp,P6Tail,ALU=0]; *Jump Equal n, n=2..9: @JEQ2: T _ Stack&-1, CallX[JEQev], Opcode[212]; *CIA=3051b @JEQ3: T _ Stack&-1, CallX[JEQod], Opcode[213]; *CIA=3055b @JEQ4: T _ Stack&-1, CallX[JEQev], Opcode[214]; @JEQ5: T _ Stack&-1, CallX[JEQod], Opcode[215]; @JEQ6: T _ Stack&-1, CallX[JEQev], Opcode[216]; @JEQ7: T _ Stack&-1, CallX[JEQod], Opcode[217]; @JEQ8: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[220]; T _ 3C, DblGoTo[JJmpa,P6Taila,ALU=0]; @JEQ9: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[221]; T _ 4C, DblGoTo[JJmp,P6Tail,ALU=0]; *Jump Equal Byte @JEQB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[222]; JEQBx: DblGoTo[Ejmp,J2,ALU=0]; *APC[12:14] contains 1,2,2,3,3,4,4,5 for JNE2 to JNE9. JNEev: LU _ (Stack&-1) - T, UseCTask; T _ (LdF[APCTask&APC,12,3]) - 1, Skip[ALU#0]; LU _ NextInst[IBuf], Call[P6Tailx]; T _ (PCF.word) + T, DblGoTo[JNEjO,JNEjE,R Odd]; *Odd JNEod: LU _ (Stack&-1) - T, UseCTask; T _ (LdF[APCTask&APC,12,3]) - 1, Skip[ALU#0]; LU _ NextInst[IBuf], Call[P6Tailx]; *Even T _ (PCF.word) + T, DblGoTo[JnComO,JnComZ,R Odd]; *Odd *Jump Not Equal n, n=2..9: @JNE2: T _ Stack&-1, CallX[JNEev], Opcode[223]; *CIA=3115b @JNE3: T _ Stack&-1, CallX[JNEod], Opcode[224]; @JNE4: T _ Stack&-1, CallX[JNEev], Opcode[225]; @JNE5: T _ Stack&-1, CallX[JNEod], Opcode[226]; @JNE6: T _ Stack&-1, CallX[JNEev], Opcode[227]; @JNE7: T _ Stack&-1, CallX[JNEod], Opcode[230]; @JNE8: T _ Stack&-1, CallX[JNEev], Opcode[231]; @JNE9: T _ Stack&-1, CallX[JNEod], Opcode[232]; *Jump Not Equal Byte @JNEB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[233]; JNEBx: DblGoTo[JB,Enojmp,ALU#0]; JLBpos: DblGoTo[J2,Ejmp,Ovf']; *Even JLBneg: DblGoTo[JB,Enojmp,Ovf']; *Odd JGEBpos: DblGoTo[JB,Enojmp,Ovf']; *Even JGEBneg: DblGoTo[J2,Ejmp,Ovf']; *Odd stksw: T _ Stack&+1, Return; *Jump Less Byte - jump if (TOS-1) < TOS @JLB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[234]; JLBx: FreezeResult, DblGoTo[JLBpos,JLBneg,ALU>=0]; *Jump Greater Equal Byte @JGEB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[235]; JGEBx: FreezeResult, DblGoTo[JGEBpos,JGEBneg,ALU>=0]; *Jump Greater Byte @JGB: Stack&-1, UseCTask, Call[stksw], Opcode[236]; LU _ (Stack&-2) - T, GoTo[JLBx]; *Jump Less Equal Byte @JLEB: Stack&-1, UseCTask, Call[stksw], Opcode[237]; LU _ (Stack&-2) - T, GoTo[JGEBx]; *Jump Unsigned Less Byte @JULB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[240]; JULBx: DblGoTo[J2,Ejmp,Carry]; *Jump Unsigned Greater Equal Byte @JUGEB: T _ Stack&-1, UseCTask, Call[stkdif], Opcode[241]; JUGEBx: DblGoTo[JB,Enojmp,Carry]; *Jump Unsigned Greater Byte @JUGB: Stack&-1, UseCTask, Call[stksw], Opcode[242]; LU _ (Stack&-2) - T, GoTo[JULBx]; *Jump Unsigned Less Equal Byte @JULEB: Stack&-1, UseCTask, Call[stksw], Opcode[243]; LU _ (Stack&-2) - T, GoTo[JUGEBx]; *Jump Zero Equal Byte @JZEQB: LU _ Stack&-1, GoTo[JEQBx], Opcode[244]; *Jump Zero Not Equal Byte @JZNEB: LU _ Stack&-1, GoTo[JNEBx], Opcode[245]; *Jump Indexed Byte and Jump Indexed Word CODEToRTemp: PFetch1[CODE,RTemp], Return; :IF[AltoMode]; **************************************** @JIB: LoadPage[opPage3], GoTo[UnImpP6], Opcode[246]; @JIW: LU _ Cycle&PCXF, Skip[R Even], Opcode[247]; CSkipData; *Odd byte can't cause refill T _ Stack&-1; LU _ (Stack) - T; T _ Stack&-1, Skip[Carry']; SkipData, Call[J2]; T _ (NextData[IBuf]) + T; *get beta LU _ CycleControl _ NextData[IBuf]; *get alpha :ELSE; ************************************************ P6PopComp: T _ Stack&-1, UseCTask; LU _ (Stack) - T, Return; @JIB: LU _ CycleControl _ CNextData[IBuf], Call[P6PopComp], Opcode[246]; T _ RSh[Stack,1], Skip[Carry']; SkipData, Call[P6Pop]; *Exit to next opcode T _ (NextData[IBuf]) + T; *add beta T _ (LHMask[Cycle&PCXF]) + T, Call[CODEToRTemp]; Stack&-1, Skip[R Odd]; T _ LdF[RTemp,0,10], GoTo[JWx]; T _ RHMask[RTemp], GoTo[JWx]; @JIW: LU _ CycleControl _ CNextData[IBuf], Call[P6PopComp], Opcode[247]; T _ Stack&-1, Skip[Carry']; SkipData, Call[P6Tail]; *Flush beta and exit T _ (NextData[IBuf]) + T; *add beta :ENDIF; ********************************************** T _ (LHMask[Cycle&PCXF]) + T, Call[CODEToRTemp]; T _ RTemp, GoTo[JWx]; @ADD: T _ Stack&-1, Opcode[250]; Addx: LU _ NextInst[IBuf]; Stack _ (Stack) + T, NIRet; @SUB: T _ Stack&-1, Opcode[251]; Subx: LU _ NextInst[IBuf]; Stack _ (Stack) - T, NIRet; %Multiply--high half of 32-bit product is left above the top of the Stack product low in Stack, hi in RTemp1 multipliplicand in T (from the argument at TOS-1) multiplier in RTemp (from the argument at TOS) The first loop flushes leading 0's in the multiplier with timing 2 cycles/0; The second loop processes 0's in 6 cycles and 1's in 10 or 11 cycles. Note how a low-order 1 in the multiplier serves as an end flag. Total time is (14 to 17) + 2*LZ + (16-LZ)*6 + (4 or 5)*(NOnes). NOTE: For random numbers, this algorithm averages about 35 cycles faster than the algorithm commented on the next page. However, when the multiplier has many leading + trailing zeroes, it is worse than the other. This algorithm has 10 to 13 cycles less setup overhead and the same loop time if the product is less than 16d bits; for larger products it has 15 to 18 cycles less setup overhead and 3 to 4 cycles less per multiplier 1; however, in all cases, it requires 2 more cycles per leading or trailing multiplier 0 than the other algorithm. In the most unfavorable case (1 x 1), the other algorithm is 17 cycles faster; for all other bad cases, it is no more than 13 cycles faster. Since 87 percent of all multiplies are preceded by small constant pushes, the other algorithm may average faster than this one, but this one is 5b mi smaller, so we use it. % @MUL: RTemp1 _ T _ 30C, Call[MulSU], Opcode[252]; *2nd loop shifts the product RTemp1/Stack left 1 and conditionally adds the *multiplicand T based upon sign of the multiplier RTemp, which is left-shifted *until the right-most 1 bit is seen. RTemp _ (RTemp) SALUFOP T, GoTo[Mul1,R<0]; Mul0: Stack _ (Stack) SALUFOP T; RTemp1 _ (RTemp1) SALUFOP T, UseCOutAsCIn, Return; Mul1: RTemp1 _ (RTemp1) SALUFOP T, UseCOutAsCIn, GoTo[MulLast,ALU=0]; Stack _ (LSh[Stack,1]) + T, Skip[R<0]; RTemp1 _ (RTemp1) - 1, UseCOutAsCIn, Return; RTemp1 _ (RTemp1) + 1, UseCOutAsCIn, Return; *Force the low bit of multiplier RTemp to 1 for the end test. *Initialize the high product word (RTemp1) to 0; low product word (TOS-1) *already contains the multiplicand, so we don't zero it and add the *multiplicand on the 1st multiplier 1 (but we have to test multiplier for 0). MulSU: RTemp1 _ (RTemp1) - (SALUF _ T); *SALUF = 30b is LU _ 2A T _ (Stack&-1) SALUFOP T, Skip[R>=0]; *Multiplier*2 from TOS RTemp _ (Zero) + T + 1, GoTo[MulSUX]; *Multiplier .ls. 0 RTemp _ (Zero) + T + 1, Skip[ALU#0]; *Multiplier .ge. 0 T _ Stack _ 0C, GoTo[mdPush]; *Multiplier .eq. 0 *One mi loop shifts off leading 0's in multiplier. RTemp _ (RTemp) SALUFOP T, Skip[R<0]; RTemp _ (RTemp) SALUFOP T, GoTo[.,R>=0]; MulSUX: T _ Stack, Return; *Multiplicand from TOS-1 MulLast: T _ RSh[RTemp1,1], GoTo[mdPush,Carry']; T _ (LSh[AllOnes,17]) or T, GoTo[MdPush]; mdPush: Stack&+1 _ T; P6Pop: LU _ NextInst[IBuf]; Stack&-1, NIRet; PopToT: T _ Stack&-1, FreezeResult, Return; @DBL: T _ LSh[Stack&-1,1], Opcode[253]; PushTP6: LU _ NextInst[IBuf]; Stack&+1 _ T, NIRet; %Multiply--high half of 32-bit product is left above the top of the Stack product low in Stack, hi in RTemp1 multipliplicand low in RTemp, hi in xfMX multiplier in xfMY The first loop, which runs until the multiplicand being left-shifted 1 each step overflows into the high word, has timing of 4 cycles on 0's, 10 on 1's; this loop doesn't task on 0's (worst case without tasking ~ 70 cycles on a multiplier of 100000b and multiplicand of 1). Note that the end test need be made only when processing a multiplier 1. The second loop, which runs until the last multiplier 1 is processed, has timing of 6 cycles on 0's, 14 on 1's. Total time 26 cycles + (14 per 1) + (6 per 0 between the leftmost and rightmost 1's) + (4 per 0 to the right of the rightmost 1). @MUL: RTemp1 _ T _ 30C, Opcode[252]; *SALUF = 30b is an LU _ 2A RTemp1 _ (RTemp1) - (SALUF _ T), Call[PopToT]; *RTemp1 _ 0 xfMY _ T, UseCTask, Call[PopToT]; Stack&+1 _ 0C, Skip[ALU#0]; *tests xfMY _ T T _ Stack&+1 _ 0C, GoTo[P6Pop]; RTemp _ T, Call[.+1]; *1st loop xfMY _ RSh[xfMY,1], GoTo[MulZ,R Even]; MulO: Stack _ (Stack) + T, GoTo[MulLast,ALU=0]; T _ RTemp _ (RTemp) SALUFOP T, FreezeResult, Skip[R<0]; RTemp1 _ (RTemp1) + 1, UseCOutAsCIn, Return; RTemp1 _ (RTemp1) + 1, UseCOutAsCIn, GoTo[MulL]; MulZ: T _ RTemp _ (RTemp) SALUFOP T, Skip[R<0]; *Must replicate the mi at MulO-1 because the opcode dispatch locations are *only four apart on this page. xfMY _ RSh[xfMY,1], DblGoTo[MulO,MulZ,R Odd]; MulL: xfMX _ 1C, Call[.+1]; *2nd loop xfMY _ RSh[xfMY,1], GoTo[MulLZ,R Even]; MulLO: Stack _ (Stack) + T, GoTo[MulLLast,ALU=0]; T _ xfMX, FreezeResult; RTemp1 _ (RTemp1) + T + 1, UseCOutAsCIn; MulLZ: T _ RTemp _ (RTemp) SALUFOP T; *Double the multiplicand xfMX _ (xfMX) SALUFOP T, UseCOutAsCIn, Return; MulLLast: T _ xfMX, FreezeResult; T _ RTemp1 _ (RTemp1) + T + 1, UseCOutAsCIn, GoTo[mdPush]; MulLast: T _ (RTemp1) + 1, UseCoutAsCin; Old divide inner loop LU _ (Zero) + T; *Carry_0 Stack _ (Stack) SALUFOP T, UseCOutAsCIn, Call[DvStart]; *Loop time: 13 to 15 cycles/bit. LU _ RTemp1; RTemp _ (RTemp) - T, Skip[ALU>=0]; *subtract divisor Stack _ (Stack) SALUFOP T, GoTo[div1]; *No carry test--q bit 1 Stack _ (Stack) SALUFOP T, UseCOutAsCIn, Skip[Carry]; *Subtract ok? RTemp _ (RTemp) + T, FreezeResult; *No--undo div1: xfTemp _ (xfTemp) - 1, FreezeResult, Skip[R>=0]; T _ RTemp, GoTo[mdPush]; DvStart: RTemp _ (RTemp) SALUFOP T, UseCOutAsCIn, Skip[R<0]; RTemp1 _ Zero, Return; RTemp1 _ (RTemp1) or not (0C), Return; % *Divide--(0,,TOS-1)/TOS. Single word dividend and divisor, no overflow check. *The remainder is left above the Stack. @DIV: MNBR _ Stack&-1, Opcode[254]; RTemp _ 0C, GoTo[LDIVx]; %Long Divide--(TOS-1,,TOS-2)/TOS. Double-word dividend, single-word divisor, no overflow check. The remainder is left above the Stack. dividend low in Stack; hi in RTemp divisor in T quotient appears in Stack; remainder in RTemp Timing ~ 30 + (9/quotient 1) + (12/quotient 0) cycles % @LDIV: MNBR _ Stack&-1, Call[PopToT], Opcode[255]; RTemp _ T; LDIVx: T _ 31C; *SALUFOP = A+A+1 SALUF _ T; T _ MNBR; *T _ divisor :IF[AltoMode]; **************************************** LU _ (RTemp) - T; *no checks in Alto mode xfTemp _ 16C, Skip[Carry']; T _ 0C, GoTo[mdPush]; *0 divisor or divide check :ELSE; ************************************************ LU _ (RTemp) - T, GoTo[DvsNZ,ALU#0]; RTemp _ sZeroDivisor; DvTrap: LoadPage[opPage0]; T _ SStkP, GoToP[BackSPPCandTrap]; DvsNZ: xfTemp _ 16C, Skip[Carry']; RTemp _ sDivideCheck, GoTo[DvTrap]; :ENDIF; *********************************************** RTemp1 _ (Zero) - T; *Save minusdivisor *The 1st bit shifted into stack is "Don't care." Stack _ (Stack) SALUFOP T, Call[DvStart]; *Shift the high dividend while simultaneously subtracting the divisor and *adding in the bit shifted out of the low word in the previous step. T _ RTemp _ (LSh[RTemp,1]) + T + 1, Skip[R>=0]; *Shift the low dividend while bringing in the quotient bit Stack _ (Stack) SALUFOP T, GoTo[DvSub]; *q bit 1 *Subtract ok? Stack _ (Stack) SALUFOP T, UseCOutAsCIn, GoTo[DvSub,Carry]; *No--undo by adding back the divisor. Would like to simply add in the next *step (i.e., ((X+divs) lshift 1) - divs .eq. (X lshift 1) + divs), but the *carry gets screwed up. T _ RTemp1, FreezeResult; T _ RTemp _ (RTemp) - T, FreezeResult; DvSub: xfTemp _ (xfTemp) - 1, FreezeResult, Skip[R>=0]; Stack&+1 _ T, GoTo[P6Pop]; *Get minusdivisor + bit shifted out of low word DvStart: T _ (RTemp1) - 1, UseCOutAsCIn, Return; @NEG: T _ Stack&-1, Opcode[256]; T _ (Zero) - T, GoTo[PushTP6]; @INC: T _ (Stack&-1) + 1, GoTo[PushTP6], Opcode[257]; @AND: T _ Stack&-1, Opcode[260]; LU _ NextInst[IBuf]; Stack _ (Stack) and T, NIRet; @OR: T _ Stack&-1, Opcode[261]; LU _ NextInst[IBuf]; Stack _ (Stack) or T, NIRet; @XOR: T _ Stack&-1, Opcode[262]; LU _ NextInst[IBuf]; Stack _ (Stack) xor T, NIRet; @SHIFT: T _ Stack&-1, Opcode[263]; RTemp _ T, DblGoTo[ShiftRight,ShiftLeft,ALU<0]; ShiftRight: RTemp _ (RTemp) + (17C); CycleControl _ RTemp, Skip[Carry]; Stack _ Zero, GoTo[P6Tail]; *shift count > 17 , use zero LU _ NextInst[IBuf]; Stack _ RF[Stack], NIRet; ShiftLeft: LU _ (RTemp) and not (17C); RTemp _ (RTemp) xor (17C), Skip[ALU=0]; Stack _ Zero, GoTo[P6Tail]; *shift count > 17 , use zero *T has positive count. form 0,,-count, then use WFA CycleControl _ RTemp; LU _ NextInst[IBuf]; Stack _ WFA[Stack], NIRet; @DADD: MNBR _ Stack&-1, Call[GetTDecStk2], Opcode[264]; *point to lsb of top doubleword Stack _ (Stack) + T; *add low bits Stack&+1, Skip[Carry]; T _ MNBR, GoTo[Addx]; *pick up high bits of top doubleword T _ (MNBR) + 1, GoTo[Addx]; *pick up high bits of top doubleword @DSUB: MNBR _ Stack&-1, Call[GetTDecStk2], Opcode[265]; *point to lsb of top doubleword Stack _ (Stack) - T; * subtract low bits Stack&+1, Skip[Carry']; *point to msb of second doubleword T _ MNBR, GoTo[Subx]; *remember msb of top doubleword (TOS) T _ (MNBR) + 1, GoTo[Subx]; GetTDecStk2: T _ Stack&-2, Return; *grab it, point to lsb of second doubleword *Double Compare: *If (TOS-2),,(TOS-3) < TOS,,(TOS-1), push -1 *If (TOS-2),,(TOS-3) = TOS,,(TOS-1), push 0 *If (TOS-2),,(TOS-3) > TOS,,(TOS-1), push 1 *Comparisons are signed @DCOMP: T _ (Stack&-2) + (100000C), Opcode[266]; Stack _ (Stack) + (100000C), GoTo[DUCOMPy]; *Comparisons are unsigned @DUCOMP: T _ Stack&-2, Opcode[267]; *Compare msb's, point at lsb of high doubleword *grab lsb of top doubleword, point at lsb of second doubleword DUCOMPy: LU _ (Stack&+1) - T; T _ Stack&-2, FreezeResult, Skip[ALU=0]; Stack _ (Zero) + 1, DblGoTo[DUCompL,DUCompG,Carry']; Stack _ (Stack) - T; *compare low words FreezeResult, Skip[ALU#0]; LU _ NextInst[IBuf], Call[P6Tailx]; Stack _ (Zero) + 1, DblGoTo[DUCompL,DUCompG,Carry']; DUCompL: Stack _ (Stack) or not (0C), GoTo[P6Tail]; DUCompG: LU _ NextInst[IBuf], Call[P6Tailx]; *ADD01 - on D0, equivalent to ADD @ADD01: T _ Stack&-1, GoTo[Addx], Opcode[270]; *Unused opcodes on page 6 :IF[AltoMode]; **************************************** LoadPage[opPage3], GoTo[UnImpP6], Opcode[271]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[272]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[273]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[274]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[275]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[276]; LoadPage[opPage3], GoTo[UnImpP6], Opcode[277]; UnImpP6: T _ sUnimplemented, GoToP[kfcr]; :ELSEIF[CedarMode]; ************************************ RTemp _ 271C, GoTo[UnImpP6], Opcode[271]; RTemp _ 272C, GoTo[UnImpP6], Opcode[272]; RTemp _ 273C, GoTo[UnImpP6], Opcode[273]; RTemp _ 274C, GoTo[UnImpP6], Opcode[274]; RTemp _ 275C, GoTo[UnImpP6], Opcode[275]; RTemp _ 276C, GoTo[UnImpP6], Opcode[276]; RTemp _ 277C, GoTo[UnImpP6], Opcode[277]; UnImpP6: LoadPage[opPage0]; TrapParm _ 0C, GoToP[UndefTrap]; :ELSE; ************************************************ LoadPage[opPage0], GoTo[UnImpP6], Opcode[271]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[272]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[273]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[274]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[275]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[276]; LoadPage[opPage0], GoTo[UnImpP6], Opcode[277]; UnImpP6: RTemp _ sUnimplemented, GoToP[BackPCandTrap]; :ENDIF; *********************************************** :END[MesaJ];(1795)\f5