*----------------------------------------------------------- Title[DMesaFloat.mc...September 5, 1983 3:59 PM...Taft]; * Mesa floating point operations *----------------------------------------------------------- % CONTENTS, by order of alpha byte value FAdd (alpha=20B) Floating add FSub (alpha=21B) Floating subtract FMul (alpha=22B) Floating multiply FDiv (alpha=23B) Floating divide FComp (alpha=24B) Floating compare Fix (alpha=25B) Fix to LONG INTEGER Float (alpha=26B) Float a LONG INTEGER FixI (alpha=27B) Fix to INTEGER FixC (alpha=30B) Fix to CARDINAL FSticky (alpha=31B) Read and set Inexact Result sticky flag and enable FRem (alpha=32b) (Floating remainder unimplemented) Round (alpha=33B) Round to LONG INTEGER RoundI (alpha=34B) Round to INTEGER RoundC (alpha=35B) Round to CARDINAL FSqRt (alpha=36B) (Square root unimplemented) FSc (alpha=37B) Floating scale *----------------------------------------------------------- * Conventions *----------------------------------------------------------- IEEE standard floating point number representation (single-precision): bit 0 sign (1 = negative) bits 1-8 excess-127 exponent (0 = true zero if fraction = 0, else denormalized; 377 = infinity or Not-a-Number) bits 9-31 fraction, implicitly preceded by "1." if nonzero and normalized Internal (unpacked) representation, for argument 1 (argument 2 is similar): ExpSign1 [0] = 0, [1:8] = excess-128 exponent, [9:12] = 0, [13:14] undefined, [15] sign (1 = negative) Frac1H high 16 bits of fraction; if number nonzero, includes explicit "1" in bit 0 and binary point between bits 0 and 1 Frac1L low 8 bits of fraction left-justified, followed by non-significant bits and sticky bit Timings are for cases involving normalized nonzero numeric operands. Where expressed in microseconds, a 64-ns cycle time is assumed (clock = 32ns). Note: at all MiscTable entries, T contains the top-of-stack and StkP has been incremented. % * Local R-register usage: SetRMRegion[BBRegs]; * Overlay BitBlt registers RVN[ExpSign1]; * Exponent and sign of argument 1 RVN[Frac1H]; * Fraction of argument 1 (high part) RVN[Frac1L]; * (low part) RVN[ExpSign2]; * Argument 2 ... RVN[Frac2H]; RVN[Frac2L]; RVN[FTemp0]; * Temporaries RVN[FTemp1]; RVN[FTemp2]; * Alto mode: OTPReg preserves the MISC alpha byte (in bits 6:13), in case we * need to trap. OTPReg = 4*alpha if entry point 0 or 1 was taken, * 4*alpha+1 if entry point 2. * Sticky is permanently dedicated to holding the Inexact Result sticky flag * and enable bit: * Sticky[0]=1 => trap enabled * Sticky[15]=1 => Inexact Result has occurred *----------------------------------------------------------- FloatingPointPresent: * Returns T=2 if floating point microcode is present, 0 otherwise. *----------------------------------------------------------- Subroutine; T← 2C, Return; TopLevel; KnowRBase[RTemp0]; *----------------------------------------------------------- * Unimplemented operations *----------------------------------------------------------- MiscTable[32], Branch[FloatTrap]; * FRem MiscTable[36], Branch[FloatTrap]; * Undefined :If[AltoMode]; ********** Alto version ********** *----------------------------------------------------------- FSticky: MiscTable[31], * Sticky flags * new: StickyFlags ← Pop[]; Push[stickyFlags]; stickyFlags← new; *----------------------------------------------------------- Sticky← T, Q← Sticky; Stack-1← Q, IFUNext0; :Else; ******** PrincOps version ******** *----------------------------------------------------------- FSticky: MiscTable[31], * Sticky flags * new: StickyFlags ← Pop[]; Push[stickyFlags]; stickyFlags← new; *----------------------------------------------------------- RBase← RBase[Sticky]; Sticky← T, Q← Sticky; Stack-1← Q, IFUNext0; :EndIf; ********************************** *----------------------------------------------------------- FSub: MiscTable[21], * Floating subtract * a, b: REAL; * b ← PopLong[]; a ← PopLong[]; PushLong[a-b]; * Timing: same as FAdd. *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, SCall[FUnPack2]; ExpSign2← (ExpSign2)+1, Branch[FAddZeroR]; * Flip sign of arg 2 and add ExpSign2← (ExpSign2)+1, Branch[FAddNonZero]; *----------------------------------------------------------- FAdd: MiscTable[20], * Floating add * a, b: REAL; * b ← PopLong[]; a ← PopLong[]; PushLong[a+b]; * Timing: 34 cycles, +[7..9] if operand exponents differ, +5 if round, * + one of the following: * signs equal: +2, +3 more if result exponent > max(arg1, arg2) exponent. * signs unequal: +8, +3 more if result sign # arg1 sign, * + 2*((max(arg1 exponent, arg2 exponent) - result exponent) mod 16). * Typical case involving unnormalization but no renormalization or rounding: * 49 cycles or ~ 3.1 microseconds *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, SCall[FUnPack2]; FAddZeroR: PD← (Stack&+2)+(Stack&+2), Branch[FAddZero]; * +1: at least one arg is zero * Difference between exponents is the amount of unnormalization required. * The low 7 bits of ExpSign1 contain either 4 or 5, whereas the low 7 bits * of ExpSign2 contain 0, 1, or 2. Thus subtracting ExpSign2 from * ExpSign1 cannot cause a carry out of the low 7 bits. * Furthermore, the low bit gets the xor of the two signs, useful later * when determining whether to add or subtract the fractions. FAddNonZero: T← ExpSign1; ExpSign2← T← T-(Q← ExpSign2); FTemp1← A0, Q RSH 1; T← T+T, DblBranch[UnNorm1, UnNorm2, Carry']; * Un-normalize operand 1. T[0:7] has negative of right-shift count. UnNorm1: ExpSign1← (B← Q) LSH 1, Branch[.+2, R even]; * Result exponent is Exp2 ExpSign1← (ExpSign1)+1; * But preserve Sign1 T← T+(LShift[20, 10]C); T← T+(LShift[20, 10]C), Branch[UnNorm1le20, Carry]; T← T+(LShift[20, 10]C), Branch[UnNorm1le40, Carry]; * Exponents differ by more than 40. * Just zero operand 1, but be sure to set the sticky bit. Frac1L← 1C; ZeroFrac1H: Frac1H← A0, Branch[UnNormDone]; * Exponent difference IN [1..20]. Let n = the difference. * T[0:7] now has 40 - n, i.e., IN [20..37], so SHA=R, SHB=T. * This is correct shift control for LCY[R, T, 20-n] = RCY[T, R, n] UnNorm1le20: T← Frac1L, ShC← T; PD← ShiftNoMask[FTemp1]; * PD← RCY[Frac1L, 0, [1..20]] T← Frac1H, FreezeBC; Frac1L← ShiftNoMask[Frac1L], * Frac1L← RCY[Frac1H, Frac1L, [1..20]] Branch[.+2, ALU=0]; * Test bits shifted out of Frac1L Frac1L← (Frac1L) OR (1C); * Nonzero bits lost, set sticky bit T← A0, Q← Frac2L; Frac1H← ShiftNoMask[Frac1H], * Frac1H← RCY[0, Frac1H, [1..20]] Branch[UnNormDone1]; * FAdd/FSub (cont'd) * Exponent difference IN [21..40]. Let n = the difference. * T[0:7] now has 60 - n, i.e., IN [20..37], so SHA=R, SHB=T. * This is correct shift control for LCY[R, T, 40-n] = RCY[T, R, n-20] UnNorm1le40: T← Frac1H, ShC← T; T← ShiftNoMask[FTemp1]; * T← RCY[Frac1H, 0, [1..20]] PD← T OR (Frac1L); * Bits lost from Frac1H and Frac1L T← A0, FreezeBC; Frac1L← ShiftNoMask[Frac1H], * Frac1L← RCY[0, Frac1H, [1..20]] Branch[ZeroFrac1H, ALU=0]; Frac1L← (Frac1L) OR (1C), * Nonzero bits lost, set sticky bit Branch[ZeroFrac1H]; * Un-normalize operand 2. T[0:7] has right-shift count, and T[8:15] is IN [0..12]. UnNorm2: T← (12S)-T; * Negate count; ensure no borrow by ALU[8:15] T← T+(LShift[20, 10]C), Branch[UnNormDone, Carry]; * Branch if exponents equal T← T+(LShift[20, 10]C), Branch[UnNorm2le20, Carry]; T← T+(LShift[20, 10]C), Branch[UnNorm2le40, Carry]; * Exponents differ by more than 40. * Just zero operand 2, but be sure to set the sticky bit. Frac2L← 1C; ZeroFrac2H: T← Frac2H← A0, Branch[UnNormDone]; * Exponent difference IN [1..20]. Let n = the difference. * T[0:7] now has 40 - n, i.e., IN [20..37], so SHA=R, SHB=T. * This is correct shift control for LCY[R, T, 20-n] = RCY[T, R, n] UnNorm2le20: T← Frac2L, ShC← T; PD← ShiftNoMask[FTemp1]; * PD← RCY[Frac2L, 0, [1..20]] T← Frac2H, FreezeBC; Frac2L← ShiftNoMask[Frac2L], * Frac2L← RCY[Frac2H, Frac2L, [1..20]] Branch[.+2, ALU=0]; * Test bits shifted out of Frac2L Frac2L← (Frac2L) OR (1C); * Nonzero bits lost, set sticky bit T← A0, Q← Frac2L; T← Frac2H← ShiftNoMask[Frac2H], * Frac2H← RCY[0, Frac2H, [1..20]] Branch[UnNormDone2]; * Exponent difference IN [21..40]. Let n = the difference. * T[0:7] now has 60 - n, i.e., IN [20..37], so SHA=R, SHB=T. * This is correct shift control for LCY[R, T, 40-n] = RCY[T, R, n-20] UnNorm2le40: T← Frac2H, ShC← T; T← ShiftNoMask[FTemp1]; * T← RCY[Frac2H, 0, [1..20]] PD← T OR (Frac2L); * Bits lost from Frac2H and Frac2L T← A0, FreezeBC; Frac2L← ShiftNoMask[Frac2H], * Frac1L← RCY[0, Frac1H, [1..20]] Branch[ZeroFrac2H, ALU=0]; Frac2L← (Frac2L) OR (1C), * Nonzero bits lost, set sticky bit Branch[ZeroFrac2H]; * Now decide whether fractions are to be added or subtracted. * ExpSign2[15] remembers the xor of the signs. UnNormDone: Q← Frac2L; UnNormDone1: T← Frac2H; UnNormDone2: ExpSign2, DblBranch[SubFractions, AddFractions, R odd]; * Subtract if signs differ * FAdd/FSub (cont'd) * Signs equal, add fractions. T = Frac2H, Q = Frac2L. AddFractions: Frac1L← (Frac1L)+Q; Frac1H← T← (Frac1H)+T, XorSavedCarry; PD← (Frac1L) AND (377C), Branch[FRePackNZ1, Carry']; * If carry out of high result, must normalize right 1 position. * Need not restore leading "1", since rounding cannot cause a carry into this * position, and the leading bit is otherwise ignored during repacking. Frac1H← (Frac1H) RSH 1; Frac1L← RCY[T, Frac1L, 1], Branch[.+2, R even]; Frac1L← (Frac1L) OR (1C); * Preserve sticky bit ExpSign1← (ExpSign1)+(LShift[1, 7]C), Branch[FRePackNonzero]; * Signs differ, subtract fractions. T = Frac2H, Q = Frac2L. SubFractions: Frac1L← (Frac1L)-Q; T← (Frac1H)-T-1, XorSavedCarry; Frac1H← A0, FreezeBC, Branch[Normalize, Carry]; * If carry, Frac1 was >= Frac2, so result sign is Sign1. * If no carry, sign of the result changed. Must negate fraction and * complement sign to restore sign-magnitude representation. ExpSign1← (ExpSign1)+1; Frac1L← (0S)-(Frac1L); T← (Frac1H)-T-1, XorSavedCarry, Branch[Normalize]; * Add/Subtract with zeroes: * One or both of the operands is zero. ALU = (high word of arg1) LSH 1. This is * zero iff arg1 is zero (note that we don't need to worry about denormalized numbers, * since they have been filtered out already). * StkP has been advanced to point to high word of arg2. FAddZero: T← (Stack&-1)+(Q← Stack&-1), * T← (high word of arg2) LSH 1 Branch[FAddArg2Zero, ALU#0]; * arg1#0 => arg2=0 Cnt← Stack&-1, * Cnt← low word of arg2 Branch[FAddArg1Zero, ALU#0]; * arg2#0 => arg1=0 * Both args are zero: result is -0 if both args negative, else +0. StackT← (StackT) AND Q, IFUNext2; * Arg 1 is zero and arg 2 nonzero: result is arg 2. * Note: must re-pack sign explicitly, since FSub might have flipped it. FAddArg1Zero: T← RCY[ExpSign2, T, 1]; * Insert Sign2 into high result Stack&-1← T; Stack&+1← Cnt, IFUNext0; * Arg 2 is zero and arg 1 nonzero: result is arg 1. FAddArg2Zero: StkP-1, IFUNext0; * Result already on stack *----------------------------------------------------------- FMul: MiscTable[22], * Floating multiply * a, b: REAL; * b ← PopLong[]; a ← PopLong[]; PushLong[a*b]; * Timing: 92 cycles, +6 if need to round. ~ 5.9 microseconds. *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, SCall[FUnPack2]; T← ExpSign2, Branch[MulArgZero]; * +1: at least one arg is zero * XOR signs and add exponents. Subtract 200 from exponent to correct for doubling bias, * and add 1 to correct for 1-bit right shift of binary point during multiply * (binary point of product is between bits 1 and 2 rather than between 0 and 1). T← (ExpSign2)-(LShift[200, 7]C); * Subtract 200 from ExpSign2[0:8] T← T+(LShift[1, 7]C); * And add 1 (can't combine the constants, alas) ExpSign1← (ExpSign1)+T; * Now do the multiplications. Initial registers: * Frac1H = F1H (high 16 bits of arg 1) * Frac1L = F1L,,0 (low 8 bits of arg 1) * Frac2H = F2H (high 16 bits of arg 2) * Frac2L = F2L,,0 (low 8 bits of arg 2) * Intermediate register usage: * Frac1L and FTemp0 accumulate sticky bits * FTemp2 is the initial product register for the Multiply subroutines. * Do 8-step multiply of F1L*F2L, with initial product of zero. Frac1L← T← RSH[Frac1L, 10]; * Frac1L← 0,,F1L Frac2L← RSH[Frac2L, 10], * Frac2L← 0,,F2L Call[MultTx2L8I]; * Force 8 iterations, initial product 0 * Low product is FTemp2[8:15],,Q[0:7]. FTemp2[0:7] = Q[8:15] = 0. * Do 8-step multiply of F2H*F1L, using high 8 bits of previous result as initial product. FTemp0← Q; * Save low 8 bits for later use as sticky bits T← Frac2H, Cnt← 6S, Call[MultTx1L]; * Cross product is FTemp2[0:15]..Q[0:7]. Q[8:15] = 0. * Do 8-step multiply of F1H*F2L, with initial product of zero. Frac1L← Q; * Frac1L← low 8 bits of cross product T← Frac1H, ShC← T, * ShC← high 16 bits of cross product Call[MultTx2L8I]; * Force 8 iterations, initial product 0 * Cross product is FTemp2[0:15]..Q[0:7]. Q[8:15] = 0. * Add cross products, propagate carries, and merge sticky bits. T← (Frac1L)+Q; * Add low 8 bits of cross products Frac1L← (FTemp0) OR T; * Merge with sticky bits from low product T← ShC, Branch[.+2, ALU=0]; * Collapse to single sticky bit in Frac1L[15] Frac1L← 1C; FTemp2← (FTemp2)+T, XorSavedCarry; * Add high 16 bits of cross products * Do 16-step multiply of F1H*F2H, using high 16 bits of previous result as initial product. * Frac1H← (-1)+(carry out of sum of low and cross products). T← B← Frac1H, Cnt← 16S; Frac1H← T-T-1, XorSavedCarry, Call[MultTx2H]; * Final result is T,,Q. Merge in the sticky bit from low-order products and exit. Frac1L← (Frac1L) OR Q, DispTable[1, 2, 2]; T← (Frac1H)+T+1, Branch[Normalize], DispTable[1, 2, 2]; * One or both of the arguments is zero. Return zero with appropriate sign. T = ExpSign2 MulArgZero: ExpSign1← (ExpSign1) XOR T, Branch[FRePackZero]; *----------------------------------------------------------- * Unsigned multiply subroutines * Entry conditions (except as noted): * Cnt = n-2, where n is the number of iterations * T = multiplicand * FracXX = multiplier (register depends on entry point); * leftmost (16D-n) bits must be zero. * FTemp2 = initial product (to be added to low 16 bits of final product) * Exit conditions: * Product right-justified in T[0:15],,Q[0:n-1] * Q[n:15] = 0 * FTemp2 = copy of T * Carry = 1 iff T[0] = 1 * If n = 16D, caller must squash Multiply dispatches in the 2 instructions * following the Call. * Timing: n+2 *----------------------------------------------------------- Subroutine; * Entry point for multiplier = Frac1L MultTx1L: Q← Frac1L, DblBranch[MultiplierO, MultiplierE, R odd]; * Entry point for multiplier = Frac2L. * This entry forces 8 iterations with an initial product of zero. MultTx2L8I: FTemp2← A0, Cnt← 6S; Q← Frac2L, DblBranch[MultiplierO, MultiplierE, R odd]; * Entry point for multiplier = Frac2H MultTx2H: Q← Frac2H, DblBranch[MultiplierO, MultiplierE, R odd]; * Execute first Multiply purely for its side-effects (dispatch and shift Q) MultiplierE: PD← A0, Multiply, Branch[FM0]; MultiplierO: PD← A0, Multiply, Branch[FM1]; DispTable[4]; * here after Q[14] was 0 (no add) and continue FM0: FTemp2← A← FTemp2, Multiply, DblBranch[FM0E, FM0, Cnt=0&-1]; * here after Q[14] was 0 (no add) and exit FM0E: FTemp2← T← A← FTemp2, Multiply, Return; * here after Q[14] was 1 (add) and continue FM1: FTemp2← (FTemp2)+T, Multiply, DblBranch[FM0E, FM0, Cnt=0&-1]; * here after Q[14] was 1 (add) and exit FTemp2← T← (FTemp2)+T, Multiply, Return; TopLevel; *----------------------------------------------------------- FDiv: MiscTable[23], * Floating divide * a, b: REAL; * b ← PopLong[]; a ← PopLong[]; PushLong[a/b]; * Timing: 147 cycles, +2 if normalize, +5 if round. ~ 9.4 microseconds. *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, SCall[FUnPack2]; T← ExpSign2, Branch[DivArgZero]; * +1: at least one arg is zero * XOR signs and subtract exponents. * Add 200 to resulting exponent to correct for cancellation of bias. T← (ExpSign2)-(LShift[200, 7]C); * Subtract 200 from ExpSign2[0:8] ExpSign1← (ExpSign1)-T; * First, transfer dividend to Frac2H ,, Frac2L and divisor to T ,, Q, * and unnormalize both of them by one bit so that significant dividend bits * aren't lost during the division. Frac1H← (Frac1H) RSH 1, Branch[.+2, R odd]; T← (Frac1L) RSH 1, Branch[.+2]; T← ((Frac1L)+1) RCY 1; * Know Frac1L is even Frac2L← T, Q← Frac2L; T← A← Frac2H, Multiply; * T,,Q ← (Frac2H ,, Frac2L) RSH 1 * Know Q[14]=0, so can't dispatch * Now do the division. * Must do total of 26 iterations: 24 for quotient bits, +1 more significant * bit in case we need to normalize, +1 bit for rounding. Frac2H← Frac1H, Call[DivFrac]; * Do 16 iterations Subroutine; Frac1H← Frac1L, CoReturn; * Preserve high quotient; do 10 more iterations * We may have subtracted too much (or not added enough) in the last iteration. * If so, adjust the remainder by adding back the divisor. Since the remainder * got shifted left one bit, we must double the divisor first. T← A← T, Divide, Frac1L, Branch[NoRemAdjust, R odd]; * T,,Q ← (T,,Q) LSH 1 Frac2L← (Frac2L)+Q; Frac2H← T← (Frac2H)+T, XorSavedCarry, Branch[.+2]; * Left-justify low quotient bits and zero sticky bit. * Then, if the remainder is nonzero, set the sticky bit. * Then normalize if necessary. We should have to left-shift at most once, * since the original operands were normalized. NoRemAdjust: T← Frac2H; PD← (Frac2L) OR T; Frac1L← LSH[Frac1L, 6], Branch[.+2, ALU=0]; Frac1L← (Frac1L)+1; * Set sticky bit T← Frac1H, Branch[Normalize]; * One or both of the arguments is zero. * Trap if divisor is zero; return zero with appropriate sign otherwise. DivArgZero: Frac2H, Branch[MulArgZero, R<0]; Branch[FloatTrap]; * Division by zero *----------------------------------------------------------- DivFrac: * Divide fractions * Enter: Frac2H ,, Frac2L = dividend (high-order bit must be zero) * T ,, Q = divisor (high-order bit must be zero) * Exit: Frac2H ,, Frac2L = remainder, left-justified * Frac1L = quotient bits (see below) * T and Q unchanged * When first called, executes 16 iterations and returns 16 high-order quotient bits. * When resumed with CoReturn, executes 10 more iterations and returns 10 low-order * quotient bits right-justified (other bits garbage). * Timing: first call: 66; resumption: 41 *----------------------------------------------------------- Subroutine; Cnt← 17S; * Previous quotient bit was a 1, i.e., dividend was >= divisor. * Subtract divisor from dividend and left shift dividend. DivSubStep: Frac2L← ((Frac2L)-Q) LSH 1; DivSubStep1: Frac2H← (Frac2H)-T-1, XorSavedCarry, DblBranch[DivSh1, DivSh0, ALU<0]; * Previous quotient bit was a 0, i.e., dividend was < divisor (subtracted * too much). Add divisor to dividend and left shift dividend. DivAddStep: Frac2L← ((Frac2L)+Q) LSH 1; DivAddStep1: Frac2H← (Frac2H)+T, XorSavedCarry, DblBranch[DivSh1, DivSh0, ALU<0]; * Shifted a zero out of low dividend (ALU<0 tested unshifted result). DivSh0: Frac2H← (Frac2H)+(Frac2H), DblBranch[DivQuot1, DivQuot0, Carry]; * Shifted a one out of low dividend (ALU<0 tested unshifted result). DivSh1: Frac2H← (Frac2H)+(Frac2H)+1, DblBranch[DivQuot1, DivQuot0, Carry]; * If the operation generated no carry then we have subtracted too much. * Shift a zero into the quotient and add the divisor next iteration. DivQuot0: Frac1L← (Frac1L)+(Frac1L), * Shift zero into quotient Branch[DivAddStep, Cnt#0&-1]; Cnt← 11S, CoReturn; Frac2L← ((Frac2L)+Q) LSH 1, Branch[DivAddStep1]; * If the operation generated a carry then we have not subtracted too much. * Shift a one into the quotient and subtract the divisor next iteration. DivQuot1: Frac1L← (Frac1L)+(Frac1L)+1, * Shift one into quotient Branch[DivSubStep, Cnt#0&-1]; Cnt← 11S, CoReturn; Frac2L← ((Frac2L)-Q) LSH 1, Branch[DivSubStep1]; TopLevel; *----------------------------------------------------------- FComp: MiscTable[24], * Floating compare * a, b: REAL; b ← PopLong[]; a ← PopLong[]; * Push[SELECT TRUE FROM a<b => -1, a=b => 0, a>b => 1, ENDCASE]; * Timing: 23 cycles usually, 24 if high parts equal. * ~ 1.5 microseconds *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, SCall[FUnPack2]; T← Stack&+2, Branch[.+2]; * +1: at least one arg is zero * First compare the signs T← Stack&+2; * T← high arg1 PD← T XOR (Q← Stack&-1); * Q← high arg2 T← Stack&-1, FreezeBC, Branch[FCompSignsDiff, ALU<0]; * T← low arg2 * Signs equal, compare magnitudes. Q = high arg2, T = low arg2, StkP -> high arg1 PD← (Stack&-1)-Q, Branch[.+2, ALU#0]; * Compute high (arg1 - arg2) PD← (Stack)-T; * High parts equal, compute low (arg1 - arg2) * Carry = 1 if arg1 >= arg2 when treated as unsigned numbers. * The sense of this is inverted if the arguments are in fact negative, since * the representation is sign-magnitude rather than twos-complement. FCompTest: ExpSign1← (ExpSign1)+1, XorSavedCarry, Branch[FCompE, ALU=0]; ExpSign1, DblBranch[FCompL, FCompG, R odd]; * Signs unequal. Unless both arguments are zero, return "less" if arg1 is negative, * else "greater". Q = high arg2. FCompSignsDiff: PD← T-T, StkP-1, Q LSH 1; * Carry← 1 PD← (Frac1H) OR Q, Branch[FCompTest]; * ALU=0 iff both args are zero FCompL: StackT← -1C, IFUNext2; * arg1 < arg2 FCompE: StackT← A0, IFUNext2; * arg1 = arg2 FCompG: StackT← 1C, IFUNext2; * arg1 > arg2 *----------------------------------------------------------- Float: MiscTable[26], * Float a LONG INTEGER * a: LONG INTEGER ← PopLong[]; PushLong[REAL[a]]; * Timing: 20 cycles, +3 if arg < 2↑16, + 2*(15 - result exponent mod 16). * Average (8 normalizations required): 35 cycles or 2.3 microseconds *----------------------------------------------------------- PD← T, RBase← RBase[FTemp0], StkP-2; ExpSign1← Add[LShift[237, 7], 177]C, * Long integer = fraction * 2↑31 Branch[FloatNegative, ALU<0]; * Init sign to 1, branch if number negative Frac1L← Stack&+1; * Positive, fraction ← integer ExpSign1← (ExpSign1)-1, Branch[Normalize]; * Fix sign * T = high fraction, ALU>=0 for Normalize FloatNegative: T← A0, Q← Stack&+1; * Negative, fraction ← negative of integer Frac1L← T-Q; T← T-(Stack)-1, XorSavedCarry, Branch[Normalize]; *----------------------------------------------------------- Round: MiscTable[33], * Round to LONG INTEGER * a: REAL ← PopLong[]; PushLong[RoundToLongInteger[a]]; * Timing: 23 cycles if result is zero, * 29 if 0 < |result| < 2↑15 * 32 if 2↑15 <= |result| < 2↑31 *----------------------------------------------------------- T← Add[LShift[25, 2], 1]C, Branch[FixReDispatch]; *----------------------------------------------------------- Fix: MiscTable[25], * Fix to LONG INTEGER * a: REAL ← PopLong[]; PushLong[FixToLongInteger[a]]; * Timing: 17 cycles if result is zero, * 20 if 0 < |result| < 2↑15 * 23 if 2↑15 <= |result| < 2↑31 *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, Call[FixLong]; * (Q&FTemp1),,T ← result T← T-1, ExpSign1, Branch[.+3, R odd]; * Push positive result Stack&+1← T+1; StackT← Q, IFUNext2; * Push negative result Stack&+1← NOT T, Branch[.+2, Carry']; StackT← (0S)-Q-1, IFUNext2; StackT← (0S)-Q, IFUNext2; *----------------------------------------------------------- RoundI: MiscTable[34], * Round to INTEGER * a: REAL ← PopLong[]; Push[INTEGER[RoundToLongInteger[a]]]; * Timing: 20 cycles if result is zero, * 26 if 0 < |result| < 2↑15 *----------------------------------------------------------- T← Add[LShift[27, 2], 1]C, Branch[FixReDispatch]; *----------------------------------------------------------- FixI: MiscTable[27], * Fix to INTEGER * a: REAL ← PopLong[]; Push[INTEGER[FixToLongInteger[a]]]; * Timing: 17 cycles if result is zero, * 20 if 0 < |result| < 2↑15 *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, Call[FixLong]; * (Q&FTemp1),,T ← result PD← RCY[FTemp1, T, 17]; * See if result < 2↑15 FTemp1← (FTemp1) OR T, Branch[FixIOverflow, ALU#0]; * Result < 2↑15, push result with correct sign. ExpSign1, Branch[.+2, R odd]; * Test sign StackT← T, IFUNext2; * Push positive result StackT← (0S)-T, IFUNext2; * Push negative result * Result >= 2↑15. If it is exactly 2↑15 and negative then push -2↑15; * otherwise trap. FTemp1=100000 iff magnitude=2↑15. FixIOverflow: T← LSH[ExpSign1, 17]; * 100000 if negative, 0 if positive PD← (FTemp1)#T, Branch[FixTrapALU#0]; * Know FTemp1#0; ALU=0 iff both are 100000 *----------------------------------------------------------- RoundC: MiscTable[35], * Round to CARDINAL * a: REAL ← PopLong[]; Push[CARDINAL[RoundToLongInteger[a]]]; * Timing: 20 cycles if result is zero, * 25 if 0 < result < 2↑15 * 29 if 2↑15 <= result < 2↑16 *----------------------------------------------------------- T← Add[LShift[30, 2], 1]C, Branch[FixReDispatch]; *----------------------------------------------------------- FixC: MiscTable[30], * Fix to CARDINAL * a: REAL ← PopLong[]; Push[CARDINAL[FixToLongInteger[a]]]; * Timing: 17 cycles if result is zero, * 19 if 0 < result < 2↑15 * 22 if 2↑15 <= result < 2↑16 *----------------------------------------------------------- RBase← RBase[FTemp0], StkP-2, Call[FixLong]; * (Q&FTemp1),,T ← result PD← Q, ExpSign1, Branch[.+2, R even]; * Result in [0..2↑16)? Branch[FloatTrap]; * Negative * Ok if high part 0, else trap * Push result from T if ALU=0; trap if ALU#0. FixTrapALU#0: Branch[.+2, ALU=0]; Branch[FloatTrap]; StackT← T, IFUNext2; *----------------------------------------------------------- FixReDispatch: * Re-dispatch Round opcode to the equivalent Fix opcode. * Enter: T = MT0Loc offset for desired Fix opcode (4*alpha +1). * Exit: Branches to specified place, with BDispatch← 2 pending. * This causes the Call[FixLong] to be converted into Call[RoundLong]. *----------------------------------------------------------- T← T+(BigBDispatch← T), StkP-1; * T← xxx2 T← Stack&+1, BDispatch← T, BranchExternal[MT0Loc]; *----------------------------------------------------------- FScale: MiscTable[37], * Floating Scale * scale: INTEGER ← Pop[]; a: REAL ← PopLong; PushLong[a * (2.0**scale)]; * Timing: 14 cycles if floating point argument is nonzero, 10 if zero. *----------------------------------------------------------- * Call FUnPack1 solely to filter out denormalized numbers and NaNs. StkP-2; * Leave FF free for placement T← Stack&-1, RBase← RBase[FTemp0], SCall[FUnpack1]; IFUNext0; * Scaling zero leaves number unchanged * Packed exponent is excess-127, and legal values are IN [1..376B]. T← LDF[Stack&+1, 10, 7]; * Extract exponent from packed number T← T+(Stack); * Exponent + scale PD← T-(377C), Branch[FScUnderflow, ALU=0]; * Result must be IN [1..376B] T← LSH[Stack&-1, 7], Branch[FScOverflow, Carry]; StackT← (StackT)+T, IFUNext2; * Ok, just add scale to packed result FScUnderflow: Branch[FloatTrap]; FScOverflow: Branch[FloatTrap]; *----------------------------------------------------------- FixLong: * Subroutine to fix to a LONG INTEGER. * Enter: one floating point number on stack * Exit: magnitude of LONG INTEGER (always positive) in Q,,T * FTemp1 = a copy of Q (high result) * unpacked floating point number in ExpSign1, Frac1H, Frac1L * StkP addresses low word for result (i.e., =1 if minimal stack) * Does not return but rather traps for infinity, denormalized, NaN, * or number too large for LONG INTEGER. * Caller may resume FixLong to perform rounding, by saving T in FTemp2 * and executing a Return. * Clobbers FTemp1, ShC * Timing: 13 cycles if result is zero * 15 if 0 < |result| < 2↑15 * 18 if 2↑15 <= |result| < 2↑31 *----------------------------------------------------------- * Note that FUnpack1 never skips, because .+1 is forced to be odd. FTemp1← Link, Call[FUnPack1], DispTable[3, 17, 0], Global; * At[xxx0] FTemp1← A0, Link← FTemp1, Branch[FixTestExp]; * At[xxx1] *----------------------------------------------------------- RoundLong: * Subroutine to round to a LONG INTEGER. * Never called directly, but only as an "xor 2" pair with FixLong. * Uses FixLong internally; all comments for FixLong apply here, except: * Clobbers FTemp2. * Timing: 16 cycles if result is zero * 21 if 0 < |result| < 2↑15 * 25 if 2↑15 <= |result| < 2↑31 * + [1..3] cycles if have to round up *----------------------------------------------------------- FTemp2← Link, Call[FixLong]; * At[xxx2] Fix and prepare to round Subroutine; * Following instruction restores RoundLong's return link and jumps back * into FixLong at the appropriate place. Note Link← overrides Return's * normal action of loading Link with .+1. FTemp2← T, Link← FTemp2, Return; * Save low word and complete rounding * Continuation of FixLong: FixTestExp: T← (ExpSign1)+(ExpSign1); * Move exponent to T[0:7] T← T-(LShift[177, 10]C); * T[0:7]← unbiased exponent +1 T← T-(LShift[20, 10]C), StkP-1, Branch[FixZero, ALU<0]; T← T-(LShift[20, 10]C), Branch[FixExpLs17, ALU<0]; T← T-(LShift[20, 10]C), Branch[FixExpLs37, ALU<0]; Branch[FloatTrap]; * Exponent > 36, overflow * FixLong/RoundLong (cont'd) * Unbiased exponent less than -1 (i.e., arg1 < 0.5). Just return zero. FixZero: T← Q← FTemp1, CoReturn; FixRoundRet: Q← FTemp1, Return; * If asked to round, do nothing; Q← 0 * Unbiased exponent is in [-1..16B]. T[0:7] now has exponent-37B, i.e., * in [177740..177757], which when properly positioned gives SHA=T, SHB=R. * Shift control is LCY[T, R, exponent+1] = RCY[R, T, 17B-exponent]. FixExpLs17: T← Frac1H, ShC← T; T← ShiftNoMask[FTemp1], Q← FTemp1, * T← LCY[Frac1H, 0, [0..17]], Q← 0 CoReturn; * Resume here to round. Rounding bits are in both Frac1H and Frac1L. T← Frac1L; T← ShiftNoMask[Frac1H], Branch[FixRoundTest, ALU=0]; T← T OR (1C), Branch[FixRoundTest]; * Sticky bit for discarded result * Unbiased exponent is in [17B..36B]. T[0:7] now has exponent-57B, i.e., * in [177740..177757], which when properly positioned gives SHA=T, SHB=R. * Shift control is LCY[T, R, exponent-17B] = RCY[R, T, 37B-exponent]. FixExpLs37: T← Frac1H, ShC← T; T← FTemp1← ShiftNoMask[FTemp1]; * T← LCY[Frac1H, 0, [0..17]] T← Frac1L, Q← T; T← ShiftNoMask[Frac1H], CoReturn; * T← LCY[Frac1L, Frac1H, [0..17]] * Resume here to round. Rounding bits are only in Frac1L. T← A0; T← ShiftNoMask[Frac1L]; * Now have rounding bit in T[0] and sticky bits in T[1:15]. ALU = T. FixRoundTest: PD← T-1, Branch[.+2, ALU<0]; * Rounding possible? T← FTemp2, Return; * Not possible, restore T and return * If T>100000 then round up always (ALU<0). * If T=100000 then round up iff lowest significant bit is a one (R odd). T← (FTemp2)+1, Branch[.+2, ALU<0, R odd]; T← FTemp2, Return; * Don't round, just restore T FTemp1← A← FTemp1, XorSavedCarry, Branch[FixRoundRet]; * Round up *----------------------------------------------------------- FUnPack2: * Pop and unpack two floating-point arguments. * Enter: T = top-of-stack, StkP points to TOS-1 * Exit: ExpSign2, Frac2H, Frac2L set up with argument 2 (right); ExpSign2[13:14]=00 * ExpSign1, Frac1H, Frac1L set up with argument 1 (left); ExpSign1[13:14]=10 * StkP addresses high word of arg 1 (i.e., =2 if minimal stack) * Call by: SCall[FUnPack2] * Returns +1: at least one argument is zero * +2: both arguments are nonzero * Returns only for normal numbers or true zero. * Traps if denormalized, infinity, or Not-a-Number. * Clobbers Q * Timing: 15 cycles normally, 16 if arg 2 is zero. *----------------------------------------------------------- Subroutine; ExpSign2← T AND (177600C), Global; T← RCY[T, Stack, 10]; * Garbage bit and top 15 fraction bits ExpSign2← (ExpSign2) AND (77777C), * Exponent in bits 1:8, all else zero Branch[.+2, R<0]; * Branch if negative ExpSign2← (ExpSign2)+(200C), * Positive, add 1 to exponent, B15 ← 0 DblBranch[Exp2Zero, .+2, ALU=0]; * Branch if exponent zero ExpSign2← (ExpSign2)+(201C), * Negative, add 1 to exponent, B15 ← 1 Branch[Exp2Zero, ALU=0]; * Branch if exponent zero Frac2H← T OR (100000C), * Prefix explicit "1." to fraction Branch[Exp2NaN, ALU<0]; * Branch if exponent was 377 T← LSH[Stack&-1, 10]; * Left-justify low 8 fraction bits Frac2L← T; T← Stack&-1; * Now do arg 1 ExpSign1← T AND (177600C), Branch[FUnPack1a]; Exp2Zero: Frac2H← (Stack&-1) OR T; * See if entire fraction is zero Frac2L← A0, Branch[Arg2DeNorm, ALU#0]; * Branch if not true zero TopLevel; T← Stack&-1, Q← Link, SCall[FUnPack1]; * Zero, unpack other arg Link← Q, Branch[.+2]; * Return +1 regardless of what FUnpack1 did Link← Q; Subroutine; ExpSign2← (ExpSign2) AND (1C), Return; TopLevel; Arg2DeNorm: Branch[FloatTrap]; * Denormalized number Exp2NaN: Branch[FloatTrap]; * Not-a-Number *----------------------------------------------------------- FUnPack1: * Pop and unpack one floating-point argument. * Enter: T = top-of-stack, StkP points to TOS-1 * Exit: ExpSign1, Frac1H, Frac1L set up with argument * StkP addresses high word of arg 1 (i.e., =2 if minimal stack) * Call by: SCall[FUnPack1] * Returns +1: argument is zero * +2: argument is nonzero * Returns only for normal numbers or true zero. * Traps if denormalized, infinity, or Not-a-Number. * Timing: 7 cycles normally. *----------------------------------------------------------- Subroutine; ExpSign1← T AND (177600C), Global; FUnPack1a: T← RCY[T, Stack, 10]; * Garbage bit and top 15 fraction bits ExpSign1← (ExpSign1) AND (77777C), * Exponent in bits 1:8, all else zero Branch[.+2, R<0]; * Branch if negative ExpSign1← (ExpSign1)+(204C), * Positive, add 1 to exponent, [13:14]←10, [15]←0 DblBranch[Exp1Zero, .+2, ALU=0]; * Branch if exponent zero ExpSign1← (ExpSign1)+(205C), * Negative, add 1 to exponent, [13:14]←10, [15]←1 Branch[Exp1Zero, ALU=0]; * Branch if exponent zero Frac1H← T OR (100000C), * Prefix explicit "1" to fraction Branch[Exp1NaN, ALU<0]; * Branch if exponent was 377 T← LSH[Stack&+1, 10]; * Left-justify low 8 fraction bits Frac1L← T, Return[Carry']; * Always skip (carry is always zero here) Exp1Zero: Frac1H← (Stack&+1) OR T; * See if entire fraction is zero Frac1L← A0, Branch[Arg1DeNorm, ALU#0]; * Branch if not true zero ExpSign1← (ExpSign1) AND (1C), Return; * Zero, return +1 TopLevel; Arg1DeNorm: Branch[FloatTrap]; * Denormalized number Exp1NaN: Branch[FloatTrap]; * Not-a-Number *----------------------------------------------------------- Normalize: * Normalize and re-pack floating-point result. * Enter: ExpSign1, T, Frac1L contain unpacked result * T = ALU = high fraction. * StkP addresses high word of result (i.e., =2 if minimal stack) * Timing: for nonzero result: 11 cycles minimum, +3 if need to normalize at all, * +2*(n MOD 16) if n>1, where n is the number of normalization steps, * +3 if n>15, +5 if need to round, +2 if rounding causes a carry * out of Frac1L, +1 or 2 in extremely rare cases. * For zero result: 6 cycles *----------------------------------------------------------- * See if result is already normalized or entirely zero. * Note that we want the cases of no normalization, one-step normalization, * and result entirely zero to be the fastest, since they are by far the most common. * Therefore, do the first left shift in-line while branching on the other conditions. PD← T OR (Q← Frac1L), * ALU← 0 iff entire fraction is 0 Branch[NormAlready, ALU<0]; * Branch if already normalized Frac1H← A← T, Divide, * (Frac1H,,Q) ← (T,,Q) LSH 1 Branch[NormalizeZero, ALU=0]; * Branch if fraction is zero ExpSign1← (ExpSign1)-(LShift[1, 7]C), * Subtract one from exponent Branch[NormBegin, ALU#0]; * Branch if high fraction was nonzero * If the high word of the fraction was zero, we discover that after having left-shifted * the fraction once. Effectively, left-shift the fraction 16 bits and subtract * 16D from the exponent. Actually, undo the first left shift and subtract only 15D * from the exponent, in case the first left shift moved a one into the high fraction. ExpSign1← (ExpSign1)+(LShift[1, 7]C); * Can't encode the constant I really want! ExpSign1← (ExpSign1)-(LShift[20, 7]C); Frac1H← Frac1L; * Left-shift original fraction 16 bits Q← T, Branch[NormLoop]; * Q← 0 * In this loop, the exponent is in ExpSign1[0:8] and the fraction in Frac1H ,, Q. * Left shift the fraction and decrement the exponent until the high-order bit * of the fraction is a one. NormBegin: T← A0, Frac1H, Branch[NormDone1, R<0]; * Branch if one shift was enough NormLoop: Frac1H← A← Frac1H, Divide, Branch[NormDone, R<0]; * (Frac1H,,Q) ← (Frac1H,,Q) LSH 1 ExpSign1← (ExpSign1)-(LShift[1, 7]C), Branch[NormLoop]; * When we bail out of the loop, the exponent is correct, but we have left-shifted * the fraction one too many times. Right-shift the fraction and exit. NormDone: Frac1H← (Frac1H)-T, Multiply; * (Frac1H,,Q) ← 1,,((Frac1H,,Q) RSH 1) NormDone1: Frac1L← Q, Branch[FRePackNonzero]; * Multiply dispatch pending!! NormAlready: Frac1H← T, Branch[FRePackNonzero]; * Placement (sigh) * Result was exactly zero: push +0 as answer. NormalizeZero: ExpSign1← A0, Branch[FRePackZero]; *----------------------------------------------------------- FRePackNonzero: * Re-pack nonzero floating-point result. * Enter: ExpSign1, Frac1H, Frac1L contain unpacked result, which must be normalized but * need not have its leading "1" so long as rounding can't carry into this bit. * StkP addresses high word of result (i.e., =2 if minimal stack) * Timing: 9 cycles minimum, +5 if need to round, +2 if rounding causes a carry * out of Frac1L, +1 or 2 in extremely rare cases. *----------------------------------------------------------- * Prepare to round according to Round-to-Nearest convention. Frac1L[8:15] are fraction * bits that will be rounded off; result is exact only if these bits are zero. PD← (Frac1L) AND (377C), DispTable[1, 2, 2]; FRePackNZ1: ExpSign1← (ExpSign1) OR (176C), * Put ones in all unused bits of ExpSign1 Branch[NoRounding, ALU=0]; * Inexact result. Round up if result is greater than halfway between representable * numbers, down if less than halfway. If exactly halfway, round in direction that makes * least significant bit of result zero. Adding 1 at the Frac1L[8] position causes * a carry into bit 7 iff the result is >= halfway between representable numbers. PD← (Frac1L) AND (177C); Frac1L← (Frac1L)+(200C), Branch[.+2, ALU#0]; * Exactly halfway. But we have already rounded up. * If the least significant bit was 1, it is now 0 (correct). * If it was 0, it is now 1 (incorrect). But in the latter case, no carries * have propagated beyond the least significant bit, so... Frac1L← (Frac1L) AND NOT (400C); * Just zero the bit to fix it * Now set the sticky flag and trap if appropriate. * Note we have not propagated the carry out of the low word yet, so we must * perform only logical ALU operations that don't clobber the carry flag. T← B← Frac1H, RBase← RBase[Sticky]; Sticky← (Sticky) OR (1C), Branch[InexactTrap, R<0]; T← T+1, RBase← RBase[FTemp0], * Prepare to do carry if appropriate Branch[DoneRounding, Carry']; * There was a carry out of Frac1L. Propagate it to Frac1H. * If this causes a carry out of Frac1H, the rounded fraction is exactly 2.0, which * we must normalize to 1.0; i.e., set fraction to 1.0 and increment exponent. Frac1H← T, Branch[.+2, Carry']; ExpSign1← (ExpSign1)+(LShift[1, 7]C); ExpSign1← (ExpSign1)-(LShift[2, 7]C), DblBranch[ExpOverflow, .+3, R<0]; * Done rounding. Check for exponent over/underflow, and repack and push result. DoneRounding: ExpSign1← (ExpSign1)-(LShift[2, 7]C), DblBranch[ExpOverflow, .+2, R<0]; NoRounding: ExpSign1← (ExpSign1)-(LShift[2, 7]C), * Subtract 2 from exponent Branch[ExpOverflow, R<0]; * Branch if exponent > 377B originally T← LDF[Frac1H, 7, 10], * Extract high 7 fraction bits, exclude leading 1 Branch[ExpUnderflow, ALU<0]; * Branch if exponent < 2 originally * At this point, ExpSign1[1:8] = desired exponent -1, and [9:15] = 176 if the * sign is positive, 177 if negative. Thus adding 2 (if positive) or 1 (negative) * will correctly adjust the exponent and clear out [9:15]. T← (ExpSign1)+T+1, * Merge exponent with fraction and add 1 Branch[.+2, R odd]; * Branch if negative Stack&-1← T+1, Branch[.+2]; * Positive, add 1 more to finish fixing exponent Stack&-1← T OR (100000C); * Negative, set sign bit of result T← Frac1H; * Construct low fraction T← RCY[T, Frac1L, 10]; Stack&+1← T, IFUNext0; InexactTrap: Branch[FloatTrap]; ExpUnderflow: Branch[FloatTrap]; ExpOverflow: Branch[FloatTrap]; *----------------------------------------------------------- FRePackZero: * Push a result of zero with the correct sign * Enter: ExpSign1 has correct sign * StkP addresses high word of result (i.e., =2 if minimal stack) * Timing: 3 cycles *----------------------------------------------------------- T← LSH[ExpSign1, 17]; * Slide sign to bit 0 Stack&-1← T; * Push true zero with correct sign FRePackZ2: Stack&+1← A0, IFUNext0; *----------------------------------------------------------- FloatTrap: * Trap to the software floating-point handler. * Alto mode: * Enter: OTPReg = 4*alpha or 4*alpha+1 (left over from MISC dispatch); * odd iff entry point 2 was taken * Restores StkP, puts the alpha byte in OTPReg, and traps to SD[137]. * PrincOps mode: * Invokes the Cedar OpcodeTrap mechanism *----------------------------------------------------------- TopLevel; DontKnowRBase; :If[AltoMode]; ********** Alto version ********** RBase← RBase[RTemp0]; RestoreStkP; :IfMEP; OTPReg← LDF[OTPReg, 10, 2], Branch[.+2, R even]; StkP+1; * Entry 2, adjust saved StkP :Else; OTPReg← LDF[OTPReg, 10, 2]; :EndIf; T← 137C, Branch[SavePCAndTrap]; :Else; ******** PrincOps version ******** T← A0, Branch[OpcodeTrap]; :EndIf; **********************************