:TITLE[MesaFP]; *DO MESA FLOATING POINT by Jim Boyce %Edit by Ed Fiala 4 June 1981: Improve speed about 60 percent; bum 7 registers; expended 23b mi net. Produce 0 on underflow for Wilhelm (non-spec) as option. Edit by Ed Fiala 20 March 1981: put in Pilot conditionals; bum 2 registers; save 11b mi. Floating point numbers are represented in two different formats. Arguments on the stack are IEEE single precision reals with the following format: bit 0 sign (1 means negative) 1:8 exponent biased by 127d (exponent of 177b with a fraction of 0 is the integer 1; exponent of 0 with a fraction of 0 is true 0; exponent of 0 with non-0 fraction is denormalized [i.e., the fraction does not have an implicit "1." in front of it]; exponent of 377b with a fraction of 0 is infinity; exponent of 377b with non-0 fraction is Not-A-Number) 9:31 fraction (implicitly preceded by '1.' if non-0 and normalized.) These numbers appear on the stack with the word holding bits 0-15 nearer to the top of stack. Most operations unpack the operands into a format that is easier to work with. This format uses three words interpreted as follows: ExpSign bits 1:8 are the exponent (biased by 127d), bit 15d the sign. NOTE: FAdd, FSub, FMp, FDiv, etc. must and do compensate for the addition of 1 to the exponent which occurs in the Renorm subroutine. Frac The fraction. When unpacked, the leading one appears in Frac[1]; when about to be packed again, the leading bit is in Frac[0], Frac[31d] is a StickyBit (the inclusive or of all remaining bits); Fix and Round handle rounding slightly differently. FracH 16 high order bits of Frac. FracL 16 low order bits of Frac. Three registers for each of the two arguments hold a floating point number in unpacked form. These registers have names with 1's or 2's in them. Numbers are often referred to by index (SB or DB). Arg1 has index 2 and Arg2 has index 3. The IEEE specification calls for four different rounding modes, two methods of handling underflow/denormalized numbers (we have added a third), two kinds of infinity. We are in the process of converting to a Real package that records all the mode settings in the Sticky register accessible to microcode. When this conversion is complete, microcode will optionally implement as much of the package as it wants to, but trap when an opcode cannot be completed in microcode. We fall short in that the Sticky register is not saved as part of the Mesa process state. In all other situations, microcode either completes the operation or traps to software, as indicated below. POSSIBLE IMPROVEMENTS: 1) Get the 1st level Misc dispatches all on one page to save 4 cycles here. 2) Put call to RoundFrac in FDiv exit saving 6 cycles. 3) Faster Renorm called from Float and subtraction. 4) Eliminate 3 cycles in zero-arg testing for all two-arg opcodes. 5) Use 31b max shift rather than 40b for Unnorm entry from FAdd/FSub; possibly do the Unnorm entry differently for negative FPTemp1. 6) Implement variant rounding modes, infinities, gradual denormalization, infinity on overflow, not-a-number on divide by zero, mod on fix/round overflow. 7) Get the Sticky register and ? software bits in the process state. STICKY REGISTER 0 0 OR 1 into Sticky[15d] on every inexact result 1 Trap on any inexact result 1-2 0 Trap on any denormalized result (user may be interested in the loss of precision) 1 Substitute 0 on underflow (non-IEEE; requested by Wilhelm) 2 Gradually denormalize on underflow (not in microcode) 3 -- 3 0 Projective infinity (only one unsigned infinity; compare of anything with infinity traps; not sure what other operations are supposed to do). 1 Affine infinity (+ infinity and - infinity both defined; arithmetic and comparisons work as expected). 4-5 0 Round to nearest (unbiased; round to even if halfway) 1 Round toward 0 (truncate)--not in microcode 2 Round toward plus infinity--not in microcode 3 Round toward minus infinity--not in microcode 6 0 Trap if denormalized args are supplied 1 Normalize the arguments and then use them (not in microcode) 7 0 Trap on invalid operations (compare of projective infinity, Not-a-number as an argument) 1 Result is the infinity or not-a-number (not in microcode) 8 0 Trap on overflow of Fix or Round operation 1 Return low-order 16d bits of the result (not in microcode) 9 0 Trap on divide-by-zero 1 Stuff in not-a-number on divide-by-zero and continue (not in microcode) 10d 0 Trap on arithmetic overflow 1 Stuff in infinity on arithmetic overflow and continue (not in microcode) 11-14d -- undefined 15d 0 All results have been exact 1 One or more inexact results have occurred (i.e., rounding has taken place). Softare is expected to execute the FSticky opcode to exchange a new value on the stack with the value currently in Sticky; then it executes floating point opcodes in the new mode settings. The Sticky register should be saved with the Mesa process state. Also, the software state of the floating point processor includes occurrence bits for all the events which have traps: Inexact result occurred (temporarily kept in FSticky), Denormalized arguments were supplied; Invalid operations occurred; Fix/Round overflow occurred; Divide by zero occurred; Arithmetic overflow occurred; Underflow occurred. The microcode need not keep these occurrence bits because the software can originally cause the events to trap, remember the event in storage, and turn off the microcode trap enable. **What about keeping the events in the process state? In the current microcode implementation, any argument with 377b or 0b exponent and any result which overflows will trap to software. ROUNDING MODES Rounding occurs at SB[FracL] bit 7 for arithmetic or at SB[FracL] bit 15d for Fix and Round operations. The bit following the one rounded is called the rounding bit, and all lower bits are sticky bits. If sticky bits are all 0, the result is exact, and no rounding occurs. Otherwise, if the InexactResult trap is enabled, trap. Otherwise, round in the direction specified by the rounding mode. "Round to nearest" goes in the obvious direction when the sticky bits are .ne. 0; if rounding bit is 1 and sticky bits .eq. 0, the number is exactly halfway between two possible results, so by convention round to the nearest even number. % :IF[AltoMode]; ******************************************** *52, 44, 46, 47, 53, 64, 70, and xfXTPReg are available. RV[ExpSign,70]; RV[FracH,54]; RV[FracL,64]; RV[ExpSign1,72]; RV[Frac1H,56]; RV[Frac1L,66]; RV[ExpSign2,73]; RV[Frac2H,57]; RV[Frac2L,67]; RV[Sticky,75]; *See page 1 comment--NOT A TEMPORARY RV[FlCode,45]; *Fix, Round RV[CCTemp1,51]; *Unnorm RV[FPTemp1,71]; *Unnorm, FAdd, FSub RV[LoopCount,51]; *FMul RV[Frac2HLSh1,50]; *FMul RV[Frac2HLSh1p1,45]; *FMul RV[FMOne,71]; *FMul RM[Frac1LL,IP[Frac2L]]; *FMul RV[TFrac,50]; *FDiv RV[Frac2H',51]; *FDiv RV[Frac2H'p1,71]; *FDiv RV[Frac2Hp1,45]; *FDiv *456b mi including two alpha-byte dispatch locations overwritten in MesaX. Set[FlPage0,16]; *341b mi (not pages 4 to 7) Set[FlPage1,16]; * 34b mi Set[FlPage2,17]; * 40b mi Set[FlPage3,17]; * 5b mi Set[FlPage4,17]; * 12b mi :ELSE; **************************************************** *17, 46, 47, 51, 53, 64, and 70 are available temporaries. RV[ExpSign,70]; RV[FracH,54]; RV[FracL,64]; RV[ExpSign1,72]; RV[Frac1H,56]; RV[Frac1L,66]; RV[ExpSign2,73]; RV[Frac2H,57]; RV[Frac2L,67]; RV[Sticky,65]; *See page 1 comment--NOT A TEMPORARY RV[FlCode,44]; *Fix, Round RV[CCTemp1,71]; *Unnorm RV[FPTemp1,45]; *Unnorm, FAdd, FSub RV[LoopCount,71]; *FMul RV[Frac2HLSh1,50]; *FMul RV[Frac2HLSh1p1,45]; *FMul RV[FMOne,44]; *FMul RM[Frac1LL,IP[Frac2L]]; *FMul RV[TFrac,50]; *FDiv RV[Frac2H',71]; *FDiv RV[Frac2H'p1,45]; *FDiv RV[Frac2Hp1,44]; *FDiv *456b mi placed primarily on page 1 with the rest on page 13. Set[FlPage0,1]; *341b mi (not pages 4 to 7) Set[FlPage1,1]; * 34b mi Set[FlPage2,13]; * 40b mi Set[FlPage3,13]; * 5b mi Set[FlPage4,13]; * 12b mi :ENDIF; *************************************************** Loca[UnnormDisp,FlPage0,40]; Loca[UnpackRet,FlPage0,60]; Loca[RoundDisp,FlPage0,100]; Loca[FixDisp,FlPage0,120]; %Coding conventions. Misc bytecodes with alpha in [20b to 37b] are floating point operations. The 1st mi below is the Misc dispatch table entry for alpha in this range. For Alto Mesa, alpha is saved in xfOTPReg, where it is the trap parameter if the FP operation traps; for Cedar RTemp holds alpha and 400b+alpha determines the trap dispatch. T is loaded with 30b, stored in SALUF as the A + A + Cy0 ALU operation. RTemp holds the Misc alpha byte at entry here; two arg opcodes dispatch on the low four bits of RTemp not only initially but also out of Unpack2. The various Fix/Round opcodes use FlCode to control dispatches as follows: 10:11 FixDone dispatch (Long 0, Integer 1, Cardinal 2) 12:15 Unpack dispatch (FAdd 0, FSub 1, FMul 2, FDiv 3, FComp 4, Fix 6, Round 10); two-arg opcodes dispatch on RTemp[12:15d], one-arg opcodes on FlCode[12:15d]. Timing from the beginning of the opcode through the dispatch table below is 16 cycles + buffer refill, where buffer refill is 4.5 cycles for Pilot mode, 6.5 cycles for Alto mode. % :IF[AltoMode]; ******************************************** *Overwrite the two MISC dispatch mi to setup the branch to this code. Set[FlPntLoc,Add[MiscDisp0,17]]; xfOTPReg ← T, LoadPage[FlPage0], At[MiscDisp0,1]; Dispatch[RTemp,14,4], At[FlPntLoc]; :ELSE; **************************************************** LoadPage[FlPage0], At[MiscDisp0,1]; Dispatch[RTemp,14,4]; :ENDIF; *************************************************** OnPage[FlPage0]; *Approx. time in cycles is given in "()" Frac2H ← T ← 30C, Disp[.+1]; Frac2H ← (Frac2H) + (SALUF ← T), GoTo[Unpack2], DispTable[20]; *FAdd (~157) Frac2H ← (Frac2H) + (SALUF ← T), GoTo[Unpack2]; *FSub (~159) Frac2H ← (Frac2H) + (SALUF ← T), GoTo[Unpack2]; *FMul (~389) Frac2H ← (Frac2H) + (SALUF ← T), GoTo[Unpack2]; *FDiv (~422) Frac2H ← (Frac2H) + (SALUF ← T), GoTo[Unpack2]; *FComp (~ 94) FlCode ← 6C, GoTo[Unpack1]; *Fix (~ 72) T ← Stack&-1, SALUF ← T, GoTo[Float]; *Float (~ 47+Renorm) FlCode ← 26C, GoTo[Unpack1]; *FixI (~ 76) FlCode ← 46C, GoTo[Unpack1]; *FixC (~ 70) T ← Stack&-1, GoTo[FSticky]; *FSticky (~ 28) :IF[AltoMode]; ******************************************** T ← SStkP, GoTo[FloatTrp1]; *FRem not implemented :ELSE; **************************************************** TrapParm ← 0C, GoTo[FloatTrap]; *FRem not implemented :ENDIF; *************************************************** FlCode ← 10C, GoTo[Unpack1]; *Round (~ 85) FlCode ← 30C, GoTo[Unpack1]; *RoundI (~ 89) FlCode ← 50C, GoTo[Unpack1]; *RoundC (~ 83) :IF[AltoMode]; ******************************************** T ← SStkP, GoTo[FloatTrp1]; *Undefined T ← SStkP, GoTo[FloatTrp1]; *Undefined :ELSE; **************************************************** TrapParm ← 0C, GoTo[FloatTrap]; *Undefined TrapParm ← 0C, GoTo[FloatTrap]; *Undefined :ENDIF; *************************************************** %Unpack2 unpacks the first Real on the stack into ExpSign2, Frac2H, and Frac2L and the 2nd into ExpSign1, Frac1H, and Frac1L and returns SB pointing at arg1 and Frac1H in T for FAdd, FSub, FMul, and FDiv. Unpack1 unpacks one real on the stack into ExpSign2, Frac2H, and Frac2L and returns SB and DB pointing at arg2 for Fix and Round. Unpack also filters out numbers which the microcode can't handle, traping on unnormalized numbers, Infinities, and NotANumbers. Timing = 29 cycles for Unpack1, 46 for 2 args (-3 per arg that is 0 and +1 per arg that is negative). % OnPage[FlPage0]; *Frac2H contains 60b here, where the two leading ones (i.e., bits 10:11d) *are loaded into the word-addressing part of SB and select word 3 of an *addressed quadword. Unpack2: Frac2H ← (SB ← Frac2H) - 1, Task; T ← LdF[Stack,11,7]; *BBFBX causes SBX←SB; the SB← here takes effect when BBFBX in the last mi of *Unpack is executed. SB ← Frac2H, Frac2H ← T, BBFBX, NoRegILockOK, Call[Unpack2A]; Frac2H ← (LSh[Frac2H,7]) or T, Call[Unpack]; Dispatch[RTemp,14,4]; T ← Frac1H ← (LSh[Frac1H,7]) or T, Disp[@FAdd]; Unpack1: Frac2H ← (Frac2H) + (SALUF ← T), Task; SB ← Frac2H; DB ← Frac2H, BBFBX, Call[Unpack]; *BBFBX will cause SBX←SB Dispatch[FlCode,14,4]; T ← Frac2H ← (LSh[Frac2H,7]) or T, Disp[@Fix]; *Position exponent in bits 1:8 and sign in bit 15d because: *(a) XOR'ing signs and adding or subtracting exponents in FMul and FDiv can *be accomplished in one operation; *(b) Convenient overflow/underflow test in sign bit of word; and *(c) Convenient for packing result. Unpack: T ← LdF[Stack,11,7]; SB[FracH] ← T; Unpack2A: T ← (LdF[Stack&-1,1,17]) and not T, GoTo[UnpNeg,R<0]; SB[ExpSign] ← T, GoTo[UnpNZ,ALU#0]; *Zero exponent may represent true 0 or denormalized number. UnpZ: T ← Stack&-1; LU ← SB[FracH] or T; SB[FracL] ← T, BBFBX, GoTo[DenormTrap,ALU#0]; CInexactOK: Return; UnpNeg: SB[ExpSign] ← (Zero) + T + 1, GoTo[UnpZ,ALU=0]; UnpNZ: LU ← SB[ExpSign] + (200C); SB[FracH] ← SB[FracH] or (200C), GoTo[NaNTrap,ALU<0]; T ← LSh[Stack,7]; SB[FracL] ← T, BBFBX; T ← RSh[Stack&-1,11], Return; %Entries are PackA and Renorm; NormGo is entered by FAdd/FSub. Renorm left-shifts the fraction SB[FracH/L] until SB[FracH][0] is 1 and subtracts the number of positions shifted from the exponent. Then it rounds SB[FracH],,SB[FracL] at SB[FracL][7] using SB[FracL][8] for rounding and SB[FracL][9:15d] as sticky bits. Finally, it performs range checks, pushes the number onto the stack, and exits. NOTE: If an overflow occurs, the possible range of exponents is 377 to 376+376-200 or 377b to 574b (For FMul with 0 NormSteps). For an underflow, the possible range is 0 or 777b down to 0-376b+177b-1; i.e., 0 or 777b down to 600b (For FDiv). Timing from Renorm to exit: 8 x nsteps + 29 + (1 if sticky bits .ne. 0 after normalizing, else 2 if odd or 4 if even) + (3 if FracL carries rounding up (+5 if FracH also carries)). The number of normalizing steps will be: 0 for FDiv; 0 or 1 for FMul; 0 or 1 for FAdd/FSub if signs are the same; 1 to 24d for FAdd/FSub if signs differ (starting at NormGo); 0 to 15d steps for Float; % NormStep: SB[ExpSign] ← SB[ExpSign] - (200C); NormGo: SB[FracL] ← SB[FracL] SALUFOP T, Task; SB[FracH] ← SB[FracH] SALUFOP T, UseCOutAsCIn; Renorm: T ← LSh[SB[FracH],10], GoTo[NormStep,R>=0]; Pack: Dispatch[Sticky,4,2]; *Dispatch on rounding mode LU ← SB[FracL] and (177C), Disp[.+1]; *Round to nearest: Test sticky bits; begin assuming sticky bits .ne. 0 SB[FracL] ← SB[FracL] + (200C), DblGoTo[RndEx,RndNr,ALU=0], DispTable[4]; GoTo[FloatTrap]; *Truncate (i.e., round toward 0) GoTo[FloatTrap]; *Round toward plus infinity GoTo[FloatTrap]; *Round toward minus infinity *Assumption correct--continue rounding. RndNr: SB[FracH] ← SB[FracH] + 1, UseCOutAsCIn, DblGoTo[RndUp,Inexact,Carry]; *Assumption wrong--all sticky bits are 0--fixup result. RndEx: Dispatch[SB[FracL],7,3], Skip[Carry']; *If carry occurred, then the number was exactly halfway between an odd *number and the next higher even number, and it is now correctly rounded. SB[FracH] ← SB[FracH] + 1, GoTo[RndUp]; Disp[.+1]; *Was halfway and odd--now fixed up. Inexact: Sticky ← (Sticky) or (1C), DblGoTo[InexactTrap,InexactNoTrap,R<0], At[RoundDisp,0]; *Was exact and even--adding 200b didn't affect it. LU ← SB[ExpSign] + (400C), DblGoTo[OverUnderflow,Pack2,R<0], At[RoundDisp,2]; *Was halfway beteen even & next higher odd--undo +200b. SB[FracL] ← SB[FracL] - (200C), GoTo[Inexact], At[RoundDisp,4]; RndUp: T ← LSh[SB[FracH],10], GoTo[.+3,Carry']; *Adding 400b to SB[ExpSign] here while leaving SB[FracH] .eq. 0 works except *when exponent is 376b--false overflow gets generated. SB[ExpSign] ← SB[ExpSign] + (200C); SB[FracH] ← 100000C; Sticky ← (Sticky) or (1C), GoTo[InexactTrap,R<0]; *Was exact and odd. InexactNoTrap: LU ← SB[ExpSign] + (400C), DblGoTo[OverUnderflow,Pack2,R<0], At[RoundDisp,6]; Pack2: T ← (RSh[SB[FracL],10]) or T, GoTo[Overflow,ALU<0]; Stack&+1 ← T; *Push low-order result word T ← SB[ExpSign] and not (177C), Task; *Simultaneously offset the exponent by 1 while OR'ing in the fraction. T ← (LdF[SB[FracH],0,10]) + T; *Push high-order result & exit PackA: T ← (LSh[SB[ExpSign],17]) or T, GoTo[FlPush]; %Improvement for Float and FAdd/FSub with different signs. Timing: 12/tristep + (5, 12, or 12 for the exit 0, 1, or 2 step cases). This is 3 slower for 0 steps, 1 slower for 1 step, and 7 faster for 2 steps; much faster for longer normalizations. Renorm goes 1 faster if all places which might use more than 1 NormStep jump to LongNorm. LongNorm: Dispatch[SB[FracH],1,2], GoTo[PackL,R<0]; SB[ExpSign] ← SB[ExpSign] - (400C), Disp[.+1]; Norm3: T ← RSh[SB[FracL],15], At[x,0]; SB[FracL] ← LSh[SB[FracL],3], Task; SB[FracH] ← (LSh[SB[FracH],3]) or T; SB[ExpSign] ← SB[ExpSign] - (200C), GoTo[LongNorm]; Norm2: T ← RSh[SB[FracL],16], At[x,1]; SB[FracL] ← LSh[SB[FracL],2]; SB[FracH] ← (LSh[SB[FracH],2]) or T, GoTo[PackL]; Norm1: SB[ExpSign] ← SB[ExpSign] + (200C), Skip, At[x,2]; SB[ExpSign] ← SB[ExpSign] + (200C), At[x,3]; NormGo: SB[FracL] ← SB[FracL] SALUFOP T; SB[FracH] ← SB[FracH] SALUFOP T; PackL: T ← LSh[SB[FracH],10], GoTo[Pack]; NormStep: SB[ExpSign] ← SB[ExpSign] - (200C), GoTo[NormGo]; % %NegFrac performs a 2's complement negation on SB[FracH],,SB[FracL] and returns the new SB[FracL] in T. Timing: 6 cycles if SB[FracL] is 0, else 7 cycles. % NegFrac: T ← SB[FracL]; NegFrac1: SB[FracH] ← SB[FracH] xnor (0C), Skip[ALU=0]; T ← SB[FracL] ← (Zero) - T, Return; SB[FracH] ← SB[FracH] + 1, Return; %RoundCard rounds SB[FracH],,T using MNBR[0] as rounding bit and MNBR[1:17b] as sticky bits; T is unchanged if no rounding occurs, or contains the new SB[FracL] if rounding occurs. Timing: 4 if exact, 12 if sticky bits .ne. 0, 13 if sticky bits .eq. 0. % RoundCard: LU ← (MNBR) SALUFOP T, Skip[ALU#0]; Return; *Exact *Round up if halfway and sticky bits non-0. T ← (RZero) + T, UseCOutAsCIn, Skip[ALU=0]; SB[FracH] ← SB[FracH] + 1, UseCOutAsCIn, GoTo[CInexact]; *Sticky bits all 0, rounding bit 1: round up if odd, down if even, but just *added 1, so result is correct if SB[FracL] now even, else 1 too big. SB[FracH] ← SB[FracH] + 1, UseCOutAsCIn; T ← (Form-2[AllOnes]) and T; CInexact: Sticky ← (Sticky) or (1C), DblGoTo[InexactTrapA,CInexactOK,R<0]; %Unnorm right-shifts the fraction DB[FracH/L] by FPTemp1[1:8] positions with underflow going into MNBR. MNBR[1:15] are sticky bits. Unnorm is called on FAdd, FSub, Fix, FixI, FixC, Round, RoundI, and RoundC. On FAdd and FSub, it right-shifts the fraction of the number with the smaller exponent by an amount equal to the difference in the exponents, so that the fractions can be added. On Fix or Round, it shifts the fraction so that its exponent can be made 235b, which positions the decimal point to the right of FracL bit 15d. Use of MNBR to capture underflow and the 37b maximum shift is needed for Fix/Round, which use MNBR[0] for rounding and MNBR[1:15d] as sticky bits. FAdd/FSub worst case is when signs differ and subtraction of fractions necessitates two normalization steps afterwards; in this case DB[FracL] bit 10d will become the rounding bit, so a maximum shift of 31b suffices, MNBR is not needed, and underflowing bits can result in OR'ing into any of DB[FracL] bits 11d to 15d. Timing: 10 cycles if shift count .ge. 40b (i.e., the leading fraction bit in DB[FracH] bit 1 is shifted further than bit 0 of MNBR); 16 if count is 0; 25 if count is 1 to 17b; 18 if count is 20b; or 27 if count is 21b to 37b. % Unnorm: T ← (LdF[FPTemp1,5,4]) - 1; LU ← (FPTemp1) - (10000C); CCTemp1 ← T, BBFBX, GoTo[.+3,ALU<0]; *More than two words--any value .ls. 40000b ok here MNBR ← R400; T ← DB[FracH] ← 0C, Return; CycleControl ← CCTemp1, GoTo[Unnorm0,ALU<0]; *CycleControl ← RSCount - 1, so MWX ← RSCount - 1 and DBX ← 0. For WFA, this *will left-cycle by 17b - DBX - MWX (= 20b - RSCount) and mask with 1's in *bits DBX to MWX (= 0..(RSCount - 1)). CCTemp1 ← 16C; Dispatch[FPTemp1,4,1]; CCTemp1 ← (CCTemp1) - T, LoadPage[FlPage3], Disp[.+1]; MNBR ← WFA[DB[FracL]], GoToP[.+1], At[UnnormDisp,0]; *1 to 17b OnPage[FlPage3]; T ← WFA[DB[FracH]]; *CycleControl ← 16b - (RSCount mod 20b - 1), knowing RSCount is non-zero, or *DBX ← 0 and MWX ← 17b - (RSCount mod 20b). For RF, this left-cycles by *MWX+DBX+1 (= 20b - RSCount) and masks with 1's in bits 17b-MWX to 17b *(= RSCount mod 20b to 17b). CycleControl ← CCTemp1; T ← (RF[DB[FracL]]) or T; DB[FracH] ← RF[DB[FracH]], Return; *21b to 37b LU ← DB[FracL] - 1, LoadPage[FlPage0], GoToP[.+1], At[UnnormDisp,1]; OnPage[FlPage3]; * +1 always .eq. or (1C) here T ← (WFA[DB[FracH]]) + 1, UseCOutAsCIn, GoToP[.+1]; OnPage[FlPage0]; CycleControl ← CCTemp1, CCTemp1 ← T, NoRegILockOK; MNBR ← CCTemp1; T ← RF[DB[FracH]]; Unnorm1: DB[FracH] ← 0C, Return; Unnorm0: Dispatch[FPTemp1,4,1]; T ← MNBR ← DB[FracL], Disp[.+1]; MNBR ← RZero, Return, DispTable[2]; *No shift T ← DB[FracH], GoTo[Unnorm1]; *Shift exactly 1 word %For Alto Mesa, FloatTrap et al. store alpha in OTPReg, set T to sFloatingPoint, and go to Kfcr which traps to MesaCode to try again on this operation. For Cedar Mesa, the parameter is 400 + alpha. Arrive at OverUnderflow for exponents 401b to 777b with the exponent field offset by -1. Exponents of 0 or of 777b down to 600b represent underflow; 377b to 574b represent overflow. Overflow always traps. Gradual denormalization is not handled in microcode and traps. A microcode implementation should sensibly handle not only denormalized results but also denormalized arguments. For an underflowing result, microcode would first right-shift the fraction one position because the implicit leading 1 isn't there with a zero exponent. Then it would right-shift the fraction while counting the exponent up to zero. Rounding would take place after shifting, and the special case in which rounding normalizes the number would be dealt with. Unpacking would normalize the fraction while counting the exponent to a negative value. % OverUnderflow: LU ← SB[ExpSign] - (137400C); *Distinguish overflow and underflow LU ← (Sticky) and (20000C), Skip[ALU<0]; DblGoTo[Pack0,FloatTrap,ALU#0]; *Underflow--check mode T ← SStkP, GoTo[FloatTrp1]; *Overflow Overflow: *For exponent .eq. 377b or 400b T ← SStkP, GoTo[FloatTrp1]; InexactTrap: T ← SStkP, GoTo[FloatTrp1]; FixTrapA: T ← SStkP, GoTo[FloatTrp1]; FixTrapB: T ← SStkP, GoTo[FloatTrp1]; FixTrap: T ← SStkP, GoTo[FloatTrp1]; NaNTrap: T ← SStkP, GoTo[FloatTrp1]; InexactTrapA: DenormTrap: T ← SStkP, GoTo[FloatTrp1]; FloatTrap: T ← SStkP; FloatTrp1: FPTemp1 ← T, Task; StkP ← FPTemp1; :IF[AltoMode]; ******************************************** LoadPageExternal[opPage3]; T ← sFloatingPoint, GoToExternal[kfcrLoc]; :ELSE; **************************************************** LoadPage[opPage0]; RTemp ← (RTemp) + (400C), GoToP[UndefTrap]; *400 + alpha :ENDIF; *************************************************** %Addition and subtraction are almost the same. 1) Unpack the arguments, negate Arg2 if FSub, check for zeroes. 2) Point DB at the argument with the smaller exponent, SB at the other, and call Unnorm to right-shift the DB argument by the difference in the exponents. 3) Add or subtract magnitudes, depending on the signs. 5) Renormalize the result, round, and push it on the stack. Get here after T ← Frac1H; SB selects arg1. Timing @FAdd to Renorm: 24 + (2 if FSub) + (5 if Exp1 .ls. Exp2) + (7 if subtraction produces negative fraction) + (1 if Unnorm produces ones in MNBR). Add Unnorm time to this. For those cases which enter Renorm at NormGo, 4 cycles have been subtracted. Subtraction can produce a negative result only when Exp1 .eq. Exp2; Unnorm produces a sticky bit only for exponent differences larger than 7. 0 or 1 NormSteps will be performed after addition; 1 or 2 NormSteps will be performed after subtraction usually; but up to 24d NormSteps might be performed if 0.5 < arg1/arg2 < 2.0. % @FSub: ExpSign2 ← (ExpSign2) xor (1C), At[UnpackRet,1]; *Complement sign @FAdd: LU ← (Frac2H) and T, At[UnpackRet,0]; *Simultaneously compute the exponent difference in bits 0:8 while XOR'ing *the signs in bit 15d and setting up a 3 in bits 10:11d for SB← and DB←. *The "1" in "61C" here prevents sign subtraction from propagating higher. T ← (ExpSign2) - (61C), GoTo[Add0,ALU=0]; T ← (ExpSign1) - T, Call[FAddA]; *Return here after Unnorm. Or the Sticky bit into DB[FracL] bit 15 and test *cycle the (Sign1 xor Sign2) bit into the ALU sign for testing. LU ← MNBR; LU ← RCy[FPTemp1,1], Skip[ALU=0]; T ← (RSh[AllOnes,17]) or T, DblGoTo[AddFrac,SubFrac,ALU<0]; DblGoTo[AddFrac,SubFrac,ALU<0]; *FPTemp1 ← [(Exp1-Exp2) lsh 7] + 61b + S1 - S2 FAddA: FPTemp1 ← T, Skip[ALU<0]; *Exp1 .ge. Exp2: Point SB at arg1 (it already does), DB at arg2. DB ← FPTemp1, GoTo[Unnorm]; *Exp1 .ls. Exp2: Point SB at arg2, DB at arg1; negate the count. T ← 157C; *FPTemp1 ← [(Exp1-Exp2)' lsh 7] + 41b + S1 - S2; SB ← 3. FPTemp1 ← (SB ← FPTemp1) xnor T; *FPTemp1 ← [(Exp2-Exp1) lsh 7] + 21b + S1 - S2; DB ← 2. FPTemp1 ← (DB ← FPTemp1) + T + 1, GoTo[Unnorm]; *The signs are equal; add the fractions. DB[FracL] is in T. AddFrac: SB[FracL] ← SB[FracL] + T; T ← DB[FracH] + 1, UseCOutAsCIn; SB[FracH] ← SB[FracH] + T, GoTo[Renorm]; *The signs are different; subtract the fractions. DB[FracL] is in T. SubFrac: SB[FracL] ← SB[FracL] - T; T ← DB[FracH], FreezeResult; SB[FracH] ← T ← (SB[FracH]) - T, UseCOutAsCIn; LU ← SB[FracL] or T, GoTo[SubNeg,ALU<0]; SubNonNeg: SB[ExpSign] ← SB[ExpSign] - (200C), GoTo[NormGo,ALU#0]; T ← Stack&+1 ← 0C, GoTo[FlPush]; *Get here only when the exponents are equal and the DB fraction is larger *than the SB fraction. 2 to 24d NormSteps will be performed. SubNeg: T ← SB[FracL], Call[NegFrac1]; *Simultaneously complement the sign and subtract 1 from the exponent. SB[ExpSign] ← SB[ExpSign] - (177C), GoTo[NormGo]; %Multiply. 1) Unpack arguments leaving normalized fractions in FracH[1:15],,FracL[0:8]. 2) Check for zero arguments. GoTo Mul0 if either arg denormalized; since Unpack has trapped for any non-0 denormalized arg, the "and" at @FMul is always non-0 for both args non-0. 3) Set sign and exponent of Frac1 to Exp1+Exp2-177b+1-1, where -177b compensates for the excess in Exp1+Exp2, +1 for the NormStep normally required, and -1 for the exponent offset in Renorm; altogether this is Exp1+Exp2-177b. Adding ExpSign1 to ExpSign2 also XOR's the sign bit for the result. 4) Shift Frac2 so that the non-0 bits are in Frac2H[8:15],,Frac2L[0:15]; Frac2L is then copied into MNBR for use by the inner loops and Frac2L becomes the low-order product word Frac1LL. 5) Frac1H/L is left-shifted 2 positions and Frac2H is added to Frac1L; the leading multiplier bit is known to be 1, so the inner loop is entered with the multiplier lsh 1 in Frac1H/L/LL[0:25b] and the multiplicand in Frac1H/L/LL[30:57b]. 6) Perform the shift and add loop 23d times, since the 1st of 24d steps was done before starting the loop. Initially, two 0 bits lie between the low-order multiplier and high-order product bits. Each step shifts the multiplier and product in Frac1H/L/LL and conditionally adds the multiplicand in Frac2H/MNBR. Carries may make the 2nd of the two 0's between the multiplier and product into a 1, but the 1st 0 will remain 0. 7) Do 0 or 1 NormSteps at Renorm. Timing @FMul to Renorm for non-0 args: (36 or 39) + 8*(10/zero or 19/one) + 15*(6/zero, [11 or 14]/one). Total averages 293.75 cycles. % @FMul: LU ← (Frac2H) and T, At[UnpackRet,2]; T ← (ExpSign2) - (37400C), GoTo[Mul0,ALU=0]; ExpSign1 ← (ExpSign1) + T, LoadPage[FlPage4]; T ← RSh[Frac1L,16]; OnPage[FlPage4]; Frac1H ← (LSh[Frac1H,2]) or T, Task; T ← LSh[Frac2H,11]; Frac1LL ← (RSh[Frac2L,7]) or T; *For that 1st multiplier 1 T ← Frac2H ← RSh[Frac2H,7], Task; Frac1L ← (LSh[Frac1L,2]) or T; FMOne ← 1C; LU ← (MNBR ← Frac2L) SALUFOP T, Task; T ← (Frac2H) SALUFOP T, UseCOutAsCIn; Frac2HLSh1 ← T, LoadPage[FlPage2]; Frac2HLSh1p1 ← (Zero) + T + 1; OnPage[FlPage2]; LoopCount ← 6C, Call[FMNoA1]; *Total of 5 + 3*LoopCount iterations %With this unrolled loop, the exterior step runs in 10 + 9/add cycles and interior steps in 6 + (5 or 8)/add cycles; the interior step can be replicated 0 to 4 times, trading space for speed. Each replication costs 10d mi each (+5 mi for initializing the three registers needed). LoopCount and entry point have to be adjusted. The 1st fast loop saves an average of 2.625/cycles bit; the 2nd saves an additional .875 cycles/bit; 34d saves .4375 cycles/bit, etc. I haven't found any way of making the slow exterior loop look more like the fast one and still get in the required tasking. % LoopCount ← (LoopCount) - 1, GoTo[MulDone,R<0]; Frac1LL ← (Frac1LL) SALUFOP T; Frac1L ← (Frac1L) SALUFOP T, UseCOutAsCIn; Frac1H ← (Frac1H) SALUFOP T, UseCOutAsCIn, GoTo[FMAd1,R<0]; FMNoA1: Frac1LL ← (Frac1LL) SALUFOP T; Frac1L ← (Frac1L) SALUFOP T, UseCOutAsCIn; FMAdS2: Frac1H ← (Frac1H) SALUFOP T, UseCOutAsCIn, GoTo[FMAd2,R<0]; FMNoA2: Frac1LL ← (Frac1LL) SALUFOP T; Frac1L ← (Frac1L) SALUFOP T, UseCOutAsCIn; FMAdS3: *Additional loop replications go here Frac1H ← (Frac1H) SALUFOP T, UseCOutAsCIn, DblGoTo[FMAd3,FMNoA3,R<0]; FMAd1: T ← (MNBR) SALUFOP T; Frac1LL ← (LSh[Frac1LL,1]) + T, Skip[R<0]; T ← (Frac2HLSh1) + 1, UseCOutAsCIn, Skip; T ← (Frac2HLSh1p1) + 1, UseCOutAsCIn; Frac1L ← (LSh[Frac1L,1]) + T, GoTo[FMAdS2,R>=0]; **Carry is impossible for the 1st 7 iterations of the loop. T ← (FMOne) + 1, UseCOutAsCIn; Frac1H ← (LSh[Frac1H,1]) + T, GoTo[FMNoA2,R>=0]; FMAd2: T ← (MNBR) SALUFOP T; Frac1LL ← (LSh[Frac1LL,1]) + T, Skip[R<0]; T ← (Frac2HLSh1) + 1, UseCOutAsCIn, Skip; T ← (Frac2HLSh1p1) + 1, UseCOutAsCIn; Frac1L ← (LSh[Frac1L,1]) + T, GoTo[FMAdS3,R>=0]; T ← (FMOne) + 1, UseCOutAsCIn; Frac1H ← (LSh[Frac1H,1]) + T, GoTo[FMNoA3,R>=0]; FMAd3: *Additional loop replications go here T ← MNBR; Frac1LL ← (Frac1LL) + T; T ← Frac2H, FreezeResult; Frac1L ← (Frac1L) + T, UseCOutAsCIn; FMNoA3: Frac1H ← (Frac1H) + 1, UseCOutAsCIn, Return; *If Frac2L ne 0, set the FracL1[15d] to control rounding. MulDone: LU ← (Frac1LL) - 1, LoadPage[FlPage0]; ExpSign1 ← (ExpSign1) - (200C), GoToP[Renorm,Carry']; OnPage[FlPage0]; Frac1L ← (Frac1L) or (1C), GoTo[Renorm]; %Division. Arg1 is dividend, arg2 divisor; quotient in MNBR/TFrac. 1) Unpack arguments. 2) Trap if divisor is 0; else return 0 quotient if dividend is 0. 3) Copy Frac1L into TFrac so that Frac1L can accumulate the quotient. 4) Set sign and exponent of the result: Exp1-Exp2+177b-1 is the exponent if the initial subtract succeeds, where 177b supplies the offset and -1 compensates for the +1 in Renorm; if the initial subtract fails, an additional -1 compensates for the extra quotient bit generated. The subtraction of the exponents also xor's the signs, but it is necessary to offset ExpSign1 by 2 to ensure that if arg1 is positive and arg2 negative, the carry from subtracting the signs won't affect the exponent calculation. 5) Compute the variations on Frac2H needed for the inner loop. 6) Initialize Frac1L to 1. On the 1st iteration, a 1 will be produced if dividend fraction is larger than divisor fraction; if so, 25d iterations produce a 24d bit result and an extra bit for rounding. If the 1st iteration produces a 0, a 26th iteration is required, and Frac1L is reinitialized to 1. The 1st inner loop exit occurs when the leading 1 has migrated out of the sign bit; the 16d bits in Frac1L are copied into MNBR, and Frac1L is reinitialized to 200b; the inner loop is then repeated 9d more times. After the 2nd inner loop exit, 25d quotient bits are in MNBR/Frac1L; the remainder serves as a sticky bit. The 1st inner loop exit is distinguished from the 2nd by the sign of Frac1L. 7) The shift and add-or-subtract loop is primed with an initial subtraction; then each step combines shifting the dividend with adding or subtracting the divisor according to the quotient bit produced by the previous step. 8) Undo the last subtraction if the right-most result bit is a 1, so that the remainder will correctly be 0 or non-0 for rounding. 9) Copy MNBR into Frac1H and OR into Frac1L for rounding if remainder non-0. 10) No normalization is required; round and push result. Timing @FDiv to Renorm: (52 or 53) + 24*(10 to 12) + (12 or 13 if dividend < divisor) + (3 if 25th bit is 0) + (3 if final remainder .ne. 0). % @FDiv: LU ← (Frac2H) and T, At[UnpackRet,3]; *Go if either arg denormalized, since denormalized implies 0 because Unpack *will have trapped for any non-zero denormalized argument. T ← (Frac2H) xnor (0C), Skip[ALU#0]; T ← ExpSign2, GoTo[Div0]; Frac2H' ← T, Task; Frac2Hp1 ← (Zero) - T; Frac2H'p1 ← (Zero) + T + 1; *Have to open-code the first subtraction. T ← Frac2L; T ← (Frac1L) - T, LoadPage[FlPage1]; TFrac ← T, FreezeResult; OnPage[FlPage1]; T ← (Frac2H') + 1, UseCOutAsCIn; Frac1H ← (Frac1H) + T, Call[DivStart]; *Loop here 10 to 12 cycles Frac1H ← (LSh[Frac1H,1]) + T; *Shift in quotient bit Frac1L ← (Frac1L) SALUFOP T, UseCOutAsCIn, GoTo[DivSub,Carry]; *Shift the remainder left 1 bit and add the divisor. DivAdd: T ← Frac2L, FreezeResult, GoTo[DivAddLast,Carry]; DivAdR: TFrac ← (LSh[TFrac,1]) + T, Skip[R<0]; T ← (Frac2H) + 1, UseCOutAsCIn, Return; T ← (Frac2Hp1) + 1, UseCOutAsCIn, Return; *Shift the remainder left 1 bit and subtract the divisor. DivSub: T ← Frac2L, FreezeResult, GoTo[DivSubLast,Carry]; DivSbR: TFrac ← (LSh[TFrac,1]) - T, Skip[R<0]; T ← (Frac2H') + 1, UseCOutAsCIn, Return; T ← (Frac2H'p1) + 1, UseCOutAsCIn, Return; *Initialize Frac1L so the sign will become 1 on the final iteration; *a 1 will then be shifted into the sign for a normalized result. DivStart: T ← ExpSign2 ← (ExpSign2) - (37400C), GoTo[DivExtraStep,Carry']; ExpSign1 ← (ExpSign1) - T; Frac1L ← 3C, GoTo[DivSub]; *Initial subtract succeeded *Initial subtract failed--extra iteration and exponent is one smaller. DivExtraStep: T ← (ExpSign2) + (200C); ExpSign1 ← (ExpSign1) - T; Frac1L ← 1C, GoTo[DivAdd]; *Get to DivAddLast or DivSubLast once after the result bits for the *high quotient word are accumulated and again after the final iteration. DivAddLast: ExpSign1 ← (ExpSign1) + 1, DblGoTo[FDivX,.+2,ALU>=0]; DivSubLast: ExpSign1 ← (ExpSign1) + 1, GoTo[FDivX,ALU>=0]; MNBR ← Frac1L, Skip[R Odd]; Frac1L ← 200C, GoTo[DivAdR]; Frac1L ← 200C, GoTo[DivSbR]; *Really done; position the fraction for normalization and rounding. FDivX: Frac1L ← LSh[Frac1L,7], Skip[R Even]; T ← Frac1H, LoadPage[FlPage0], GoTo[DivMore1]; TFrac ← (TFrac) + T; T ← (Frac2H) + 1, UseCOutAsCIn; T ← (Frac1H) + T, LoadPage[FlPage0]; DivMore1: LU ← (TFrac) or T; OnPage[FlPage0]; T ← MNBR, Skip[ALU=0]; Frac1L ← (Frac1L) or (1C); Frac1H ← T, GoTo[Renorm]; *ZERO ARGUMENTS. %FAdd/FSub: Get to Add0 knowing that (Frac1H) and (Frac2H) .eq. 0; since both args were either normalized or zero if passed by Unpack, passed by one/both fractions must be zero. If just one is 0, return the other. If both are 0, return 0 (negative 0 if both args are negative 0). SB points at arg1. % Add0: LU ← Frac1H; LU ← Frac2H, Skip[ALU=0]; Stack&+2, GoTo[NormReturn]; *Frac1H .ne. 0, so Frac2H .eq. 0 T ← ExpSign2, Skip[ALU#0]; *Frac1H and Frac2H both 0. ExpSign1 ← (ExpSign1) and T, GoTo[Pack0]; *The following would work except that on FSub the sign must be complemented: * Stack&+3; * Stack&-2 ← Stack&-2; * Stack&+3; * Stack&-2 ← Stack&-2, GoTo[NormReturn]; FPTemp1 ← 60C; *Frac2H .ne. 0 SB ← FPTemp1; BBFBX, GoTo[Renorm]; %Divide. Trap if divisor (Frac2) is 0; else dividend (Frac1) is 0, so return 0 with sign the xor of the dividend and divisor signs. Since Unpack has already trapped for all denormalized args except zeroes, the checks for 0 look at FracH bit 1. T contains ExpSign2. Multiply. Return 0 with sign equal to the xor of the multiplier and multiplicand signs. T contains ExpSign2 - 40000b. % Div0: LU ← (Frac2H) and (40000C); Mul0: ExpSign1 ← (ExpSign1) xor T, GoTo[FloatTrap,ALU=0]; Pack0: T ← Stack&+1 ← 0C, GoTo[PackA]; %Floating Comparison returns INTEGER -1 if arg1 .ls. arg2, 0 if arg1 .eq. arg2, or 1 if arg1 .gr. arg2. Timing @FComp to completion: 15 or 16 cycles if .eq., 27 to 30 cycles if .ne. **Unpacking apparently serves only to eliminate not-a-numbers and denormalized **numbers from testing. % @FComp: T ← LSh[ExpSign1,17], Task, At[UnpackRet,4]; Stack&+3; LU ← (LSh[ExpSign2,17]) xor T; T ← Stack&-2, GoTo[CompDiffSign,ALU<0]; LU ← (Stack&+3) - T; *LU ← arg1Lo - arg2Lo T ← Stack&-2, FreezeResult; *T ← arg1Hi - arg2Hi T ← (Stack&-1) - T, UseCOutAsCIn, GoTo[CompNon0Test,ALU#0]; Comp0Test: Stack ← T, FreezeResult, GoTo[NormReturn,ALU=0]; ExpSign2 ← (ExpSign2) + 1, DblGoTo[.+3,.+2,Carry']; CompNon0Test: ExpSign2 ← (ExpSign2) + 1, Skip[Carry']; ExpSign1, DblGoTo[CompL,CompG,R Odd]; ExpSign2, DblGoTo[CompL,CompG,R Odd]; CompG: Stack ← 1C, GoTo[NormReturn]; CompL: Stack ← (Zero) - 1, GoTo[NormReturn]; NormReturn: LU ← NextInst[IBuf]; NIRet; *If the signs differ, it is nevertheless possible that both fractions are 0 *(positive and negative 0). When the signs differ, we don't care which way *the Carry branch condition goes at Comp0Test+1 (???). CompDiffSign: T ← Frac1H; T ← (Frac2H) or T, GoTo[Comp0Test]; %Float LONG INTEGER to REAL 1) Copy the argument into Frac1 2) Check sign, and negate Frac1 if negative. 3) Avoid 16d of NormSteps if 1st word is 0; special case entirely 0. 4) Set exponent and Renormalize. Timing Float to Renorm: 21 cycles if positive, 28 if negative; +4 if 1st word insignificant. % Float: Frac1H ← 40C; SB ← Frac1H, Frac1H ← T, NoRegILockOK; ExpSign1 ← T, BBFBX; *Start building sign T ← Stack&-1, Call[FloatA]; *Have Frac1L in T from either FloatA or NegFrac. FloatB: ExpSign1 ← (ExpSign1) - (200C); LU← Frac1H; ExpSign1 ← (ExpSign1) + (43400C), Skip[ALU=0]; ExpSign1 ← (ExpSign1) + (4000C), GoTo[Renorm]; Frac1H ← T; Frac1L ← 0C, Skip[ALU=0]; T ← LSh[Frac1H,10], DblGoTo[Pack,NormStep,R<0]; T ← Stack&+1 ← 0C, GoTo[FlPush]; FloatA: Frac1L ← T; ExpSign1 ← RSh[ExpSign1,17], DblGoTo[NegFrac,FloatB,R<0]; %Fix or Round REAL to LONG INTEGER, INTEGER, or CARDINAL. 1) Unpack the argument. 2) Unnormalize Frac1 so that the decimal point is positioned to the right of Frac1L bit 15d. 3) Round, if appropriate (controlled by FlCode) 4) Negate the fraction if sign is negative. 5) Several things depending on the result type (controlled by FlCode) LONG INTEGER: Push double-precision result. INTEGER: Trap if positive or negative number is larger than 15d bits, else push single-precision result. CARDINAL: Trap if fraction word 0 is non-zero, else push single-precision result. Timing from @Fix to exit (not including Unnorm): Fix 21 if 0, 23 if .gr. 0, 29 if .ls 0; FixC 19 if 0, 21 if .gr. 0, 27 if .ls 0; FixI 24 if 0, 26 if .gr. 0, 33 if .ls. 0; Add 2 + RoundCard time for the analogous Round operations. % *Computation is 236b - Exp1 (the low order +1 prevents sign subtraction *from propagating into the exponent field and inverts the branch condition). *GoTo FixTrap if exponent .gr. 235b; in this case the number is .ge. 2↑31d *(31d bits holds all numbers .ls. 2↑31. FixRA: SB[ExpSign] ← T ← SB[ExpSign] - (200C); LU ← SB[FracH], GoTo[FixTrap,ALU>=0]; FPTemp1 ← (Zero) - T, GoTo[Unnorm,ALU#0]; T ← SB[FracL], GoTo[FrPos]; *Fraction is 0 @Fix: SB[ExpSign] ← SB[ExpSign] - (47000C) - 1, Call[FixRA], At[UnpackRet,6]; *SB[FracL] is in T here. SB[ExpSign], DblGoTo[FRPos,FRNeg,R Odd]; @Round: SB[ExpSign] ← SB[ExpSign] - (47000C) - 1, Call[FixRA], At[UnpackRet,10]; *SB[FracL] is in T here. LU ← MNBR, Call[RoundCard]; *SB[FracL] is still in T. SB[ExpSign], Skip[R Odd]; FRNeg: Call[NegFrac1]; FRPos: Dispatch[FlCode,12,2]; LU ← SB[FracH], Disp[.+1]; GoTo[FixTrapA,ALU#0], At[FixDisp,2]; *Cardinal FlPush: LU ← NextInst[IBuf]; Stack&+1 ← T, NIRet; SB[ExpSign], FreezeResult, GoTo[.+3,R Even], At[FixDisp,1];*Integer LU ← T, GoTo[FixTrapB,ALU#0]; DblGoTo[FixTrapA,FlPush,ALU<0]; LU ← SB[FracH] xnor (0C); LU ← (LSh[AllOnes,17]) xor T, DblGoTo[FixTrapB,.-2,ALU#0]; Stack&+1 ← T, At[FixDisp,0]; *Long integer T ← SB[FracH], GoTo[FlPush]; %FSticky is used to read and set the InexactResultTrapEnable and the InexactResult Bit. It swaps Stack and Sticky. % FSticky: MNBR ← Sticky, Sticky ← T, NoRegILockOK; T ← MNBR, GoTo[FlPush]; *For buffer refill traps. PFetch4[PCB,IBuf,4], GoToExternal[MesaRefillLoc], At[LShift[FlPage0,10],377]; :END[MesaFP];