DragonFP.tioga
<<Save Using - Copy /INDIGO/Dragon/Documentation/ _ DragonFP.tioga>>
<<Last Edited by Curry - December 13, 1984 1:01:24 pm PST>>

Floating Point Instructions

Several conditions have contributed toward a change in the floating point control scheme:
A desire to reduce the size of the Instruction Decode PLA
New multicycle floating point instructions (single and double precision divide)
Newly available chained floating point operations
Possible future instruction timing problems
Opportunity to use the EU stack as a source of mode changes
The cost of the proposed changes will be use of 2 more 3 byte opcodes, four floating point shadow registers in the EU, and some 
floating point flags added to the IFUStatus bits.  In addition, when floating point instructions are active, it may be necessary to 
save/restore up to all four of the floating point shadow registers during process switching (just as carry and the field register 
must be saved/restored).

Although each floating point device (alu and multiplier) has on-board double word registers for both the A and B operands, the 
proposed scheme only allows the A operand registers to be used to hold values between instructions.  Interprocess integrity is 
preserved by maintaining shadow registers in the EU (FpAluM, FpAluL, FpMultM, FpMultL) and by keeping flags describing the way in 
which the registers for both devices were loaded last (single, double, integer, clear).  Clearing these flags indicates the 
registers are no longer important and need not be maintined during process switches.


SFP alpha - Store FP - is used to load the A operand registers for either the floating point ALU or multiplier (as well as their EU 
shadows) from the EU Stack and to set the appropriate IFUStatus flags.  SFP is also used to set the mode in the floating point 
devices.
alpha[0]
0    =>    ALU
1    =>    Multiplier
alpha[1]
0    =>    Store A Operand
1    =>    Execute Floating Point modeControlOperator
alpha[6..7]
3    =>    A operand is integer    - stores one word  from stack
2    =>    A operand is double        - stores two words from stack
1    =>    A operand is single        - stores one word  from stack
0    =>    A operand is clear        - clears flags - indicates end of FP operations for the device

Floating Point Mode
Both the ALU and multiplier devices have 16 bits of mode state.   This mode state must be changed 4 bits at a time using the SFP 
instruction where the top of the EU stack contains a modeControlOperator described below.  The IFU register FPMode maintains a 
shadow copy of the mode state of both devices (16 multiplier mode bits at the high end).  Currently, only one of the four nybls for 
each device need be manipulated by the 'user'.  The following table describes the types of mode information:
User mode:
Floating point rounding    {nearest, zero, plusInfinity, minusInfinity}
Fixed point rounding         {fpRounding, zero}
Fast mode                    {Fast, IEEE}
Mode fixed by implementation:
Flowthrough Timing        => 1
Accumulation Timing        ALU => 0,        Multiplier => 1

modeControlOperator:
If alpha[1]=1 then modeControlOperator is taken from stack
modeControlOperator[ 0..23] = 8 (funny control bit)
modeControlOperator[26..27] = modeNyblIndex
modeControlOperator[28..31] = modeNybl
modeNyblIndex=1
modeNybl[0..1] = Mult/Div accumulation rate timer control
00    => 1 cycle  for ALU @ 100ns cycle
01    => 2 cycles for Mult @ 100ns cycle
modeNyblIndex=2
modeNybl[0..2] = Flowthrough timer control
001    => 2 cycles for both devices @ 100ns cycle
modeNyblIndex=3=User Nybl
modeNybl[0..1] = Floating point rounding mode
00    Round toward nearest
01    Round toward zero
10    Round toward Positive infinity
11    Round toward negative infinity
modeNybl[2] = Fixed point rounding mode
0    Round according to Floating point rounding mode
1    Round toward zero
modeNybl[3] = Fast mode
0    IEEE mode
1    Flush denormalized operands and results to zero

For example, to set the user nybl in the multiplier for
Floating Point    -> Round toward Positive infinity,
Fixed                -> Round according to Floating point rounding mode,
Fast mode        -> IEEE
One would:
LIDB    00111000 00001000    (remember [S+1] _ 0 0 beta alpha, S _ S+1)
SFP    11000000

The shadow registers may be read or written without side effects using LIP and SIP.


FLIP/FLOP alpha beta - Floating point operation
FLIP    => put result In device.
FLOP    => put result On stack.

The FLIP or FLOP instructions specify the FP operation.  They also specifies the type of B operand (always located on the EU stack) 
and the destination of the result.
The B operand can be:
none
single
double
The result of the operation can go in one of three places:
top of the EU Stack                        (FLOP)
the A operand of the FP alu            (FLIP)
the A operand of the FP multiplier    (FLIP)
FLIP and FLOP intructions also contain a 6 bit field which allows the number of wait cycles to be specified.
  
FLIP/FLOP alpha beta
alpha[0]
0    =>    ALU operation
1    =>    Multiplier operation
alpha[1]
0    =>    FLIP => put result In ALU,  FLOP => put result On stack
1    =>    FLIP => put result In Mult, FLOP => put result On stack
alpha[2..7]
6 bit function code
beta[0..1]
0    =>    no B operand
1    =>    B operand is single
2    =>    B operand is double
beta[2..7] - 58 - # wait cycles
This field helps to reduce the size of the decode PLA by precomputing an internal microcycle branch.  So far there are exactly 4 
values depending on the function code.
double divide    => 2
single divide        => 31
double multiply    => 54
otherwise            => 56


FPMaskStatus register
A 4 bit status code is returned at the end of each FLIP or FLOP instruction.  This is ORed into the corresponding bit in the 16 bit 
status field in the FPMaskStatus register.  Because the first 3 bits of the status field represent the result of compare operations 
they are always cleared prior to this update.  A floating point fault is generated if the bit corresponding to the status code in 
the 16 bit mask of the FPMaskStatus register is not set. The FPMaskStatus register may be read or written using LIP and SIP.  The 
Weitec 1164/1165 documentation contains extensive descriptions of how various status codes are generated.  The code value, priority 
and description are as follows (10 is the highest priority status):

Code    Priority    Description

0000        1        Result was +/- zero            exact  (compare equal)
0001        1        Result was +/- infinity        exact  (compare less than)
0010        1        Result was finite and # zero    exact  (compare gr than)
0011        1        Result was finite and # zero    inexact

0101        2        Overflow and inexact
0110        3        UnderFlow
0111        4        UnderFlow and inexact

1000        5        Operand  A        Denormalized
1001        5        Operand      B    Denormalized
1010        6        Operands A+B    Denormalized
1011        7        Divide by zero

1100        9        Operand  A        NaN
1101        9        Operand      B    NaN
1110        10        Operands A+B    NaN
1111        8        Invalid operation


IFUStatus register
The IFUStatus register contains the following state
Traps            {enabled,  disabled}
Reschedule    {clear,     waiting}
Mode            {kernal,   user}
FPAluAreg    {clear, single, double, integer}
FPMultAreg    {clear, single, double, integer}


Floating Point Functions
Double precision operands are stacked with the MSW pushed first.

Interpretation of symbols in following function tables:
I32 = integer
F** = F32 or F64 - floating point number
W** = W32 or W64 -           wrapped FP num
U** = U32 or U64 - unrounded wrapped FP num
*32 = F32 or U32 - result may be too close to zero
*64 = F64 or U64 - result may be too close to zero

Multiplication

xxx 000   F32/U32 _     F32 * F32          000 xxx      A   *   B
xxx 001   F64/U64 _ F32/F64 * F32/F64      001 xxx     |A|  *   B
xxx 010   F32/U32 _     W32 * F32          010 xxx      A   *  |B|
xxx 011   F64/U64 _ W32/W64 * F32/F64      011 xxx     |A|  *  |B|
xxx 100   F32/U32 _     F32 * W32          100 xxx    - A   *   B
xxx 101   F64/U64 _ F32/F64 * W32/W64      101 xxx    -|A|  *   B
xxx 110   F32/U32 _     W32 * W32          110 xxx    - A   *  |B|
xxx 111   F64/U64 _ W32/W64 * W32/W64      111 xxx    -|A|  *  |B|

ALU

ALU-Subtract             ALU-Compare (status only)

00 000 0    *32 _ F32 - F32        10 000 0     F32  -  F32
00 000 1    *64 _ F64 - F64        10 000 1     F**  -  F**
00 001 0    *32 _|F32 - F32|       10 001 0
00 001 1    *64 _|F64 - F64|       10 001 1

00 010 0                           10 010 0    |F32| - |F32|
00 010 1                           10 010 1    |F**| - |F**|
00 011 0    *32 _ F32  / F32       10 011 0
00 011 1    *64 _ F**  / F**       10 011 1

00 100 0    *32 _-F32 + 0          10 100 0     F32   -  0
00 100 1    *64 _-F64 + 0          10 100 1     F**   -  0
00 101 0                           10 101 0
00 101 1                           10 101 1

00 110 0    *32 _-F32 + 0          10 110 0
00 110 1    *64 _-F64 + 0          10 110 1
00 111 0    *32 _ W32  / F32       10 111 0
00 111 1    *64 _ W**  / F**       10 111 1

ALU-Add                   ALU-Convert

01 000 0    *32 _ F32 + F32        11 000 0     U32 to F32 (exact)
01 000 1    *64 _ F64 + F64        11 000 1     U64 to F64 (exact)
01 001 0    *32 _|F32 + F32|       11 001 0     D32 to W32
01 001 1    *64 _|F64 + F64|       11 001 1     D64 to W64

01 010 0    *32 _|F32|+ |F32|      11 010 0     U32 to F32 (inexact)
01 010 1    *64 _|F64|+ |F64|      11 010 1     U64 to F64 (inexact)
01 011 0    *32 _ F32  / W32       11 011 0
01 011 1    *64 _ F**  / W**       11 011 1

01 100 0    *32 _ F32 + 0          11 100 0     F** to  I32
01 100 1    *64 _ F64 + 0          11 100 1     F** to  I32
01 101 0                           11 101 0     I32 to  F32
01 101 1                           11 101 1     I32 to  F64

01 110 0    *32 _|F32| + 0         11 110 0     F64 to  *32
01 110 1    *64 _|F64| + 0         11 110 1     F32 to  F64
01 111 0    *32 _ W32  / W32       11 111 0
01 111 1    *64 _ W**  / W**       11 111 1