EU state
The EU performs various operations on local registers, and communicates with the outside world through a cache attached to a bus (the M-bus). The EU has a 32-bit ALU, a Field Unit (FU) for shifting and other field operations, and a multiplier unit.
The EU has a set of registers that contain the most recent elements of the data stack for a process. There are also registers that contain constants, as well as special purpose quantities. This architecture permits most elementary operations invloving local variables to be performed in 1 EU cycle. However, this architecture requires special attention to migrating the contents of a process stack between registers and memory.
The EU has the following registers:
Stack denotes the stack registers (
ARRAY [0..StackSize)
OF Word)
StackSize = 128 in current version.
Locals denotes the local registers (Locals[x] == Stack[(x+L)
MOD StackSize])
note that Locals is aliased with Stack.
AuxRegs denotes the auxilliary registers (ARRAY [0..15] OF Word). Most of these registers will be used for runtime support for higher-level languages (i.e. Mesa).
Constants is an array of registers in the EU (ARRAY [0..11] OF Word). Although these registers are not really constant as far as the hardware is concerned, they are used to hold constants in the Mesa runtime. They are more general than the AuxRegs in that they can be used in more addressing modes.
Field denotes a special register that can be used to control the field unit operations, and also participates in multiplication and division.
MAR (Memory Address Register) denotes the register which holds the address given in the last memory operation. It is used to report EU page faults.
Instruction formats
This section describes the various instruction formats. Note that the description [0..255] specifies an unsigned number occupying 8 bits, while [-128..127] specifies a signed number occupying 8 bits. The description of the instruction format gives tentative bit assignments, followed by a list of the instructions using the format (in {} brackets), followed by a rough description of how the format is interpreted.
Arithmetic involving S or L will always be performed modulo the size of the EU stack, without detection of underflow or overflow. We will not further indicate this limitation of precision.
For convenience, we use the following abbreviations to refer to numbers obtained from bytes that follow the opcode byte:
Alpha is the first byte after the opcode. AlphaZ is Alpha extended to 32 bits with high-order 0s. AlphaS is Alpha extended to 32 bits with the sign (high-order) bit of Alpha.
Beta is the second byte after the opcode. BetaZ is Beta extended with 0s. When taken as a pair of four bit numbers, BetaL is the leftmost half of Beta, and BetaR is the rightmost half.
Gamma is the third byte after the opcode. Delta is the fourth byte after the opcode.
AlphaBeta is the unsigned unaligned halfword following the opcode byte, interpreted as Alpha + Beta*256. AlphaBetaZ is AlphaBeta extended to 32 bits with 0s. AlphaBetaS is AlphaBeta extended to 32 bits with the sign (high-order) bit of AlphaBeta.
AlphaBetaGammaDelta is the 32-bit unsigned unaligned quantity following the opcode byte. The number is interpreted as AlphaBeta + 256*256*(Gamma + Delta*256).
OI - Operation Implicit
[op: [0..255]] -- 1 byte
{ADD, BNDCK, DIS, DUP, EXCH, EXDIS, J1, LCn, MDIV, MUL, RDIV, RETN, RETT, SFC, SFCI, SJ, SUB, UDIV, UMUL}
For OI instructions the operand (if any) is implicit in the opcode.
OB - Operation Byte
[op: [0..255], lit: [0..255]] -- 2 bytes
{ADDB, ARL, AS, ASL, CST, EP, IN, J2, JB, LEUR, LIB, LIFUR, OUT, PSB, RB, RET, RSB, SEUR, SIFUR, SUBB, WB, WSB}
For OB instructions the operand is given by AlphaZ or AlphaS.
ODB - Operation Double Byte
[op: [0..255], lit: [0..65535]] -- 3 bytes
{ADDDB, FSDB, J3, JDB, LFC, LGF, LIDB, SHL, SHR}
For ODB instructions the operand is given by AlphaBetaZ or AlphaBetaS.
OQB - Operate Quad Byte
[op: [0..255], addr: Word] -- 5 bytes
{DFC, J5, LIQB}
For OQB instructions the operand is AlphaBetaGammaDelta.
JBB - Jump Byte Byte
[op: [0..255], dist: [-128..127], lit: [0..255]] -- 3 bytes
{JEBB, JEBBJ, JNEBB, JNEBBJ}
For JBB instructions the new PC is the byte address given by PC+AlphaS. BetaZ is used for comparison with the top of stack.
LR - Local Register
[op: [0..15], reg: [0..15]] -- 1 byte
{LRn, SRn}
For LR instructions the operand is Locals[reg], which is either pushed to or popped from the stack.
LRB - Local Register Byte
[op: [0..15], reg: [0..15], disp: [0..255]] -- 2 bytes
{LRIn, SRIn}
For these instructions, the operand is (Locals[reg]+AlphaZ)^, which is either pushed to or popped from the stack.
LRRB - Local Register Register Byte
[op: [0..255], disp: [0..255], reg1,reg2: [0..15]] -- 3 bytes
{RAI, RRI, WAI, WRI}
For all instructions, reg1 (BetaL) indicates a local register. For RRI and WRI, reg2 (BetaR) indicates a local register as well. For RAI and WAI, reg2 indicates an auxilliary register.
RR - Register to Register
[op: [0..255], c,a: [0..15], aOpt,cOpt,bOpt,aux: BOOL, b: [0..15]] -- 3 bytes
{RADD, RAND, RBC, RFU, ROR, RRX, RSUB, RUADD, RUSUB, RVADD, RVSUB, RXOR}
For these instructions the effect is roughly Rc←Ra op Rb. SE is used to indicate the effect of the instruction on S. SE𡤀 at the start of the instruction, and S←S+SE at the end of the instruction.
Ra:
IF
NOT aOpt
THEN IF aux THEN AuxRegs[a] ELSE Locals[a]
ELSE
SELECT a
FROM
< 12 => Constants[a]
12 => Stack[S];
13 => Stack[S-1];
14 => Stack[S]; SE←SE-1
15 => Stack[S-1]; SE←SE-1
Rb:
IF
NOT bOpt
THEN IF aux THEN AuxRegs[b] ELSE Locals[b]
ELSE
SELECT b
FROM
< 12 => Constants[b]
12 => Stack[S];
13 => Stack[S-1];
14 => Stack[S]; SE←SE-1
15 => Stack[S-1]; SE←SE-1
Rc:
IF
NOT cOpt
THEN IF aux THEN AuxRegs[c] ELSE Locals[c]
ELSE
SELECT c
FROM
< 12 => Constants[c]
12 => Stack[S];
13 => Stack[S-1];
14 => Stack[S+1]; SE←SE+1
15 => Stack[S+1]; SE←SE+1
RJB - Register Jump Byte
[op: [0..255], dist: [-128..127], sdd,sd,opt,aux: BOOL, reg: [0..15]] -- 3 bytes
{RJEB, RJEBJ, RJGB, RJGBJ, RJGEB, RJGEBJ, RJLB, RJLBJ, RJLEB, RJLEBJ, RJNEB,RJNEBJ}
For these instructions the effect is to jump to the byte PC given by PC+AlphaS if the indicated comparision of Ra with Rb is true. Beta is decoded as much as possible in the same way that Beta is decoded for the RR format instructions. The comparision is always made involving either [S] or [S-1] with a register determined by the decoding of Beta. SE is used as in the RR format description.
IF sdd THEN SE←SE-1
Ra: IF sd THEN Stack[S-1] ELSE Stack[S]
Rb:
IF
NOT opt
THEN IF aux THEN AuxRegs[reg] ELSE Locals[reg]
ELSE
SELECT reg
FROM
< 12 => Constants[reg]
12 => Stack[S];
13 => Stack[S-1];
14 => Stack[S]; SE←SE-1
15 => Stack[S-1]; SE←SE-1
Calls and Returns
This section gives tentative information about the strategy used for procedure calls and returns on Dragon.
Simple call
The simple case of calling a procedure first pushes any arguments expected by the procedure, then calls the procedure via DFC or LFC. The first instruction is normally an EP instruction, which sets L to the base of the arguments. Therefore, the arguments become the initial local variables without moving those arguments.
The return PC and the return L are pushed onto the IFU stack by the call. If there is insufficient room to do this, an IFU stack overflow trap is taken after the call has transferred control.
Global frames
If the procedure needs access to a global frame, then it is the responsibility of the procedure to setup a register with the pointer to the global frame. Current plans are to use the LGF instruction to load the global frame pointer from the global frame table into a local register. The LIQB instruction could be used to setup the global frame pointer in the case where there are few procedures for the frame, but space considerations will normally make it more desirable to use the 3-byte LGF instead of the 5-byte LIQB, since that will save 2 bytes per procedure.
Simple return
The simple case of returning from a procedure uses the RET opcode, which specifies how much to adjust the stack before returning. If the IFU stack is not empty, then the return PC and L are taken from the IFU stack, and the most recent entry in the IFU stack is discarded. If the IFU stack is empty when a RET is performed, the results are undefined. S is adjusted according to the alpha byte. For cases where the stack should not be adjusted on return the RETN opcode is used.
Procedure variables
For various reasons covered below, procedure variables are called with one more level of indirection than simple procedures. Procedure variables are implemented as pointers to words that contain the starting address of the procedure. To call through a procedure variable, the procedure variable is pushed, then a call is made to ([S])^ using SFCI. This convention leaves an extra word on the stack, so procedure variable calls must go to a different entry point than simple procedure calls.
Nested procedures
Nested procedures are implemented by placing the starting address in the local frame extension (the part of the local frame required to be in memory). The procedure variable for this nested procedure will be a pointer to this word. Therefore, on entry to the nested procedure, the address of the frame (plus an offset) will be on the stack, which makes computing the static link easy (a SUBB instruction).
Interface function call
Interface function calls are both more flexible and more involved than simple procedure calls. Interface records are referred to via positions in the global frame. Procedures exported through those interfaces have procedure variables in various slots of the interface record. To make an interface function call one simply sets up the arguments, fetches the procedure variable from the interface record, and performs a procedure variable call.
Multiple global frames
Although procedures must be specially compiled to use multiple global frames, there is no additional mechanism beyond forcing all calls to routines with multiple global frames to use the indirect procedure call. The address on top of the stack can then be used to find the global frame (probably via adding a constant).
Coroutines
Coroutine calls are handled by traps. The data structures to be used are not yet defined. However, co-routine calls will be roughly as expensive as process switches.
Traps
This section gives tentative information about traps generated by the IFU. The general approach to traps is to have the instruction that generates the trap have no effect, and the return PC for the trap routine be the PC of the trapping instruction. Maskable traps (Reschedule, EU Stack overflow, and IFU stack overflow) disable further maskable traps until they are reenabled (usually via RETT).
Reschedule
The reschedule trap occurs when the RESCHEDULE line is raised and interrupts are enabled. An attempt is made to have idle processors notice the RESCHEDULE line before non-idle processors (this is controlled by software). If the RESCHEDULE line is raised while interrupts are disabled, the reschedule trap will be deferred until interrupts are enabled again. The response to the reschedule trap is quite complex, and will not be covered in this note.
EU Stack overflow
The stack overflow trap occurs when an attempt to increase the EU stack pointer by 1 (S←S+1) would result in S crossing the stack overflow limit register (and traps are enabled). The limit, which is set by software, must allow sufficient room for the trap handler. The handler for this trap should migrate the eldest frame in the EU and IFU stack registers to memory, then return via RETT.
IFU stack overflow
The IFU stack overflow trap occurs when an attempt to call a procedure when the IFU stack is full (and stack overflow is enabled). Some number of IFU frames are still available after this trap occurs, so calls can be made by the trap handler. The handler for this trap is the same as for EU stack overflow.
Stack underflow
The stack underflow trap occurs when a return instruction (RET, RETN, or RETT) tries to return to an empty IFU stack. The trap handler must arrange to migrate a frame from memory to the IFU and EU registers. This is not a true trap, since the IFU stack is left empty on entry to the handler. When the transfer takes place, maskable traps are disabled.
ALU fault
The ALU fault trap occurs when the ALU detects integer overflow, a bounds check (due to the BNDCK instruction) or a NIL check (due to the NILCK instruction), or Lisp NaN. The handler for ALU fault should turn this trap into the appropriate error, depending on the instruction that raised the fault.
EU page fault
The EU page fault trap occurs when a reference to unmapped memory is made by the EU. The faulting address is available through a special EU register (MAR). The handler should determine if the page is valid, then either cause the page to be made present, or cause a page fault error. Interrupts are disabled during the trap handler.
IFU page fault
The IFU page fault trap occurs when a reference to unmapped memory is made by the IFU. The faulting address is available as the return PC. The handler should determine if the page is valid, then either cause the page to be made present, or cause a page fault error.
Instruction Set Summary
Name form # Description
ADD OI 1 [S-1]←[S]+[S-1]; S←S-1; trap on overflow
ADDB OB 1 [S]←[S]+AlphaZ; trap on overflow
ADDDB OB 1 [S]←[S]+AlphaBetaZ; trap on overflow
AL OB 1 L←L+Alpha
AS OB 1 S←S+Alpha
ASL OB 1 S←L+Alpha
BNDCK OI 1 trap if [S] < 0 or [S-1]-[S] >= 0; S←S-1
CST OB 1 [S+1]←([S-2]+AlphaZ)^; [S+1]=[S] => ([S-2]+AlphaZ)^←[S-1]; S←S+1; special: atomic
DFC OQB 1 call proc at AlphaBetaGammaDelta
DIS OI 1 S←S-1
DJ OQB 1 PC ← AlphaBetaGammaDelta
DUP OI 1 [S+1]←[S]; S←S+1
EP OB 1 L←S+Alpha
EXCH OI 1 [S+1]←[S]; [S]←[S-1]; [S-1]←[S+1]
EXDIS OI 1 [S-1]←[S]; S←S-1
FSDB ODB 1 Field𡤊lphaBeta
IN OB 1 [S]←([S]+AlphaZ)^; special: uses IO lines
JB OB 1 PC←PC+Alpha
JDB ODB 1 PC←PC+AlphaBetaS
JEBBj JBB 2 BetaZ = [S] => PC←PC+AlphaS; S←S-1
Jn O* 4 Noop of length 1, 2, 3, or 5 bytes (used as jump)
JNEBBj JBB 2 BetaZ # [S] => PC←PC+AlphaS; S←S-1
LCn OI 8 [S+1]𡤌onstants[n]; S←S+1
LEUR OB 1 [S+1]←PReg[Alpha]; S←S+1
LFC JDB 1 call proc at PC+AlphaBetaS
LGF ODB 1 [S+1]←([GB]+AlphaBetaZ)^; S←S+1
LIB OB 1 [S+1]𡤊lphaZ; S←S+1
LIDB ODB 1 [S+1]𡤊lphaBetaZ; S←S+1
LIFUR OB 1 [S+1]←PReg[Alpha]; S←S+1
LIQB OQB 1 [S+1]𡤊lphaBetaGammaDelta; S←S+1
LRIn LRB 16 [S+1]←([L+n]+AlphaZ)^; S←S+1
LRn LR 16 [S+1]←[L+n]; S←S+1
MDIV OI 1 [S-2],[S-1] ← [S-2],[S-1] / [S]; signed, Sign[rem] = Sign[divisor]
MUL OI 1 [S-1],[S] ← [S-1]*[S]; signed
OUT OB 1 ([S]+AlphaZ)^←[S-1]; S←S-2; special: uses IO lines
PSB OB 1 ([S-1]+AlphaZ)^←[S]; S←S-1
RADD RR 1 Rc←Ra+Rb+carry; carry𡤀 trap on overflow
RAI LRRB 1 [L+BetaL]←(AuxRegs[BetaR]+AlphaZ)^
RAND RR 1 Rc←Ra AND Rb
RB OB 1 [S]←([S]+AlphaZ)^
RBC OB 1 trap if Ra < 0 OR Ra-Rb>= 0; Ra←Rc
RDIV OI 1 [S-2],[S-1] ← [S-2],[S-1] / [S]; signed, Sign[rem] = Sign[dividend]
RET OB 1 S←L+Alpha; return from proc
RETN OB 1 return from proc
RETT OB 1 return from proc; enable traps
RFU RR 1 [Rc]𡤏ieldUnit[[Ra],[Rb],Field]
RJEBj RJB 2 Ra=Rb => PC←PC+AlphaS
RJGBj RJB 2 Ra>Rb => PC←PC+AlphaS
RJGEBj RJB 2 Ra>=Rb => PC←PC+AlphaS
RJLBj RJB 2 Ra<Rb => PC←PC+AlphaS
RJLEBj RJB 2 Ra<=Rb => PC←PC+AlphaS
RJNEBj RJB 2 Ra#Rb => PC←PC+AlphaS
RLADD RR 1 Rc←Ra+Rb; carry𡤀 trap on overflow or Lisp NaN
RLSUB RR 1 Rc←Ra-Rb; carry𡤀 trap on overflow or Lisp NaN
ROR RR 1 Rc←Ra OR Rb
RRI LRRB 1 [L+BetaL]←([L+BetaR]+AlphaZ)^
RRX RR 1 [Rc]←([Ra]+[Rb])^
RSB OB 1 [S+1]←([S]+AlphaZ)^; S←S+1
RSUB RR 1 Rc←Ra-Rb-carry; carry𡤀 trap on overflow
RUADD RR 1 Rc←Ra+Rb+carry; set carry
RUSUB RR 1 Rc←Ra-Rb-carry; set carry
RVADD RR 1 Rc←Ra+Rb
RVSUB RR 1 Rc←Ra-Rb
RXOR RR 1 Rc←Ra XOR Rb
SEUR OB 1 PReg[Alpha]←[S]; S←S-1
SFC OI 1 call proc at [S]; S←S-1
SFCI OI 1 call proc at ([S])^
SHL ODB 1 [S]𡤏ieldUnit[[S],0,AlphaBeta]
SHR ODB 1 [S]𡤏ieldUnit[[S],[S],AlphaBeta]
SIFUR OB 1 PReg[Alpha]←[S]; S←S-1
SJ OI 1 PC←PC+[S]
SRIn LRB 16 ([L+n]+AlphaZ)^←[S]; S←S-1
SRn LR 16 [L+n]←[S]; S←S-1
SUB OI 1 [S-1]←[S-1]-[S]; S←S-1; trap on overflow
SUBB OB 1 [S]←[S]-AlphaBetaZ; trap on overflow
WAI LRRB 1 (AuxRegs[BetaR]+AlphaZ)^←[L+BetaL]
WB OB 1 ([S]+AlphaZ)^←[S-1]; S←S-2
WRI LRRB 1 ([L+BetaR]+AlphaZ)^←[L+BetaL]
WSB OB 1 ([S-1]+AlphaZ)^←[S]; S←S-2
UDIV OI 1 [S-2],[S-1] ← [S-2],[S-1] / [S]; unsigned
UMUL OI 1 [S-1],[S] ← [S-1]*[S]; unsigned