Execution Unit Features
Æ
160 32-bit registers
Ë 128 registers used to cache local variables
Ë 16 registers used for runtime
Ë 12 registers used for constants
Ë 4 special registers
Ë read two, write one every cycle
Æ
1 operation per cycle
Ë 32-bit ALU operation
Ë 32 ← 32 x 32 shift/extract/insert
Ë 2 bit multiply step
Ë 1 bit divide step
Ë bypassing avoids waits for register update
Æ 1 cache read/write per cycle
Instruction Fetch Unit
Æ
IFU translates DragOps to EU control signals
Ë DragOps = interface between IFU and compiler
Ë IFU selects registers & operations for EU
Ë IFU passes literals to EU
Ë IFU gets some data from EU
Æ IFU overlaps instruction fetch with EU operations
Æ
IFU performs control transfers
Ë jumps (conditional & unconditional)
Ë traps/calls/returns
Ë faults
Æ
IFU controls local frame model
Ë S is index to top of stack
Ë L is index to base of local frame
Ë PC, L saved at call, restored at return
Ë S, L adjusted in IFU only
Instruction Set Issues
Æ
IFU issues
Ë IFU has very little space
Ë no microcode (PLA, not ROM)
Ë simple instructions necessary
Ë DragOps should closely match EU
Æ
compiler issues
Ë density worthwhile, but efficiency worth more
Ë should have minimal "funny restrictions"
Ë must be sufficient for efficient Mesa
Kinds of instructions
Æ instructions can be 1, 2, 3, or 5 bytes long
Æ
RR format - 3 operand arithmetic / logical
Ë
each operand designates one of
¡ 16 local registers
¡ 16 auxilliary registers
¡ 12 "constant" registers
¡
4 stack options
source: [S], [S]-, [S-1], [S-1]-
destination: [S], [S-1], [S+1]+
Æ
RJB format - conditional jumps
Ë
each compares two operands
¡ one operand must be on stack
¡ other operand is as in RR format
Ë
each has two opcodes
¡ sense of jump is predicted
Ë range is [pc-128..pc+127] bytes
More kinds of instructions
Æ
calls / returns
Ë must be fast & simple
Ë return PC and L cached in IFU
Ë
arguments become locals of called frame
¡ returns passed back on stack
¡ must adjust L at entry, S at exit
Ë can directly call 32-bit byte PC
Ë can call through [S]
Ë
global frame not part of call/return
¡ G is loaded by separate op
Æ
reads / writes
Ë stack & register versions
Ë most have 1 byte offsets
Even more kinds of instructions
Æ
compact versions
Ë
stack oriented versions of add, sub
¡ can add/sub literals as well
Ë stack oriented versions of shifter ops
Ë stack <-> register operations
Ë 1, 2, 4 byte literals
Æ
others
Ë S & L adjusting operations
Ë read/write special IFU/EU registers
Ë bounds check
Ë jumps, noops, traps
Ë map operations
Ë conditional store
Conditional Store
Æ
sample ← CStore[ptr, old, new]
Ë sample ← ptr^ {hold the bus}
Ë IF sample = old THEN ptr^ ← new
Æ
sources of atomicity
Ë CStore
Ë cause interrupt
Ë enable/disable interrupts (1 processor)
Æ
capabilities
Ë spin locks
Ë monitor locks
Ë condition variables
Ë atomic x ← f[x]