Page Numbers: Yes X: 527 Y: -.5" First Page: 2
Heading:
Dorado Hardware ManualOverview14 September 1981
Overview
Experience suggests that programmers will gradually develop a mental model something like Figure 1; until this mental model is well established, it is probably desirable to
Read the following with Figure 1 in view.
Dorado has Processor, Control, Memory, IFU, and IO controller sections.
Io controllers are independent of each other and of the other sections—you will have to understand a particular io controller iff you are going to write microcode that controls it.
The memory and IFU are "slaves" to the processor/control section. In most situations, their external interface is simple relative to internal details of operation, and effective programming is usually possible without detailed understanding.
However, programmers will have to understand the processor thoroughly because the different parts of the processor are controlled directly by instruction fields, and most of the processor will be used, even in a small program.
Programmers must also understand most of the control section, although fairly simple assembly language contstructs are transformed into the complicated branch encodings needed by Dorado, so detailed understanding of Dorado branching is not required.
Control
Dorado supports up to 16 independent tasks at the microcode level. Each task has its own program counter (TPC), and other commonly-used registers are also replicated on a per-task basis. Tasks are scheduled automatically by the hardware in response to wakeup requests, where task 15 is highest priority, task 0, lowest.
Emulator microcode runs entirely in task 0 (lowest priority); fault conditions normally wakeup task 15, the "fault task" (highest priority). Other tasks are normally paired with io devices that issue wakeup requests when they need service. Task switching, discussed in "Control Section", is in most cases invisible to the programmer, because commonly-used registers are duplicated for each task.
In this manual, "instruction" refers to a microinstruction in the control store, as opposed to an opcode in the higher level language interpreted by a microprogram. The JCN field in an instruction encodes a variety of jumps, calls, conditional jumps and calls, instruction dispatches and returns for the current task.
Registers, Memories, and Data Paths
Tables 1, 2, and 3 describe memories, registers, and data paths in Dorado; these are diagrammed in Figure 1. The first two tables below focus on a particular register or memory and tell how it is used and where it connects; the third table focuses on particular data paths and shows how they connect various parts of the machine.

Table 1: Memories
MemoryComments
IMIM is a 4096-word x 34-bit (+2 parity) RAM used to store instructions. When written, the address is taken from Link and data from B 16 bits at-a-time (1 extra bit and parity from RSTK field). When read, the address is taken from Link, and data is delivered to Link 9 bits at-a-time. The read or write is controlled by the JCN field and two or three low bits of RSTK.
ALUFMALUFM is a 16-word x 6-bit ALU control RAM addressed by the 4-bit ALUF field. Five ALUFM bits specify 16 boolean or 5 arithmetic operations on A and B. One bit is the input carry for arithmetic operations (modifiable by several functions). ALUFM[ALUF] is read onto Pd by the ALUFMEM function or both read onto Pd and loaded from B by the ALUFMRW← function.
RMRM is a 256-word x 16-bit (+2 parity) RAM used for general storage by all tasks. The normal address is RBase[0:3],,RSTK[0:3]. Data can be read onto A or B and loaded from Pd or Md without using FF. Together with T, RM forms the input to the Shifter.
STKSTK is a 256-word x 16-bit (+2 parity) stack accessible only to the emulator, used instead of RM when the BLOCK bit in the instruction is 1. Its address comes from StkP, modified by -4 to +3 under control of RSTK.
IFUMIFUM is a 1024-word x 24-bit (+3 parity) decoding memory containing 256 words for each of four instruction sets. The instruction set can be set by the InsSetOrEvent← function. The low 8 address bits are normally an opcode fetched from the cache, but can be loaded from B by the BrkIns← function to read or write IFUM itself. The IFUMLH← and IFUMRH← functions load, and the B←IFUMLH’ and B←IFUMRH’ functions read different bits of IFUM. During normal operation IFUM controls decoding of the stream of opcodes and operands fetched from memory relative to BR 31, the code base.
MAINMain storage consists of a 64-row x 4-column x 16-word virtual cache coupled with one to four 256k x 16-bit memory modules (using 16k-bit storage chips). The IFU and processor independently access the cache, with IFU references deferring to the processor. The processor has two dissimilar methods of reference, one primarily to the cache (with "misses" initiating main memory action) and one directly to main memory (invalidating cache hits on writes, using dirty cache hits on reads). Fetch←, Store←, IFetch←, LongFetch←, and PreFetch← are cache references. Md can be loaded into T or RM (LC field), routed onto B (BSEL field), onto A (FF field), or used in a shift-and-mask operation (ASEL and ALUF fields). IOFetch← and IOStore← (ASEL field) initiate a 16-word transfer between an io device and memory without further processor interaction (using Fin or Fout bus). Virtual addresses are transformed to absolute using the Map memory. All references leave information in the Pipe memory.
BRA 32-word x 28-bit base register memory addressed by the MemBase register. The virtual address for any memory reference is BR[MemBase]+Mar. BR is loaded from Mar by the BrLo←A and BrHi←A functions and can be read indirectly onto B via the virtual address left in the Pipe after a memory reference (Pipe0 and Pipe1 functions).
PipeThe 16-entry x 6-word pipe contains trace information left by memory references. This information includes the virtual address, map stuff, single-error and double-error information, cache control stuff, task and subtask. It is automatically loaded during any memory reference and can be read onto B by the Pipe0, Pipe1, ..., Pipe5’ functions.
MapThe Map is a 16k or 64k-word x 19-bit (+parity) memory used to transform virtual addesses to absolute. Addressed by VA[10:23], map entries contain 16 bits of real page, write protect, dirty, and referenced bits. They can be written from B with Map← (ASEL) and read from the Pipe after main storage references.

Table 2: Registers
RegisterComments * = one of these for each task; i.e., "task specific"
T*16-bit (+2 parity) T sources either A (ASEL field or FA field with memory ops) or B (BSEL field), or the Shifter (ASEL) and loads from either Pd or Md (LC field).
RBase*4-bit RBase,,RSTK field forms addresses for RM. RBase can be loaded from FF[4:7] or from B[12:15] by the RBase←SC, RBase←B, or Pointers←B functions; it is read onto Pd[12:15] by the Pd←Pointers function. RBase is loaded with 0 or 1 when the IFU dispatches to the first instruction for an opcode.
StkPThe emulator uses STK instead of RM when the BLOCK bit is 1. 8-bit StkP holds the address for STK. The RSTK field is interpreted as an adjustment to StkP, which can be modified -4 to +3 in conjunction with testing for overflow and underflow. This mechanism implements the Mesa evaluation stack. StkP can be loaded by the StkP←B function and read onto Pd[8:15] by the ←TIOA&StkP function (Stack overflow and underflow indicators are read into Pd[8:9] by the Pd←Pointers function.).
Q16-bit Q is used as a shift register by multiply and divide. Q can be read onto A (FF field or FA with Fetch← or Store←) or B (BSEL field) and loaded from any B source except a constant (BSEL and FF fields). Functions implement Q lsh 1 and Q rsh 1.
CntCnt is a 16-bit counter that can be both decremented and tested for zero by a branch condition. Cnt can be loaded from FF[4:7] with 1 to 16 or from B (FF field) and can be read onto Pd (FF).
TIOA*TIOA is an 8-bit io address register (see "Slow IO") loaded by the TIOA←B function and read onto Pd[0:7] with the Pd←TIOA&StkP function. TIOA[5:7] may also be loaded from FF[5:7].
ShC16-bit ShC controls the shifter-masker (see "Shifter"). RF←A, WF←A, and ShC←B functions load ShC in various ways. ShC can be read onto Pd by the Pd←ShC function.
MemBase*MemBase is a 5-bit register addressing BR for memory references. The MemBase←n functions load it from FF[3:7]; the MemBaseX←n functions load it from 0,,MemBX[0:1],,FF[6:7]. The IFU loads MemBase with a value between 0 and 3 relative to MemBX or with 34 to 37, as specified in IFUM, prior to executing the first instruction of an opcode. MemBase is read onto Pd[3:7] by the Pd←Pointers function and loaded from B[3:7] by the Pointers←B and MemBase←B functions.
MemBXMemBX is a 2-bit register used like a stack pointer in conjunction with MemBase. The ideas behind this are discussed in "Memory Section".
Link*16-bit Link holds subroutine return addresses, address-modification for dispatches, IM address for IM reads/writes, and data for TPC reads/writes. It can be read onto or loaded from B[0:15] by the B←Link or Link←B, BigBDispatch←B, or BDispatch←B functions, or from CIA+1 by CALLs and RETURNs.
PC16-bit PC contains the byte displacement of the next opcode relative to BR 31, the code base. The IFU maintains this register, so only conditional jumps that don’t jump and opcodes of type "pause" have to load it with the PCF←B function. The B←PCX’ function reads PC.
TPC*TPC contains the address of the next instruction for each task. It is addressed from B[12:15] and read/write control is in JCN. Data is read from/written into Link under control of the JCN field of the instruction.
McrMemory control register—disables parts of memory system for initialization and checkout.

Table 3: Data Paths
PathComments
AThe 16-bit high-true A bus (called "alua" in hardware drawings) may be driven from T, RM, STK, Q, Id, Md, a small constant between 0 and 178, or the shifter. It is also possible to ’or’ the low-true shifter output with one of the other A sources. The A bus is totally inside the processor section, not connected to any other sections of Dorado, and it is one of the two Alu inputs. The RF←A and WF←A functions, which load ShC for subsequent shift operations, receive data from A.
MarThe 16-bit Mar bus transmits the displacement for a memory reference from the processor or IFU section to the memory section. The CFlags register, some bits of the Mcr register, and the BR memory in the memory section are also loaded from Mar. The processor drives Mar only when it is starting a reference or executing one of the functions between 1208 and 1278 (i.e., CFlags←A’ and LoadMcr[A,B] are in this group of functions); during other instructions, the IFU may use Mar to initiate instruction-fetches. Mar is driven low-true; when driven by the processor, it receives the same data as are driven onto A (but the shifter cannot drive Mar).
BThe 16-bit B bus consists of one data path inside the processor section (called "alub" in hardware drawings) and another on the backplane (called "Bmux" in hardware drawings); the IOB bus is driven from Alub on Output operations, when it also is an extension of B. Alub and Bmux may be directly driven high-true from registers inside the processor; alternatively, Bmux may be driven low-true from other sections, in which case the processor receives the data onto alub through inverters (so the data appears high-true on alub). The BSEL field in an instruction can specify that either T, RM/STK, Q, or Md sources B; other sources and destinations loaded from B are specified in the FF field; BSEL and FF are used in combination to specify that a literal 8-bit constant (in either the left or right byte of the word with 0’s or 1’s in the other byte) sources B. Alub is one of the two Alu inputs. The processor computes odd byte parity on alub; Bmux and IOB destinations may store or check the parity computed by the processor.
PdThe Pd path ("Processor data") receives data from an 8-input multiplexor whose inputs are the Alu output, possibly shifted left or right one bit on Alu shift functions or masked on a shifter operation, io device input data, and the infrequently read registers in the processor section. Pd may be written into the T register or the RM or STK memories.
IdThe Id path ("IFU data") is used to send arguments from the IFU to the processor for interpretation. It can be routed onto A using ASEL (A←Id, Fetch←Id, Store←Id, or IFetch←RM/STK); alternatively, the TIsId or RIsId functions can be used to replace data from T or from RM/STK by IFU data—these functions provide a roundabout method of getting Id onto B.
MdThe Md path ("Memory data") moves data from the cache in the memory section into the processor. The processor latches Md and can route it onto A or B, load it into T and RM/STK, or use it in a shift-and-mask operation.
IOAThe IOA bus ("Input-output address") is driven from the TIOA register; it specifies the io device affected by a Pd←Input or Output←B function.
IOBThe IOB bus ("Input-output bus") is driven from alub on an Output←B function or received on Pd by a Pd←Input function; it transmits data to or from an io device.
Fout("Fast output bus") transmits data from the error corrector to a fast output device.
Fin("Fast input bus") transmits data from a fast input device (Presently, there are no fast input devices) to the syndrome generator.
Sout("Storage output bus") transmits data from the syndrome generator to storage.
Sin("Storage input bus") transmits data from storage to the error corrector.

Timing
The terminology used in discussing timing is as follows:
clockThe 30 ns (nominal) atomic time period of the machine. Clock period can be controlled by the baseboard microcomputer or through the manifold system as discussed in the "Dorado Debugging Interface" document.1
cycleThe duration of instructions—two clocks or 60 ns except for instructions that read/write IM or TPC.
t0The instant at which MIR (MicroInstruction Register) is loaded—the beginning of a cycle.
t1The next instant after t0—always one clock later.
t2The instant following t1one clock after t1 except for instructions that read/write IM or TPC. Additional clocks intervening for these special cases, which only affect the control section, are denoted by t1a, t1b, etc.
t3, t4Subsequent instants for a instruction. t3 of the previous instruction coincides with t1 of the current instruction; t4 with t2.
First half cycleThe interval from t0 to t1 (or t2 to t3).
Second half cycleThe interval from t1 to t2 (or t3 to t4).
As implied by this terminology, Dorado initiates a new instruction every cycle. Instructions are pipelined, requiring a total of three cycles for execution. Timing for a typical instruction is shown in Figure 7. At t-2, the next instruction address is determined and instruction fetch from IM begins; at t0, the instruction is loaded into MIR from IM. During the first half cycle, the selected register is read from RM or STK, and at t1 is loaded into a register. During the next two clocks (t1-t3), addition is performed in the ALU; at t3 the result is loaded into a register for writing into RM/STK or T. During the final clock, RM is written.
Since a new instruction begins before the previous one finishes, paths exist to bypass the register being written if the following instruction specifies it as a source (These paths, inaccessible to the programmer, are not shown in Figure 1).
Most registers load from B at t3 (i.e., at the mid-clock of the cycle following the load instruction). These may source B in the instruction after they are loaded. The load information and data are pipelined into the next cycle, as described above. Registers loaded at t2 may be used during the first half-cycle of the following instruction. Usually, this type of register is used for some type of control information, since control registers are normally clocked at t0 (= t2 of previous instruction), data-oriented registers at t1 (t3 of previous instruction).
Table 4 summarizes the time at which loading takes place and some other information.
1 We actually operate with a clock speed of 32 ns, slower than the 30 ns nominal period, and production machines typically become unreliable at about a 29 ns clock period.

Table 4: Load Timing
Register/ TaskLoad Data LoadComment
MemorySpecificTimeSourceControl

MIR*not0IMJCNHolds current instruction
CIAnot0TNIA,BNPCJCNHolds current instruction address
CIAinc*not1CIA
TPCI*not2TNIA, CIA
TPCyesFHCTPCIHOLD
t2LINK(??)JCN,BReading/writing takes 3 cycles
Linkyest2BFFAlso loaded by CALL, RETURN, and
dispatchesreadout valid t1 to t3
IM
noBJCNReading/writing require 3 cycles
CTASK
not0NextSwitchCurrent task
CTD
not1CTASKCurrent task delayed
Ready
not0PEncSwitchTask-ready flipflops
StkP
not2BFFNew value read if it changes in the
same instruction
RBase
yest2F2F1RAM writte at t3, bypassed
Cntnot2F2F1Br cond to sub 1 and test
BFF
ALUFMnot2BFFAddressed by ALUF
The output is valid t1 to t3
TIOA
yest2BFFReadout valid till t3
MemBXnot2F2F1Readout valid till t3
MemBaseyest2F2F1Readout valid till t3
MemBase xor 1FF
ShCnot3A,BFFRF←A, WF←A, ShC←B
t1FFASEL,BSEL
Q
not3BFF,BSEL
t3ALU[15],,Q[0:14]FFMultiply
t3Q[1:15],,ALUcryFFDivide
t3QFFQ rsh 1, Q lsh 1
RMnoSHCPd,MdLC,RSTKBypassed
STKnoSHCPd,MdLC,RSTKBypassed
TyesSHCPd,MdLC,FFBypassed
IFUMnoSHCBFFIFUMLH←/IFUMRH←
BrkIns
not2BFF
PC*not3BFFLevel F PC loaded, level X read
Brnot2AFFBrLo←/BrHi←, ←Pipe0, ←Pipe1
MapBuf*noFHCBASEL, FAWritten on Map←, TestSyndrome←,
ProcSRN←, LoadMcr
DBufnoFHCBASEL, FAWritten on Store←
Mdyest5cacheBypassed
CFlagsnot2MarFFFor debugging, initialization
Mcr
not3Mar, MapBufFFFor debugging, initialization
Asrn
not2AsrnAddresses the pipe for ring refs
ProcSRN
not3MapBufFFAddresses the pipe for Pipe0 to Pipe5
TestSyndrome
not3MapBufFFFor debugging error correction
Pipe0not3Br, etc.ASEL, FAWritten on ref., B←Pipe0
Pipe1not3Br, etc.ASEL, FAWritten on ref., B←Pipe1
Pipe2not3ASEL, FAWritten on ref., B←Pipe2
Pipe3not14mapASEL, FAValid after any storage access or Map←
Pipe4not14,t48map, ECASEL, FAValid after any storage access
Pipe5not3,t4cacheASEL, FAWritten on ref., B←Pipe5

*Cannot be read as data by the processor
Instruction Fields
The 34-bit instruction is divided into the following fields:
Table 5: Instruction Fields
FieldSizePurpose (may have other effects, described below)
RSTK4 bitsSelects RM register to be read and/or written
ALUF
4 bitsSelects ALU function or shifter operation
BSEL
3 bitsSelects source for B
LC
3 bitsControls source and loading of RM and T
ASEL
3 bitsSource/destination control for A
BLOCK1 bitBlocks io task unless wakeup is waiting
Selects stack operations for emulator task
FF
8 bitsFunction (FA=FF[0:1], FB=FF[2:4], FC=FF[5:7])
JCN
8 bitsJump control
P0161 bitOdd parity on first word of instruction
P1733
1 bitOdd parity on second word of instruction
Total34 bits + 2 parity
The above instruction layout emphasizes compactness at the expense of programming flexibility. The following comments explain some of these tradeoffs
1.The RSTK field specifies only four of the eight address bits needed for addressing RM. The other four are taken from the RBase register (loaded by a function). In the emulator task, BLOCK causes STK to be used instead of RM, and RSTK is decoded to cause modifications of StkP.
2. ALUF addresses the 16-word ALUFM memory in which 16 of 26-odd useful ALU operations are stored. For the shift operation decode of ASEL, the first three bits of ALUF select the kind of shift, while the ALUFM address is forced to 168 or 178.
3. BSEL decodes the most common data sources for B. Less common B sources are selected by FF, and then BSEL encodes one of several destinations for the source.
4. ASEL specifies the source and destination for A. The default source is the RM address selected by RSTK. Four ASEL decodes specify the most common memory operations, where the virtual address is BR[MemBase]+A. These decodes consume the two leading bits of FF to specify alternate sources (T or Id) or less frequent memory operations. The remaining four ASEL decodes select alternate sources T, Id, or the shifter, where the shifter decodes work in combination with ALUF, as discussed later.
5. LC specifies loading of RM/STK and T from Pd and Md.
6. FF is the catch-all field in which operations or data not otherwise specifiable can be encoded. Operations encoded in FF are called "functions". There are five ways FF is used:
a.To extend the branch address encoded in JCN (long goto, long call).
b.
To form a constant on B as selected by BSEL.
c.
To specify one of 64 common functions and branch conditions while the two leading bits modify the memory reference operation specified in ASEL.
d.
To specify one of 256 functions and branch conditions, some of which use low bits of FF as literal values.
e.
As a shift control value when ASEL decodes to "shift" and BSEL to a constant.
When FF is used as a function, it sometimes modifies the interpretation of other fields in the instruction. For example:
a.16 FF decodes modify RM write address bits which would otherwise have come from RSTK or StkP.
b.
16 FF decodes modify RM write-address bits which would otherwise come from RBase.
c.
16 FF decodes select less common B sources, causing BSEL to encode a destination rather than a source for B.
7. JCN (in conjunction with current address) encodes the next instruction address as follows:
a.One of 64 global Calls.
b.
One of 60 local Gotos.
c.
One of 4 local Calls.
d.
One of 14 local conditional branches with 7 branch conditions.
e.
One of 16 long Gotos/Calls (use FF field for rest of address).
f.
One of 4 IFU jumps for next opcode (high 10 address bits from IFU).
g. Return.
h.
TPC read/write.
i.
IM read/write (Use low bits of RSTK also).
8.P0 and P1 are odd parity on the left and right halves of IM. When wrong, these give rise to error signals (see "Theory of Operations") which stop the machine after (unfortunately) the instruction with bad parity has been executed. The artifice of deliberately loading both parity bits incorrectly is used to implement breakpoints.
Notation
The notation used in referring to fields in the instruction is that the left-most bit of the field is denoted as 0. Hence, the fields in the instruction are as follows: RSTK[0:3], ALUF[0:3], BSEL[0:2], LC[0:2], ASEL[0:2], BLOCK[0], FF[0:7], JCN[0:7].
The BLOCK bit is also called StackSelect, for its use in choosing STK instead of RM for the emulator task.