Page Numbers: Yes X: 530 Y: 10.5" First Page: 48
Margins: Top: 1.3" Bottom: 1"
Heading:
Dorado MicroassemblerEdward R. Fiala21 July 1980
36. Instruction Memory Read-Write
The hardware provides an efficient method for loading the instruction memory (which might be common if microcode overlays are used) and a painful method of reading the instruction memory (unlikely to be dynamically frequent). Each instruction that reads or writes IM takes three cycles.
IM read/write is encoded in the JCN and RSTK fields of the instruction, so you may not program any control clause in the same instruction. The instruction after the one doing the read or write must be at .+1 within the page, and the assembler automatically imposes this constraint, so you do not have to use "At[N]". Tasking must be off.
For loading IM, the address to be written is first loaded into Link←, then the left or right half is written from a B source. RSTK[1:3] control left/right half, good/bad parity, and the 17th data bit (RSTK.0 or Block), so there is little flexibility in selecting an RM address for use with the write--you probably should source the data from T, Q, or Cnt. Link is smashed with .+1 after the write, so it has to be reloaded before writing the other half of IM. The following sequence is an example:
%Have 16 bits of left-half data at STK[StkP], RStk.0 and JCN.7 value in the sign bit and low bit of STK[StkP-1], respectively, and 16 bits of right-half data in STK[StkP-2]. Write this data into the IM address in Q with good parity.
%
T←Stack&-1, Link←Q;
TaskingOff, Stack&-1, Branch[R0TRUE,R<0];
IMLHR0’POK←T;
T←Stack&+1, Link←Q;
WRH:Stack&-2, Branch[BTRUE,R ODD];
IMRHB’POK←T;
IMWFIN:TaskingOn;
...
R0TRUE:IMLHR0POK←T;
T←Stack&+1, Link←Q, Branch[WRH];
BTRUE:IMRHBPOK←T;
T←Stack&-2, Branch[IMWFIN];
The IM write instructions take three cycles each but are otherwise indistinguishable from ordinary instructions. This means that there are no strange restrictions on other actions carried out in the same instruction.
IM data are read nine bits at-a-time, with the address again coming from Link and the byte number from RStk[2:3]. The data arrangement is shown in a figure of the hardware manual and is read back by the B←Link function in the cycle immediately after the read.
%Have IM address in RM location RTemp1. Read the left-half of IM to RTemp3 right-half to RTemp2 using RTemp0 as temp storage. Assume RBase points at correct region of RM at call. The extra bits of IM (P.16, P.17, RStk.0, and Block) are flushed. RTemp0 to RTemp3 are RM locations whose low bits are 0, 1, 2, and 3, respectively. RRetn is another RM location in the same region as RTemp0 to RTemp3
%
Subroutine;
RDIMD:RRetn←Link;*Save return to caller of subroutine
Top Level;
Link←RTemp1;
TaskingOff;
ReadIM[1];
RTemp0←Link;*RTemp0[7:17]←byte 1
Link←RTemp1;
ReadIM[0], T←Lsh[RTemp0,10];*T[0:7]←byte 1 flushing parity bit
RTemp3←Link;*RTemp3[7:17]←byte 0
Link←RTemp1;
*RTemp3←byte0,,byte 1 (flushing RSTK.0 bit)
ReadIM[3], RTemp3←Lcy[RTemp3,T,10];
RTemp2←Link;*RTemp2[7:17]←byte 3
Link←RTemp1;
ReadIM[2], T←Lsh[RTemp2,10];*T[0:7]←byte 3, flushing parity bit
RTemp2←Link;*RTemp2[7:17]←byte 2
TaskingOn;
Link←RRetn;*Restore return address to LINK
Subroutine;
*RTemp2←byte2,,byte3 flushing JCN.7
Return, RTemp2←Lcy[T,RTemp2,10];
37. Reading and Loading Task PC’s
The method by which task PC’s (TPC) are loaded for specific tasks, like the IM read/write, is a funny format RETURN in the JCN field of the instruction. Consequently, control clauses are verboten in the instructions that do this, and, again, the successor to the instruction that does the TPC read/load must be at .+1 within the page (automatically constrained by the assembler). Tasking must be off.
Data for TPC goes to/from LINK, task number from B[14:17].
Two macros are defined for reading and writing TPC from Link (These are B destinations.):
LdTPC←Loads TPC for task from LINK, task number from B[14:17]
RdTPC←Reads TPC for task into LINK, task number from B[14:17]
These macros fill in JCN appropriately. Normally, the task number on B will be a constant produced by mentioning the name of a task you have previously defined with TASKN, e.g.:
TASKN[DSP,14];*Define DSP as the display task
Link←DSPST;*Load Link with starting address
TaskingOff;*Require tasking off during LdTPC←
LdTPC←DSP;*is then equivalent to LdTPC←14C;
TaskingOn;
38. Divide and Multiply
The Dorado hardware defines special standalone functions Multiply, Divide, and CDivide which allow multiplication to be carried out in a one-cycle loop and division in a two cycle loop.
The hardware actions caused by these functions are as follows:
Multiply:
PD←ALUCarry..ALU/2
Q←ALU[17]..Q/2
Next branch address← whatever it is ’or’ 2 if Q[16] is 1
Divide:
PD←2*ALU..Q[00]
Q←2*Q..ALUCarry’
The following examples show how these are used:
%At entry:
RTemp/ multiplier (20-bit unsigned)
T/ multiplicand (20-bit unsigned)
At exit:
RTemp..Q/ 40-bit result
The first step is outside the inner loop. It moves the multiplier into Q and tests Q[17]. The second step, also outside the inner loop, tests Q[16] with the Multiply function and initializes the result (computed in RTemp) to 0. It enters the inner loop at the "add" or "no-add" position based upon step 1. The Multiply function also causes a dispatch, so the inner loop is entered with the "add" or "no add" decision already made for the two low bits of multiplier, 0 in Q[0], and untested multiplier bits in Q[1:16]. The inner loop does 16 useful Multiply steps, 1 useless step testing the 0 that started out in Q[0], and then the exit instruction does a final Multiply testing the low bit of the result, leaving the result in RTemp[0:17]..Q[0:17]. Instruction placement is critical. The two exit instructions have to be located so that ’or’ing 2 into their locations doesn’t change the location. The inner loop instructions have to be located so that the first is a fast-goto location and a multiple of 4. This can only be satisfied if the low four bits of address are 4, 10, or 14.
%
MULT:Q←RTemp;
Goto[.+2,R Even], B←RTemp, Cnt←16S;
Goto[M1], RTemp←T-T, Multiply;
Goto[M0], RTemp←T-T, Multiply;
DTABLE[MulX,0,7770]; *Dispatch table origin 0 mod 10 (0 and 1 unused).
MXIT0:Dat[MulX,2];
Return, Dat[MulX,3];
*Here after Q[16] was 0 (no add)
M0:DblGoto[M0,M0E,Cnt#0&-1], RTemp←RTemp, Multiply, Dat[MulX,4];
M0E:Goto[MXIT0], RTemp←RTemp, Multiply, Dat[MulX,5];
*Here after Q[16] was 1 (add)
M1:DblGoto[M0,M0E,Cnt#0&-1], RTemp←(RTemp)+T, Multiply, Dat[MulX,6];
Goto[MXIT0], RTemp←(RTemp)+T, Multiply, Dat[MulX,7];
%At entry:
RTemp/ most significant 20 bits of 40-bit unsigned dividend
Q/ least significant part of dividend
T/ divisor (20-bit unsigned)
At exit:
Q/ quotient (20-bit unsigned)
RTemp/ remainder (20-bit unsigned)
Each divide step shifts Q[0] from the low part of the dividend into the high part of the dividend while doing the Divide function and testing for exit. The second instruction chooses between add or subtract, based upon whether or not the last add/subtract "succeeded".
The duplicated instructions are required because they are part of branch condition pairs.
%
DIV:(RTemp)-T, Cnt←17S;
Goto[DivOK,Carry’];*Test whether the divide is possible
Return, RTemp←T-T;*Return 0 indicating impossible
DivOK:PD←T;
Goto[BigDiv,Alu<0];*Branch for the hard case
DblGoto[DvExit,DvTest,Cnt=0&-1], RTemp←(RTemp)-T, Divide;
*Easy case--divisor bit 0 is 0
DvTest:DblGoto[Dv0,Dv1,Carry’];
Dv0:DblGoto[DvExit,DvTest,Cnt=0&-1], RTemp←(RTemp)+T, Divide;
Dv1:DblGoto[DvExit,DvTest,Cnt=0&-1], RTemp←(RTemp)-T, Divide;
DvExit:Goto[DvXit0,Carry’];
DvXit1:(RTemp)-T, Divide;
DblGoto[DvXitFix,DvXitOK,Alu<0], RTemp←(RTemp)-T;
DvXit0:(RTemp)+T, Divide;
DblGoto[DvXitFix,DvXitOK,Alu<0], RTemp←(RTemp)+T;
*Fix for having subtracted too much in last step
DvXitFix:
Return, RTemp←(RTemp)+T;*Adjust remainder
DvXitOK:
Return;
*Hard case--bit 0 of divisor is 1
BigDiv:DblGoto[BigDvd,BDvLp1,Alu<0], RTemp←(RTemp)-T, Divide;
BDL2:DblGoto[BigDvd,BDvLp1,Alu<0], RTemp←(RTemp)-T, Divide;
BDvLp:DblGoto[BigDvd,BDvLp1,Alu<0], RTemp←(RTemp)-T, Divide;
BDvLp1:RTemp←(RTemp)+T;
Goto[BDvLp,Cnt#0&-1], RTemp←(RTemp)+T;
BDvXit:(RTemp)-T, Divide;
BDvXit0:
DblGoto[DvXitFix,DvXitOK,Carry’], RTemp←(RTemp)-T;
BRDX:Goto[BDvXit0],(RTemp)-T,Divide;
*Big partial dividend, check for carry
BigDvd:Goto[BigDvH,Carry’];
DblGoto[BDvXit,BDvLp,Cnt=0&-1], PD←RTemp;
*Most complicated case--big R and no carry
BigDvH:RTemp←(RTemp)+T, Goto[.+2,Cnt#0&-1];*R+2T-T
BRDX1:PD←(A←RTemp), CDivide, Return;
*Force carry to 0--1 bit in Q
BigRLP:Goto[BrDvXit,Cnt=0&-1], RTemp←(A←RTemp), CDivide;
Goto[BDL2,Alu>=0], PD←RTemp;
Goto[BigRLP], RTemp←(RTemp)-T;
*Exit for the hard cases
BrDvXit:
Goto[BRDX,Alu>=0];
RTemp←(RTemp)-T, Goto[BRDX1];
39. Programming Tips and Examples
Experience suggests that it is necessary to worry about availability of FF for use in long branches. For this reason you should try to leave the FF field free for a long branch when this doesn’t add extra instructions.
Another issue to be concerned with is usage of Alu operations. Preliminary versions of the Mesa and Alto emulators have suggested that the 15 operations *’ed in the "Assembling for ALUFM" section will be required. At the moment, A0 is also defined. However, try to avoid using A0 and other doubtful operations unless you really need them. In those places where A0 would be the simplest, try to use A-B with the same source for both A and B instead. Similarly, try to use A-B-1 rather than A1 and XOR rather than EQV. If you need an extra operation to save time or space, go ahead and use it, but don’t do this needlessly in case we decide to change the selection of operations later.
Also, BitBlt uses two ALUFM locations as variable operations but should restore these to standard values before exiting to the next opcode. If these two operations are restored, the emulator will have 17 Alu operations available, though other tasks will have only 15 available. The comments in the D1Alu.Mc file show how to define the two "emulator only" operations so that the assembler will flag an error when one of these is used from an io task.
It is also important to take full advantage of the various numbers which can be delivered by ←ID when programming emulators. These are the operand, argument bytes alpha and beta, and then instruction length endlessly. For example, on Mesa DIVIDE, it was possible to use length=1 to negate the quotient and remainder with (ID)-T-1 (etc.). Also, the same instruction can be used for NOT and NEG opcodes and the same exit instruction for ADD and SUB. Try to exploit the various options afforded by this.
The examples below will be augmented as more code is available.
*Mesa Read-Field opcode
RDFLD:IFetch←Stack, TIsID;*Calc. pointer as MDS+a+Stack
Stack←MD, RF←ID;*IFU supplies b
IFUJump[0], Stack←ShiftLMask;*Shift and mask, Stack←result
*Opcode 23, type = regular, length = 3 bytes, MemBase←MDS, RBase←0, no operand
IFUReg[23,3,MDS,0,RDFLD,17,0,0];
*Mesa Write-Field opcode
WRTFLD:T←(IFetch←Stack&-1)+T, TIsID;*Calc pointer and save in T
WF←ID, RTemp←T;*T←field descriptor
T←ShMDBothMasks[Stack&-1];*Deposit Stack in MD and pop
IFUJump[0], Store←RTemp, DBuf←T;*Store result, exit
*Opcode 24, type regular, length = 3 bytes, MemBase←MDS, RBase←0, no operand
IFUREG[24,3,MDS,0,WRTFLD,17,0,0];
*Random number generator using 8 words of RM as storage for the "state" of
*the generator.
RMRegion[Other];
RV[RGState,0]; RV[Rand,0];
RMRegion[Random];
RV[R0,134134]; RV[R1,054206];
RV[R2,036711]; RV[R3,103625];
RV[R4,117253]; RV[R5,154737];
RV[R6,041344]; RV[R7,006712];
SET[X,20];*A "call" location
RGen:Goto[RGen1], T←R0, RBase←RBase[Rand], At[X];
Goto[RGen1], T←R1, RBase←RBase[Rand], At[X,1];
Goto[RGen1], T←R2, RBase←RBase[Rand], At[X,2];
Goto[RGen1], T←R3, RBase←RBase[Rand], At[X,3];
Goto[RGen1], T←R4, RBase←RBase[Rand], At[X,4];
Goto[RGen1], T←R5, RBase←RBase[Rand], At[X,5];
Goto[RGen1], T←R6, RBase←RBase[Rand], At[X,6];
Goto[RGen1], T←R7, RBase←RBase[Rand], At[X,7];
RGEN1:Return, T←Rand←(Rand)+T;
*The calls are as follows:
RGState←(RGState)+1, BDispatch←RGState;
Call[RGEN], RBase←RBase[Random];*Return random number in T
*Test-and-set in one instruction for use by different tasks that control
*each other. Sign bit of RM register RFlag is the lock.
RFlag←(RFlag) or (100000C), Branch[AlreadyLocked,R<0];
*Alternative lock procedure: store -1 in RFlag when unlocked; then:
RFlag←(RFlag)+1, Branch[AlreadyLocked,R>=0];
Appendix 1. MicroD
MicroD transforms .Dib files produced by Micro into .Mb files. Since instruction placement is fairly tedious, the display shows a progress message, so you can monitor progress of the load. The sequence of progress messages is as follows:
Loading File1...
Loading File2...
...
Loading FileN...
N instructions, M words for symbols
Linking...
Building allocation lists...
Assigning locations...
Reloading binaries...
Checking assignment...
Writing .MB...
N words free
Error messages may appear at any time. Some of these immediately abort the load, but most errors do not abort until the end of the current progress step. In other words, errors during "Linking...", will usually abort at the end of this loading phase; errors during "Building allocation lists...", usually abort at the end of this phase, etc.
After "Building allocation lists..." has completed, all bugs will have been detected except conflicting absolute addresses (two AT’s at same location) and various overflows (too many globals, too many IFU entries, too many instructions on a page, etc.).
The data printout for IFUM and RM is in two columns. For RM the address symbol(s) associated with a location are printed to the right of the data. For IFUM, the IM target symbol is printed to the right of the data. For IM, the printout is like the following:
345457 23456 23457FOO
346601233333144444
meaning that the 345th instruction assembled by Micro with label "FOO" was placed at absolute location 457 and the two 16-bit numbers are the octal contents of the instruction.
The error messages produced by MicroD contain the symbolic address of the instruction at which the error was detected, when relevant.
Micro Output for the Imaginary Machine
Micro outputs stuff for IM, RM, IFUM, ALUFM, STK, and fake memories called BR, BRX, DEVICE, TASKN, VERSION, RVREL, IMLOCK, and IMMASK.
MicroD transforms only IM and IFUM data. Addresses in all memories and data in all memories except IM and IFUM pass through MicroD to the .Mb output file unchanged--this excludes data and addresses for VERSION, RVREL, IMLOCK, and IMMASK, which are fake memories whose contents and address symbols are consumed and flushed by MicroD.
Data are output for IM, IFUM, ALUFM, RM, and STK in the form expected by MicroD and Midas, as given below. BR, BRX, DEVICE, and TASKN have address symbols useful when debugging with Midas but no data are output for these memories. In summary, we have:
IMTransformed by MicroD--see below
IFUMTransformed by MicroD--see below
RM20-bits per word
STK20-bits per word (Most programs don’t assemble anything for this memory, but provision is made for this.)
ALUFM10-bits per word with 0 and 3:7 containing the 6 bits loaded into the ALUFM ram
BRbase register address symbols for debugging
BRXMemBX-relative base register address symbols for debugging
DEVICEio device address symbols for debugging
TASKNtask address symbols for debugging
VERSION1-word memory defining the machine as Dorado for MicroD.
IMLOCK10000-word x 1-bit memory; a 1 in an IMLOCK word prevents MicroD from placing any instruction in the corresponding location of the microstore.
IMMASK10000-word x 24-bit memory defining dispatch table length and allowable placement of first word.
IM and IFUM parity bits expected by the hardware are computed by neither Micro nor MicroD; Midas computes these at the time it does the load.
Micro outputs a modified form of Dorado IFUM words, as follows:
PA 1 bitPacked-a bit
NEnt 2 bitsNumber of instructions in target sequence
1 bitUnused
IFAD14 bitsImaginary address of target instruction
Sign 1 bit
3 bitsUnused (parity bits filled in by Midas)
Length’ 2 bitsopcode length (1, 2, or 3 bytes)
RBaseB’ 1 bitRBase initialization
MemB 3 bitsMemBase initialization
Pause’ 1 bit
Jump’ 1 bit
N 4 bits
All of the bits are located in positions compatible with IFUMRH←/IFUMLH← except for IFAD, which has two extra bits. These extra bits are positioned to avoid conflict with real IFUMRH←/IFUMLH← data bits.
MicroD will transform IFAD into a real address and output the proper 12 bits in 0 to 11 of the first word, as well as zeroing the extraneous bits.
Micro outputs for each instruction assembled the 42-bit (+2 parity bits) instruction and four extra words of stuff needed by MicroD as follows:
Dorado instr.42 bitsComplete except for branch address stuff
P016 1 bitLoad bad parity into IM[0:20]
P2141 1 bitLoad bad parity into IM[21:41]
14 bitsunused
1 bitunused
W0@ 1 bitPlace at location W0
Glb@ 1 bitPlace at a global location
OnPg@ 1 bitPlace on the page specified in W0
W014 bitsLocation for placement if W0@ or OnPg@ = 1
Returns 1 bitThis instruction does a Return, CoReturn, or IFUJump
(or IM or TPC read/write)
Calls 1 bitThis instruction does a Call or CoReturn
JBC 1 bitThis instruction has a branch condition in JCN
UsesFF 1 bitFF field unavailable for long goto or long call
W114 bitsImaginary address of unconditional or false branch
(7777 defaults this to .+1)
Branches 1 bitThis instruction does a Branch
Goes 1 bitThis instruction does a Goto
Emul 1 bitPrint as emulator instruction
IsCond 1 bitThis instr has a branch condition (i.e., W2 at W1 OR 1)
W214 bitsImaginary true address of conditional branch
(7777 defaults this to .+1)
W1 and W2 may receive automatic Micro fixups if they are forward references.
Micro finishes assembly for all bits of the instruction except those referring to instruction locations. In other words, the only job of MicroD is assigning absolute locations for the instructions and storing appropriate stuff in the JCN fields (and for long calls, in the FF fields) of the instructions and in the address fields of IFU words.
For conditional branches, the branch condition(s) are already in FF or in JCN, so MicroD does not fix up those parts of the instruction. For Return, CoReturn, IFUJump, IM read/write, and TPC read/write, JCN is also complete.
A more precise meaning for some of these bits is as follows:
IsCondThe instruction at imaginary address W2 must be placed at the absolute location assigned to W1 xor 1.
ReturnsJCN has been completely assembled by Micro; W1 and W2 are irrelevant.
CallsThe next instruction in sequence must be at .+1 within the same page, and, unless Returns is also 1, the instruction W1 must be placed at a call location in the microstore.
Instruction Placement
The discussion here describes the original design of MicroD by E. Fiala. The actual MicroD, designed and implemented by L. Deutsch, differs from this description in a number of ways. There is presently no description of the existing program.
The "Load" pass of MicroD loads the .Dib file output by Micro into simulated memories and executes fixups. After loading, all addresses and all data not needed during placement computations are flushed; after placement computation is finished, the .Dib binaries are reread, modified with the placement information and output on the .MB output file.
After loading, several passes are made over IM data as described below. During the "Link" pass simulated memory for an instruction is viewed as follows:
AlcPtr20 bitsPoints at alist header (now 0)
Link20 bitsPointer to next alist item (now 0)
4 bitstail of Dorado instruction
14 bitsUnused
1 bitUnused
Place 3 bits0 = W0 is the absolute address of this instruction
1 = Place at a global location
2 = Place at a global and place W0 at . xor 1
3 = IFU entry
4 = Place at even location and place W0 at . xor 1
5 = IFU entry and place W0 at . xor 1
6 = Place at odd location and place W0 at . xor 1
7 = None of the above
W014 bitsAbsolute addr of this instr if Place = 0
Imaginary addr of instr at . xor 1 if Place indicates it
Returns 1 bitJCN field fully assembled; ignore W1 and W2
Calls 1 bitPlace next instr. at (.+1 & 77)+(. & 7700); require W1 to
lie at a call location unless Returns is 1
JBC 1 bitPlace W1 and W2 at a reachable JCN branch condition target
UsesFF 1 bitFF field unavailable for long goto or long call
W114 bitsImaginary address of branch from this instruction
Branches 1 bitDoes a Branch
Goes 1 bitDoes a Goto
1 bitUnused
IsCond 1 bitHas a branch condition
W214 bitsSecond imaginary address of DBLxxx or .+1
JBCT 1 bitThe target of a JCN-encoded conditional branch
GoedTo 1 bitThe target of an unconditional or false conditional Goto
Called 1 bitTarget of unconditional or false conditional Call
jbcLink15 bits7777 if no JCN conditional branch else 10000+imag addr
"Link" then scans IM, doing the following for each word:
a.AlcPtr and Link are initialized to 0.
b.If W2 is relevant (= IsCond & not Returns), then W2 must be at W1 xor 1, so the W0 and Place fields are set appropriately for both words, making error checks for inconsistent constraints.
c.The JBCT, GoedTo, and Called bits are set in W1 as appropriate (ignored if Returns eq 1).
d.The word containing W2, now disposed of, is converted into a brLink. If Returns or (not IsCond & not UsesFF), then W1 can be anywhere and no restriction is propagated. Otherwise, W1 must be in the same page as this instruction. Either brLink or jbcLink is set to 10000+W1 and the other is set to 7777 (= empty). jbcLink is used when the branch target must be a reachable JCN-encoded conditional branch location.
While propagating xor1 relationships, error checks ensure that no situations where different instructions must be xor1 to the same instruction occur. If such errors are detected, error messages are output, and at the end of "Link" assembly terminates.
"Link" then scans simulated IFUM and, for IFU entries which have been loaded, sets the IFUE state in Place for addresses branched to from the IFU; if NEnt is greater than 1, then Calls is set 1 in the first NEnt-1 instructions of the entry vector.
At the end of "Link" simulated memory is as follows:
AlcPtr20 bitsPointer to the alist header
Link20 bitsPointer to next alist item
1 bitUnused
Place 3 bitsPlacement constraint
W014 bitsAbsolute address of this instruction if Place eq 0
Returns 1 bitJCN is correct and W1 is irrelevant
Calls 1 bitthe next imaginary instr must be placed at .+1
JBC 1 bitA branch condition is in JCN
UsesFF 1 bitFF field not available for long call or long goto
W114 bitsImaginary address of branch from this instruction
State 3 bitsState of allocation list (now 0)
brLink15 bitsImaginary addr of next instr in page or 7777B if empty
JBCT 1 bitTarget of branch with condition in JCN
GoedTo 1 bitPlace at a goto location
Called 1 bitPlace at a call location
jbcLink15 bitsImaginary addr of next instr in subpage or 7777B if empty
At the end of "Link" each instruction contains a collection of flags and W0 describing restrictions on its placement, and the lists beginning at jbcLink and brLink thread through instructions on the same subpage or same page. W1, UsesFF, and Returns indicate how the JCN (and sometimes the FF) field must be filled in for its own branch. Calls connects it to the instruction at .+1, and Calls in the preceding imaginary word may connect it to .-1.
The "AList" pass of MicroD transforms data structures left by "Link" into a form more amenable to allocation. The word containing W0 and the JBCT, GoedTo, and Called bits are processed so that the placement constraints are contained in a one-word "Mask" and in the three-bit "State" field. The jbcLink and brLink lists are transformed into circular lists as follows:
a.Initially, xxLink contains 7777B (empty) or 10000+imaginary addr, interpreted as an "unmarked" pointer.
b.Imaginary addr in xxLink is interpreted as a "marked" pointer (which implies that imaginary address 7777 is unusable--sorry about that, but the allocator is unlikely to be good enough to assemble 100% of the microstore anyway).
c.During the scan of IM, if xxLink is already marked, skip it. Otherwise,
d.Follow and mark the xxLinks until either 7777 (empty) or a marked link is encountered. If empty, change that to a marked pointer to the starting xxLink. If a marked pointer, splice the list just scanned in at that place, except that if the marked pointer is at the original xxLink then done (List was already circular).
Next, "alists" of instructions connected by Calls or Xor1 are built. Alists have the property that the placement of every instruction in the alist is determined unambiguously by the placement of any other element. Alists begin at a header and thread through the Link words of IM entries in the alist. The interpretation of State is as follows:
0Absolute--list contains absolutely located instructions -or-
Page-relative--alist contains instructions whose low 6 bits are located
1Other--placement constraint encoded in Mask (currently unused)
2Xor1--two-instruction alist with instructions at an xor1 pair, legal placements encoded in Mask
3Plus1--multi-instruction alist with instructions bearing a .+1 relationship to predecessors
4AnyCall--one-instruction at any call location
5AnyGo--one-instruction at any goto location
6AnyIFUE--one-instruction at any IFU entry
7Any--one-instruction arbitrarily located
Legal alists containing arbitary combinations of Calls and Xor1 constraints are transformable into a "Plus1" list. Header locations for the alists are determined as follows:
a.Absolutely-located alists have their header in the PageTab entry (see below) for the appropriate absolute page. All absolutely-located instructions in that page are on that single alist.
b.Page-relative alists (i.e., ones containing a Global) have header in GlobTab.
c.AnyCall, AnyGo, AnyIFUE, and Any instructions which have both jbcLink and brLink equal to 7777 (empty), are combined onto single lists. These are not considered to be part of any instruction cluster and are allocated at the last possible moment. Instructions which are only reached by long Goto/Call or IFU dispatch and which themselves do long Goto/Call, Return, or IFUJump wind up on these lists.
d.All other alists have their headers in AlcTab.
The AlcPtr word in each IM word’s structure points at the alist header. This is needed for clustering instructions into pages.
The "Cluster" pass of MicroD groups and sorts the alists into clusters of instructions that must appear on the same 64-word page of the microstore. This is done in the following steps:
a.Absolute clusters for pages 0-77 are collected and sorted by size.
b.Global clusters are collected and sorted by size.
c.Global clusters are merged into page 0-77 clusters.
d.Remaining clusters are collected and sorted by size.
e.Remaining clusters are merged into page 0-77 clusters.
f.The page-independent AnyCall, AnyGo, AnyIFUE, and Any alists are allocated.
The "seed" alist for the cluster gathering procedures is obtained as follows:
a.The PageTab entry for a page contains its absolutely-located instructions.
b.GlobTab entries not absorbed during (a) are seeds for global clusters.
c.Take AlcTab entries not absorbed collecting other clusters in an arbitrary order.
Note: The circular jbcLink and brLink lists form a fully-connected structure, so the cluster gathering process can begin with an arbitrary seed alist. The purpose of collecting the clusters in the careful order described above is to avoid unnecessary sorting of the clusters and avoid undesirable thrashing by the cluster-merging heuristic.
As a cluster is collected, the alists composing it are aggregated into adjacent AlcTab locations, and the single-instruction alists (probably 80% of all instructions are on single-instruction alists) are rplaca’d onto special lists for the cluster. The PageTab or ClusTab structure describing a cluster is as follows:
a.Pointer to first AlcTab alist.
b.Count of alists in AlcTab.
c.Header for AnyCall instructions in cluster.
d.Header for AnyGoto instructions in cluster.
e.Header for Anywhere instructions in cluster.
f.Header for absolutely-located/page-relative instructions in page/cluster.
g.Count of total instructions in cluster.
PageTab only:
h.Count of total goto locations occupied by current allocation of page.
i.Count of total call locations.
j.Count of total JCN locations.
k.Count of total JCN conditional branch goto locations.
l.Count of total JCN conditional branch call locations.
m.4-word bit table for allocation.
This information is needed by the allocate-and-merge heuristic. A rough sketch of the heuristic is as follows:
a.Initially, each PageTab entry contains the assorted lists described above and an empty bit table for the page.
b.The alists in AlcTab are sorted into a desired allocation order (undecided how this works at present).
c.The AlcTab alists are allocated, the bit table bits filled in, and the tentatively assigned location stored in brLink (which is no longer needed).
d.The assorted counts are filled in by counting the ones in the bit table appropriately. To these counts are added the lengths of the Anyxx lists.
e.Merges are considered in the order of decreasing size. Namely, the can-I-merge question is asked for the largest entry in ClusTab with the largest entry in PageTab and then successively smaller PageTab entries until the answer is "yes".
f.If either the PageTab or the ClusTab entry contains only alists beginning in the Anyxx headers (i.e., there are no AlcTab alists for the cluster), then the merge question can be answered by considering only the assorted counts. Otherwise, the counts will provide a certain negative answer for most situations when the merge is impossible.
g.If the PageTab entry is empty (i.e., the page hasn’t been used yet), then the merge is ok, so the page-relative alists in the cluster are converted to absolute, the AlcTab alists are sorted into position and the bit table and counts are filled in as above. This may result in an error if the cluster is too big for one page.
h.If the counts indicate that a merge is probably ok, then the bit table in PageTab is copied and an attempt is made to allocate the alists in the cluster without changing any location assignments already made for the PageTab alists. If this succeeds the clusters are merged with the cluster’s AlcTab alists being appended after the ones already in PageTab.
i.If (h) fails the answer is presently assumed to be "no". (This can be improved later by resorting the alists in PageTab and in the cluster, but maybe the heuristic will work well enough without resorting to this time-consuming reallocation.)
j.If the answer is "no" then loop to the next smaller PageTab entry.
The "Allocate" pass of MicroD is carried out as follows: Each entry in PageTab now represents instructions that will wind up on a single absolute page. AlcTab alists have already been assigned absolute locations (assignment in brLink). Absolute locations are now assigned to the remaining instructions on the AnyCall, AnyGoto, and Any lists for each page. Then the instructions on the page-independent AnyCall, AnyGoto, and Any lists are allocated wherever there is space.
The "Relocation" pass of MicroD rereads the .Dib binaries, checks assignments of IM words, and outputs a .Mb file in the form expected by Midas. Memory definitions, addresses, and data for all memories except IM and IFUM are output unchanged in the order read, except that the fake memories intended only for MicroD (RVREL, IMLOCK, VERSION, and IMLOCK) are flushed. IM addresses are also output unchanged--they are not relocated because Midas works with the unrelocated addresses.
However, modified definitions for IM and IFUM are output, and MicroD builds an in-core data structure for IM and IFUM words so that these memories can be listed on the .Dls file. To do this, it compresses Dorado instructions into the form shown below; IM address symbols are appended to the appropriate symbol chain.
MicroD fills in JCN (and sometimes FF) fields of instructions and IFUM words with absolute information. In filling in JCN the rules are as follows:
1.If the instruction has a branch condition in JCN, only JCN[1:4], the 4 bits selecting from 16 possible target addresses, are filled in by MicroD (other bits were filled by Micro.).
2.If the instruction has Returns=1, no fixup is made.
3.Otherwise, all bits JCN[0:7] are set by MicroD to the correct values, and for long gotos/calls FF[0:7] are also set.
MicroD must also zero the extraneous bits in each IFUM word.
After this, representation of the IM words is as follows:
Dorado instr.44 bits
14 bitsunused
2 bitsunused
Undef 1 bitThis bit must be 0
Emul 1 bitPrint out as an emulator instruction
AbsAddr14 bits
SymLink20 bitsPointer to chain of symbols
Then:
a.Memory definition blocks compatible with Midas are output for all memories on the .Mb file; the sizes expected by Midas are as follows:
IM10000 words x 100-bits (IM representation given above with SymLink removed)
IFUM 2000 words x 40-bits
otherpassed through MicroD unchanged
b.Data blocks are output on the .Mb file for all memories. IM words are represented by the 14-bit absolute address as well as the data, so that both the imaginary and absolute addresses are available to Midas during debugging.
c.IM words are output as data blocks beginning at 0 and extending to the last imaginary location used by the program.
d.Finally, the Micro endblock is output.
Appendix 2. Recent Hardware and Assembler Changes
1.The Micro "While" builtin has been added (affecting D1Lang internally but probably uninteresting to programmers).
2.The DispTable placement macro has been added, supported by the IMMask memory in MicroD.
3.The StackNOUFL, StackNOUFL&+1, StackNOUFL&+2, and StackNOUFL&+3 macros have been added to read the top stack entry without checking for a StkP=0 underflow condition.
4.The macros for restoring an ALUFM entry that has been smashed have been added; the nH literals have been removed.
5.The "Cnt-1" macro has been added to use the Cnt=0&-1 branch condition for its side effect.
6."SetRMRegion" has been added so that the definition of an RM region can be in a different file from definitions for registers in that region.
7.The BRX (fake) memory has been added for use in contexts which specify the MemBX-relative loading of MemBase (e.g., in defining IFUM entries, MemBaseX←SC)
8.The "IMReserve" and "IMUnreserve" macros have been added to prevent/allow MicroD use of absolute microstore locations.
9.The "OnPage" and "AutoPage" macros have been added to force MicroD placement on a particular page and to allow general placement (primarily for microcode overlays).