D0 GOTCHAs8 February 1982Filed on: [Indigo] D0Gotchas.Press, .BravoThere are a number of errors that Micro and MicroD do not check for. The following is a list ofknown illegal instruction combinations as well as instruction combinations that do not work likeyou might think they work. If you know of any others, please send a message to Ed . Thislist is incomplete.Continuing From FaultsAfter a fault, the fault handler sometimes resumes the aborted instruction, but cannot do so incertain cases. Faulting in such situations must be avoided.Two known problems are an abort following a LoadPage, or an abort of an instruction using thebypass kludge. Gotchas related to these are discussed below. In preparation for these, thesituations when faults happen and when the fault handler continues from them are discussed here.Fortunately, the fault handler CAN continue correctly from the instruction following a Dispatch,UseCTask, or APCTask&APC_.Stack Overflow and UnderflowNo existing microcode system resumes from stack overflow or underflow. Pilot restarts theemulator at a trap address; other systems crash. Consequently, no lore for continuing from thesehas been developed.MC2 FaultsAn MC2 fault happens on uncorrectable storage failures (multiple errors) for any of the referencesthat can cause such a failure; these are PFetch1/2/4, PStore1/2, and IOFetch4/16. This isconsidered to be a crash condition, so we don't have to worry about resuming following such afault.Correctable failures (single errors) only happen when the LogSE bit is true in the map entry for thepage and only when the reference experiencing the single error is a PFetch1/2/4. The faulthandler would ALWAYS resume the program after single-error logging, if we allowed this featureto be used, but at the moment LogSE is only used in a limited context by the Initial microcodeduring storage testing.Unfortunately, Pilot has about 10 or 15 LoadPages which might have LogSE faults abort the nextmi, and several of these are not readily fixable. Continuation after the LoadPage at the buffer refilllocation (location 0) is particularly difficult to arrange. As a result, LogSE is ILLEGAL in Pilot andAMesa, so programmer's need not worry about it at present. The fault handler will crash on anyMC2 fault, assuming an uncorrectable error.MC2 faults happen on about (not sure) the 18th cycle after a PFetch1/2/4, if the data is nottouched sooner. If read through the ALU earlier than the 18th (?) cycle, then the fault is supposedto abort the instruction which touches the data. However, we have discovered that this next)\bpX (`Sq ]rF2 YqM WO V!)7 TV O=p KqY J< FA DM BL A.:& ?d ;s 7q$6 501 4 /s ,_q%= *(2 (K ' #R !U C .O c U '(? \R R + UK d 0, 8 yBXDo Gotchas 8 February 19822instruction completes before the fault on PFetch4 MC2 faults. For example in:PFetch4[Base,RReg0];T _ RReg0;*This instruction SHOULD be abortedCall[Foo];When the PFetch4 experiences an MC2 fault, T _ RReg0 completes before the fault. In addition,TPC is written by the Call[Foo], even though that instruction is aborted by the fault, and this canbe confusing during debugging. The extra instruction is executed because 8 cycles of transportfor the PFetch4 is longer than the 5 cycle abort of the memory controller, so the next instructioncompletes before the fault begins. We don't know any reason why this bug is harmful becausethe corrected data is read.MC1 faultsMC1 faults represent page or write protect faults. In isolation (i.e., in the absence of suspendedcycles due to transport for previous references), MC1 faults seem to abort the fourth instruction(seventh cycle?) after a PFetch1/2/4 or PStore1/2 reference and the third after a PStore4.However, transport for a preceding reference might suspend intervening cycles and advance thefault. A preceding PFetch2 might advance the fault to the third instruction or a PFetch4 to thesecond, ordinarily. If transport for a preceding PFetch4 intervened and experienced errorcorrection, the fault would abort the instruction immediately after the reference!It is true for Pilot, and probably for other microcode systems, that page and write protect faults areonly supposed to happen on references INITIATED by the emulator, although, if the emulatortasks, an io task might be INTERRUPTED by the fault.The fault handler will ALWAYS resume from such a fault if an io task was interrupted, (possiblychanging the emulator's TPC or handling a nasty special case for a Return in the instruction beingresumed). For Pilot, it will RARELY continue from such a fault if the emulator was interrupted--itonly does this during instruction buffer refill for a NextInst and on jump opcodes. Continuation inthese cases is not accomplished through the hardware's Restore function, so other restrictionscommented in the code must be dealt with.1.LoadPage in Emulator Task Consequently, the only LoadPage restriction for the emulator is with respect to page andwrite protect faults, and it only applies in the rare situations when the emulator will beresumed. In this case, if the preceding reference is unknown, then LoadPage is illegal in anyof the three instructions following a reference which might fault. If something is known aboutthe preceding reference (or lack thereof), then it might be possible to relax this restrictionand allow LoadPage to be used in the first or second instruction after a reference.If you screw up, you will continue execution on the wrong page with no warning from thefault handler. fp qGfp bqN^t\ '[ Wq> UR T8' RE$> PzE N Jjs Fq*9 E- W CcZ ?E >&)7 <\9! :R 7 \ 5UQ 34 0V .MP ,O * V ( P '#) " p))qX),.)L)8_)mG)S)1Q)f  AWDo Gotchas 8 February 198232.Task Switch Before LoadPage in Non-Emulator Task A corollary to (1) above is that in an io task, a task switch must not have occurred during thethree instructions preceding a LoadPage. This is because the Emulator can do a memoryreference and a return on the same instruction. The non-Emulator task that runs must thenwait at least three instructions for the page or write protect fault to occur.The following combination is illegal:Call[TaskSwitch];LoadPage[0];The fault handler will give a distinctive MP code if you violate this Gotcha.3.Bypass Kludge When a Faulting Reference is Preceded by a PFetch4Ordinarily, a PFetch4 would suspend only four cycles, so the page fault for a PFetch1/2/4 orPStore1/2 would be advanced no earlier than the second instruction after the reference, butif error correction occurs, a total of 8 cycles would be suspended for transport, and the faultwould be advanced to the instruction immediately after the reference. This means that the bypass kludge cannot be used following a reference which faults andmight have PFetch4 transport occur between itself and the next instruction, if the faulthandler will continue from the fault. Since continuation from MC2 faults and stack errorsnever happens, this is the only case to worry about.4.Beware of H4 parity errors.H4PE's are possible following Input, IOStore4, or IOStore16 references. They wouldordinarily be fatal, crashing the system, but a 3 mb Ethernet controller routinely generatesthem. Also, the MC1 memory controller microcode has a bug that causes a wild branch afteran H4PE, if another memory reference is starting at that time.To cope with these hardware problems, the fault handler ALWAYS continues from H4PE's,under the assumption that they were generated by a 3 mb Ethernet input task, and theEthernet microcode has an appropriate number of Nops following each Input or IOStore4 toensure that MC1 doesn't take a wild branch and that an H4PE abort will not fall after aLoadPage.The effect of this treatment is that any real H4PE goes undetected and may additionallytrigger either a LoadPage gotcha or MC1 wild branch as a result of an H4PE. Problems areminimized by jumpering out H4PE's on machines with 3 mb Ethernet controllers.Consequently, except in the 3 mb Ethernet microcode, H4PE's are assumed never to happen,so microcoders need not worry about any coding restrictions.For the Ethernet task, an H4PE aborts the fourth instruction following an Input or the sixthfollowing an IOStore4. If an Input is interlocked, the instruction after the interlock is abortedby the H4PE fault. The instruction preceding the abort cannot contain a LoadPage. A fp qGfp b)1)^q T)\.()[Y)YLN)U%RhtP )M,qM Hp)@)DqM)B&5)A /0)?AF);L ):R)89Z)6o4 1Up))-q I),)3)*NT)(>)%M)#GF )!}F)A))u-*)3&)M)nA)<)20,) g3/) U U>.]Do Gotchas 8 February 19824violation causes the 'LoadPage error' number in the MP. (See Gacha #2.)Another reference must not start until any reference that might generate an H4PE hasfinished. (All instructions that that can cause an H4PE use only MC1.) This one is reallynasty. It can cause almost anything. A common result is an RM parity error.The problem is a bit complicated to explain. When it is finished processing it's part of anoperation, the MC1 microcode normally jumps to 0 (an idle loop that waits for the nextmemory operation). If the H4PE bit is on, it jumps to 2 to report the error to the faulthandler. It does this by ORing the 2 bit into the next address. The problem is that thedispatch to the first instruction of the next opcode may be happening at the same time. If so,it goes to the desired location OR 2, and you have to get out the listings to find out wherethat is.An even more complicated symptom will happen if there is not much IO activity. If the wildbranch happens, the H4PE flop is left set until another Input or IOStore happens to test it.This prevents data from getting written into memory, so the emulator can get really confused.To avoid the MC1 wild branch requires 3 non-memory instructions after Input and 4 afterIOStore4.5.IOStrobe with RETURNIOStrobe on the same instruction as RETURN does not work. The purpose of IOStrobe is toturn off the wakeup latch and go back to sleep until the next wakeup occurs (e.g., in theSA4000 code). If IOStrobe is on the same instruction as the RETURN, the latch will still be onwhen the RETURN is executed, and control may remain in the same task. 6.Minimum Instructions Between Timer OperationsWhen a timer instruction is executed (Load Timer or Add To Timer), it is necessary toexecute at least 14 machine cycles (usually seven instructions) before another timerinstruction can be executed. 7.Loading the Source of PStore4There must be at least one non-memory instruction between the loading of the source of thePStore4 and the PStore4; otherwise the processor may not be able to deposit the data in Rbefore the memory controller needs it. The following combination is illegal:R0 _ ... ;PStore4[Base,R0]; fp qGfp)bq9r)^qH )\ ?s)[qG)WW)U%1)T:)RE@)PzH)N)3)L)IsT)G<)E4))BlB)@ ;p))8q7)6KQ)4^)2G -p)-)*+q0%)(`T)& !}p)) qD)@0))u')%t  ?.VDo Gotchas 8 February 198258.Loading the Source of PStore1 or PStore2There must be at least one non-memory instruction between the loading of the source of aPStore1 or PStore2 if the PStore might be followed immediately by another memoryreference. The following combination is illegal:Temp _ Temp-1;PStore1[Base,Temp];Input[...];In this example, the write of Temp will not be done until the Input is finished. Since the Inputwon't start until the PStore is finished, the PStore will store the OLD value of Temp.9.Task Switch after PStore1 or PStore2As corollary of (6) above, a PStore1 or PStore2 may not be followed immediately by a taskswitch if there was not a non-memory instruction between the loading of the source and thePStore. The following combination is illegal:R0 _ ... ;PStore1[Base,R0], RETURN;10.Loading the Odd Base RegisterAt least one non-memory instruction must be executed between loading the odd baseregister and a memory reference that uses the base register. The following combination is illegal:BasePairOdd _ ... ;PStore1[BasePairEven,R0];11.Loading the Even Base RegisterAt least one non-memory instruction must be executed between loading the even baseregister and a memory reference that uses the base register. Exception: if it can beguaranteed that the memory reference will not be aborted, or the memory reference usesDF2 addressing with a displacement of zero. The following combination is illegal:BasePairEven _ ... ;PStore1[BasePairEven,R0]; fp qGfp b)()^qX)\I)[ )W%T3t RhP )M,q.2)KaG FHp)$)BqC)A D)?A);%8]t 6 1yp)).qQ),<=)(%%Xt# tp))qH )8?)m<)-)1%t  >.[^Do Gotchas 8 February 1982612.Loading the Displacement Before a Memory ReferenceLoading the displacement immediately before a memory reference works because of thebypass hardware: T _ Displacement;PStore1[Base,Source];In the above example, T does not actually get written until after the memory instruction.Because there is a write of T pending when the PStore is executed, bypassing is invoked,and the output of the ALU is correctly used in the calculation of the memory address.Loading the displacement before a memory instruction will not work if there are only othermemory instructions before the displacement is needed. The following combination does not work:T _ Displacement;Input[...];PStore1[Base,Source];In the above example, T does not actually get written until after all memory instructions havebeen executed. Because there is a write of T pending when the PStore is executed,bypassing is invoked, and the displacement used in the calculation of the PStore address isthe output of the ALU left over from the Input (which is junk). The store is done, but not tothe place you expected!13.Reading a Register Following a Memory ReferenceIf an RM register or T is loaded in the instruction preceding a memory reference, it cannot beread in the instruction immediately following the reference. The following combination does not work:DiskAddress _ T;PFetch1[...];T _ LSH[DiskAddress,4];In this example, the bypassed value written to T is not the value written, but the result of thebase register addition done by the memory reference.In some cases this can be thought of as a hardware feature rather than a bug. In theMemory section of the D0 Hardware Manual, uses of this feature (known as the "bypasskludge") are described in detail.14.Testing IOAtten After a Task SwitchA microprogram must not test IOAtten in any instruction following a task switch. This isbecause the time required for CTask[0..3] to reach a controller and be returned to the fp qGfp b)2)^qO)\YotW)T3qS)RhF)PA)M,(2)Ka8)G(D}tB @)=vqI);4)9 O)8V)6K 12p)/)-q=!)+?)((%t#G !}) qS )@4)"3)6)9!  p)#) q'1) 9 >.]oDo Gotchas 8 February 19827processor as IOAtten is longer than one cycle. The following combination is illegal:Call[TaskSwitch];GoTo[Attention,IOAtten];Also note that the UTVFC (display controller) has special restrictions regarding the memoryreference which must be issued before testing IOAtten. Study the comments in the displaymodules as well as the hardware manual to find out how these work.15.Issuing IOStrobe After a Task Switch? IOStrobe should not be issued until the third (second?) instruction following a task switch(?Not sure about this one?). 16.Storing into the IO Register of Another TaskIt is not possible to store into the IO register of a lower priority task. In particular, the form: T _ xx;Output[reg];works fine from task 0. From any other task, the current CTask gets ORed into the high fourbits even if you don't want them there. Output Followed by PStore4An Output instruction followed by a PStore4 instruction will cause all types of memory errors,register parity errors, and generally unpredictable behavior. The problem is that this instructionsequence will cause both MC1 and MC2 to access RM at the same time. Have a look at thememory timing chart on page 63 of the D0 Hardware Manual. Suppose the Output instruction isstarted. MC1 runs for two cycles. Two cycles later, MC2 starts, and the next instruction alsobegins execution. Suppose the next instruction is PStore4. MC1 will start right away because theOutput has finished with MC1. MC1 will look and see if MC2 will reference RM in the next cycle.MC2 will reference RM five cycles later, but MC1 does not look that far ahead, so it starts. If youline up RM references for MC1 and MC2, you will see that they occur simultaneously if thePStore4 is executed right after the Output or one instruction later.When an Output instruction is executed, a certain number of instructions must be executed beforea PStore4 or Task switch can occur. The number of instructions following the Output depends onwhat the memory is doing. There are three cases to consider, and these are described separatelybelow. fp qGfp)bq0)^%[:tYo)UqR)T3B)RhB MOp)$)IqO )H Bp),)?q;(<t:K )6q&5)5) /p ,q@ *B! (H '#%7 %X"= #D !00 O .U cD H '_ \=# > JBTDo Gotchas 8 February 1982817.Output Preceded by any Memory Instruction Other Than IOFetch or OutputIf in a segment of code between task Returns, the memory instruction immediately precedingOutput is any memory operation other than IOFetch4, IOFetch16, or Output, then at least twonon-memory instructions must lie between the Output and a PStore4. Similarly, a task Returncan be performed on or after the second instruction following the Output. 18.Output Preceded by IOFetch or OutputIf in a segment of code between task Returns, the memory instruction immediately precedingOutput is IOFetch4, IOFetch16, or Output, more than two instructions may be requiredfollowing the Output before a PStore4 or task Return can be executed. This is because thepromotion of the Output instruction from MC1 to MC2 in the memory system is deferrred fora number of clocks due to the preceding memory operation. This causes the RM access inthe Output to occur later in time than if the memory pipe had not been busy. The actualnumber of instructions required after the Output is given by the number of non-memoryinstructions lying between the Output and the preceding memory instruction. PrecedingInterveningNon-memoryMemoryNon-memoryInstructions requiredInstructionInstructionsfollowing OutputIOFetch4 orOutput2 or less534435 or more2IOFetch168 or less59410311 or more2Regardless of the proximity of an interfering prior memory operation, it is always legal tointerlock the Output to avoid unacceptable instruction count requirements prior to the Output(it won't take any less total machine time; it just causes a processor suspension until theOutput is complete). Specifically, the sequences:Output[foo];foo_foo;PStore4[...];Output[foo];foo_foo, RETURN;are legal. fp qGfp b)F)^q--)\ N)[3()YLK T3p)$)Pq--)N)+)M,M )Ka/*)I'/)G9)F),)D7M?p# 6 >&# 6<\ # 69q 7#6#6(6#4^6#26/!#6#-V6#+6#) 6)&O=)$*3)".-) 1}t  v ):q  T?(YDo Gotchas 8 February 1982919.Output AloneIf in a segment of code between task Returns, Output is the only reference,then it mustassume that the last instruction in the previous task could have been IOFetch16. Therefore, itmust obey the instruction count rule given in number 18 releative to IOFetch16, or do theinterlock following the Output. 20.At Least One Non-Memory Instruction in a TaskA task must execute at least one non-memory instruction. This follows from (11) above.Consider the following sequence of instructions in tasks A and B:TASK A:T _ xx, RETURN;TASK B:Memory Instruction;Memory Instruction;Memory Instruction, RETURN;TASK A Continues:R _ T;When Task A continues, T will not have been assigned because of the bypass problem.21.Minimum Time Between Maintenance Panel OperationsThere must be at least 800 nanoseconds between a ClearMPanel and an IncMPaneloperation, and at least 400 nanoseconds between an IncMPanel and another IncMPaneloperation. Since clock period may be as little as 70 nanoseconds, 12 cycles (6 instructions)must be allowed between ClearMPanel and IncMPanel and 6 cycles (3 instructions) betweentwo IncMPanel's. 22.Return Following DispatchBeware of a Return statement following a Dispatch. This is equivalent to a notify (i.e., loadAPCTask&APC and then Return). The following example illustrates the point: MsgStatus _ (MsgStatus) + 1;Dispatch[MsgStatus,7,4], Skip[ALU#0];MsgStatus _ (Zero) - 1, RETURN;Disp[MsgState];In the above example, the Return statement causes a jump to Control Store location zero.The following example will produce the desired results: fp qGfp b) )^qP)\0.)[7")YL T3p)-)Pq8)NAKtHDA.?d=:'6)3Cq+( .*p)1)*q  3)( D)'#2+)%X>)# tp))q R)8Kt%1f) qC) *7N >.](Do Gotchas 8 February 198210MsgStatus _ (MsgStatus) + 1;Dispatch[MsgStatus,7,4], Skip[ALU#0];MsgStatus _ (Zero) - 1, GoTo[RETURN];*Delay oneinstruction.Disp[MsgState];23.Timer Instruction With Conditional BranchIt is illegal to have a conditional branch in the same instruction as a LoadTimer orAddToTimer instruction. This may cause the timer instruction to fail. For example: Skip[ALU#0], AddToTimer[RTimer];24.Loading PCF Before NextInst/NextDataPCF cannot be loaded in the instruction immediately preceding a NextInst or NextDatainstruction. PCF gets loaded, but the decision to cause a trap is made at the time MIR isloaded with the NextInst or NextOp instruction. This means that if the 10-bit was on beforePCF was loaded, a trap will occur. 25.T _ StackIn the Emulator, the first reference to the Stack in a byte code must not be:T _ Stack;If a PFetch2 is pending to the Stack, the processor may read the R location before thesecond word has been transferred to the stack. To prevent this from occurring, the firstreference to the Stack must also decrement the stack pointer:T _ Stack&-1;Decrementing the stack pointer causes an instruction to be aborted if a PFetch2 is pendingto the stack.26.Putting a Negative Number on the StackThe instructions:Stack&+1 _ (Zero) - T;Stack&+1 _ (Zero) - 1;do not work when the stack is empty because the Zero causes the hardware to read thestack, possibly causing a stack underflow trap, even though the value read from the stack isunused.To put -1 on the stack, the instruction: fp qFfpbt`S%^%< \ Z Up)))RhqQ)PSM,t Hp)$)DqT)B:)A U)?A# :'p))6qM3Ct )/qR).#5),<=(t )%XqE)# tp)&)qt)TqH )S)) M( >.]Do Gotchas 8 February 198211Stack&+1 _ (Stack&+1) OR NOT (0C);does not work if the stack is empty either.27.Beware of Fetches to the StackThe instructions:PFetch4[Base,Stack0];Will fetch four words to the stack registers. If this is followed by a stack instruction:Stack&+1 _ ...there will be no interlock between the PFetch4 and the write to the stack. This means thewrite to the stack will be overwritten by the fetch. The way to avoid this is by means of anexplicit interlock:Stack0 _ Stack0;Stack&+1 _ 0C;28.Writing a Register From a Constant or T Does Not Interlock a PFetchAssigning a constant or T to a register does not interlock a PFetch1/2/4 to that register. Ifyou assign a constant to a register and a fetch is pending on that register, the constant willbe overwritten by the fetch. For example:PFetch4[Base,R0];R2 _ 0C;In the above example, R2 will be overwritten by the fetch. Reading a register that is beingfetched will cause it to interlock:PFetch4[Base,R0];R2 _ R2;R2 _ 0C;29.Testing the R<0 or R Odd Branch Condition Does Not Interlock a PFetchYou will defeat both the hardware interlock and the assembler's no-interlock warning, if youwrite an R<0 or R Odd branch condition as in the following example:PFetch4[Base,R0];R2, T _ 0C, GoTo[Next,R<0];or in the following:R2, GoTo[Next,R<0]; fp qFfpbt")^q+ Yp))V!qRt)O=q5$Kt )HYqJ)FY)DARt? :np)C)6q:$)51^)3g)/t.*)*q3))(#%|t#! p)E)[qT)CtT)q pt( )>.\Do Gotchas 8 February 198212In this case, you will encounter Gotcha 28. However, if you make a practice of writing yourR<0 and R Odd branch tests as follows:A _ R2, T _ 0C, GoTo[Next,R<0];then the assembler will warn you with a 'no register interlock' message. And if you writeyour RM branch conditions as:LU _ R2, GoTo[Next,R<0];the hardware interlock will safely occur.30.Base Register Now Interlocks.The D0 Hardware Manual is wrong in section 5.3 R Interlocking - where it says:"Note that the interlock comparators are not activated during memory reference instructions.This means that if a base register is fetched into RM and then used in a memory referenceinstruction without being explicitly read first, the old version of the base register will be usedif the base register has not been filled by the memory. This situation must be avoided by theprogrammer."A hardware modification in 1981 was made so that the base register used by a referenceinterlocks in exactly the same way as reading that register in a non-reference instruction.This improvement is very valuable for the Alto emulator and in some other situations.As a result of this hardware change, however, you can now get unwanted interlocks forInput, Output, and ReadPipe references where the base register isn't used.In Output[RReg,F2], for example, the assembler supplies register 0 in the appropriate groupof 20b registers as the default base register. If this Output were preceded by a PFetch intothat register, then the Output would unintentionally be held until the PFetch finished. Toavoid this, supply a third argument to Input, Output, or ReadPipe, which will be used as thebase register for the instruction. For example:Output[RReg,0,RConst];uses RConst instead of register 0 as the base register. fp qFfp)bq8$)`S&\t)YoqT)WT3t)Pq) Kp))H6qN)D R)BF)A.L)?d^)= ):'++)8]=)6=)3 9)1UJ)-$7),>)*NF)(-/)&0#Gt)q7 ?.H} HELVETICA  HELVETICA  HELVETICA HELVETICA GACHA ?"'!-2277<J@GEj/H FD0Gotchas.BravoFialaFebruary 8, 1982 4:20 PM