Dorado Hardware ManualInstruction Fetch Unit14 September 198164Instruction Fetch Unit The instruction fetch unit, or IFU, decodes a stream of bytes from memory into a sequenceof 8-bit opcodes and operands using a writeable decoding memory, and presents theresults to the processor for efficient interpretation. The next section contains an overviewof IFU function, supplemented by details in later sections.Read this chapter with Figure 12 in front of you.Overview of OperationThe IFU handles four independent instruction sets. Opcodes are 8-bit bytes, which may befollowed in memory by 0, 1, or 2 operand bytes. Hence, the total length of an operation is1, 2, or 3 bytes. The first operand byte is called a, the second b.One method of dealing with operations longer than 3 bytes is to encode them in IFUM as 1-bytejumps to the next operation. This gives up the possibility of referencing N, a, or b with _Id butavoids having to restart the IFU. The processor then must compute the proper place in theinstruction stream and reference a, b, g, etc. without help from the IFU.The term PC refers to the displacement of an opcode byte from the codebase, which is BR31. PC's are 16-bit items, where 0:14 are an unsigned word displacement relative to thecodebase, and bit 15 selects the byte. In other words, codebase points at a 32k segmentof virtual memory; a PC selects a byte in this segment. The PC's are named PCF, . . .,PCM, and PCX, where the final letter in the name denotes the level in the IFU pipeline.Since the IFU's PC is only 16 bits, overflowing either end of the code segment causes wraparound.This programming error is not detected by the hardware.For Alto compatibility reasons, we currently have the following kludge. Instruction sets 0and 1 treat byte 0 in the selected word as bits 0:7, 1 as bits 8:15; instruction sets 2 and 3treat byte 0 as bits 8:15, 1 as 0:7. Eventually, this may be changed so that all instructionsets use 0 for the byte in 0:7 and 1 for 8:15.The IFU is started by first selecting an instruction set (InsSetOrEvent_B function) and thenloading the F-level PC (PCF_B function). The IFU then starts fetching the byte streamstarting at the word BR[31] + PCF[0:14], byte PCF[15], from the cache and preparesopcodes for interpretation by the processor.Bytes from the cache then march through the IFU pipeline beginning with the F and G full-word buffer registers on the MemD board; single bytes from F/G then move into J or H onthe IFU board. InsSet[0:1] and the opcode byte in J address the decoding memory, IFUM,a 1024-word x 24-bit (+3 parity) RAM containing the information in the table below.Although IFUM is writeable, it will normally be loaded with the microprogram and notsubsequently changed (Diagnostics are, of course, an exception.).ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОaвrpНЇО^eТ┼П+Т▀П.НЇО\⌡ТъТЮП4НЇОZпТ√Т≈ПAНЇОYТП;Н╩ОU■qТXП1НЇОPzsТНЇОMpТ┴П3Т┼П&НЇОK>Т█Т▌ПCНЇОIsТП4tptpНyОF╡uТ╘ПKТ╙НyОE-Т÷П"Т═П,tutuНyОC╘ТрП!ТсП9НyОB%ТП!tututuП!НЇО>ЁpТ┼qpП7qpТ▀НЇО<ИТїТ╗ПTНЇО;Т≤П&Т≥П2НЇО9TТўПEТ╞НЇО7┴ТЇП'Т╦П0НyО4гuТ⌠ПaНyО3gТП7НЇО0pТ╗П?Т╘НЇО.MТ∙П[Т√НЇО,┐Т⌡ПEТ°НЇО*╦ТП.НЇО'FТ▐ Т░ПRНЇО%|ТаП=ТбНЇО#╠ТвТьП>НЇО!ФТП,НЇОtТ▀П+Т▄П.НЇО╙Т▌ПCТ▐НЇОъТ▓Т⌠П8НЇОТХТИП?НЇОJТэТщПDНЇОТПA ХЇ8=ПRс%Dorado Hardware ManualInstruction Fetch Unit14 September 198165Table 18: IFUM FieldsNameSize ContentsLength' 2Opcode length: 1, 2 or 3 bytes (0 length is illegal).TPause' 1The opcode is of type pause.TJump' 1The opcode is of type jump.IFaddr'10TNIA[4:13] of the first instruction to be executed in interpreting this opcode(TNIA[14:15] from the IFUJump in the exit of the previous opcode).RBaseB' 1RBase initialization, discussed below.MemB 3MemBase initialization, discussed below.Sign 1Operand sign extension, discussed below.Packeda 1Packed a, discussed below.N 4Operand encoded in the opcode, discussed below.Length', TPause', TJump', Sign, Packeda, and N are used by the IFU to prepare operandsand to sequence correctly to the next opcode; IFaddr' is passed to the control section; andthe processor uses MemB and RBaseB' to initialize MemBase and RBase when themicrocode for the opcode commences.Length' determines the number of operand bytes; a for a two or three-byte instruction willbe in H, while b for a three-byte instruction will be in F/G, when the assembled instructionis ready to proceed. The assembled instruction and a then drop into the M level.IFUJump[n] (see "Control Section") transfers control to the starting instruction for theopcode assembled in M, where TNIA[4:13]_IFaddr, TNIA[14:15]_n (n is 0 to 3) is thelocation of the entry instruction. A 4-long entry vector, rather than a single startingaddress, can be utilized for faster execution, as discussed later. IFaddr may be overruledby a trap address when appropriate.At t0 of the starting instruction, the processor initializes RBase to RBaseB (i.e., to 0 or to 1)and MemBase to 0..MemBX[0:1]..MemB[1:2] if MemB[0] = 0, or to 348+MemB[1:2] ifMemB[0] = 1. MemBX is interpreted as a stack pointer to a 4-entry stack with 4 baseregisters in each entry, and MemB[1:2] in IFUM select a particular base register from thecurrent entry. The MemBX kludge may reduce computation on procedure call/return, asdiscussed later. Other information about the opcode and a are copied into the X level.Instructions that implement the opcode then reference operands in sequence using theA_Id, RisId, or TisId operations discussed in "Processor Section" or the IFetch_ operationdiscussed in "Memory Section," which read operands from the X level. The operandsequence delivered by the IFU in response to _Id is as follows:НЇОf╙pТ─Н"sqН5pТНFОfЯН$CОbsТXНґО_ёvНЫPЬЪЗЪН·ТЫхЬЪЗЪНґО]KuЫPНТ─ЫPЬЪЗЪН·ЫхЬЪЗЪТП/НґО[єЫЬЗЫPЬЪЗЪНТ─ЫPЬЪЗЪН·ЫPЬЪЗЪТwuНґОZ ЫЬЗЫPЬЪЗЪНТ─ЫPЬЪЗЪН·ЫPЬЪЗЪТwuНґОX°ЫЬЗНЫPЬЪЗЪН·ТЛЫPЬЪЗЪТМП7Н~ОW;ТПBНґОUЇЫЬЗНТ─ЫPЬЪЗЪЫЬЗН·ТП!НґОT3НТ─ЫPЬЪЗЪН·ЫхЬЪЗЪТП!НґОR╞НТ─ЫPЬЪЗЪН·ЫхЬЪЗЪТП!НґОQЫPtНuТ─ЫPЬЪЗЪН·ЫхЬЪЗЪТtuНґОO`НТ─ЫPЬЪЗЪН·ЫхЬЪЗЪТП(НЇОLqТ√ЫЬЗТ≈tpqpП(НЇОJGТ┬П.qpП&НЇОH|ТЩqpqpТЧНЇОF╡ТП#НЇОC@qpТ≤П)tpП)НЇОAuТ▄tpТ█ПHНЇО?╙ТТП4tp ТУНЇО<8ТзПXНЇО:nТэТщП?НЇО8ёТЕП2ТФП&НЇО6ыТ⌡П#Т°П!qpНЇО5ТП#НЇО1°Т┘О1uО1°pТ├П$qpНЇО/яТЙП*ТКО/DuО/яpНЇО.ТЇТ╦ПAНЇО,<ТїП7Т╗П"НЇО*rТ╘Т╙П4НЇО(їТЄП8Т╣tpНЇО%5ТиПTНЇО#jТ█ПHТ▌НЇО!═ТжПIТвНЇОуТП? TЇ▌<ПH}ўDorado Hardware ManualInstruction Fetch Unit14 September 198166Table 19: Operand Sequence for _IdTypeLengthPackedaSequence0IllegalJump1Length, Length, Length, . . .Packeda, sign, and N determine jump displacement.Jump2Length, Length, Length, . . .Packeda and N are unused; sign extends the sign of a for thejump displacement.Jump3IllegalRegular1N if N ne 178, Length, Length, Length, . . .Packeda and sign are unused.Regular20N if N ne 178, a, Length, Length, . . .a is sign-extended if sign = 1.Regular21N if N ne 178, a[0:3], a[4:7], Length, Length, . . .Sign is unused.Regular30N if N ne 178, a, b, Length, Length, . . .a is sign-extended if sign = 1.Regular31N if N ne 178, a[0:3], a[4:7], b, Length, Length, . . .Sign is unused.PausexxSame as regularRegular and pause opcodes have an optional 4-bit operand N that is delivered first (N isn't suppliedwhen N = 178). This is followed by a and b, if they exist; a is sign-extended when sign = 1 orsplit into two 4-bit nibbles if Packeda = 1. Subsequently, _Id delivers Length. For jumps, all ofthese operands are consumed in computing the jump displacement, and _Id delivers Length.The normal opcode references all of its N, a, and b operands; however, except on three-byte opcodes, the IFU hardware does not require that these operands be referencedtheprocessor could exit to the next opcode without reading all the operands, if that wasdesirable for some reason. However, for opcodes of length 3, the processor must consumethe a byte with _Id (both a[0:3] and a[4:7] if Packeda=1) before going to the next opcodewith an IFUJumpit does not suffice to consume the last a byte with _Id concurrent withIFUJump. An opcode must never do more than 7 _Id's for reasons that will be discussedlater.The types of opcodes are distinguished as follows: A pause has no successor, and the IFUmust be restarted with PCF_B before the next IFUJump. A regular's successor is the bytefollowing its last operand; a jump's successor is determined by adding a displacement tothe current PC as follows:If Length=1, then Sign.Packeda.N forms a six-bit signed displacement. In otherwords, the jump is to any byte in the range PC408 to PC+378.If Length=2, then Packeda and N are unused; the jump displacement is a, if signis 0, or sign-extended a, if sign is 1.A jump with Length=3 is illegal.The IFU pipeline follows the instruction stream and fills up when it is five or six bytes aheadof the current opcode. When a pause opcode is recognized, further memory referencesare not made. When a jump opcode is recognized in J, the IFU discards any bytes in F, G,НЇОf╙pТ─Н"sqН5pТНFОfЯН ОbsТXП#Н:О^╛wНҐН@tН!цwН:О\TuНН┤Н!цН:ОZпНН┤Н!цwТuН!цОYLwtwuwuН:ОW╔НН┤Н!цwuН!цОV!wtuТ≤wuwutuТ≥Н!цОT²ТН:ОSНН┤Н!цН:ОQ■НН┤Н!цwuwuОQОQ■wuН!цОOйwtuwuН:ОMЪНН┤Н!цwuwuОMrОMЪtuwuН!цОL5tuwuН:ОJjНН┤Н!цwuwuОIщОJjtutuwuН!цОHцwu Н:ОGНН┤Н!цwuwuОF▌ОGtutuwuН!цОEQtuwuН:ОC├НН┤Н!цwuwuОBЫОC├tututuwuН!цОAъwu Н:О@[НН┤Н!цНyО>&wuТ▌wuТ▐wuwuНyО<╒Т▒wuО<О<╒Т▓tututuwuНyО:ьТ÷wtuП"wuТ═ НyО9TТЕП(ТФП)wuНЇО6pТ·П+tptp Т÷НЇО4:Т═П(qpТ║П&НЇО2pТжТвП5НЇО0╔Т┌П3Т┐П%НЇО.зТ┘tpТ├tp tptpqpНЇО-Т≤qp Т≥tpНЇО+EТ≈П?Т≤НЇО){НЇО& Т┐П6qpТ└НЇО$>Т▐П9qpТ░НЇО"sТ╗qpТ╘НЇО ╘ТНyО7Т╞qpТ╟q tqpП/НyОlТП1ОъuОlpОъuОlpНyОТ⌠qpqtpqpТ■П"tpqНyОSpТtpqpНyОАqp НЇОoТ─Т│ПGНЇО ╔Т╞qpТ╟НЇОзТ┘qpП9Т├Ъ .Ї⌠<П\x Dorado Hardware ManualInstruction Fetch Unit14 September 198167and H and refills these pipe levels with bytes along the jump path.The B_PCX' function reads PC (inverted) for the current opcode. Note that PCF_B doesnot affect the value of PCX; B_PCX' continues to read the displacement of the currentopcode, which does not change until an IFUJump is done.An opcode that conditionally jumps can be encoded in IFUM with type either jump orregular. If encoded as type jump, when the condition is false, the program must issuePCF_B to restart the IFU at the fall-through address. Similarly, if regular, PCF_B must beissued to restart at the jump address.The Length argument delivered by _Id after other operands have been referenced is useful inconditional jump calculations. Note that the fall-through address for a conditional jump isLength+PCX, so:T_(Id)(PCX')1;*Id = Length for type jumpPCF_T;Noop;IFUJump[0];restarts the IFU at the fall-through address for type jump.Following PCF_B, the IFU flushes its pipeline; it is illegal for either the instructioncontaining PCF_B or the one immediately after it to do an IFUJump, but any subsequentinstruction can issue an IFUJump; however, the processor will spin uselessly at the IFU"NotReady" trap until the fifth cycle after PCF_B (earliest) or later (longer opcodes, cachemisses, Mar traffic).ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbПCНЇО^╛Т·Т÷П9НЇО\АТЎ Т©ПKНЇО[ТП7НЇОW╔ТпТяqpП/qpНЇОUзqpТЎqpТ©П4НЇОTТ■Т∙П5qpНЇОREТП&НyОOuТлwuТмП1НyОM╧ТЩПBТЧНyОLXТН:ОJ#Нwu wН:ОHцuН:ОGbН:ОF НyОCмП6wuНЇО@~pТП<ТНЇО>ЁТіП-ТїП(НЇО<ИТҐП.ТЎП)НЇО;Т▐Т░ПGНЇО9TТЪBЇ9 <П.Ч╞Dorado Hardware ManualInstruction Fetch Unit14 September 198168Table 20: IFU FF DecodesNameActionIFUResetHalt and clear the IFU pipeline and clear errors, testing features, and BrkPending(i.e., BrkIns); Reschedule condition and instruction set are not cleared.B_IFUMLH'Read the high-order IFUM word, InsSet, and IdCnt onto B (low-true) as follows:FieldB bitsIdCnt 0:2Count of _Id's since start of opcodeInsSet 3:4Instruction set numberPackeda 5Packed aIFaddr' 6:15Starting addressIFUMLH_BLoad the high-order IFUM word from B (t1 to t3), where the Packeda and IFaddrfields are in the same form as B_IFUMLH'. Must have at least one interveninginstruction after a preceding BrkIns_ or InsSetorEvent_.IFUMRH_BLoad the low-order IFUM word from B (t1 to t3) in the format given below; musthave at least one intervening instruction after a preceding BrkIns_ orInsSetorEvent_:FieldB bitsSign 0IPar.0 1Even parity over N, MemB[1:2], and IFAD[0:1]IPar.1 2Even parity over IFAD[2:9]IPar.2 3Even parity on Packeda, Sign, Length', MemB.0,RBaseB', TPause, and TJumpLength' 4:5Instruction length (low true)RBaseB' 61-bit RBase initializationMemB 7:93-bit MemBase initializationTPause' 10Type pause (low true)TJump' 11Type jump (low true)N12:154-bit operandB_IFUMRH'Read IFUM fields in the same format as IFUMRH_B (inverted).PCF_BLoad PCF at t3, clear and restart the pipeline.B_PCX'Read PC for the currently executing opcode (inverted).BrkIns_BLoad BrkIns from B[0:7] at t3, and set BrkPending (ill-defined unless the IFU hasbeen reset). BrkIns replaces the next opcode loaded into J; then BrkPending iscleared. BrkIns also addresses IFUM on IFUMLH/RH_ and B_IFUMLH'/RH'.InsSetOrEvent_BIf B[0]=1, then B[6:7] are loaded into the InsSet register at t3; if B[0]=0, thenB[4:15] control event counters as discussed in the "Other IO and Event Counters"chapter. A following PCF_B starts the IFU interpreting using the new instructionset. Illegal except when the IFU is paused or reset or when PCF_ will be donebefore the next IFUJump.ЪНЇОf╙pТ─Н"sqН5pТНFОfЯН bОbsТXНxО^РvНЫхЬЪЗЪНxО\ЎuЫЬЗНТ°ЫхЬЪЗЪПCТ²НО[]ТПIНxОY⌠НТЁЫхЬЪЗЪТЄП6Н║ОWхwЫЬЗН$НТН║ОUЩuН$НТ─Н){ТН║ОT²Н$НТ─Н){ Т Н║ОStН$НuТ─Н){ТtН║ОQ■uН$НТ─Н){ТНxОOїЫхЬЪЗЪНТ√ЫхЬЪЗЪТ≈ОOОOїОOОOїtu НОMЪТ╛ТґП@НОL·ТП8НxОJтНТ÷ЫхЬЪЗЪП&ОJGОJтТ═ОJGОJтП!НОI-ТEПCТFНОGлН║ОFwЫЬЗН$НТН║ОD7uН$НТ─Н║ОBжН$НН){ТП,Н║ОAuН$НТ─Н){ТН║О?ЯН$НТ─Н){ТtuН){О>mН║О=Н$НТ─Н){ ТН║О;╚Н$НТ─Н){ТН║О:KН$НТ─Н){ТН║О8ЙН$НТ─Н){ТН║О7┴Н$НТ─Н){ТН║О6(Н$НН){НxО4^ЫхЬЪЗЪНЫхЬЪЗЪП;НxО2⌠НЫхЬЪЗЪО2О2⌠П!НxО0хНЫхЬЪЗЪП6НxО.ЧНТ·ЫхЬЪЗЪО.qО.ЧТ÷П+НО-VТ╔П-ТіП"НО+ЖТИТЙП<НxО*+НТ╟ЫхЬЪЗЪП&Т╠О)·О*+НО(└Т█П!Т▌П/НО'#Т÷П*Т═П'НО%бТ÷П<Т═НО$aТЪРЇ$>7mCм╠Dorado Hardware ManualInstruction Fetch Unit14 September 198169Table 20: IFU FF Decodes (continued)NameActionRescheduleCause a reschedule trap on the second or third "successful" IFUJump."Successful" means that an IFUJump is not trapped for some other reason suchas not-ready. The second IFUJump will be trapped if it does not occur in theinstruction immediately after the first successful IFUJump; otherwise, the thirdsuccessful IFUJump will be trapped. The trap instruction is executed as though itwere the first instruction of the rescheduled opcode, and _Id and IFUJump willwork as though that opcode were in progress.Also set the Reschedule branch condition (emulator only) to true.RescheduleNowRescheduleNow is guaranteed to trap the next successful IFUJump, so long as thenext IFUJump appears in the second cycle after RescheduleNow, or later. TheReschedule branch condition is not affected.NoRescheduleTurn off the Reschedule trap and branch condition.IFUTest_BLoad the test-control register from B (load with 0 or do IFUReset when not testing)as follows:FieldB bitsTestFG 0:7Substituted for cache dataTestFGParity 8Substituted for cache parity bitTestFault 9Substituted for memory fault signalTestMemAck 10Substituted for memory MemAck signalTestMakeF_D 11Substituted for memory MakeF_D signalTestFH' 12enable FHCP and t1 when IFUTick executedTestSH' 13enable SHCP and t2 when IFUTick executedTestEn 14test enableIFUTickTick the IFU's clock once according to TestFH and TestSH in the IFUTest register.The IFUJump Entry VectorAn IFUJump[n], encoded in the JCN field of the instruction, sends control to an addresspartly determined by the IFU and partly by the IFUJump clause. The four possible targetsof an IFUJump are called an "entry vector".An opcode leaves its results in one of several convenient forms agreed to by convention,then chooses an entry instruction in its successor with IFUJump[n], where n =0 to 3.Every opcode in the instruction set must have an entry vector of the same length. Carefulchoice of forms may reduce execution time by one cycle for some opcodes withoutincreasing execution time for successor opcodes.A true branch condition (FF-encoded) with IFUJump prevents starting the next opcode. Forexample, IFUJump[2,condition] sends control to the next opcode's entry 2, if condition isfalse, or entry 3, if condition is true. However, no other IFU activities associated withstarting the new opcode take place when condition is true, so entry 3 is executed in thecontext of the opcode that did the IFUJump[2,condition]; however, the processor initializesRBase and MemBase as though the next opcode were starting, so this part of the state islost. Thus, at a cost of one entry instruction in every opcode of an instruction set, it maybe possible to shorten the execution time of some opcodes using a conditional exit.An opcode with common and uncommon exit cases, for example, can exit withIFUJump[2,condition], where entry 2, the common case, starts the next opcode, while entry3 is reached for the uncommon case. Since IFUJump loads Link with .+1, entry 3 caneither Return, to execute more code associated with the uncommon case, or it can doЪНЇОf╙pТ─Н"sqН5pТНFОfЯН'ОbsТXП%НxО^РvНЫхЬЪЗЪНxО\Ўu НТ*ЫхЬЪЗЪТ+wuНО[]Т║П5Т╒НОYЭТ╡П1ТЁНОX°ТАПJТБНОW;Т▐Т░П8НОUзТЄПNНОTyТП,НОR╞ПAНxОPДНТ┼ЫхЬЪЗЪП(wuП#НОO┐ТЁП=ТЄНОN#ТП,НxОLXНЫхЬЪЗЪП2НxОJ█НТ└ЫхЬЪЗЪТ┘П5НОI-Т Н║ОGbwЫЬЗН$НН║ОE≈uН%·Т─Н){ ТН║ОD7Н%·Т─Н){ ТН║ОBжН%·Т─Н){ ТН║ОAu Н%·Т─Н){ ТН║О@ Н%·Т─Н){ ТН║О>ЁН%·Т─Н){Т О>&О>ЁН║О=Н%·Т─Н){Т О<О=Н║О;eН%·Т─Н){ТНxО9 ЫхЬЪЗЪНТ┼ЫхЬЪЗЪП-Т▀П$НЇО4єsТЫЬЗНЇО12pТґП!ТўП6НЇО/hТ≤П*Т≥П/НЇО-²ТП+НЇО*+Т║Т╒qpНЇО(`ТмПQТнНЇО&√Т▒ПRТ▓НЇО$кТМТНП@НЇО#ТП0НЇО▐Т│П1Т┌П(НЇОдТ╚П2Т╛П'НЇОЫТкПJТлНЇО/Т╗П4Т╘П$НЇОdТ■ПOТ∙НЇО Т≤Т≥П8НЇОоТ≈П#Т≤П:НЇОТХПSНЇО▓Т5П?Т6 НЇО хТ┼П5Т▀П$НЇОЩТ╡ТЁП?НЇО 3ТбПLТцЪzЇ Л=П^ЁDorado Hardware ManualInstruction Fetch Unit14 September 198170something more explicit, if an appropriate convention is followed by all opcodes.The following example shows how an instruction set with four opcodes (Push, Add, Store,and JNZ) is implemented using a four-long entry vector. The opcodes in this example dealwith the stack like Mesa opcodes do, and the first three entry conventions are, in fact, oneswhich might be used by the current Mesa emulator.%Entry0:Stk[StkP] holds top-of-stack (if anygarbage if stack empty), T holds garbage1:T and Stk[StkP-1] hold previous top of stack (garbage if stack empty),Stk[StkP] garbage, Md holds top-of-stack.2:T and Stk[StkP+1] hold top-of-stack,Stk[StkP] holds previous top of stack (garbage if stack empty).3:Results in same form as entry 2, but restart IFU at NewPC = (Id)(PCX')1Note that Stack&+1 references must not check for underflow when the stack may legitimately beempty.%*Push the memory location pointed to by N.Push:Fetch_Id, T_StackNoUFL&+1, IFUJump[1];Fetch_Id, T_StackNoUFL&+1_Md, IFUJump[1];Fetch_Id, StkP+2, IFUJump[1];T_(Id)(PCX')1, StkP+1, Return;*Replace the top two stack entries by their sum.Add:T_Stack&1, Branch[.+2];Stack_Md;T_Stack&1_T+(Stack&1), IFUJump[2];T_(Id)(PCX')1, StkP+1, Return;*Store the top-of-stack into the memory location pointed to by N and pop the stack.Store:Store_Id, DBuf_Stack&1, IFUJump[0];Stack_Md, Branch[Storex];Store_Id, DBuf_T, IFUJump[0];T_(Id)(PCX')1, StkP+1, Return;Storex:Store_Id, DBuf_Stack&2, IFUJump[2];*Pop the stack and branch if the top-of-stack was zero, else fall through*This opcode is of type jump.JNZ:Pd_Stack&1, Branch[ZTest];Pd_Md, StkP1, Branch[ZTest];Pd_T, Branch[ZTest];T_(Id)(PCX')1, StkP+1, Return;ZTest:T_Stack&1, IFUJump[2,ALU#0];*Return here when the jump doesn't take.T_Stack&1, PCF_T;IFUJump[2];Push thus requires 1 execution cycle; Store and Add take either 1 or 2 cycles dependingupon the entry point; JNZ takes 2 cycles when the jump takes or 9 cycles when the opcodefalls through (because the IFU isn't ready until the fifth cycle after PCF_B).Although every opcode in an instruction set must have an entry vector following the sameconventions, it is not necessary that the vector be four-long. In the above example, asingle-entry scheme would probably use the entry 2 convention followed above. In thatevent, Push, Add, Store, and JNZ would require 2, 1, 2, and 3 cycles (common case),ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbПQНЇО^╛Т ПPТ⌡НЇО\АТ┴ПGТ┼НЇО[Т┌П$Т┐П9НЇОYLТП1НyОUЩuНґОTyН(ПMНґОRУН(ПFН(ОQqП)НґОOМН(П$Н(ОNiП?НґОLЕН(ПIНyОKaТ╚ПMТ╛НyОIщНyОHYНyОE-ТП*НyОC╘НШП&НШОB%П)НШО@║НШО?НyО;РП0НyО:nНШНШО8ЙНШО7fП$НШО5БНyО2ІПSНyО12НШП$НШО/ўНШО.*НШО,іНyО+"НШП$НyО'ВПIНyО&sНyО$ННШНШО#jНШО!ФНШО bНyОчНШНyОZП(НШОжНШОR НЇОpТ║П8Т╒НЇО9Т│П6Т┌П"НЇОnТПNНЇОЭТ≤П5Т≥П#НЇО2ТдП)ТеП.НЇОgТ╧ПFТ╨НЇО °ТгП1ТхП"Ъ ▌Ї U<П]ІDorado Hardware ManualInstruction Fetch Unit14 September 198171respectively, compared to 1, 1 or 2, 1 or 2, and 2 or 3 cycles for the four-entry schemeabove.Since Mesa requires about 120 IFU entries for its 256 opcodes, the cost of the secondentry in the vector is between 0 and 120 locations, and 120 locations each for the third andfourth entries. Since Mesa is implemented by about 1044 instructions using entry vectorsof length 1, a vector of length 2 scheme would require ~1100, length 3 ~1220, and length4 ~1340 instructions. The implementor of an instruction set should decide when theadditional locations expended for larger entry vectors are no longer worth the additionalspeed.Although we originally hoped for as much as 8% faster inner loops and 4% overall speedimprovement, Gene McDaniel measured only 2% faster execution for Mesa (excluding diskwait) using a length 3 entry vector; microstore increased about 120 locations. Investigationrevealed that increased traffic on Mar (by overlapped Fetch_ and _Md) was causing IFUnot ready to occur more often, offsetting the fact that fewer processor cycles were needed.Forwarding saved about .2 cycles/opcode.Note: IFU trap locations discussed below must also be entry vectors that follow the sameconvention.Timing SummaryFrom the detailed timing discussion at the end of this chapter, the following generalizationsabout IFU timing can be drawn:Assuming no misses and no delays because the processor uses Mar, IFUJump willsuccessfully dispatch to the entry instruction of the next opcode on the fifth cycleafter PCF_B if the new opcode either is one byte long or is two bytes long andstarts at an even byte; otherwise it will succeed on the sixth cycle.A jump opcode causes a 3 cycle gap in the IFU pipe. The effect of the gap wouldbe a 3 cycle delay if each opcode were executed in exactly one cycle. However,the gap can overlap with extra cycles taken on the jump opcode itself or either ofthe two preceding opcodes. As usual in timing considerations, a 3-byte opcodecounts as two normal opcodes.If a long stream of regular one-byte opcodes is being executed by the processor atthe fastest possible rate (one instruction/opcode), and if the IFU neither missesnor faults nor waits for the processor's use of Mar or the cache, then it will alwayshave the next opcode ready for IFUJump. If the IFU waits one cycle for theprocessor to use Mar, it will shortly fill its pipe again, so scattered Mar referencesby the processor will not result in IFU NotReady.If a long stream of regular two-byte opcodes, each of which has an a but no N(This is the worst case.), is being executed by the processor at the fastest possiblerate (one instruction/opcode), and if the opcodes in the stream start at the evenbytes in words, and if the IFU neither misses nor faults, and if the processor neveruses Mar, then the IFU will give 25% NotReady. Each cycle in which the processorНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТґП)ТўП/НЇО`SНЇО\АТҐПGТЎ НЇО[Т┌ПXТ┐НЇОYLТ·ПEТ÷НЇОW│Т⌠ПMТ■ НЇОUЇТэТщП>НЇОSЛТЄП(Т╣П1НЇОR"НЇОN╟Т╒ПPТёНЇОLЕТ ПEТ⌡НЇОKТ┤П9Т┬П$НЇОIPТ╙ПQТ╚НЇОG┘Т┤ПNТ┬НЇОE╩ТП(НЇОBIqpТ≥П?Т НЇО@~ НЇО;esТ НЇО7СpТ┬П?Т┴НЇО6(ТНyО2ІТ█Т▌П7НyО0ЛТ∙П1Т√П#НyО/!Т╛П8ТґНyО-VТПEНyО)ДТ┬qpТ┴П.НyО(Т Т⌡П5НyО&OТ⌠П3qpТ■ НyО$┘Т╠П$Т╡П*НyО"╨ТНyОHТ┐Т└ПBНyО}Т╪ТҐП3НyОЁТ┬П,Т┴П)НyОХТуТжПCНyОТ∙Т√ПAНyОSТП1НyОАТўП5Т╞ tpНyОТ├ПDТ┤НyОLТїПEТ╗НyО │Т▄П&Т█П.НyОЇТ─Т│П?ЪpЇp=П\⌡;Dorado Hardware ManualInstruction Fetch Unit14 September 198172uses Mar adds one cycle of delay. If the opcodes in the stream start at the oddbytes in words, then the processor will get NotReady 40% of the time.Three-byte opcodes are not as bad as two-byte opcodes because, in the worstcase, the processor cannot reference both a and b in less than 2 instructions.Hence, a stream of three-byte opcodes has timing approximately the same as astream in which each three-byte opcode is replaced by a one-byte opcode followedby a two-byte opcode.Mar traffic may be an important timing factor if many opcodes finish in one or two cycles.Whenever the processor is making a reference, the IFU cannot use Mar, and the IFU mustmake one reference for every two bytes in the instruction stream. Note that if a processorreference is held, the IFU will also be prevented from making references (but the IFU is notprevented from making references when _Md is held).Use of MemBX and the Duplicate Stk RegionsThe present Mesa implementation requires 34 cycles for a local XFER and 54 cycles for anexternal XFER, excluding memory wait, and measurements made on the Mesa compilershowed that 38% of all cycles were spent in XFER. For this reason, speed improvements inXFER are an important objective.Since about 70% of all calls return before calling any other procedure, if a caller's baseregisters and stack were left untouched, then this information would neither have to besaved during the call nor restored during the return in most cases.The hardware that supports this idea consists of the MemBX register, pointing at one offour blocks of 4 base registers each, and StkP, pointing at one of four stacks of 64registers each. During a procedure call, StkP and MemBX may be advanced by 1 region,leaving the caller's state intact; if the callee makes nested calls, then eventually the MemBXand Stk regions would be exhausted and some would have to be saved and (eventually)restored. However, if the callee returns without too many nested calls, then its caller'sstate would still be intact.We have not constructed examples that use this idea, but a savings of 50% in averageXFER timing has been projected for Mesa.TrapsThe IFU may trap for not ready, reschedule request, map faults, cache data errors, andIFUM parity errors. When a trap condition occurs, the IFU substitutes a trap address forIFaddr on the next IFUJump. Hence, the next IFUJump sends control to one of the entriesin the trap vector. Locations assigned to these trap vectors are given in "Control Section"; note that eachinstruction set has independent trap locations.НЇОf╙pТ─Н"sqН5pТНFОfЯНyОbТ╒ПEТё НyО`SТПEНyО\АТбТцП9НyО[Т©П*tpТюtpНyОYLТ╨Т╩П:НyОW│Т─Т│ПGНyОUЇТНЇОREТ≈ Т≤ПOНЇОPzТ⌠П<Т■НЇОN╟Т▐Т░П;НЇОLЕТ┬ПUТ┴НЇОKТП3НЇОFsП*НЇОB▐pТ┼П2Т▀П&НЇО@еТхПBТи НЇО>ЗТ│ Т┌ПNНЇО=/ТНЇО9ҐТ╟ПZНЇО7СТ©ПLТю НЇО6(ТПCНЇО2ІТ╠П-Т╡П*НЇО0ЛТэТщП5НЇО/!Т Т⌡П<НЇО-VТ┐ПIТ└НЇО+▄Т╞П9Т╟НЇО)аТЎ Т©ПPНЇО'ВТНЇО$┘ТҐПBТЎНЇО"╨ТП(НЇО║sНЇО/pТ╧П3Т╨П#НЇОdТїПHТ╗НЇО Т▄П:Т█НЇОоТНЇО]Т╧П>Т╨НЇО▓ТП/Ъ @ЇK=ПXюDorado Hardware ManualInstruction Fetch Unit14 September 198173Each trap vector is dispatched into by IFUJump exactly as though it were an opcode.B_PCX' reads the PC of the opcode that would have been executed if the trap had notoccurred and RBase, MemBase, and _Id stuff are set according to that opcode (in everycase except NotReadyall are undefined at a NotReady trap).The relative priority of traps is as follows: IFUM parity error is highest, then NotReady,reschedule, cache data parity error, and map fault.The NotReady trap occurs whenever the IFU does not have both an opcode and itsassociated operands (a, b) ready for the processor. Since PCX, MemBase, and RBase areinvalid, the trap microcode must wait for the IFU to become ready. The following codesequence will work for all instruction sets that do not use a conditional exit:NotReady:FreezeBC, IFUJump[0];FreezeBC, IFUJump[1];FreezeBC, IFUJump[2];FreezeBC, IFUJump[3];For the sample instruction set given earlier, which uses entry 3 as a conditional exit, thefollowing sequence would be appropriate:NotReady:IFUJump[0];*Can't convert to IFUJump[2] because stack may be emptyT_Stack&1_Md, IFUJump[2];*Convert case 1 to case 2IFUJump[2];T_(Id)(PCX')1, StkP_StkP+1, Return;*Resume the opcode which didn't really exitIf the IFU detects bad parity on any read of IFUM, the IFUJump to the opcode affected bythis parity error will trap to the IFUM parity error trap location.The IFU will trap at the cache data parity error location, if it detected invalid parity on anybyte sent by the memory system. PCX will always correctly point at the opcode that wouldhave been executed next had the trap not occurred; however, the opcode and operandspointed at by PCX are not necessarily the ones that suffered the parity error. This occursbecause the pipe has continued ahead of PCX. The most confusing case occurs when theopcode following PCX was a jump; in this case the opcode fetched by the jump may havecaused the parity error, in which case PCX+/ jump displacement is limited to the rangePCX4008 to PCX+3778.The IFU will hold an IFUJump in the cycle prior to a cache data parity error or IFUM parityerror trap.Note that IFUReset must be given after an IFUM or cache data parity error and beforerestarting the IFU.The Reschedule function is used by io tasks to request service by the emulator. The IFUwill honor this trap request on the second IFUJump after it is executed, as discussed in alater section. The RescheduleNow function is like the Reschedule function, but the IFUhonors it on the first IFUJump after it is executed, rather than the second (RescheduleNowwas intended for use when continuing an opcode which previously experienced a fault).ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТнТоП5НЇО`SqТ╡П?ТЁНЇО^┴ТіП.ТїП'НЇО\ЎТП:pНЇОYLТ╨П@Т╩НЇОW│ТП3НЇОTТЕП2ТФНЇОREТ┼tpТ▀tpП=НЇОPzТІqpТЇП)НЇОN╟ТПOНyОKНuНЭОJ█НЭОI-НЭОGлНЭОFkНЇОCpТ╘П8Т╙П#НЇОARТП(НyО>░uНЭО=/ Н&√ТўТ╞П&НЭО;оТ─Н&√ТНЭО:n НЭО9 Т─П%Н+ЖТП$НЇО5ЎpТ√ПLТ≈НЇО3ТТПCНЇО0┌Т■ПQТ∙ НЇО.ЇТ▄ПCТ█НЇО,МТґТўП4НЇО+"Т∙ПNТ√НЇО)WТ┬П1Т┴П$НЇО'█Т∙Т√ПAНЇО%бТ⌠Т■П<НЇО#ЬО#kuО#ЬpТ О#kuО#ЬpНЇО ╘Т█ПLТ▌НЇОчТ НЇОlТбП7ТцНЇО╒ТНЇО0Т°Т²П9НЇОeТ П$qpП.Т⌡НЇО⌡ТІТЇПDНЇОпТ▄qpТ█ПDНЇОТЇП.Т╦П'Ъ ╒ЇЎ=ПYMBDorado Hardware ManualInstruction Fetch Unit14 September 198174An IFU fetch may experience a map fault. The memory system does not report IFU mapfaults to the fault task. Instead, it signals the IFU that a map fault has occurred, and theIFU passes this indication through its pipeline. Eventually, the IFUJump that would havesent control to the opcode affected by the map fault will instead transfer to the map faulttrap vector.Although IFU map faults are not reported to the fault task, the fault task must be careful to passover any pipe entries that were created by IFU map faults when it is woken for some other reason.Erroneous bytes fetched after a pause or jump opcode might cause map faults, but the IFU discardsthese before they reach the end of the pipeline, so the processor is never informed. Consequently,erroneous references interfere with processor memory activity and delay the IFU's efforts to refill itspipe on a jump, but don't have any disastrous effect.An IFU fetch may experience single or double storage failures. Unlike map faults, these arereported to the fault task just as on processor fetches. The memory system pipeline willfinish loading the cache munch just as though the data were ok, and the cache entries willhave valid byte parity. The IFU will continue running just as though no error had occurred.However, the fault task will be woken soon enough that it will run before the IFU's Fregister is loaded with a byte from the bad munch. Hence, the fault task will run beforethe emulator can possibly execute an IFUJump to the byte that suffered the error.For a recoverable error, the fault task can simply carry out some logging action and block;no harm will occur because the IFU will actually have gotten valid data, and the cache willcontain valid data. For an irrecoverable error, the fault task must clear the bad cachemunch and use the RescheduleNow function to trap the next IFUJump to code for dealingwith the irrecoverable error.Erroneous bytes fetched after a pause or jump opcode might suffer irrecoverable errors. The faulttask has no reasonable way to distinguish these from bytes really in the instruction stream, so it willcause a Reschedule trap anyway.RemarkAlthough independent trap vectors for each instruction set are probably inessential, performance should bebetter when the NotReady trap, which occurs frequently, is distinct for each instruction set. This allows thevarious IFUJump exits to be transformed into the form most likely to be convenient for the next opcode.The other traps could have been implemented to use a common trap for all locations. This would be moreeconomical for IFUM and FG parity error traps, if these simply result in an uncontinuable crash when runningsystem microcode. However, different trap vectors for each instruction set are probably more convenient forReschedule and Map fault traps, which have to save the state of the emulator currently running.In any case, reserving locations for these traps costs at most 5 traps * 4 instruction sets * 4 entries/trap =1008 locations, and realistically is much less than this because many instruction sets will not need 4 entriesand there will probably be fewer than 4 instruction sets concurrently active.ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТіП-ТїП&НЇО`SТ÷П2Т═П+НЇО^┴ТўПAТ╞НЇО\ЎТ·Т÷ПHНЇОZСТНyОX2uТєТ╔ПGНyОVяТ▓Т⌠ПLНyОT3Т┬wuwuТ┴П'НyОRрТ░Т▒ПRНyОQqТ■П-Т∙П:НyОPТ wuП'НЇОLбpТ─П>Т│НЇОJВТ╗Т╘П:НЇОI-Т┼ПGТ▀НЇОGbТ├Т┤ПDНЇОCПТяПUНЇОB%Т÷П2Т═П(НЇО@[ТПQНЇО<ИТ▌ПAТ▐НЇО;Т░П[НЇО9TТюТаПEНЇО7┴Т⌠Т■ПDНЇО5ЎТНyО2зuТ wuwuП5НyО1UТ▀ПDТ▄П#НyО/яТНЇО,иwНЇО*+uТІПTТЇНЇО(йТ═П:Т║П4НЇО'iТЎТ©ПOНЇО$кТ÷П+Т═П<НЇО#jТ⌠П_Т■НЇО" Т·П;Т÷П1НЇО ╘ТП_НЇОТ≈ПVТ≤НЇО╙ОО╙Т║ПjНЇОТПMЪ╡Їъ<ПM,+Dorado Hardware ManualInstruction Fetch Unit14 September 198175IFU ResetThe processor can reset the IFU by executing the IFUReset function. This clears all IFUerror conditions, prevents further IFU memory references, clears the BrkIns_ featurediscussed earlier and the test features discussed later, and generally puts the IFU in aclean and operable state. The Reschedule feature is not affected by IFUReset.IFUReset should be executed after power-on to get the IFU shut off. A single IFUReset willmake the IFU passive with respect to operating the rest of Dorado. However, the IFU itselfmight not be operable until a second IFUReset is executed because of a pathologicalcondition (If BrkIns is loaded and Testing is true, then the first IFUReset will clear Testingbut not BrkIns; a second IFUReset is required to clear BrkIns in this case).If the IFU has any outstanding memory references pending at the time the first IFUReset isexecuted, those references will complete and disturb the top part of the IFU pipeline. Asecond IFUReset must be issued after these references have all finished prior to reading orwriting IFUM. If the second IFUReset is executed 36 or more cycles after the first, then itwill for sure completely reset the IFU.The worst case is when a miss has just started the storage pipeline with an IFU reference in thecache address section. In this case the IFU reference does not enter the storage pipeline until the8th cycle and then takes 28 cycles to complete.IFUReset should be executed prior to using BrkIns_. It should also be executed afterreading or writing IFUM (to reset the BrkPending condition that is still lurking).ReschedulingIo tasks request service from the emulator by first indicating a request in some way(Presently an RM location is used as a 16-bit table in which 1's indicate requests.), thenexecuting the Reschedule function, and finally blocking. The IFU and the processor storethe reschedule condition in flipflops which remain set until the NoReschedule functionturns them off.The next IFUJump after Reschedule transfers to the entry vector for the opcode as usual;the reschedule trap address will drop into the IFAddr register at t2 of this instruction, andthe first IFUJump after that will dispatch into the reschedule trap vector. This means thatsecond IFUJump will trap unless the second IFUJump occurs on the instructionimmediately after the first IFUJump, in which case the trap will not occur until the thirdIFUJump. IFUJump's that experience a NotReady trap are not counted. The entry vector at the reschedule trap location is entered as though it were the nextopcode. When Reschedule is used by io tasks to request the wakeup of another process,this fact is unimportant. However, the other use of Reschedule is in continuation from map(and other) faults. In this application, the reschedule trap will wind up restoring the IFUstate by executing an appropriate number of _Id's and eventually branching back to theinstruction that experienced the fault. The continuation method is discussed later.Opcodes which might execute for a long time, such as block transfer and BitBlt, mustcheck for rescheduling explicitly, and the (emulator only) Reschedule branch conditionЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbrНЇО^╛pТ╗ПXНЇО\АТЧПLТЪНЇО[ТеПOТфНЇОYLТПNНЇОUзТ├Т┤П@НЇОTТ▄Т█ПSНЇОREТуП9ТжНЇОPzТ Т⌡П@НЇОN╟ТПLНЇОK>Т▐Т░ПHНЇОIsТ╛ПAТґНЇОG╗Т┤ПPТ┬ НЇОEчТ≤ПTТ≥НЇОDТП'НyОARsТє Т╔ПRНyО?ЯТ▓Т⌠ПMНyО>░ТП/НЇО;ApТхqpТиqНЇО9wpТПRНЇО4^rНЇО0ЛpТГПTНЇО/!ТґПUТўНЇО-VТ≥ПEТ НЇО+▄ТвПVНЇО)аТНЇО&OТ°qpП7Т²НЇО$┘Т·ПCО#ЬsО$┘pТ÷НЇО"╨Т⌡П3Т°П)НЇО ОТ<ПLНЇО%Т╧П2Т╨П(НЇОZТПEНЇОХТхТиП3qНЇОpТ■Т∙ПIНЇОSТ┘П1Т├П*НЇО┬ТґТўПHНЇОЎТ╟ПRТ╠НЇОСТПTНЇО │ТйПTНЇОЇТуП*ТжП,ЪєЇp<П\⌡7Dorado Hardware ManualInstruction Fetch Unit14 September 198176makes this check easier. If such opcodes did not check for rescheduling, then service tothe io device might be postponed for too long.The reschedule flipflops are not cleared by IFUReset, so the NoReschedule function mustbe executed as part of system reset.When the reschedule trap vector is entered, the IFU is in an undefined state except forPCX', and PCF_ is needed to restart the IFU at the continuation address.BreakpointsBrkIns_B implements debugging breakpoints straightforwardly. The idea is that a one-byteopcode, BrkP, is used to transfer control to a debugger while saving emulator state neededto continue later, and another opcode, Continue, is used to continue from breakpoints (ForMesa, BrkP and Continue are special cases of Xfer.).BrkP may be substituted for any opcode in a program. The debugger gets control whenBrkP is executed, saves state, and eventually can execute Continue to restore state fromvalues saved by BrkP.Continue first restores registers, then loads BrkIns with the opcode for which BrkP wassubstitued; then it uses PCF_B to restart the IFU at the breakpoint. The IFU will then startrunning; the first opcode fetched will again be the BrkP opcode, but the contents of BrkInswill be substituted for the one fetched from memory, and the program will continuecorrectly.Without BrkIns_B the debugger would have to simulate the broken opcode beforecontinuing at the following opcode, which would be harder. The example below shows acode sequence for the final part of Continue.Continue:. . .IFUReset;*Stop future IFU fetches and clear pipeT_41C;Cnt_T;IFUReset, Goto[.,Cnt#0&-1];*Reset after previous IFU fetches completeBrkIns_Opcode;*Load opcode which BrkP replacedPCF_BreakAddress;*Restart IFU at address of BrkPNoop;*No-op required after PCF_ before IFUJumpIFUJump[0];*Resume programNote: IFUReset is required before BrkIns_, even when an opcode of type Pause is inprogress.Reading and Writing IFUMIn addition to its function related to breakpoints, BrkIns_B is used to address IFUM whenreading or writing that memory.When IFUM is loaded, it is addressed by the instruction set InsSet[0:1] and BrkIns. Thedata must remain on B for two cycles, so tasking must be disabled and the instructionНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ ПYНЇО`SТП.НЇО\АТ═П5Т║П"НЇО[ТП$НЇОW╔ТЇП6Т╦П!НЇОUзТПHНЇОPаr НЇОMOpТ▌ПYНЇОK└Т├П7Т┤П#НЇОI╨Т▄П3Т█П'НЇОGОТП4НЇОD}ТіП+ТїП)НЇОB╡Т╘П9Т╙НЇО@ХТНЇО=vТЎП=Т©НЇО;╚Т▀Т▄П?НЇО9АТ┼П,Т▀П/НЇО8ТРТСП?НЇО6K НЇО2зТПMНЇО1Т═П;Т║НЇО/DТП-НyО,┐sНЭО+"НЭО)аН&∙П'НЭО(`НЭО'НЭО%÷Т─Н&∙ТП$НЭО$> Н&∙НЭО"щН&∙НЭО!}Н&∙П)НЭО Н&∙НЇОмqpТуТжП1НЇОНЇОИrТНЇОwpТ═ПDТ║НЇОґТНЇО ;Т╞ПGТ╟НЇОpТюПEТаТЇ)<П\Б Dorado Hardware ManualInstruction Fetch Unit14 September 198177following the one with IFUMLH/RH_ must put the same data on B. If this data comes fromRM or T, the register must not have been loaded in the cycle preceding the IFUMLH/RH_(because the bypass logic will change the B select from Pd or Md to RM or T, possiblyglitching data on B). The following subroutines illustrate loading and reading back IFUM.WriteIFUM:IFUReset;*Stop future IFU fetches and clear the pipeT_41C;Cnt_T;IFUReset, GoTo[.,Cnt#0&1];*Reset after previously issued fetches completeInsSetOrEvent_RMaddr0;*Load 2 instruction set bits forming IFUM addressBrkIns_RMAddr1;*Load 8 opcode bits forming IFUM addressTaskingOff;*Ensure no B glitch below and let BrkIns_ settle for 1 cycleIFUMLH_RMdataHi;*Write high part of IFUMB_RMdataHi;*Keep data good a little longer (mustn't glitch)IFUMRH_RMdataLo;*Write low part of IFUMB_RMdataLo, TaskingOn;*Keep data good a little longerIFUReset, Return;*Clear BrkInsReadIFUM:IFUReset;*Stop future IFU fetches and clear the pipeT_41C;Cnt_T;IFUReset, GoTo[.,Cnt#0&1];*Reset after previously issued fetches completeBrkIns_RMaddr1;*Load 8 opcode bits forming IFUM addressInsSetOrEvent_RMaddr0;*Load 2 instruction set bits forming IFUM addressNoop;*Two instructions must elapse after loading BrkIns*one after loading InsSet (?Two noops after loading InsSet*might be better since this is a tight path?)RMdataHi_IFUMLH;*Read IFUM into RM.RMdataLo_IFUMRH;IFUReset, Return;*Clear BrkInsContinuing from Processor FaultsSaving and restoring the state of an interrupted program requires some cleverness not onlyfor the IFU, but also for the Control, Processor, and Memory sections. The emulator mightfault for a data error, map fault, or stack overflow/underflow; for io tasks, stackoverflow/underflow is impossible and map faults will probably be illegal, so only data errorfaults are legitimate. The discussion here will concentrate on map faults, though the sameapproach could be used for other fault conditions as well.The fault task must use as few instructions as possible so that io tasks won't be preemptedfor too long. The minimum is to copy all pipe entries that contain memory faults into RM orStk buffers, preserve DBuf, and save the emulator's TPC; the fault task must itself deal withdata error faults by io tasks; it then restarts the emulator at a trap address. The emulatormicroprogram then saves the rest of the emulator state and deduces the nature of thefault(s) using methods discussed in "Memory Section".The emulator fault microcode first saves ALU branch conditions and task-specific registers,then other information of interest. The saved information is stored where the Mesa (orwhatever) program can get at it; then the trap microcode restarts Mesa at a trap procedurethat will service the map fault (probably swap in a page from the disk); eventually, state willbe restored and the opcode that faulted will be resumed at the instruction that faulted.ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ┘П?Т├НЇО`SТ■ПFТ∙НЇО^┴Т╠П@Т╡НЇО\ЎТ║Т╒П@НyОYЭs НЭОX°Н&∙ТП+НЭОW;НЭОUзНЭОTyТ─Н&∙ТП)НЭОSН&∙П1НЭОQ╦Н&∙П(НЭОPW Н&∙Т▒ Т▓П2НЭОNЖН&∙ТНЭОM∙ Н&∙П0НЭОL5Н&∙НЭОJтТ─Н&∙ТНЭОIsТ─Н&∙ТНyОFуНЭОEtН&∙П+НЭОDНЭОB╡НЭОARТ─Н&∙ТП)НЭО?ЯН&∙П(НЭО>░Н&∙П1НЭО=/Н&∙П2Н&∙О;оТІ ТЇП0Н&∙О:nТП-НЭО9 Н&∙НЭО7╛НЭО6KТ─Н&∙ТНЇО1UrНЇО-ДpТ├П$Т┤П6НЇО,Т▄Т█ПJНЇО*NТ!П+Т"П(НЇО(└Т⌠ПLТ■НЇО&╧Т▌ПKТ▐НЇО$НТП:НЇО!}Т├Т┤ПDНЇО╡Т─Т│ПUНЇОГТ├ПLТ┤НЇОТ≈П<Т≤П!НЇОRТеПFТф НЇО┤ТП5НЇОТ▀П[НЇОKТЎ Т©ПMНЇО─Т┬П8Т┴П"НЇОІТ┤П3Т┬П,НЇО КТюП,ТаП,іЇ є<ПZg\Dorado Hardware ManualInstruction Fetch Unit14 September 198178The IFU state may be saved via B_IFUMLH' and B_PCX'. B_IFUMLH' reads the currentinstruction set and IdCnt from B[0:4]; B[5:15] are IFUM bits which are not of interest whensaving the state of the program, so the tricky code sequence given earlier for reading IFUMis not required. B_PCX' reads the current PC.The 3-bit counter, IdCnt, keeps track of how many _Id's have been done; to avoidoverflowing this counter, no more than 7 _Id's should be done when executing anyopcode. This is one (harmless) restriction on coding emulators. The other is thatemulators never map fault on the instruction after a dispatch (BDispatch_B,BigBDispatch_B, or Multiply); this can be assured by doing _Md prior to or concurrent withany dispatch.Sample microcode for saving emulator state is as follows:%Must first save the volatile branch conditions; Overflow and Carry won't change unless an arithmeticALU operation is executed, so saving them can be deferred. T, the first item saved, is written into theRM region reserved for Save using the change-RBase-for-write FF decode.%Save:FreezeBC, DblGoTo[ALUls,ALUge,ALU<0];ALUls:SavedT_T;T_0C, GoTo[SaveBC];ALUge:SavedT_T, DblGoTo[ALUgr,ALUeq,ALU#0];ALUgr:T_1C, GoTo[SaveBC];ALUeq:T_2C;*Have a code, 0, 1, or 2, in T indicating the state of the ALU<0 and ALU=0 branch conditions.SaveBC:SavedALULEZ_T;*Save the branch condition codeT_Pointers;*T_MemBase, MemBX, and RBaseT_T Or (100000C);*Make negativeRBase_RBase[SaveRMRegion];*Now choose two numbers such that their sum produces the correct ALUcry and Overflow branch*conditions.SavedPointers_T, MemBase_SaveBaseReg, DblGoto[Cry,NoCry,Carry];Cry:DblGoTo[CryOvf,CryNoOvf,Overflow];NoCry:DblGoTo[NoCryOvf,NoCryNoOvf,Overflow];CryOvf:SaveA1_100000C;SaveA2_100000C, GoTo[SaveRest];*Numbers such that SaveA1+SaveA2 produces*Overflow and Carry resultNoCryNoOvf:SaveA2_0C, GoTo[.+2];CryNoOvf:SaveA2_1C;SaveA1_177777C, GoTo[SaveRest];NoCryOvf:SaveA1_77777C;SaveA2_77777C, GoTo[SaveRest];SaveRest:SavedPCX_Not(PCX');T_Not(IFUMLH');*Read IdCnt and InsSet in IFUMLH[0:4]SavedIdCnt_LdF[T,0,2];T_T and (14000C);T_RSh[T,2];SavedInsSet_T+(100000C);*Set up word for InsSetOrEvent_ below. . .*Code to save rest of state (all easy)Sample microcode for continuing is given below:Resume:. . .*Restore all processor registers except T, Cnt, RBase,*and MemBase.InsSetOrEvent_SavedInsSet;*Restore the IFU instruction set number.PCF_SavedPCX;*Restart IFU at address of the opcode that faultedWakeUp[ContTask];*Wakeup the special task used for continuation.Noop;*No-op required so that the instruction after the IFUJumpЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ╗Т╘ПJНЇО`SТ⌠П<Т■НЇО^┴Т└Т┘ПDНЇО\ЎТП.НЇОYLТСП(ТТП(НЇОW│ТМПPНЇОUЇТУТЖПKНЇОSЛqТ┼Т▀П$p НЇОR"Т│П)Т┌П1НЇОPWТНЇОLЕП9НyОJ#sТ╞Т╟ПHНyОHцТ°Т²ПKНyОGbТПGНyОFНyОD═НЭП%НyОC@НЭНЭОAъНyО@~НЭП%НyО?НЭНyО=╪НЭНyО<\ТтП3ТуП*НyО:ШНЭ Н&√ТНЭО9 Н&√НЭО89Т─Н&√ТНЭО6ыНyО5xТрПGТсНyО4НЭО2ІТП?НyО1UНЭП"НyО/УНЭП&НyО.■НЭНЭО-3Т─Н'ТП!Н&√О+рНyО*r НЭО)НyО'╟НЭ НЭО&OНyО$ННЭ НЭО#▌НyО ОНЭО▐НЭО.Н&√П%НЭОмНЭОlНЭО НЭО╚Н&√П%НЭОJТ─Н&√ТП!НЇОШpП/НyО:sНЭТ─Н&√ТП.Н&√ОыНЭОxН&√П(НЭО Н&√П2НЭОЇН&√П/НЭО VН&√ТйТкП" RЇ 3<П]ьaDorado Hardware ManualInstruction Fetch Unit14 September 198179*below will be executed by the continuation task.Cnt_SavedIdCnt, IFUJump[0];*Continue execution in the continuation task at Cont0Resume1:Skip[Cnt=0&1], At[Resume1Loc];*Reissue the appropriate number of _Id's to put A_Id, GoTo[.1];*the IFU in the state it was in at the fault.Cnt_SavedCnt;*Restore Cnt. . .*Restore Md by fetching from a convenient storage*location. Then repeat the Fetch_ or Store_ that*faulted using a convenient base register and restore*the base register (complicated code here needs careful*thought).T_SaveA1;Pd_T+SaveA2;*Restore Carry and Overflow branch conditions.T_SavedT, TaskingOff;*Restore T register*Below, the TaskingOff, WakeUp, TaskingOn sequence insures that precisely one emulator instruction will*be executed after the TaskingOn before the continuation task runs.BDispatch_SavedALULEZ;*Dispatch to 0, 1, or 2 in table based on*ALU>0, ALU<0, or ALU=0.WakeUp[ContTask];*Wakeup the special task reserved for continuation.Link_SavedLink, At[ConTab,0];*Restore Link and ALU>0TaskingOn;Pd_Not(Pointers_SavedPointers), GoTo[COK];Link_SavedLink, At[ConTab,1];*Restore Link and ALU<0TaskingOn;Pd_Pointers_SavedPointers, GoTo[COK];Link_SavedLink, At[ConTab,2];*Restore Link and ALU=0TaskingOn;Pd_(SavedPointers) xor (Pointers_SavedPointers), GoTo[COK];COK:FreezeBC, GoTo[.];*The special restart task needed for continuationContinueInit:RBase_RBase[SavedTPC];*Initialization code for the task*First of two wakeups comes herechange emulator's TPC to Resume1 and block.Cont0:Block;T_Resume1Loc;Link_T, TaskingOff;LdTPC_0C;*Restart emulator at Resume1TaskingOn;Block;*Second of two wakeups comes here. Reload emulator TPC with continuation address.Cont1:Link_SavedTPC;LdTPC_0C;*Restart emulator at saved continue addressBranch[Cont0];IFU TestingThe IFU test control register is loaded by the IFUTest_B function; when not testing, thisregister should contain 1, and it is loaded with 1 by the IFUReset function. IFUTest.15disables the periodic wakeup request to the Junk task discussed in the "Slow IO" chapter;when IFUTest.15 is 0, the junk wakeups occur 60 times/sec and are dismissed by anyIFUTest_ function.IFUTest.14 (TestEn) enables IFU test mode; it is illegal for this bit to change from 0 to 1when the IFU is active because, if this occurred in the same cycle that an IFU memoryreference was issued, then the IFU would pollute the Mar bus indefinitely, making thememory system unusable by the processor.ЪНЇОf╙pТ─Н"sqН5pТНFОfЯН&√ОbAsП1НЭО`ЮТ─Н&√ТП,НyО^BНЭТ─Н'ТП'НЭО\АТ─Н&√ТП)НЭО[─Н&√НЭОZ Т─Н&√ТП)Н&√ОX©П1Н&√ОW^П5Н&√ОUЩТКТЛП%Н'ОT² НЭОS<НЭОQшН&√ТП.НЭОPzТ─Н&√Т НyОOТ▒Т▓ПPНyОM╧ТПCНЭОLXН&√П)Н&√ОJВНЭОI√Н&√П3НЭОH6Т─Н&√ТНЭОFу НЭОEtП*НЭОDТ─Н&√ТНЭОB╡ НЭОARП%НЭО?ЯТ─Н&√ТНЭО>░ НЭО=/П;НyО;оНЭНyО90П1НyО7пНЭО6oН&√П!НyО5ПLНyО3ґНЭНЭО2LНЭО0ЛНЭО/▀Н&√НЭО.* НЭО,иНyО+iПRНyО*НЭ НЭО(їН&√П+НЭО'F НЇО"Pr НЇОчpТ╠П'Т╡П2НЇОТ╪П0ТҐП(НЇОIТ▌ПFТ▐НЇО~ТфТгП=НЇОЄТНЇОBТ╙ПTТ╚НЇОwТ╦Т╧ПMНЇОґТуП<ТжНЇОБТП(Ъ фЇ⌡<ПYpKDorado Hardware ManualInstruction Fetch Unit14 September 198180The test features aim at two situations. First, they allow the IFU clocks to be controlled bya program, so a diagnostic can slowly step the IFU pipeline through its stages. Secondly,they allow data supplied by a diagnostic to be substituted for signals that would otherwisecome from the memory system. This allows the IFU to be tested in the absence of thememory system, which allows scope probes to be inserted easily and decouples IFUproblems from memory system problems.The TestFH' and TestSH' bits in the IFUTest register enable the first-half-cycle and second-half-cycle clocks, respectively, which will occur between t2 and t4 of the cycle after the oneissuing the IFUTick function. Thus, the IFU can be stepped through a PCF_B function asfollows:TaskingOff;IFUTest_TestEn;IFUTick;PCF_value;where PCF_value is just an exampleany other IFU function or an IFUJump could be usedinstead.The IFU's memory interface is simulated by the TestFG, TestParity, TestFault, TestMemAck,and TestMakeF_D bits in IFUTest. Memory references are not issued by the IFU whenTestEn is true. TestFG and TestParity are substituted for the FG byte and parity bit fromthe memory system; the other signals are control signals sent by the memory system inresponse to IFU references. They are supposed to work as follows:MemAck occurs at t2 of a cycle in which the IFU makes a reference at t1, iff the memorysystem accepted the reference; if the memory system was busy and did not accept thereference, then MemAck does not occur, and the IFU should repeat its reference. Theabsence of MemAck serves approximately the same purpose for the IFU that Hold servesfor the processor.MakeF_D occurs at t1 of a cycle in which the memory system loads F at t3; in the event ofa map fault, MakeF_D occurs at t1 of the cycle in which the memory system would haveloaded F at t3 if the map fault had not occurred. The IFU can try to start a reference at t1,even though it has an unfinished reference in progress. The memory system will acceptthe reference iff MakeF_D occurs; otherwise, it will refuse the reference. In other words,the IFU's second reference starts at t1 iff the first reference will deliver data at t3.Fault is concurrent with (?) MakeF_D and indicates that the IFU reference experienced amap fault.In other words, a memory reference can be simulated with the IFU test feature by (1)ticking the IFU through a cycle in which it makes a reference; (2) ticking the TestMemAckresponse of the memory system with IFUTest_B and IFUTick; (3) ticking TestMakeF_D; (4)ticking with TestFG and TestParity holding simulated memory data.Details of Pipe OperationThe IFU is a six-stage pipeline, starting with words fetched from memory, and ending withopcode starting addresses delivered to the control section and operands delivered to theНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ█П[Т▌НЇО`SТ≈Т≤ПKНЇО^┴Т√Т≈ПLНЇО\ЎТ╣ПEТІНЇОZСТМП.ТНП"НЇОY)ТП%НЇОUЇТ┬П?Т┴НЇОSЛТ┘П1Т├ ОS_sОSЛpОS_sОSЛp qpНЇОR"Т≈ПTТ≤НЇОPWН:ОM∙s Н:ОL5Н:ОJтН:ОIs НЇОF▌pТ┘П&Т├П/НЇОDдНЇОARТ▀ПYНЇО?┤ТЎПIТ©НЇО=╪Т║П*Т╒П0НЇО;РТ╨ПDТ╩НЇО:'ТПBНЇО6╣qpТ÷О6(sО6╣p Т═П%О6(sО6╣pНЇО4КТЎПHТ© НЇО3 ТҐqpП*ТЎНЇО1UТі qpТїП.НЇО/▀ТНЇО,qpТ▄О+▄sО,pП1Т█О+▄sО,pНЇО*NТіqpО)аsО*NpП(Тї НЇО(└Т┴О'ВsО(└pП,Т┼П"О'ВsО(└pНЇО&╧Т╚П<Т╛НЇО$НТ═ПJТ║НЇО#$ТЬП&О"≈sО#$pП-ТЫО"≈sО#$pНЇОуqpТ╙qpП%Т╚ НЇОТ НЇО≥ТнТоПCНЇОнТ≤П+Т≥П$q НЇОpТ∙П0Т√q pНЇО9Тqpq pНЇО rНЇОўpТ ПTТ⌡НЇО ЦТ╙П2Т╚П&Ъ@Ї °<П]oyDorado Hardware ManualInstruction Fetch Unit14 September 198181processor. The levels are named: F, G, H, J, M and X. Each level has a data-valid bitindicating whether or not it contains something useful.PCF, PCJ, PCM, and PCX are PC's for the corresponding pipe levels (except that PCF is aword PC rather than a byte PC). PCF, PCM, and PCX are independent of each other sincejumps and PCF_ may result in these all being different; PCJ is related to PCF by thenumber of valid bytes in the F/G/H levels; the hardware also uses PCFG, which containsPCF plus the number of valid bytes in the F/G levels. Operationally, F/G are a FIFO inwhich PCF is the write pointer, incremented as words are fetched from the cache, andPCFG is the read pointer, incremented as bytes are moved from F/G into J/H. Note thatthere is no PCH because PCH would equal PCJ+1.Pipe control is straightforward in principle. The F and G levels are 16-bit registers filledfrom the cache. Following PCF_B, if there is space in the pipeline for another word, theIFU will start a reference at t1 of any cycle in which the processor is not using Mar (so asmany as 2 IFU references can be outstanding). Cache words are stored in F at t1, thendropped into G at t2; bytes drop into H at t3 or J at t4; there are bypass paths to get bytesdirectly from F/G into J when H is invalid. As the processor executes opcodes, F and Gbecome invalid, and the IFU refills them from memory automatically. This continues untilthe IFU is reset by the processor, or encounters a pause opcode.The F and G registers are physically located on the MemD board. The four bytes in F/G are inputsto a multiplexor controlled by the IFU, and the multiplexor output is sent across the backplane to theIFU. BrkIns[0:7] or IFUTest[0:7] replace F/G data when using breakpoints, reading/writing IFUM, orusing IFU test features.While following the opcode stream, a jump will invalidate data in F. However, if a reference is inprogress and F has not yet been filled by the memory system, then the IFU will invalidate the datawhen it arrives and restart the next reference immediately. In other words, the IFU cannot abandonthe useless fetch; it must wait for it to finish and discard the result.The J and H levels are one byte wide. For one-byte opcodes it is possible to consider Hand J as independent levels of the pipe; however for two or three-byte opcodes, it isappropriate to consider J/H as a single level in which J holds the opcode and H holds a.If J is invalid, then it will be loaded from the next opcode (which may be in G, F, or Haccording to various conditions) at an even clock (t0) and H will be loaded from the byteafter the opcode (which is always in G) at the following odd clock (t1); if the byte after theopcode isn't ready, it will drop into H at the next odd clock after it is ready. The InsSetand J registers address IFUM and IFUM outputs reveal whether the byte in H is a (Length= 2 or 3) or the next opcode (Length = 1).The conditions under which the M level can be loaded from J are that M is invalid (orabout to become invalid) and:Length = 1 -or-Length = 2 and H is valid -or-Length = 3 and H is valid and either F or G is valid.If these conditions are met, then the M level is loaded (t2) with information from IFUM andwith a, if Length = 2 or 3. If Length = 3, then b will drop from G into H (t3).НЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ╡П7ТЁНЇО`SТП7НЇО\АТ▐Т░ПAНЇО[Т┬П%Т┴П1НЇОYLТмП&ТнП.НЇОW│Т║ПMТ╒НЇОUЇТўПTТ╞НЇОSЛТаП@ТбНЇОR"Т²П8Т·НЇОPWТП.НЇОLЕТґПVТўНЇОKТ═Т║ПJНЇОIPТ■ОHцsОIPpП<НЇОG┘Т╙П-Т╚П"ОFЬsОG┘pНЇОE╩Т▌Т▐ОE.sОE╩pОE.sОE╩p ОE.sОE╩pП%НЇОCПТ÷ПFТ═НЇОB%ТіТїП>НЇО@[ТП3qpНyО=≥sТ┬Т┴ПHНyО<8Т┐Т└ПHНyО:ьТ■ПcНyО9wТНyО6ыТ═П%tsТ║П:НyО5xТ≈П)Т≤П9НyО4Т▌ПLТ▐НyО2ІТПHНЇО/hpТ·П>Т÷НЇО-²ТсПFТтНЇО+рТ≤Т≥П8upНЇО(`Т╣ПXНЇО&√Т╒П&Тё О& sО&√pП$НЇО$кТ▒ПEО$>sО$кpТ▓НЇО#ТіПIТїНЇО!6Т≈ПEТ≤upНЇОkТП*НЇОЫТЎПDТ©НЇО/ТН:ОҐН:ОРН:О(П5НЇОІТ■Т∙П2О)sОІpНЇО КТПupП+upТЯО ^sО Кp *Ї ;<ПZпsDorado Hardware ManualInstruction Fetch Unit14 September 198182If Length < 3, then the H/J level is now free to work on the next opcode. If Length = 1and the next opcode happens to be in H, then H will drop into J at the same time (t2);otherwise, J will be loaded from the next opcode in F/G when it is ready.When the processor does an IFUJump[n], level M presents information needed by the nextopcode as follows:IFaddr is TNIA[4:13] for the IFUJump;MemBase is set to 0.MemBX.MemB[1:2] or 348+MemB[1:2];RBase is set to 0 or 1;N, Sign, Length, Packeda, and a are loaded into the X level;b is loaded into the M level if Length = 3.Referencing IFU operands with A_Id, TisId, or RisId affects the IFU in two ways: it causesthe IFU to advance to the next item of Id, and for a 3 byte instruction when a is taken(a[4:7] when Packeda = 1) it causes b to drop from M to X, freeing M for the nextinstruction.IFetch_ also uses Id, as discussed in memory section, but does not advance the IFU to thenext item of Id.For a one or two-byte opcode, it is permissible for the processor to do an IFUJump beforereferencing any operands with _Id; this will advance normally to the next opcode.However, for a three-byte opcode the processor must reference all of a, so that b dropsinto X, before doing an IFUJump.When a pause or jump is recognized, the IFU may already have filled the F and G levelserroneously (i.e., 4 bytes ahead). These levels are flushed and refilled along the jump path.Timing DetailsThis section discusses timing details of the IFU pipeline assuming that all IFU references hitin the cache and are never deferred for processor references.First case: Restart IFU at even bytet0:An instruction with PCF_FOO is started, where FOO is even.t2:F, G, H, J, and M levels are made invalid.t3:Reference the word containing FOO.t5:Reference word containing FOO+2.t7:Load F with data from the FOO reference; reference the word containing FOO+4.t8:Load the first byte from F into J; load G from F; F becomes invalid; start reading the IFUMentry for J.t9:Load the putative operand byte from G into H; G becomes invalid; load F from the FOO+2reference.t10:Distinguish 5 cases below.FOO is a one-byte regular opcodeЪНЇОf╙pТ─Н"sqН5pТНFОfЯНЇОbТ·qpП@Т÷НЇО`SТЄП.Т╣П%О_фsО`SpНЇО^┴ТПIНЇО[Т▒П&Т▓П0НЇОYLТН:ОUзП%Н:ОTП)ОS┌sОTp Н:ОREН:ОPzupupН:ОN╟upП*НЇОK>Т≈Т≤П=НЇОIsТЄПLТ╣upНЇОG╗upТпupupТяНЇОEчНЇОBlТ┤П9Т┬НЇО@║ТНЇО=/Т▒ПRТ▓НЇО;eТ П'ТП*НЇО9 Т╗П=Т╘up upНЇО7пТНЇО4^Т╔qpqpТіП/НЇО2⌠Т┐ПIТ└НЇО-zrТ НЇО*pТ┌П9Т┐П%НЇО(=ТП=НxО$кrП%НxО" sН⌡П5tsНxО Н⌡П*НxО.Н⌡П"НxО@Н⌡НxОRН⌡ПMНxОdН⌡Т∙ Т√ПMН⌡ОТНxОН⌡Т≈ПVН⌡О╣ НxОгН⌡ТНxОxq·Ї1=ПYз Dorado Hardware ManualInstruction Fetch Unit14 September 198183t10:Load M from IFUM; IFUJump will now succeed; load J from H (FOO+1); load G from F(FOO+2 and FOO+3); F and H become invalid; start reading the IFUM entry for J.t11:Load H from G (FOO+2); load F from FOO+4 reference.t12: (The FOO+1 opcode would pop into M if IFUJump were done at t10.)IFU is quiescent; F has two useful bytes, G one byte, J/H has two bytes; M level is readyand waiting for IFUJump.FOO is a two-byte regular opcodet10:Load M from IFUM and M[a] from H; IFUJump will now succeed; load J from F (FOO+2);load G from F (garbage and FOO+3); F and H become invalid; start reading the IFUM entryfor J.t11:Load H from G (FOO+3); G becomes invalid; load F from FOO+4 reference; reference theword containing FOO+6.t12:Load G from F; F becomes invalid.t15:Load F from the FOO+6 reference; now quiescent.FOO is a three-byte regular opcodet10:Load M from IFUM and M[a] from H; IFUJump will now succeed; load G from F (FOO+2and FOO+3); H and F become invalid; J goes to special state (b in H).t11:Load H from G (FOO+2 = b); load F from the FOO+4 reference; now quiescent.t12: (The FOO+2 byte would pop from H into M[b] if IFUJump were done at t10.)FOO is a one-byte jump opcodet10:Load M from IFUM; IFUJump will now succeed; J, H, G, and F become invalid.t11:Discard the FOO+4 reference; reference the first word along the jump path.t13:Reference the second word along the jump path.t15:Load F from the first word along the jump path.t16:Load J from F, etc.FOO is a two-byte jump opcodet10:Load M from IFUM and M[a] from H; IFUJump will now succeed; G and F become invalid; Jand H are in a special jump state, computing the jump address.t11:Discard the FOO+4 reference; reference the first word along the jump path.t12:J and H become invalid.t13:Reference the second word along the jump path.t15:Load F from the first word along the jump path, etc.Second case: Restart IFU at odd bytet0:An instruction with PCF_FOO is started, where FOO is odd.t2:F, G, H, J, and M levels are invalid; IFUJump will trap at NotReady.t3:Reference the word containing FOO.t5:Reference word containing FOO+1.t7:Load F with data from the FOO reference; reference the word containing FOO+3.t8:Load the second byte from F into J; F becomes invalid; start reading the IFUM entry for J.t9:Load F from the FOO+1 reference.t10:Distinguish 3 cases below (and the one and two-byte jump cases which are not repeatedbelow).ЪНЇОf╙pТ─Н"sqН5pТНFОfЯНxОbAsН⌡ТІПIТЇН⌡О`ЮТИП0ТЙНxО^РН⌡ТП3НxО]Н⌡ПBН⌡О[єТ÷ПSТ═Н⌡ОZCТНxОVТqНxОTysН⌡Т·usП*Т÷Н⌡ОRУТ▌П:Т▐Н⌡ОQ■ТНxОOїН⌡Т⌡ПPТ°Н⌡ОNFТНxОLXН⌡П!НxОJjН⌡П/НxОGqП"НxОD═sН⌡ТєТ╔usП8Н⌡ОBЫТП=usНxО@еН⌡ТМТН usП2НxО>░Н⌡ТЖП*usТВНxО;qТНxО8фsН⌡ПJНxО6ыН⌡ПJНxО4КН⌡П.НxО2ЩН⌡П/НxО1Н⌡НxО-юqНxО+EsН⌡Т┴usП2Т┼ Н⌡О)аТП>НxО'сН⌡ПJНxО%ЕН⌡НxО#ЬН⌡П.НxО" Н⌡П4НxО╩rП%НxОЫsН⌡П5tsНxОН⌡ПDНxОН⌡П"НxО0Н⌡НxОBН⌡ПMНxОTН⌡Т≥П%Т П5НxОfН⌡ТНxОxН⌡Т╞Т╟ПBН⌡О Ъ °ЇТ7m[ADorado Hardware ManualInstruction Fetch Unit14 September 198184FOO is a one-byte opcodet10:Load M from IFUM; IFUJump will now succeed; load J from F (FOO+1); load G from F(garbage and FOO+2); F becomes invalid; start reading the IFUM entry for J.t11:Load H from G (FOO+2); G becomes invalid; load F with the FOO+3 reference; referencethe word containing FOO+5.t12:Load G from F; F becomes invalid.t15:Load F from the FOO+5 reference; now quiescent.FOO is a two-byte opcodet10:Load G from F (FOO+1 and FOO+2); F becomes invalid.t11:Load H from G (FOO+1); load F with the FOO+3 reference.t12:Load M from IFUM and M[a] from H; IFUJump will now succeed; load J from G (FOO+2);load G from F; F and H become invalid; start reading the IFUM entry for J.t13:Reference the word containing FOO+5; load H from G (FOO+3).t17:Load F with data from the FOO+5 reference; now quiescent.FOO is a three-byte opcodet10:Load G from F (FOO+1 and FOO+2); F becomes invalid.t11:Load H from G (FOO+1); load F from the FOO+3 reference.t12:Load M from IFUM and M[a] from H; IFUJump will now succeed; H becomes invalid; J is ina special state (b in H).t13:Load H from G (FOO+2); load G from F (FOO+3 and FOO+4); F becomes invalid; referencethe word containing FOO+5.t17:Load F from the FOO+5 reference; now quiescent.НЇОf╙pТ─Н"sqН5pТНFОfЯНxОbqНxО_фsН⌡Т╦П9Т╧Н⌡О^eТПKНxО\wН⌡Т÷П)Т═П+Н⌡О[ТНxОY)Н⌡П!НxОW;Н⌡П/НxОSЛqНxОQ■sН⌡П3НxОOїН⌡П7НxОM∙Н⌡Т Т⌡usП:Н⌡ОLТПJНxОJ#Н⌡П;НxОH6Н⌡П9НxОDГqНxОB▐sН⌡П3НxО@║Н⌡П7НxО>░Н⌡Т▌usП%Т▐Н⌡О<ИТusНxО:ьН⌡Т│П7Т┌Н⌡О9wТНxО7┴Н⌡П/Ъ Ї7f7m0╔╟Dorado Hardware ManualSlow IO14 September 198185Slow IOThe slow io facility allows data transfers between the processor and any of up to 256independently addressed io registers. It is intended that the slow io facility will be used toload and read control information associated with high speed io devices (> 20 x 106bits/sec), which will then use the fast io system for their data transfers. Low speed devices(< 20 x 106 bits/sec) will use the slow io bus for all phases of their operation. Very slowor polled devices may be driven directly from an emulator.Device controllers for Dorado interact with the processor by exchanging data over a 16-bitbidirectional bus IOB ("Input/Output Bus"). There may be a total of up to 256 io registersin all controllers connected to a single system. The unique 8-bit device numbers assignedto particular devices or uses that appear in every system are discussed in subsequentchapters and summarized in the table below.Table 21: IO Register AddressesNumberNameComment10DiskControlDisk control register11DiskMuffDisk muffler control12DiskDataDisk FIFO data13DiskRamDisk format RAM14DiskTagDisk tag register15EDataEthernet input or output data16EControlEthernet control and status360PixelClockDDC pixel clock361MixerDDC mixer362CMapDDC CMap363DWTFlag* (DispM analog of DWTFlag)364DHTFlag* (DispM analog of DHTFlag)365BMapDDC BMap366NLCB* (DispM analog of NLCB)367Statics* (DispM analog of Statics)370StatusDDC muffler and OIS data372MiniMixerDDC MiniMixer373DWTFlagDDC word task control374DHTFlagDDC horizontal task control375HRamDDC horizontal waveform control376NLCBDDC next line control block377StaticsDDC debugging controlЪНЇОf╙pТ─Н&∙qН4]pТНGОfЯНЇОaвrНЇО^epТк ТлПJНЇО\⌡Т▒П&Т▓П9НЇОZпТБПOТЦО[]sНЇОYpТ│ПKТ┌НЇОW;Т√ ОWхsОW;pПQНЇОUpТП:НЇОQЧТ▓П9Т⌠П!НЇОP4Т▌НBОOщЧcОP4НLОOщЧОP4Н$<ОOщЧРОP4Т▐ qНЇОNipТ▒П5Т▓П%НЇОL·ТмП"ТнП3НЇОJтТП+НУОGbtТXНyОCПqНН"sНзО@ХpН Н"sТНзО?НН"sНзО=SНН"s НзО;┬НН"sНзО9ҐНН"sНзО7СНН"sНзО6(НН"sНзО4^Н Н"sНзО2⌠НН"sНзО0хНН"sНзО.ЧНН"sП"НзО-3НН"sП"НзО+iНН"sНзО)·НН"sНзО'сНН"sП"НзО& НН"sНзО$>НН"sНзО"sНН"sНзО ╘НН"sНзОчНН"sНзОНН"sНзОIНН"s░Ї<ПM FDorado Hardware ManualSlow IO14 September 198186Input/Output FunctionsIn most cases, a task will need to do many sequential io operations to the same io register.The 8-bit task-specific register TIOA holds the device address being referenced by eachtask.TIOA is loaded at t2 from B[0:7] by the TIOA_B function, or TIOA[5:7] can be loaded fromFF[5:7] while preserving TIOA[0:4] by the TIOA_small constant function. Pd_Input,Pd_InputNoPE, or Output_B functions can be issued in the instruction immediatelyfollowing the one that loads TIOA.Most input registers include odd byte parity with IOB data. The Pd_Input function readsIOB data and checks parity. The Pd_InputNoPE function reads IOB data without a paritycheck; this is useful when determining whether a device exists (IOB has bad parity if anonexistent register is selected). The enabling and timing of parity error halts is discussedin the "Errors" chapter.The Output_B function sends 16 bits of data with parity to the io register selected by TIOA.Many controllers check the parity and report parity errors as part of their status.The tasks reserved for standard peripherals are given in the table below.Table 22: Task AssignmentsNumberNameComment0EMUThe emulator1CONSpecial task for restarting emulator after faults2JNKJunk task (awakened every 32 ms)3DHTDisplay horizontal task4AHTDispM terminal interface horizontal task6EOTEthernet output task7EITEthernet input task118AWTDispM terminal interface word task128SIMTask simulator138DWTDisplay word task148DSKDisk io178FLTThe fault taskIO OpcodesThe Mesa instruction set has two opcodes for dealing with the slow io system:INPUT:TIOA_a;Stkp_Stkp+1;Stack_Input, IFUJump[0];OUTPUT:НЇОf╙pТ─Н&∙qН4]pТНGОfЯНЇОbtНЇО^╛pТ┤ Т┬ПNНЇО\АТЇП>Т╦НЇО[НЇОW╔Т⌠ОWsОW╔pТ■П5НЇОUзТ П)ТП)НЇОTТТП7НЇОREТП"НЇОNсТ╗ПIТ╘НЇОMТ÷П-Т═П)НЇОK>Т╪П0ТҐП'НЇОIsТ▄П#Т█П;НЇОG╗ТНЇОD7Т└П\НЇОBlТПSНЇО>ЗПIНkО;┬tТXНyО8qНН╡НзО4єpНН╡ТНзО2зНН╡П1НзО1НН╡upНзО/DНН╡НзО-zНН╡П(НзО+╞НН╡НзО)ДНН╡НзО(О'█sНО(pН╡П"НзО&OО%бsНО&OpН╡ НзО$┘О#ЬsНО$┘pН╡НзО"╨О"-sНО"╨pН╡НзО ОО bsНО ОpН╡ НЇОмt НЇО[pПMНЇОsН╛ОeusН╛ОЎН╛О:НЇО╒Ї К=ПZ (Dorado Hardware ManualSlow IO14 September 198187TIOA_a;Output_Stack&1, IFUJump[0];These opcodes allow a Mesa program to have full access to the io system. The intent isthat these instructions will be used to set up registers in firmware-driven devices, and do allthe service required by polled slow devices. In many cases, the use of an INPUT orOUTPUT instruction is not sensible (doing io to a device normally driven by firmware, forexample), but the capability should prove useful for testing and diagnostics.Wakeup, Block, and NextThe "Control Section" chapter discussed task switching, and the material which follows is anelaboration of that discussion.Note that a task for which a wakeup request is issued at t0 cannot commence its nextinstruction until t4; i.e., at least two cycles elapse after a wakeup before the next instructionis executed. The task then runs until it does a Block; in order to avoid an erroneous extrawakeup, the task must lower its wakeup request at least one cycle before issuing Block.Consequently, an io device may turn off its wakeup request according to one of threestrategies:The first is to turn off the request when Next becomes equal to its task number; inthis case the wakeup request is lowered at t0 of the first instruction executed for thetask, and it must not block until the second instruction to prevent an erroneoussecond wakeup. The special situation in which Next is invalid ("Next Lies") must bedealt with by device controllers that do this. This situation occurs as follows:Suppose that a task blocks with the following instruction:Branch[Loop], Fetch_Address, Block;*Fetch next wordThis generates Switch and the task in Bnt is broadcast over the Next bus.If the Fetch_ causes hold and Bnt < Ctask, then no task switch will occur.However, the Next bus is incorrectly broadcasting Bnt. Since hold occursafter t1, there is insufficient time to change the Next bus back to Ctask inthis case.Consequently, controllers using Next detect "Next Lies" and disable anyactions that would otherwise be performed when it occurs.A pathological lockout problem should be noted: Since task T's wakeuprequest was lowered at t2 when Next=T was noted at t0, the Next Liescondition will (correctly) result in repeating the held instruction at t2;however, some task of lower priority than T may erroneously execute at t4.This might be a problem if some high demand task of higher priority iscoded so that it always creates Next Lies (say, by doing Block andimmediate _Md in the instruction after a Fetch_).Another consequence of "Next Lies" is that IOAtten may be incorrect whenНЇОf╙pТ─Н&∙qН4]pТНGОfЯН╛ОbsupН╛О`SsНЇО\АpТ÷Т═ПBНЇО[Т│Т┌ПCНЇОYLТ╩ПUНЇОW│ТіП,ТїП-НЇОUЇТПMНЇОQqtНЇОMЪpТ│П\НЇОL5ТНЇОHцТнП5ТоОH6sОHцpНЇОFЬТ⌠ОFksОFЬpТ■П2НЇОE-Т П7Т⌡П%НЇОCcТюqПNpНЇО?ЯТщПKТчНЇО>& НyО:ЄТ⌡П)Т°П*НyО8ЙТ├П,О8]sО8ЙpП&Т┤НyО7ТфТгqПCНyО5UТ└pП8Т┘НyО3┼ТБТЦПDН;О0ТП:НЩО,іТ─П#Н.qТ Н;О)4Т╒П6ТёН;О'iТ⌡П/Т°Н;О%÷ТєТ╔П8Н;О#тТіО#GsО#тpП;ТїН;О" Т Н;О≤ТлПGН;ОмТП9Н;О[Т╩Т╪П(Н;О▒ТкОsО▒pОsО▒pТл Н;ОфТП7ТО9sОфpН;ОШТ П+Т⌡ОnsОШpН;О1ТлТмП'Н;ОfТ Т П+Н;О⌡ТП1Н;О*Т▄Т█П2 :Ї Ц=Г](HDorado Hardware ManualSlow IO14 September 198188"Next Lies" is occurring. Consequently, branch on IOAtten is illegal duringan instruction that blocks and might cause hold.The second strategy monitors TIOA becoming equal to a particular device value. Inthis case the wakeup request is lowered at t0 of the second instruction following awakeup, and the task must not block until the third instruction. The disk controllerhas used this strategy, which has the draw back that if TIOA inadvertently assumesthe particular device value for any other task, the hardware will malfunction. Aconsequence of any device using this strategy is that all tasks must be careful toinitialize TIOA properly when first awakened.The third strategy waits for some Output_B or Pd_Input operation to reset thewakeup condition. This would reset the condition at t3 or t5 of the Output_Binstruction, and the wakeup would be lowered at t4 or t6; in this case the task mustnot block until the third or fourth instruction after the Output_B or Pd_Input toavoid an erroneous wakeup. The exact requirement depends upon the iocontrollerthe disk controller, for example, lowers its wakeup request at t4 and canblock in the third instruction after Output_B, while the display controller horizontaltask lowers its wakeup request at t5 and can block in the fourth instruction.If loops naturally run for at least three instructions, use of TIOA is more economical than useof Next because TIOA decoding is mandatory in any case, while Next is needed only forshort loop devices, devices that use the fast io system, and devices that drive the SubTasklines.SubTasksWhen an io device sees Next becoming equal to its task, it can (optionally) present a two-bitSubTask number as well.The processor, control, and memory sections clock SubTask into flipflops at t0. Theprocessor OR's SubTask [0:1] into RBase[2:3] and into MemBase[2:3]. This allows the samefirmware to control several identical io devices concurrentlyeach device, represented by aSubTask, gets its own RM region with 16 RM locations and its own pair of MemBaseregisters; if only SubTask[0] is driven, then two RM regions and four MemBase registers areavailable to each subtask. Note that the 16 change-RBase-for-write functions do not ORSubTask into the changed address, so they cannot be used; also, if RBase is read by theprocessor the value read out has SubTask OR'ed in. However, the 16 change-RSTK-for-write functions do work.Note also that when the debugging processor (Baseboard microcomputer or Alto runningMidas) asserts the Freeze signal, the affect of the subtask on RBase[2:3] is disabled, butsubtask continues to affect MemBase[2:3].In the memory section, the task and SubTask that issued an IOFetch_ is bussed to fastoutput devices with data from storage. The device receiving the data identifies itself bymeans of this information. IOStore_'s are handled similarly.A task presenting SubTask signals generally must Block at the same location each iterationНЇОf╙pТ─Н&∙qН4]pТНGОfЯН;ОbТ▌ Т▐П>Н;О`SТП0НyО\АТ█П5Т▌НyО[Т÷П,ОZ┼sО[pТ═НyОYLТ≈qТ≤П"pНyОW│Т√Т≈П5НyОUЇТлПQНyОSЛТ╞qpП$qpТ╟НyОR"ТП-НyОN╟ТыП6ТзНyОLЕТщТчП$ОLXsОLЕpОLXsОLЕpНyОKТ▒Т▓П!ОJ█sОKpОJ█sОKpqНyОIPТлПNpНyОG┘Т,П>Т-НyОE╩Т▒П3Т▓ОE.sОE╩pНyОCПТєПKТ╔ НyОB%ТЗП#ОA≤sОB%pТШНЇО>вТ┐П#Т└П<НЇО=Т╩П1Т╪П$НЇО;AТ·П[НЇО9wНЇО4^tНЇО0ЛpТ▀П>Т▄НЇО/!ТНЇО+╞ТСТТП2О+"sО+╞pНЇО)ДТ┴П5Т┼П$НЇО(Т≈П0Т≤П+НЇО&OТБП@ТЦНЇО$┘Т▒Т▓П;НЇО"╨ТҐТЎПEqpНЇО ОТ╛ПKТґНЇО%Т╩П2Т╪П"НЇОZТqpНЇОХТ©П+ТюП)НЇОТ╩qpП"Т╪НЇОSТП)НЇОАТцП7ТдНЇОТфП<ТгНЇОLТП=НЇОзТ≤П,qТ≥П*ЪЎЇ⌠=Г\xiDorado Hardware ManualSlow IO14 September 198189since there is only a single TPC value for all of the SubTasks. Hence, the full generality oftasking is unavailablethe microcode for these tasks must be coded as though the wakeupmechanism were a priority interrupt.Illegal Things IO Tasks Must Not Do(1) It is illegal to Block in an instruction that does B_ExternalSource, where ExternalSourceis anything except one of the sources on the IFU board. This restriction is needed so thatthe emulator will be able to do arithmetic on B_PCX'.(2) The IOAtten branch condition is illegal in an instruction that Blocks and might be held,because NextLies might occur, as discussed above.(3) A task may not Block on an instruction that might be held, if its wakeup request might bedropped at t0 of the instruction. If this occurred, the instruction might inadvertently berepeated before the Block took effect.(4) It is illegal to Block with TaskingOff in force.(5) A task must not Block until one cycle after its wakeup request is turned off.(6)It is illegal to issue Wakeup[n] if task n might run in the next cycle. Wakeup[n] must beexecuted with TaskingOff in such circumstances.ЪНЇОf╙pТ─Н&∙qН4]pТНGОfЯНЇОbТ≈Т≤ПKНЇО`SТ≈ПWНЇО^┴ТП$НЇОYotП#НЇОUЩpТ√П1Т≈П,НЇОT3Т²П(Т·П3НЇОRhТП5НЇОNЖТєП+Т╔П1НЇОM,ТП1НЇОI╨Т├ПTТ┤НЇОGОТпОGbsОGОpП+ТяП#НЇОF$ТП&НЇОB╡П4НЇО?AПQНЇО;оН2Т▐П0Т░П*НЇО:ТП/√Ї9Ґ=Г.N┬Dorado Hardware ManualFast IO14 September 198190Fast IO The fast input/output system provides high-bandwidth data transfers between storage andio devices. Transfers occur in units of one munch (= 16 words); the addresses of the 16words must be i, i+1, ..., i+15, where i mod 16 = 0. One word is transferred every clock,for a peak bandwidth of 533 x 106 bits/second. A fast device is also interfaced to the slowio system, from which it receives its control information, since there is no way for thedevice to communicate directly with the processor using the fast io system.A single transaction of the fast io system transfers exactly one munch. Successivetransactions are completely independent of each other, whether they involve the same ordifferent devices, as far as the io system is concerned. The only relationship betweentransactions is that storage references of two transactions occur in the order that they wereissued.Each fast io transaction is initiated by an IOFetch_ or IOStore_ reference coded in ASEL.Once this instruction has been executed, the transaction proceeds without furtherinteraction with the processor (except for fault reporting). The transaction itself involves astorage reference, and transport of the data between main storage and the device. In thecase of a fetch, transport happens at the end of the reference, after the munch has beenerror-corrected. For a store, transport happens at the beginning of the reference, inparallel with mapping the VA and starting the storage chips. As a result of this difference,the transport for a fetch may overlap or even follow the transport for a following store.TransportThe device is only concerned with the transport of the data, and has no way of knowingexactly how or when the storage reference take place. The transport happens in 16clocks, each transporting a single word using the Fin bus (IOFetch_'es) or Fout bus(IOStore_'s). The two busses are independent, and transport can be happening on both ofthem simultaneously.The two busses have much in common. Both have Task and Subtask lines, on which thememory presents the task and subtask involved in the transport about to begin and a Nextsignal used for synchronization. The Fout bus has a Fault line which is high at the time thelast word of the transaction is delivered if there was a memory fault during the fetch (otherthan a corrected single error).Both data busses are 18 bits wide: 16 data bits, numbered 0..15, and two byte partiy bits,numbered 16 (bits 0..7) and 17 (bits 8..15). The parity bits have the same timing as thedata bits. A device is invited to check the parity of data on Fin, and is required to generateparity for data on Fout.Wakeups and MicrocodeThe normal interface between a device and its task involves one wakeup for each munchtransferred. The device must keep track of the number of wakeups it has issued, sincedata may not arrive from storage for several microseconds, but there is no way to stop theНЇОf╙pТ─Н&∙qН5 pТНFОfЯНЇОaвrpНЇО^eТ·П9Т÷НЇО\⌡Т√Т≈П<НЇОZпТ┐ qpqpqp qpТ└П*НЇОYТ┘Т├ОY▓sОYpП;НЇОW;ТкПPТлНЇОUpТПKНЇОQЧТРТСПKНЇОP4Т╗П*Т╘П-НЇОNiТфПWНЇОL·Т─ПXТ│НЇОJтНЇОGbТ√ПSТ≈НЇОE≈Т(ПQНЇОCмТ⌠ПAТ■НЇОBТ√qpП%Т≈НЇО@7Т°Т²П>НЇО>mТЮПDТАНЇО<╒Т░П=Т▒НЇО:ьТҐТЎПLНЇО5ЎtНЇО2LpТїПNТ╗НЇО0┌ТшП:ТэНЇО.ЇТЦП'ТДП,НЇО,МТ└Т┘ПBНЇО+"ТНЇО'╟Т≥П7Т НЇО%ЕТ▒ПGТ▓НЇО$Т─ Т│ПRНЇО"PТ▄ПEТ█НЇО ├ТНЇОТ²П0Т·П*НЇОIТўПUТ╞НЇО~Т┌ПCТ┐НЇОЄТНЇО⌡tНЇО)pТ║ПOТ╒НЇО ^Т╡ТЁПIНЇО⌠Т▌ПGТ▐єЇL=П\©7Dorado Hardware ManualFast IO14 September 198191data from arriving once the task has started the memory reference.Typical microcode for a fast output device is given in the "Display Controller" chapter.LatencySuppose that the highest priority fast io task issues its wakeup request at t0; then it willexecute its first instruction at t4. Some other task can cache fault with clean victim in thecycle starting at t0, and another task can cache fault with dirty victim in the cycle startingat t2. The first reference gives rise to one storage reference and the second to two storagereferences; each of these three storage references takes 8 cycles to handle, so the fast ioreference will not begin for about 24 cycles. From the time it begins until the last dataword is delivered to the device is 23.5 cycles, for a total of 47.5 cycles, to which 2 cyclesmust be added for the time between the wakeup and the first executed instruction. In thissituation, the transport is not finished until 49.5 cycles after the wakeup. Lower prioritytasks are delayed by an additional 8 cycles for each reference which might be made by ahigher priority task.The above is one possible worst case. Another is the execution time of higher prioritytasks; a wakeup might be delayed by sum of the longest normal execution of the fault taskand of other higher priority tasks. The fault task execution time is presently unknown.A store reference is slightly better, since its transport is finished 8 cycles after thereference starts, for a total latency of 40 cycles.All these numbers assume that a reference can be started every 8 cycles. Ifsuccessive references are to 4k modules, however, they can happen only every 13cycles, and the calculations must be adjusted accordingly. Also, data is returnedfrom a 4k module 3.5 cycles later.НЇОf╙pТ─Н&∙qН5 pТНFОfЯНЇОbПBНЇО^╛ТцП-ТдП+НЇОY⌠tНЇОV!pТ╠ПMОU■sОV!pТ╡НЇОTVТ■П"ОSиsОTVpТ∙П9НЇОR▀Т≥ОQЧsОR▀pТ ПEНЇОPаТ│ОP4sОPаpТ┌ПRНЇОNЖТ■Т∙ПHНЇОM,Т╡П-ТЁП-НЇОKaТ≤Т≥ПEНЇОI√Т░П1Т▒П)НЇОGлТІПDТЇНЇОFТ ПWНЇОD7ТНЇО@еТҐП?ТЎНЇО>ЗТ▌П6Т▐П#НЇО=/ТдП(ТеП0НЇО9ҐТВП/ТЬП)НЇО7СТП3НyО5ТтТуП-НyО3gТ█Т▌П3НyО1©Т·ПAТ÷НyО0ТП"Ї/я<П8:яDorado Hardware ManualDisk Controller 14 September 198192Disk ControllerThis chapter describes the Dorado disk controller, which uses the Slow IO system tocontrol up to four Century Data Trident disk drives. Either the 80x106-byte T-80 or the300x106-byte T-300 drives can be used. An extension of the controller onto a second logicboard (not designed) would allow control of up to 31 disk drives; alternatively, duplicatingthe present controller (with different TIOA, task, and muffler assignments) would allowindependent control of four additional drives.Keep Figure 13 in view while reading this chapter.The disk controller uses task 148 and the first five values of the TIOA addresses in block108 - 178 (The Ethernet controller, on the same logic board, uses two of the other three.).Either the task or TIOA block can be modified by changing a SIP component on the logicboard. TIOA assignments are as follows:108DiskControlOutput_B to control register118DiskMuffOutput_B muffler control and Pd_Input to read muffler128DiskDataPd_Input to read FIFO or Output_B to write FIFO data138DiskRamOutput_B to format RAM148DiskTagOutput_B to tag registerNote: other tasks must not select these TIOA addresses at any time; doing so may cause the diskcontroller to malfunction.The controller is interfaced to the disk drives by a daisy chain cable bussed to all drivesand by an independent radial cable to each drive. The radial cables contain the followingsignals:data line (bidirectional, differentially driven)data clock (from drive, differentially driven)subsector/index line (from drive)selected line (from drive)select line (from controller)sequence line (from controller, controlled by the baseboard for drive 0 and groundedfor other drives)two VCC lines and scope trigger (from controller)The daisy-chain cable contains the following signals:16 control "tags" driven by the controller and received by the selected drive9 error and status signals from the drive as follows:CylOffset'ReadOnly'NoTerminatorHeadOvfl'SeekInc'DevCheck'NotOnLineNotReadyIndex'НЇОf╙pТXН" qН0ЛpНGОfЯНЇОaвrТНЇО^epТщТчПGНЇО\⌡Т╩ПFО](sО\⌡pНЇОZпО[]sОZпpТ┘П#Т├П0НЇОYТ·П.Т÷П.НЇОW;ТзП>ТшНЇОUpТП.НЭОQЧqТXП2НЇОN▄pТ÷ОMЪОN▄Т═П9НЇОL·ОLОL·Т⌠ОLОL·П8Т■НЇОJ╠Т≤Т≥ПDНЇОHФТП(НyОEtОDГsНШОEtp Н НyОC╘ОCsНШОC╘pН П5НyОAъОARsНШОAъpН П4НyО@О?┤sНШО@pН НyО>JО=ҐsНШО>JpН НxО;╚tsТгtП$sТхНxО:KТНЇО6ЭpТїП5qpТ╗НЇО51Т∙qpП.Т√ НЇО3gНyО0_ТП0НyО.■П.НyО,иП!НyО*ЪНyО)4НyО'iТєПGТ╔НґО%÷ТНyО#тП1НЇО лП5НyОдПMНyОЫП5НґО/ НґОdНґО НґОоНґОНґО:НґОoНґО ╔НґОзЇ⌠=Х\x6Dorado Hardware ManualDisk Controller 14 September 198193The controller or's the NoTerminator error (which means that the daisy-chain cable isn'tterminated) into the NotOnLine error; the other error indications are discussed later.Disk AddressingThe disk system is accessed through a many level addressing scheme. First a particulardisk drive is selected. Then a data surface or head and a cylinder are selected (5 surfaces,815 cylinders on a T-80). Each cylinder is further divided into sectors which consist ofblocks.Firmware may control the following parameters:Sector size (1378 words max., limited by 4-bit subsector counter)Number of blocks within one sector (1 to 4)Block sizes (2 to 2684 words)Note: Various limits on the sizes of blocks and sectors will be discussed. The processor interface allowsa six-bit subsector counter of which only four bits are presently implemented, and this is the mostsignificant length limit at present (1378 words). If the subsector counter were enlarged to six bits, thenthe block size limit imposed by the error correction algorithm (2684 data words) would apply. We are,however, unlikely to find any of these length limits significant unless we enlarge the memory page size to4096 words. Jumpers in the disk unit could also be set to vary the spacing between subsector pulses.Because sector formats are flexible, firmware can adjust the controller to system needs.The sector formats specifically envisioned in the design of the controller include 28 256-word sectors for Alto Diablo emulation and Pilot, 16 512-word sectors for Juniper, and 91024-word sectors for Alto Trident emulation.Sector Layout ConsiderationsEach block within a sector can be either read, written, or checked. However, once anyblock is written, later blocks in that sector cannot be read during that disk revolution.(Later blocks should be readable on subsequent disk revolutions, though this is notguaranteed and no existing software depends on this.) Reading or writing must start withthe first block in the sector and continue; since check bits are stored at the end of eachblock, the entire block must be read to verify its data or correct errors; however, one doesnot have to read or write subsequent blocks in the sector. After a check-block operation isstarted, the controller inhibits writing later blocks within a sector without a specific "OK"from the firmware.Our general plan is to use the first block in a sector as a header identifying the diskaddress; all headers will be written when a disk pack is initialized; subsequently, the disktask compares the header with the disk address it thinks it is accessing. The header notonly provides a useful safeguard against positioning errors but also allows faster sectordetermination when switching to a new drive, as discussed later.The second block might identify information stored in the sector (e.g., the Label block inAlto format). The third block might be the data block. The fourth block could holdreference, backup, or archiving information. All of these choices are a matter ofprogramming convention.НЇОf╙pТXН" qН0ЛpНGОfЯНЇОbТІП$ТЇП4НЇО`SТВПAТЬНЇО\uТНЇОX°pТіТїПEНЇОVяТ─П0qpТ│qpНЇОUТЇП(Т╦qpНЇОSвТҐqpТЎПPНЇО=Т═ПVТ║НЇО;AТЭП+ТЩП$НЇО9wТРП>ТСНЇО7╛ТтТуПHНЇО5БТЄТ╣П?НЇО4Т┴П4Т┼П%НЇО2LТіТїПFНЇО0┌ТП$НЇО-Т┤ПPТ┬ НЇО+EТ╧ПIТ╨НЇО){Т≤Т≥ПEНЇО'╟Т╙Т╚ПBvpНЇО%ЕТ┌ Т┐ПLНЇО$НЇО ╘ТПRНxО║П9НxОжП<НxОП:НxОJtsТЁПMТЄНxОИТ╦П]Т╧ НxО┬Т≤ПOТ≥НxО(ТЇtsП2Т╦НxОгТіП5ТїП3НxОf:ЇC=ХWх;Dorado Hardware ManualDisk Controller 14 September 198195Table 23: T-80 Specifications and CharacteristicsCapacity82.1 million 8-bit bytes unformattedTransfer rate9.67 x 106 bits/sec (= one 16-bit word/1.65 ms)Cylinder positioning time6 ms cylinder to cylinder maximum (3 ms typical)30 ms average55 ms maximumRotational speed3600 rpm (16.66 ms/revolution)Sector length selection12-bit increments through jumpers on sector boardDensities370 cylinders/inch6060 bits/inch max. recording densityDisk pack characteristicsIBM 3336-type components5 recording surfaces plus 1 servo surface815 cylinders/surfaceOperating methodsModified frequency modulation recordingLinear positioning motor with cylinder following servoMechanical specificationsSize - 17.8" wide x 10.5" high x 32" deepWeight - 230 lbs.Error rateRecoverable: 1 error/1010 bitsIrrecoverable: 1 error/1013 bitsPositioning: 1 error/106 seeksPack start/stop time20 sec start time20 sec stop time (with dynamic braking)Controls and indicatorsReady IndicatorOff = disk not spinningFlashing = spinning up/downOn = ReadyFault IndicatorStart/Stop switchRead-only switchDegate switch (inside the drive; takes disk off-line for testing)General Firmware OrganizationThis section gives a general overview of how the disk controller firmware is organized;more detailed descriptions follow later.The disk drive generates subsector and index pulses on one line in the radial cable; thecontroller distinguishes these according to pulse width. In the normal Idle loop, thecontroller looks only at these pulses from the connected drives. A four-bit counter for eachdrive counts down subsector pulses and generates sector pulses. Upon either a sector oran index pulse from the selected drive, the controller generates a disk task wakeup. Thedisk task then either increments (sector wakeup) or zeroes (index wakeup) its firmwaresector counter, clears the wakeup condition, checks for a new command, and blocks.Because there are no hardware sector counters, the disk task must maintain a sectorcounter itself; this implies that the rotational position is generally unknown on alldeselected drives.НЇОf╙pТXН" qН0ЛpНGОfЯНLОbAwТFП2НЇО_9sНQТП$НЇО]Т─НQТО]▓О]П"vsНЇОZґТ─НQТП/НQОYLНQОWКНЇОUЇТ─НQТНЇОS┌Т─НQТП+НЇОQNНQНQОOМП%НЇОM╧Т─НQТНQОLXП)НQОJВНЇОHцТ─НQТНQОGbП6НЇОE-Т─НQТП%НQОCмНЇОA≤Т─ НQТ ОB%ОA≤НQО?ЯО@~О?ЯНQО>JО>вО>JНЇО<Т─НQТНQО:ЄП'НЇО8─Т─НQТ Н▐О7Н▐О5ЎН▐О4^ НQО2ЩНQО1°НQО0;НQО.зПAНЇО*╦uТXНЇО'FpТфП@ТгНЇО%|ТП(НЇО" Т╛qpqpТґП%НЇО ?ТХПRТИНЇОtТ─П.Т│П/НЇО╙Т⌠П1qpТ■ НЇОъТ·qpТ÷П9НЇОТфП:ТгНЇОJТуТжП<НЇОьТтП$ТуП/НЇО ТПFТНЇОCТЪ^ЇЭ<ХX>Dorado Hardware ManualDisk Controller 14 September 198196When first selecting a drive, there are two strategies for determining the sector position: (1) Wait foran index wakeup, at which time the sector position becomes known; (2) Wait for a sector wakeupand then read the sector number stored in the header block (This can only be done if the disk isnot moving to a new cyclinder.). The most efficient strategy appears to be a combination: Selectthe drive and start a seek to the correct cylinder; if an index wakeup arrives before the seek isfinished, then the sector position is synchronized with no loss of time. If the seek finishes first, thenread the next header to determine the sector number. When a new disk operation is noted, firmware will perform the following steps:Execute a drive-select command, if the drive differs.Load the sector size only if different, and block until index.Load the format RAM only if word count or commands differ.Execute a Control Tag (seek) command only if the cylinder differs, and wait (continuingto count sectors) until the drive becomes ready again.Execute a Head Tag command.Block until, at a sector wakeup, the next sector is the one wanted.Load the appropriate transfer command into the control registerBlock until the next sector wakeup.At the start of the next sector, the controller will become active and sequence throughcommands under control of the format RAM and two sequence proms (one for reading,one for writing).The sequence proms define what operations the controller must go through, and the formatRAM contains all parameters that might change from one implementation to another.Actual commands for the Trident disk are stored in the format RAM along with countvalues such as words/block, words of ECC, and words of delay before some operation; thecommands are loaded into the tag register and executed by the controller during thetransfer.Once a transfer has started, the disk task will be woken according to the number of wordsin the FIFO, and it will send or receive the appropriate number of words. Read andcompare operations are performed by firmware, as well as detecting checksum errors atthe end of reading. During writing, firmware must provide one word of sync bits (2018standard, 0018 for Alto Trident emulation) followed by the specified number of words forthat block (the controller will append 2 words of checksum). During read, the controllerwill look for, and discard, the first word of sync bits, then firmware must accept thespecified number of words for that block, followed by two words of checksum to bediscarded, followed by the ECC remainder to be used for error detection/correction.Task WakeupsThe controller may wakeup the disk task for many conditions; the disk task must deteminethe cause and take appropriate action, which must in some way cause the wakeup to goaway.In general, there are two ways to determine the wakeup condition: read the wakeupcondition, or assume the condition knowing the state of the disk task (which implies thestate of the controller). When expecting a sector or index wakeup, the disk task must testcarefully to count sectors reliably, but in the middle of word transfer operations, it willЪНЇОf╙pТXН" qН0ЛpНGОfЯНyОbAsТ┐ПQТ└НyО`ЮТ÷Т═ПEНyО_─Т⌡Т°ПOНyО^Т ПEТ⌡НyО\ЎТ╞П6Т╟П+НyО[]Т┘ПeТ├НyОYЭТП5НЇОVўpПNНxОS<П5НxОQqП>НxОOїП:НxОMэТ┌П)Т┐П.НxОLТП6НxОJGНxОH|П%qpНxОF╡П?НxОDГП#НЇОAuТ©Тю qpП?НЇО?╙Т╦Т╧П4НЇО=ЮТНЇО:nТ│Т┌П:НЇО8ёТМП6ТННЇО6ыТсП(ТтП*НЇО5Т┤П6Т┬П!НЇО3CТзТшП3НЇО1yНЇО.Т▌П/Т▐П*НЇО,<ТжПSНЇО*rТЁТЄПCНЇО(їТ╩ПFТ╪О(НЇО&╧Т╛О&,О&╧ТґПFНЇО$кТўПYНЇО#ТЮП%ТАП1НЇО!6ТЦП?ТДНЇОkТРП3ТСНЇО&uТXНЇОЄpТ⌠ Т■ПJНЇОИТёТєП6НЇОНЇОґТыПKТзНЇОБТ╟ПEТ╠НЇО Т░Т▒П<НЇОMТиП9ТйП"Ъ▓Ї=Х];Dorado Hardware ManualDisk Controller 14 September 198197assume the wakeup reason to minimize overhead. The various conditions are as follows:IndexTW, SectorTW, TagTW, RdFifoTW, and WrFifoTW; these wakeup conditions aredetailed in the "Muffler Input" section.Control RegisterThe DiskControl register is a collection of flip-flops defining the state of the controller; onOutput to DiskControl, IOB is interpreted as follows:B[5]Clear EnableRunB[6]Set DebugModeB[7]Set BlockTilIndexB[8:9]Operation for first block of sector, where the operations are:0 = Done (finished with all blocks in this sector)1 = Write2 = Read and check3 = ReadB[10:11]Operation for second block of sector, as above.B[12:13]Operation for third block of sector, as above.B[14:15]Operation for fourth block of sector, as above.EnableRun determines whether the controller is active at all. It is initially cleared byIOReset, and can only be set by completing the loading of the format RAM (see below).DebugMode allows the controller to be exercised by diagnostics when no disk is present; inthis case, diagnostic firmware provides fake disk bit-clocks and data. The flip-flop iscleared by DisableRun.BlockTilIndex can be set to disable sector and index task wakeups until (a) the selecteddrive is ready, and (b) an index pulse is received from the drive. It is cleared by an indexwakeup. This is useful after switching drives or executing a ReZero operation, either ofwhich causes the controller to lose sector synchronization with the drive. BlockTillIndexprevents the wakeup conditions from being set until these conditions are met, but does notclear any such wakeups that have already occurred. To prevent races, it is necessary toclear SectorTW and IndexTW, then set BlockTillIndex, then clear SectorTW again.A request for a sector transfer is initiated by loading bits 8 and 9 of the control registerwith a non-zero value. Then the controller will wait until the next sector pulse to set the"Active" flip-flop and execute the transfer. Once a transfer has been started, it may beaborted by loading a new value into the control register twice. The first will clear theActive flip-flop, and the second will load the control register. (When Active, the controlregister is enabled for shifting commands rather than loading of io data.)Format RAM and Sequence PROMsThe format RAM is a 16-word by 12-bit register that holds commands and delay countsused by the controller during a transfer. Words within the RAM are used according to thefollowing table; the example values are appropriate for Alto Diablo disk emulation (2-wordheader, 8-word label, and 256-word data record).ЪНЇОf╙pТXН" qН0ЛpНGОfЯНЇОbТёП.ТєП(НЇО`SТП7ТНЇО^┴ТП(НЇОZCuТXНЇОVяpТ·ПPТ÷НЇОUТП5НxОR"sНТFНxОP·ННxОOННxОM∙НП>Н'ОLXТП2Н'ОKН'ОIщН'ОH÷НxОGНТFП/НxОE≈НП.НxОDНП/НЇО@║pТмП8ТнП"НЇО>вТ╟ПDТ╠НЇО;eТ│Т┌ПJНЇО9 Тщ ТчПNНЇО7пТНЇО4^Т╞П*Т╟П.НЇО2⌠Т⌠Т■ПNНЇО0хТ╟П=Т╠НЇО.ЧТ╟П?Т╠НЇО-3Т┬Т┴ПKqНЇО+ipТ═Т║ПBНЇО)·ТПOНЇО&,Т╛ТґПSНЇО$aТёП@qpТєНЇО"≈Т╠ПYНЇО лТдТеПEНЇОТЇПAТ╦НЇО7ТПJНЇОЯuТXНЇОpТ╨П.Т╩П%НЇО╣Т░П*Т▒П/НЇОЙТ║П#Т╒П7НЇО ТП0 bЇы=ХX2%Dorado Hardware ManualDisk Controller 14 September 198198ExampleAddrDescription Value00Word count of the first block000101Word count of the second block000702Word count of the third block037703Word count of the fourth block000004Control tag command for a read operation010405Control tag command for a write operation020406Control tag command to set Head Select000407Control tag command to zero the tag bus000008Word count to write zeroes before writing the 1st block of a sector003309Word count to write zeroes before writing the sucessive blocks000610Word count to wait before reading the 1st block of a sector001111Word count to wait before reading the sucessive blocks000212Word count of ECC words plus one000213Word count of 2000114Word count of 1 (minimum count)000015Not used0000Notice that the format RAM contains both word counts and tag commands. Word countsare 1 less that the desired count. Tag commands will be loaded into the tag register (seebelow) and then used as a "control tag function" by the Trident disk. The values in theright column are those used for the Alto Diablo emulation format. Notice that all but thefirst 4 values are determined by characteristics of the drive being used as opposed to thespecific sector format. The meaning of the tag command values can be found in the "TagRegister" section.The format RAM is addressed in two ways. During a transfer, sequence PROMs move datafrom the RAM into either a tag register or a count register. At other times, the Dorado mayaddress the RAM with the RAM Address register, which is zeroed when the control registeris written; executing an Output to the DiskRam register writes IOB into the RAM at thecurrent address and then increments the address. Loading the last word in the formatRAM turns on the EnableRun flip-flop allowing normal disk control activity. The formatRAM may be read via the muffler scheme discussed later.There are two sequence PROMs, one for reading (or checking) and one for writing. ThePROMs are addressed by a program counter that is initialized to zero at the beginning of asector and is incremented upon completion of each PROM program action. Either the readPROM or the write PROM is selected according to the operation being performed on thecurrent block.The sequence PROMs are clocked by WordClock, which is derived from the disk bit clock,which in turn is derived from timing information pre-recorded on the disk pack. Thesubsector pulses generated by the drive are also derived from this timing information. Thisenables very precise placement of the data on the disk, in a manner that is independent ofthe disk's rotational velocity or the Dorado's clock rate.НЇОf╙pТXН" qН0ЛpНGОfЯН9═ОbAtНґОaН═ Н9═ТFН]О^┴sН═Н:_Н]О]KН═Н:_Н]О\Н═Н:_Н]ОZпН═Н:_Н]ОY⌠Н═П(Н:_Н]ОXUН═П)Н:_Н]ОWН═П&Н:_Н]ОUзН═П'Н:_Н]ОT²Н═ПCН:_Н]ОS_Н═П>Н:_Н]ОR"Н═П;Н:_Н]ОPДН═П6Н:_Н]ОOїН═Н:_Н]ОNiН═Н:_Н]ОM,Н═Н:_Н]ОKНН═Н:_НЇОH÷pТ· Т÷П=q НЇОFуТ∙П!pП9НЇОE Т╔ТіПDНЇОC@ТєП/Т╔П+НЇОAuТ⌡Т°П:НЇО?╙Т▄П7Т█НЇО=ЮТНЇО:nТ├П0Т┤П%НЇО8ёТ└ПMТ┘НЇО6ыТ▀qpТ▄НЇО5ТЎ Т©ПKНЇО3CТюПUНЇО1yТаПLТб НЇО/ўТП7НЇО,<ТёПUНЇО*rТ┴Т┼П:НЇО(їТ┐П6Т└П!НЇО&эТїП"Т╗П2НЇО%Т НЇО!═Т░ПVНЇОуТшП=ТэНЇОТ┬ПVТ┴НЇО@Т▀П!Т▄П9НЇОuТП:Ъ xЇd<ХOїNDorado Hardware ManualDisk Controller 14 September 198199The read and write sequence PROMs are described in the following tables.Write Sequence PROMDurationAddrDescription(WordClocks)00Issue tag command in RAM[6] (head select)101Delay (wait for head select to settle)RAM[13]+102Issue tag command in RAM[5] (write command)103Write long preamble for first blockRAM[8]+104Write sync word105Write data for first blockRAM[0]+106Write first ECC wordRAM[14]+107Write second ECC word and 2 postamble wordsRAM[12]+108Advance control register to the operation for the next blockRAM[14]+109Issue tag command in RAM[5] (write command)110Write short preamble for second blockRAM[9]+111Write sync word112Write data for second blockRAM[1]+113Write first ECC wordRAM[14]+114Write second ECC word and 2 postamble wordsRAM[12]+115Advance control register to the operation for the next blockRAM[14]+116-22Same as 09-15, except step 19 uses RAM[2]+123-29Same as 09-15, except step 26 uses RAM[3]+130Zero the tag bus131Not usedRead Sequence PROMDurationAddrDescription(WordClocks)00Issue tag command in RAM[6] (head select)101Delay (wait for head select to settle)RAM[13]+102Delay (skip over early part of preamble)RAM[10]+103Issue tag command in RAM[4] (read command)1Note: WordClocks cease until controller has read sync word from disk04Read data for first blockRAM[0]+105Read ECC wordsRAM[13]+106Compute first word of ECC remainder, issue tag command in RAM[6]107Compute second word of ECC remainderRAM[14]+108Advance control register to the operation for the next blockRAM[14]+109Delay (skip over early part of preamble)RAM[11]+110Issue tag command in RAM[4] (read command)1Note: WordClocks cease until controller has read sync word from disk11Read data for second blockRAM[1]+112Read ECC wordsRAM[13]+113Compute first word of ECC remainder, issue tag command in RAM[6]114Compute second word of ECC remainderRAM[14]+115Advance control register to the operation for the next blockRAM[14]+116-22Same as 09-15, except step 18 uses RAM[2]+123-29Same as 09-15, except step 25 uses RAM[3]+130Zero the tag bus131Not usedTag RegisterThe 16-bit tag register drives the tag bus on the daisy-chain cable; all disk drive commandsare initiated through the tag register. The tag register is sometimes loaded from IOB via anOutput command to DiskTag, sometimes from the format RAM. Loading a Head Tag,Cylinder Tag, or Control Tag into the tag register (from either source) activates a timingcircuit that handles all timing requirements of the Trident drive as follows: Only the tag busbits are enabled for the first 200 ns; then the Tag[0:3] bits are also enabled for 1.2 ms;finally, the Tag[0:3] bits are disabled again and the TagTW flip-flop is set to wakeup thedisk task (indicating completion of the Tag instruction). The Drive Select Tag (Tag[0]) doesnot activate the timing circuit, since the timer counts disk clock cycles, but disk clocks areinvalid during drive select changes.НЇОf╙pТXН" qН0ЛpНGОfЯНЇОbТПHНґО_tТFН9═НґО]ьН═ Н9═Н]О[]sН═П)Н:_Н]ОZ Н═П&Н:_Н]ОXБН═П+Н:_Н]ОW╔Н═П#Н:_Н]ОVgН═Н:_Н]ОU*Н═Н:_Н]ОSЛН═Н:_Н]ОR╞Н═П+Н:_Н]ОQqН═П<Н:_Н]ОP4Н═П+Н:_Н]ОNЖН═П%Н:_Н]ОM╧Н═Н:_Н]ОL{Н═Н:_Н]ОK>Н═Н:_Н]ОJН═П+Н:_Н]ОHцН═П<Н:_Н]ОG┘Н═П+Н]ОFHН═П+Н]ОE Н═Н:_Н]ОCмН═НґО@ХtН9═НґО?╙Н═ Н9═Н]О=/sН═П)Н:_Н]О;РН═П&Н:_Н]О:ЄН═П(Н:_Н]О9wН═П*Н:_Н═О89tПDН]О6ЭsН═Н:_Н]О5ЎН═ Н:_Н]О4│Н═П@Н:_Н]О3CН═П$Н:_Н]О2Н═П<Н:_Н]О0хН═П(Н:_Н]О/▀Н═П*Н:_Н═О.MtПDН]О-sН═Н:_Н]О+рН═ Н:_Н]О*∙Н═П@Н:_Н]О)WН═П$Н:_Н]О(Н═П<Н:_Н]О&эН═П+Н]О%÷Н═П+Н]О$aН═Н:_Н]О#$Н═НЇОuТXНЇО░pТ└ Т┘ПRНЇОеТ┐П0Т└П-НЇОЗТуП4ТжНЇО0Т╠Т╡ПBНЇОeТ│ПAТ┌НЇО⌡ТЇПRТ╦vpНЇОпТўТ╞П<НЇОТ│Т┌ПHНЇО ;Т█Т▌П>НЇОpТП$Ъ░Ї)<Х\БфDorado Hardware ManualDisk Controller 14 September 1981100Bits 4 through 15 of the tag register are interpreted according to the following table:Tag[0]Drive select and subsector countTag[4:15] are interpreted by the controller to effect drive select orsubsector counter changes. The tag timing and wakeup circuit is notactivated; firmware must take care of the timing by first loading Tag[4:15]as desired but with Tag[0:3] equal 0, then or-ing in the Tag[0] bit andoutputting again. 4:9Subsector countDivide the 117 subsector pulses from disk by subsector count+1 to form Sectorpulses (Tag[4:5] are presently unimplemented).Tag[4:9] = 3 yields 29 sectors large enough for 256-word data blocksTag[4:9] = 6 yields 16 sectors large enough for 512-word data blocksTag[4:9] = 148 yields 9 sectors large enough for 1024-word data blocks 10Load subsector from Tag[4:9] for the drive selected prior to the execution of thistag instruction.11:15Drive selectThe basic controller handles up to 4 disk drives; additional units may beaccommodated by adding drive dependent logic on an additional board andconnecting it in in place of drive 3. To allow this, the 5 bit drive select field isinterpreted as follows. 0 - 3select drive 0 to 3, respectively 4 - 368select drive 3378don't select any driveTag[1]Head TagLoads a register in the drive that selects the head to be used duringsubsequent read/write commands. A Tag wakeup occurs at completion(1.6 ms). 4:7Unused 8Off Cylindermay be activated during a read to attempt recovery of unreadabledata. It causes cylinder positioning to be offset 80 micro-inches. 9Determines direction of offset if bit 8 is set.10:15Head numbervalues from 0 to 4 are valid for a T-80, 0 to 19 for a T-300. Thedrive will turn on "EndOfCylinder" (alias HeadOverflow) error if an invalid headaddress is issued.Tag[2]Cylinder TagCauses the drive to seek to the specified cylinder. A Tag wakeup occursafter the tag timing sequence has completed (1.6 ms), and the NotReadystatus bit is raised until the seek has completed (3 to 55 ms depending onthe seek distance). 4:15Cylinder number (0 to 814 for Trident disks presently in use). An illegal cylindernumber will cause DeviceCheck to be raised.Tag[3]Control TagA Tag wakeup occurs at command completion (1.6 ms) and uponcompletion of the last read/write operation in a sector. Generally, ControlTag commands are issued only by the controller itself (using tagcommands from the format RAM) rather than by the microcode; DeviceCheck Reset and ReZero are an exception.ЪНЇОf╙pТXН" qН0ЛpНFMОfЯНЇОbТзТшП?НxО^╛НШТНШО\АТП;Т НШО[ТпТяП)НШОYLТ≥Т П.НШОW│ТеТфП4НШОUЇТНШОRУsТ─Н~ТН~ОQ■Т┴П>Т┼Н~ОP4ТП.Н~ОNсПDН~ОMrПDН~ОLТЙОK└ОLТКП1НШОIщТ─Н~П4tsТ│Н~ОH|ТНШОFHН~Н~ОDГТБПBТЦН~ОC├ТиПCТйН~ОB%Т▄ПEТ█Н~О@еТН@О>ЁТ─Н"sТН@О=vТ─О<ИН"sО=vТН@О;оО;BН"sО;оНxО8─pНШНШО6╣ТуТжП5НШО4КТЁП-ТЄНШО3 ТvpНШО0_sТ─Н~НШО.*Н~Т▓ПMН~О,иТПCНШО*∙Т─Н~ ТП%НШО(`Н~Т┌П@Т┐ Н~О'Т⌡ПKТ°Н~О%÷ТНxО"PpНШНШО ├Т∙Т√П)НШО╩Т╘П+Т╙vpНШОПТ┴П7Т┼НШО&ТНШОdsТ─Н~Т┐Т└П>Н~ОТП+НxО╣pНШ НШОЙТ%П/vpqpТ&НШО Т┴Т┼П6НШОUТ1Т2П!НШО┼ТаП,ТбНШО юТП(Ъ ПЇ y;Х]▓xDorado Hardware ManualDisk Controller 14 September 1981101 4AltoLeaderspecial flag to the controller that allows disks written by an AltoTricon Controller to be read. This bit should only be used for the Alto Tridentsimulation. 5Unused 6Strobe Latecauses data recovery circuits within the drive to sample data earlywithin the data bit time (for recovery when the drive is experiencing excessiveread errors). 7Strobe Earlylike StrobeLate except in the obvious way. 8Writeturns on the write circuits. 9Readturns on the read circuits. 10Unused 11Reset Head registerzeroes the head address register in the drive. 12Device Check Resetresets all latched error conditions in the drive. 13Head Selectturns on the head selection circuits, in conjunction with a Read orWrite. 14ReZerorepositions the heads to cylinder 0 (if the heads are loaded) and resetsthe head address register; resets SeekIncomplete and DeviceCheck errorconditions. 15Head Advanceincrements the head address register in the drive.FIFO RegisterData to/from the disk is buffered through a 16-word FIFO (25 ms of buffer), which isread/written with Pd_Input/Output_B when TIOA selects DiskData. Each FIFO word holds16 data bits, 2 parity bits, and a 2-bit field indicating that the next word to be read is eitherwrite, read, or read-and-check type data. During output to the disk, the controller checksparity both when receiving data on the io bus and again when reading the FIFO. During adisk read, parity is computed before writing into the FIFO, is passed through the FIFO, andis then written on the io bus for the processor to test.Muffler InputDorado uses a multiplexor scheme called the muffler system for reading miscellaneouslogic signals during debugging from the Alto or baseboard. The disk controller also allowsa muffler address to be specified on an Output to the DiskMuff register; in this way, anyDskEth board signal available through the multiplexors (mufflers) is also available forfirmware sampling. Other bits of the DiskMuff register output specify other operations asfollows:B[0]Simulate read data of 1 for 1 cycle (for use by diagnostic programs)B[1]Simulate read clock of 1 for 1 cycle (for use by diagnostic programs)B[2]Clear CompareErrdone by disk task if a read&compare is found to be OKB[3]Set ReadDataErrdone by disk task to inhibit future writesB[4]Clear the index wakeup flip-flopB[5]Clear the sector wakeup flip-flopB[6]Clear the tag wakeup flip-flopЪНЇОf╙pТXН" qН0ЛpНFMОfЯНШОbAsТ─Н~Т╧П4Т╨Н~О`ЮТ■Т∙П3Н~О_─ НШО]KТ─Н~НШО[Н~Т▐П)Т░П&Н~ОYІТ║Т╒П@Н~ОXUТНШОV!Т─Н~ТП1НШОSЛТ─Н~ ТНШОQ╦Т─Н~ ТНШОO┐Т─Н~НШОMOН~ТПBНШОKТ─Н~ТП>НШОHФТ─Н~Т├ПKН~ОG┘НШОEQТ─Н~Т┘ Т├П3Н~ОCПТТП,Н~ОB▐ НШО@[Т─Н~ТП;НЇО<8uТXНЇО8фpТвП)ТьvpНЇО6ЭТ┬ПEТ┴НЇО51Т┬ПLТ┴НЇО3gТ⌡П[НЇО1°Т▌П-Т▐П+НЇО/яТ▄ПEТ█НЇО.ТП8НЇО)аuТXНЇО&OpТоП3ТпП!НЇО$┘Т▀П?Т▄НЇО"╨Т╚П5Т╛П$НЇО ОТНПAТОНЇО%Т╔ПFТіНЇОZНxО╪sН ТFПEНxО~Н ПFНxОAН ПFНxОН П:НxОфН НxО┬Н П!НxОKН ╟Ї(<ХTЦCDorado Hardware ManualDisk Controller 14 September 1981102B[7]Clear all error flip-flops within the controller (not the disk drive)B[8:15]Muffler addresssignals are enumerated belowFollowing an output to the DiskMuff register, the firmware must wait one cycle beforeinputting the selected muffler signal with Pd_Input. The state of the signal selected will bedriven on IOB[15], and the remaining bits will be zero. For the purpose of examinationfrom Midas, the signals are grouped into 16-bit words, as shown in the following table. Thebits within each word and an appropriate explanation follow:KSTATEvarious bits indicating the state of the controller 000TempSensesee "Dorado Debugging Interface" document 001IndexTWdisk task wakeup is due to an index pulse; index pulses occur once/diskrevolution (16.7 ms) and are used to synchronize the hardware subsectorcounter and the firmware sector counter. An index pulse also causes aSectorTW. 002SectorTWdisk task wakeup is due to a sector pulse. To maintain a reliable sectorcount in a race-free manner, the microcode must (a) check for SectorTW,and upon finding it set increment the sector number and clear SectorTW;(b) check for IndexTW, and upon finding it set zero the sector number andclear both IndexTW and SectorTW. 003TagTWdisk task wakeup is due to completion of a Head Tag, Cylinder Tag, orControl Tag command. This occurs 1.6 ms after issuing an Output to theDiskTag register, and also upon completion of the last read/write transferin a sector. 004RdFifoTWdisk task wakeup is due to presence of at least 3 words in the FIFO duringa normal read or 1 word during a read-and-check. During a normal read,an Input that reduces the FIFO below 3 words will drop RdFifoTW in timefor a Block to take effect on the 5th cycle following the Input; this permitsa 2-cycle loop (Input, Block). During a read-and-check, an Input thatempties the FIFO will drop RdFifoTW in time for a Block to take effect onthe 3rd cycle following the Input; this permits a 4-cycle loop (Input, no-op,no-op, Block). 005WrFifoTWdisk task wakeup is due to space for at least 4 words in the FIFO. AnOutput that reduces the free space below 4 words will drop WrFifoTW intime for a Block to take effect on the 5th cycle following the Output; thispermits a 2-cycle loop (Output, Block). WrFifoTW is enabled to occur byselecting TIOA[DiskData] when a write command is in progress; it isdisabled by TIOA[DiskControl], which the microcode executes afteroutputting the last data word of a block. One more WrFifoTW will occurafter all data has actually been sent to the disk. 006ReadDataData bit from the disk (available for diagnostics) 007WriteDataData bit to the disk (available for diagnostics) 010EnableRunFormat RAM has been written, and wakeups are enabled 011DebugModeController has been placed in debug mode 012RdOnlyBlock'The controller is processing a block in normal read mode 013WriteBlock'The controller is processing a block in write mode 014CheckBlock'The controller is processing a block in read and check mode 015ActiveThe controller is processing a command for the current sector016:017Select.0..1The address of the currently selected drive unitKSTATvarious bits indicating the status of the drive/controller. The controller willturn on WriteInhibit for the remainder of the sector after any of the followingerrors are detected, but will still go through all the motions of wordtransfers.ЪНЇОf╙pТXН" qН0ЛpНFMОfЯНxОbAsН ТFПEНxОaН П,НЇО]╣pТуП1ТжП$НЇО[ЙТ┬П4Т┴П*НЇОZ ТІТЇПEНЇОXUТ│П9Т┌П#НЇОV┼ТП<НxОSН\П3НыОQNsТ─Н\НТП&НыОOїТ─Н\НТ·Т÷П0НОNFТ╔П!ТіП&НОLЕТ╚ПFНОK└НыОIщТ─Н\НТ·Т÷П-НОH|Т≤Т≥П?НОGТ²Т·П4НОE╩Т├Т┤П3НОDZТtsНыОB╡Т─Н\НТ╠П$Т╡НОA.Т∙П&vsТ√ НО?╙Т═П*Т║НО>JТНыО<╒Т─Н\НТ┌Т┐ПAНО;AТ⌠П1Т■НО9АТ≈ПGНО8─Т█П!Т▌П,НО7ТпТяП=НО5ЎТ■П1Т∙НО4^Т▒П9Т▓НО2ЩТ НыО1UТ─Н\НТ╗П4Т╘ НО/УТ╒ ТёП;НО.■Т Т⌡П3НО-3Т·П4Т÷НО+рТКТЛП$НО*rТ6П$Т7НО)Т÷Т═П4НО'╟ТП2НыО& Т─Н\НТП.НыО$aТ─Н\НТП,НыО"╨Т─Н\НТП.НыО!Т─Н\Н ТНыОkТ─Н\НТП5НыОдТ─Н\ НТП/НыОТ─Н\ НТП8НыОuТ─Н\НТП:НыОнН\ НП0НxОpН\Т÷Т═П9Н\О╣Т┴Т┼П7Н\ОЙТ П#ТП#Н\О zЇы<ХX2кDorado Hardware ManualDisk Controller 14 September 1981103 020SeekIncThe disk drive has not correctly positioned the heads within the last 700ms. A ReZero command must be issued to clear this error. 021HeadOvflThe head address given to the disk drive is invalid (i.e., greater than 4 fora T-80 drive). 022DevCheckOne of the following errors occurred:Head select, Cylinder select, or Write command and disk not readyIllegal cylinder address.Offset active and cylinder select command.Read-Only and Write.Certain errors during writing, such as more than one head selected, notransitions of encoded data or heads more than 80 micro-inches offcylinder.A ReZero command may be necessary to clear this error. 023NotSelectedThe selected drive is in "off-line" test mode or the selected drive is notpowered up 024NotOnLineThe drive is in test mode or the heads are not loaded 025NotReadyThere is a cylinder seek in progress or the heads are not loaded 026SectorOvflThe controller detected that a command was active when the next sectorpulse occurred. This error implies either a hardware malfunction or adiscrepancy between the sector format of the drive and the word count theprogram thinks is appropriate. 027FifoUnderflowEither the FIFO became empty while writing (task got behind) or the FIFOhad too many words taken out of it while readng (microcode word count orwakeup error). 030FifoOverflowEither the FIFO became full while reading (task got behind) or the FIFOhad too many words put into it during writing (microcode word count orwakeup error). 031ReadDataErrA flip-flop in the controller for latching one of three errors:CompareErra read-and-check operation was executed on a block,and the microcode did not issue ClearCompareErr beforethe beginning of the next block.ECCErrorthe microcode can set the ReadDataErr flag if itdetermines that the ECC words after reading one blockare non-zero in order to inhibit future writes.ECCComputeErrThe ECC hardware within the disk controller failed togenerate a single "1" bit (i.e., a hardware malfunction). 032ReadOnlyThe "Read-Only" switch on the drive is on. 033CylinderOffsetThe cylinder position is currently offset. This is a mode used for recoveryof bad data. 034IOBParityErrThe controller detected bad parity on the IOB bus. 035FifoParityErrThe controller detected bad parity on the data out of the FIFO. 036WriteErrOR of errors on muffler addresses 020-035 037ReadErrOR of errors on muffler addresses 020-031 and 034-035KRAMcontents of the format RAM040:043Address of format RAM word044:057contents of format RAM wordKTAGcontents of the tag register060:07720 bit value last loaded into the tag registerKFIFOstate of the io control logicНЇОf╙pТXН" qН0ЛpНFMОfЯНыОbAsТ─Н\НТїП2Т╗НО`ЮТП9НыО_9Т─Н\НТ░ПDТ▒НО]ьТ НыО\1Т─Н\НТП"Н bОZпТвТьП$Н bОYoТН bОXП*Н bОVўН bОUMТ≥ПCТ Н bОSЛТмПBН bОR▀НОPДТП6НыОO=Т─Н\ НТ╚П1Т╛НОMэТ НыОL5Т─Н\НТП2НыОJ█Т─Н\НТЭП"ТЩНыОHФТ─Н\ НТ═ Т║П8НОG┘ТйПFНОF$Т┘П(Т├П!НОDдТНыОCТ─Н\НТ≤П$Т≥НОA╪Т│П(Т┌НО@[Т НыО>ЁТ─Н\НТ╟П<Т╠НО=SТ╙П=Т╚НО;РТ НыО:KТ─Н\ НТП>Н bО8ё Н(їТяП*ТрН(їО7BТ▐Т░П/Н(їО5БТН bО4:Н(їТ'П0Н(їО2зТўП/Т╞Н(їО1yТП/Н bО/яН){ТЇП5Н(їО.qТ╛П+Тґ НыО,иТ─Н\НТП'НыО+"Т─Н\ НТ▐П(Т░П!НО)аТНыО(Т─Н\НТП/НыО&sТ─Н\НТП<НыО$кТ─Н\НТП'НыО#$Т─Н\НТП3НxОуpН\НыОsН\НыОlН\НxОpН\НыО\sН\П.НxО pН\Ъ ▓Їф8фVE╚Dorado Hardware ManualDisk Controller 14 September 1981104 100ShiftInThe controller is currently shifting data into the FIFO 101ShiftOutThe controller is currently shifting data out of the FIFO 102ComputeECCThe controller is currently shifting data and computing the ECC checksum 103NextBlockOccurs between blocks within a sector 104LoadTagIndicates that the next word read from the format RAM should be loadedinto the tag register as opposed to the count register 105CntDone'Indicates that the count register is again zero, and a new value from theformat RAM will be loaded next 106OutRegFullThe holding register on the input to the FIFO has been loaded, but nottransferred into the FIFO. 107InRegFullThe holding register out of the FIFO has been loaded, but not read viaPd_Input or loaded into the output shift register.110:113FifoWaddrThe 4-bit address indicating where the next word will be written into theFIFO114:117FifoRaddrThe 4-bit address indicating where the next word will be read from theFIFO. if FifoWaddr equals FifoRaddr then the FIFO is defined as empty.Error Detection and CorrectionTo allow high data density and a few surface imperfections during manufacture, Tridentdisk packs are not required to be perfect. A disk pack is defined as suitable when no morethan three bad areas occur on any data surface; a bad area is defined as one which couldpotentially cause read errors of no more than 11 bits in length. To correct errors arisingfrom these imperfections as well as other (infrequent) read errors, the controller implementsan error detection and correction scheme which will detect (with very high probability)errors of any length, and will allow correction of any burst error of 11 bits or less.Warning: If an error burst longer than 11 bits occurs, there is a significant possibility that the errorcorrection algorithm detailed below will fail and double the number of bad bits! Consequently, diskhandling programs should try other methods of error recovery before invoking the error-correctionalgorithm.To avoid problems, it is good practice to run diagnostic programs on new disk packs; note badsectors and don't use these during normal operation.When an error does occur, the first step is to try rereading the offending sector several times. Oneof these reads may succeed. If not, try rereading with the cylinder position offset or with the datastrobe early or late as discussed in the "Tag Register" section. If these attempts all fail, then tryerror correction.Error correction is accomplished through a mixture of disk controller hardware (for ECCgeneration and checking) and system software/firmware (for error recovery). This is acompromise between capability, speed, and cost. The basic capabilities and restrictions ofthe 32-check-bit scheme are summarized below.1) A single error burst of length less than 12 data bits (i.e., a scattering of errorbits within the bit stream, all of which fit within an 11-bit span) can be corrected inblocks shorter than 2685 data words. (Example: for the data "0001100101", thedata "0000101101" contains a single burst error of length 4.). The codeimplemented will detect errors in arbitrarily long blocks, but not enoughinformation exists to correct longer blocks.НЇОf╙pТXН" qН0ЛpНFMОfЯНыОbsТ─Н\НsТП4НыО`0Т─Н\НsТП6НыО^BТ─Н\ НsТ≤П*Т≥НыО\TТ─Н\НsТНыОZfТ─Н\НsТ÷Т═П4НОXБТП6НыОWТ─Н\НsТ═П!Т║НОU■ТНыОSиТ─Н\ НsТ╟Т╠П+НОREТНыОPzТ─Н\НsТ╡ТЁП.НОNЖТП2НыОM,Н\НsТ╘П"Т╙П'НОK╗НыОIщН\НsТ╧Т╨П*НОHYТ╙П@Т╚НЇОD7uТXНЇО@еpТ╩П:Т╪НЇО>ЗТ│Т┌П=НЇО=/Т▀П=Т▄НЇО;eТ║ПSТ╒НЇО9 Т─П#Т│П:НЇО7пТнП.ТоП)НЇО6ТИТЙПMНxО3CwsТ┘Т├ПZНxО1ЦТ∙П8Т√П,НxО0┌Т╚П*Т╛П7НxО/! НxО,┐ТЁП)ТЄП4НxО+"ТП4НxО(└Т▀ПYТ▄НxО'#Т√ПDТ≈П!НxО%бТ·ПWТ÷НxО$aТНЇО!pТЄТ╣ПGНЇОHТиПVНЇО}Т▀П/Т▄П,НЇОЁТП-НxОAТ╘qpПAТ╙НxОvТ┤П3Т┬П$НxО╛Т╪П@Н;┴ОUЧфО╛Н=шОUЧфО╛ТҐНxОАТНpО┼ЧфОАНбО┼ЧфОАТП,НxОТFТGП+НxОLТП,Ъ╨Ї<ХYїDorado Hardware ManualDisk Controller 14 September 19811052) Simple error detectiontwo words are returned by the hardware which are bothzero if the read is successful.3) Software/firmware error correction can be completed in less than one diskrevolution. The correction procedure is well suited to a mixture of software andfirmware. If done entirely in firmware, error correction would take less than 1 ms.4) Not all uncorrectable errors will be detected as such. An uncorrectable errorrequires two bad spots on the disk surface within one sector (the pack isbadthrow it out!), an electronic error in a sector with a bad spot, or twoelectronic errors within one sector. If such an error has occurred, it can, with aprobability of say 20 percent, result in an error pattern and displacement thatseems valid. This will result in leaving the error bits uncorrected and changingsome bits which were in fact correct. This means that for high data security, acheck code should be generated and imbedded as part of the data file beforewriting on the disk.The error-correcting code (ECC) generated is referred to as a Fire Code (see Error-Correcting Codes by Peterson). The following is a detailed description of this code andrecovery procedure.The code calls for dividing the outgoing data stream by a polynomial of the form:P(X) = P1(X)(Xm + 1)Where P1(X) is an irreducible polynomial of degree n (n = burst length) and m is > 2*n.For this particular application the polynomials chosen are:P(X) = (X11 + X2 + 1)(X21 + 1)During a write, the two polynomials are multiplied together and implemented by hardware inthe form:P(X) = X32 + X23 + x21 + X11 + X2 + 1The data stream is premultiplied by X32 to make room for the 2 word ECC and thenreduced modulo P(X). This is accomplished by the normal feedback shift registertechnique with the difference that to perform premultiplication, the output of the register isexclusive-or'd with the incoming data and then fed back. After all data bits have beenshifted out, the contents of the ECC shift registers are appended to the disk block.During a read, the feedback shift register is reconfigured such that the two originalpolynomials are implemented separately. The incoming data stream, including the 2appended words of ECC, is independently reduced modulo P0(X) and P1(X), whereP0(X) = X21 + 1НЇОf╙pТXН" qН0ЛpНFMОfЯНxОbТ┬П'Т┴П(НxО`SТНxО\АТыПLНxО[Т╚Т╛ПAНxОYLТ⌠Т■П9НxОUзТ╙Т╚ПKНxОTТТП,НxОREТТП8ТУНxОPzТєПDТ╔НxОN╟ТвП+ТьП$НxОLЕТІТЇП?НxОKТ╟Т╠П9НxОIPТхП+ТиНxОG┘ТНЇОDТЙП,ТКП!qНЇОBIТ╟pП:Т╠ НЇО@~ТНЇО=ПQНёО9 О9 О9 О:'О9 НЇО6(ТёО5⌡О6(ТєПIНЇО4:ТП;НёО0хО1UО0хО1UО0хО1UО0хНЇО-VТ─Т│ПKНЇО+▄ТНёО(О(їО(О(їО(О(їО(О(їО(О(їО(НЇО$╗ТсП%О%5О$╗ТтНЇО"щТТП6НЇО!Т▒Т▓ПAНЇОHТ╪П%ТҐП2НЇО}ТНПAТОНЇОТЯП:ТРНЇОAТУПBТЖНЇОvТП8ОИОvОИОv НёООwОО▒ОЪ (Ї0=ХUш:Dorado Hardware ManualDisk Controller 14 September 1981106P1(X) = X11 + X2 + 1After reading in all words off the disk, the contents of the two polynomial shift registers areread out of the FIFO. If the data is recovered without error, then reducing it modulo P0(X)and P1(X) results in the registers containing all zeroes.If the data contains an error, then the two registers will be non-zero. If one but not bothregisters is non-zero, then the error is irrecoverable.To recover from an error, a procedure is undertaken which determines the pattern of bitswhich are in error, and the displacement of this pattern from the end of the record. I amsimply going to present the magic equation to be solved, and some magic constants to beused for solving this equation. Much of the polynomial implementation and the equations,which use the "Chinese Remainder Theorem" are discussed in technical reports fromCALCOMP (Calcomp Technical Report TR-1035-04, by Wesley Gee and David George) andXEROX (Xerox XDS preliminary report "Error Correction Code for the R.M. Subsystem," byGreg Tsilikas, 28 March 1972.).The basic equation is:D = Q*LCM (A0*M0*S0 + A1*M1*S1)where:Ei = modulus of the polynomialLCM = least common multiple of E0 and E1Mi = LCM/EiAi = a constant such that Ai*Mi modulo Ei = 1Q = smallest integer to make D positiveSi = number of shift operations to the appropriate polynomial remainders asdescribed below.D = displacement of right-most incorrect bit from the end of the record.The values of E0 and E1 were found by programming the procedure outlined in theCALCOMP report, and yielded the following result:E0 = 21 E1 = 2047The least common multiple (LCM) of E0 and E1 is simply the product of E0 and E1 sincethe two numbers have no factors in common. Thus the LCM, which is also the recordlength which can be corrected, is 42,987 bits, or 26862 words.Knowing LCM and E0 and E1, the values of M0 and M1 are easily found to beM0 = 2047 M1 = 21The values of A0 and A1 are next determined using a trial and error approach that I put ina small program. The results can easily be confirmed, and are given below:НЇОgpТXН" qН0ЛpНFMОfЯНёОa▒ОaОa▒ТОbОa▒ОbОa▒НЇО^Т┤П"Т┬П=НЇО\TТ▄Т█П;О[гО\TНЇОZfТОYыОZfП3НЇОVТТ²ПKТ·НЇОU*ТП7НЇОQ╦Т· Т÷ПNНЇОOМТ ПTТ⌡НЇОN#Т▓Т⌠ПHНЇОLXТ√Т≈ПIНЇОJ█ТэП:ТщНЇОHцТ┌П8Т┐НЇОFЬТ┌П$Т┐П3НЇОE-ТНЇОA╪Н:О>J О=ҐО>JО=ҐО>JО=ҐО>JО=ҐО>JО=ҐО>JО=ҐО>JНxО:ьН:О9 О8─О9 Н:О7О6▓О7О6▓Н:О51О4єО51О4єН:О3CО2ІО3CО2ІО3CО2ІО3CО2ІО3CН:О1UП'Н:О/▀О.ЧО/▀ТГПIН:О-²ТН:О+рПHНЇО(`ТЕО'сО(`О'сО(`ТФП8НЇО&sТП1НёО#О"tО# О"tО#НЇО▐Т·П$ОО▐ОО▐ОО▐ОО▐Т÷НЇО║Т╧Т╨П>НЇОжТП?НЇОdОвОdОвОdОвОdОвОdНёОРОeОРОeОРНЇО─Т▄ОСО─Т█ОСО─ПCНЇО▓ТПKZЇK<ХXЦnDorado Hardware ManualDisk Controller 14 September 1981107A0 = 19 A1 = 195All of the above values derived so far are constants determined for the particularpolynomials chosen. The values of S0 and S1 are determined in the software from theerror patterns returned at the end of a disk transfer.S0 is first determined by a software procedure using the following steps:1) The remainder from dividing the input data by X21 + 1 is found in ECC[11:31];if this remainder is zero, then the error is uncorrectable.2) Test the low order 10 bits for all zeroes, and if not then perform a left circularshift on the 21 bits. When the low order 10 bits are all zeroes, the error pattern isin the upper 11 bits of the word, and S0 is the number of times the circular shiftwas performed.3) If the low order 10 bits don't become all zeroes within 20 shifts (1 full cycle), theerror is uncorrectable.S1 is then determined in microcode as follows:1) The remainder from dividing the input data by X11 + X2 + 1 is found inECC[0:10]; if this remainder is zero, then the error is uncorrectable.2) Test this number to see if it is equal to the error pattern determined in step 3 ofS0, and if not reduce this number modulo X11 + X2 + 1 (left shift and XORfeedback). When the contents of this word equals the error pattern (it isguaranteed to happen before 2047 reductions), S1 is determined as the number ofreductions performed (In the hardware implementation of switching from the writepolynomial to the read polynomials, it was easier to implement a polynomial thatpremultiplied by X11. This means that the remainder returned by the hardwarealready has had 11 shifts performed. To compensate, when S1 has beendetermined by the above procedure, you must add 11 to the value, and subtract2047 if the result is greater than or equal to 2047.).The basic equation for the displacement now looks likeD = Q*42,987 19*2047*S0 195*21*S1where:0 < S0 < 200 < S1 < 2046Notice that the straightforward solution to this equation cannot be done with single-precision arithmetic on the Dorado; to avoid double precision, the following manipulation ofthe equations is useful:НЇОgpТXН" qН0ЛpНFMОfЯНёОbОa▒ОbТ Оa▒ОbНЇО^╛ТПCТНЇО\АТ╧П$О\TО\АО\TО\АП(НЇОZСТП6НЇОW│ОVТОW│ПGНxОTТ▓П"Т⌠ОT°ОTНxОREТП;НxОNсТ⌡Т°ПNНxОMТ┼Т▀П@НxОK>Т÷П'ОJ╠ОK>Т═НxОIPТ НxОEчТ┌ПTТ┐НxОDТНЇО@║О@О@║П,НxО=/ТсТтО=╪О=/О=╪О=/НxО;eТПFНxО7СТ┬П>Т┴НxО6О5xО6ТхП(О6▓О6О6▓О6ТиНxО4Т Т П-НxО2LТ█П/О1©О2LТ▌ НxО0_Т²ПAТ·НxО.■Т╞ПKТ╟НxО,іТҐО-3О,іТЎП7НxО*шТ П%ТО*NО*шНxО(НТ╚Т╛П6НxО'#ТП6НЇО#╠П6Н:О ?О╡О ? О╡НxОмН:ОНО╚Ч╟ООuОН]О╚Ч╟ОН:ОНОЎЧ╟ОО┬ОН]ОЎЧ╟ОНЇОёТЯП@ТРНЇОьТ├ПYТ┤НЇО ТЪhЇф<ХVhYDorado Hardware ManualDisk Controller 14 September 1981108D = Q*2047*21 19*2047*S0 4095*S1D = Q*2047*21 19*2047*S0 2*2047*S1 S1D'= Q*21 19*S0 2*S1where:0 < D' < 20D = 2047*D' S1 (add 42,987 if D' = 0)For some reason that we don't understand, the actual required calculation must be D =2047*(D'+1) S1 in the last step. Also D' is conveniently calculated as (215*21 19*S0 2*S1) rem 21.ЪНЇОgpТXН" qН0ЛpНFMОfЯН:ОbТОa▒ОbОa▒Н:О`0О_ёО`0 О_ёО`0О_ёН:О^BО]╣О^BО]╣НxОZпН:ОYНОXўЧ╟ОYНОXўЧ╟ОYН:ОW;ОVўОW;НЇОSиqТ╚Т╛П8НЇОQЧТ┴ОQqtОQЧqПHОQqtНЇОP4qТОOїtОP4q ЇO└<Х╙zDorado Hardware ManualDisplay Controller14 September 1981109Display ControllerThe Dorado Display Controller (DDC) uses the fast io system to obtain representations ofvideo images from storage; it then transforms these representations into control signals formonitors. Its three design objectives are:(1) To handle a variety of color, grey-level, and binary (black-and-white) monitors;(2) To utilize the full power of the fast io system in producing high-bandwidthcomputer graphics;(3) To allow various compromises in color and spatio-temporal resolution forexperimental purposes. Clock rates, video signals, and other monitor waveformsshould be controllable by firmware.There are two independent video channels capable of running in a variety of modes. Twochannels allow text to be displayed on one channel, graphics on another, or the mainpicture on one, cursor on the other.The DDC must readily handle Alto-style and LF (large format) monitors which we expect tobe standard for most systems. Bit maps, display control blocks, and monitor controlblocks, similar to those used on the Alto, provide the software interface to the DDC. The"seven-wire" video interface makes provision for one or more low bandwidth input devices(keyboard, pointing device, etc.); our current provisions for keyboard and mouse input arealso discussed in this chapter.Keep Figure 14 in view while reading this chapter.Operational OverviewVideo scan lines are encoded in bitmaps, which are contiguous blocks of virtual memory;the two channels, A and B, have independent bitmaps and data paths in the DDC. Thehigh-priority DWT (Display Word Task) runs on behalf of either A or B using the subtaskmechanism; it transmits each bitmap to a FIFO consisting of 15 munches/channel. Thebitmap stream emerging from the FIFO is then sorted into items (1, 2, 4, or 8 bits wide) foreach channel which are combined, mapped, and transformed into pixels (picture cells) onthe screen.In addition to the two channels, the DDC supports a programmable cursor that is 16 pixelsx 1 bit/pixel wide.A lower priority DHT (Display Horizontal Task) handles horizontal and vertical retrace andsets up starting addresses and munch counts, cursor data, and formatting information inthe NLCB (Next Line Control Block) for the DDC. The NLCB is then copied into the CLCB(Current Line Control Block) during horizontal retrace prior to the next scan line.The rate-of-flow of items is governed by the resolution and pixel clock period. Resolutionmay be independently programmed for each channel so that items flow at 1/4, 1/2, or 1times the pixel clock period. If the DispM board is present, then the pixel clock period isalso progammable; otherwise, it is determined by a crystal oscillator on the DispY board,which must have a frequency appropriate for the monitor being driven.ЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОaвrНЇО^epТїП,Т╗П,НЇО\⌡Т░Т▒ПKНЇОZпТП+НyОW^Т²Т·П=НyОU■ТпТяП8НyОSиТНyОQЧТЖПLНyОP4ТўП$Т╞П+НyОNiТП#НЇОJВТ⌠ qpП1Т■НЇОI-ТйТкП>НЇОGbТП$НЇОCПТ▐Т░ПLНЇОB%ТА ТБПIНЇО@[Т²П6Т·П$НЇО>░Т░Т▒П<НЇО<фТ ПZНЇО:ШТНО7┴qТXП2НЇО2psТНЇО.ЧpТ║qpТ╒П/НЇО-3Т╟ПSНЇО+iТ╛ qpП/ТґНЇО)·ТЄП>Т╣НЇО'сТ█П9qpТ▌ НЇО& Т²Т·qpНЇО$>Т НЇО лТ▐ПRТ░НЇОТНЇО░ТєqpПBТ╔НЇОеТ╞ПTТ╟НЇОЗТ▐qpТ░П0qНЇО0pТПSНЇОЎТ Т⌡q pqpНЇОСТїТ╗ПOНЇО)Т°Т²П>НЇО ^Т╙Т╚ПHНЇО⌠ТПEЪЇL<П\©5Dorado Hardware ManualDisplay Controller14 September 1981110Items can be treated in one of three ways: First, an Alto monitor can be driven. Second,items can be mapped through the 256-word x 4-bit MiniMixer into video data for a black-and-white or grey-level monitor.Three separate interfaces are provided on the DispY board. An Alto monitor interface ORs one-bititems from the A and B channels with the cursor, and then XORs by polarity to produce one-bitpixels for an Alto display. A seven-wire interface outputs 1 bit/pixel for a binary monitor. And an 8-bit digital-to-analog converter (DAC) produces grey-level video.Third, items may be mapped by the Mixer (or A color map), a 1024-word x 24-bit RAM, intosignals for a color or grey-level monitor. A variety of modes determine which bits from theA and B items address the mixer. Mixer output consisting of 8 bits for each of the red,green, and blue guns is then digital-to-analog converted for color monitors. Additionally,there is a 24-bit/pixel mode in which the Dorado supplies 8 bits for each of the threecolors; the colors are independently mapped through the Mixer and two additional 256-word x 8-bit RAMs called the BMap and the CMap.The DDC is implemented on two Dorado main logic boards, called DispY and DispM.DispY contains all the logic necessary for vertical and horizontal sweep control, channeldata paths, and video data for binary and grey-level monitors running at a fixed pixel clockrate. DispM contains the color maps, the programmable pixel clock, and the three DACsfor driving a color monitor. Additionally, DispM contains an independent terminal controllerthat is structurally similar to a one-channel, one bit/pixel DispY but is specialized to drivinga 7-wire terminal.Thus there are two principal DDC configurations. On a Dorado with only a 7-wire terminaland no color monitor, only the DispY board is present; it is programmed for Alto terminalemulation, and only a small subset of its capabilities are used. However, on a Dorado withboth a 7-wire terminal and a color monitor, the DispM board is also present; all of DispYand the color hardware on DispM are used to drive the color monitor, and the independentcontroller on DispM is used to drive the 7-wire terminal.Video Data PathFast IO Interface and FIFOThe fast io system delivers data to the DDC at a rate of 16 bits/clock; words are receivedalternately in the REven (t1) and ROdd (t2) registers shown in Figure 14, then written intothe FIFO, a 256-word x 32-bit RAM, during the first half of the next Dorado cycle (t2 to t3),leaving the second half of the cycle free for read access by the video channels. In otherwords, the REven and ROdd registers widen the data path from 16 to 32 bits to allowsufficient time to both write and read the FIFO in one cycle.The 256 double-words in the FIFO are divided evenly among the two channels, so each hasbuffer storage for 16 munches. Each channel has write and read pointers that address theFIFO when appropriate.Write pointers are initialized once during vertical retrace and then sequence throughaddresses for the entire display field; a write pointer is incremented after each double-wordwrite for its channel, so that the next word to be written is addressed at all times. SinceЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТ√П:Т≈НЇО`SТёП(ТєqpНЇО^┴ТНyО[гtТ≥ Т ПSНyОZfТ╛ТґПMНyОYТ┌ПMТ┐НyОW╔ТП@НЇОTVpТ┤Т┬qpП1НЇОR▀Т▄ Т█ПQНЇОPаТ╛П9ТґНЇОNЖТ╚П8Т╛П#НЇОM,ТфПPТгНЇОKaТаП3ТбП"НЇОI√ТП/НЇОF$ТщТчП6НЇОDZТ╡ТЁПCНЇОB▐Т▄ Т█ПQНЇО@еТ╔ПQТіНЇО>ЗТ├Т┤ПJНЇО=/Т┬П`НЇО;eТНЇО7СТ▒ПGТ▓НЇО6(Т╒ПPТёНЇО4^Т▀П)Т▄П2НЇО2⌠ТіП;ТїНЇО0хТ┴Т┼ПKНЇО.ЧТП9НЇО)ДsНЇО%÷qНЇО"-pТ≤ПQТ≥НЇО bТ═ОуtО bpОуtО bpП,Т║НЇО≤Т▒ПTОtО≤pТ▓ОtО≤pНЇОмТ°П)Т²П1НЇОТеП?ТфНЇО8ТП=НЇОРТ┌П,Т┐П+НЇО(Т┴ПMТ┼НЇО]ТНЇО КТППUНЇО Т┬ПFТ┴НЇО VТ═Т║ПCЪнЇ =П]ЭQDorado Hardware ManualDisplay Controller14 September 1981111the fast io system delivers only one munch at a time, there is never any problem indeciding which of the two write pointers should address the FIFO.Read pointers, however, are initialized during each horizontal retrace, so that the correctfirst double-word is read at the start of every scan line. This is required because the fast iosystem always delivers complete munches, but unused double words may appear at theend of the last munch for the previous scan line, or at the beginning of the first munch forthe current scan line; the read pointer has to be reinitialized to skip over these. FIFO readsalternate between channels A and B, so the data rate for one channel is limited to 32bits/2 cycles (=16 bits/cycle).Note that bitmaps are required to start at even addresses because the FIFO is 32 bits wide.Item FormationAt the output end of the FIFO there is a multiplexor shared by both channels and, for eachchannel, two intermediate buffers (FIB and SIB), and a shift register SR. The multiplexorpermutes the 32-bit quantity emerging from the FIFO so that when the double-word hasmarched through FIB and SIB and is finally loaded into SR, successive shifts will producesuccessive items of the selected size (8, 4, 2, or 1 bits).The SR is tapped as follows:SR.0Item[0] for item sizes 1, 2, 4, or 8;SR.16Item[1] for sizes 2, 4, or 8, gated to 0 for size 1;SR.8, SR.24Item[2:3] for sizes 4 or 8, gated to 0 for sizes 1 or 2;SR.4, SR.12, SR.20, SR.28Item[4:7] for size 8, gated to 0 for sizes 1, 2, or 4.All eight Item bits are gated to 0 if the channel is off. It is useful to think at this point that,regardless of a channel's item size, an 8-bit wide item is produced, whose bits contain non-zero data only in those positions dictated by the item size; i.e., for size 1 only the mostsignificant bit may be non-zero; size 2 allows data in the topmost two bits, etc.The SR loads on the item clock after its last item has been used; the item clock rate is thepixel clock rate divided by the resolution (1, 2, or 4 for full, half, or quarter, respectively).Hence, for 8, 4, 2, or 1-bit items, SR will be shifted 3, 7, 15, or 31 times, repectively, and bereloaded from SIB on the following item clock.Synchronization of SR, which uses the item clock, with FIB and SIB, which use the Dorado systemclock, is a little tricky. SIB_FIB will occur no later than (4.6 ns)+C+(1.1 ns)+C+C = 3*C+5.7 nsafter SR_SIB, where C is the period of the Dorado system clock and 4.6 ns and 1.1 ns are the worstcase propagation delay and setup time of the components in the synchronizer; FIB_FIFO will occurat this time or on one of the next three Dorado clocks, depending upon which of these four clockscorresponds to t2 of the cycle in which this channel can read the FIFO. Allowing for propagationdelay through SIB (5.0 ns) and setup time for SR (1.7 ns), the worst case minimum spacing betweenloads of SR is 3*C+(5.7 ns)+(6.7 ns) = 3*C+12.4 ns. This must be less than the time foremptying SR which is I*(32/ItemSize), where I is the period of the item clock. Hence, I >(3*C+12.4)/4 for ItemSize=8, or I > 25.6 ns for a Dorado clock period of C = 30 ns.The 8-bit items from the two channels are then presented to either the Mixer section on theDispM board or the MiniMixer or Alto video interface on the DispY board.НЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТъП/ТЮП$НЇО`SТПAНЇО\АТ╙П3Т╚П(НЇО[Т│П/Т┌П1НЇОYLТ╩Т╪ПEНЇОW│Т█Т▌ПGНЇОUЇТ│П?Т┌НЇОSЛТдП&ТеП/НЇОR"ТНЇОN╟Т├ qП/pП"НЇОJjqТ НЇОFЬpТ┼П&Т▀П4НЇОE-ТїП#qpqpТ╗qpНЇОCcТ╨ПDТ╩НЇОA≤Т⌡П"Т°П7НЇО?нТП;НЇО<\НyО8ЙН"sП%НyО7Н"sТЧТЪНyО5UТ─ Н"sТ╩Т╪П#НyО3┼Т─Н"sТФП-НЇО0Т┴ПdНЇО.MТ┐ПDТ└НЇО,┐ТЁПKТЄНЇО*╦ТПQНЇО'FТ▀q pТ▄П*НЇО%|Т÷П.Т═П3НЇО#╠Т┌П<Т┐П%НЇО!ФТП.НyО%tТ≤ПIТ≥НyОдТ┼Т▀ПFНyОcТ─ПTТ│ НyОТ≈П`НyО╒Т░Т▒ПZНyОAТ°ОЄОAТ²П:НyО Т┤П0Т┬П1НyО9ТеПKТфНyОьТчП0ТъП*НyОwТЗП/ТШП$НЇО)pТ┐П;Т└НЇО ^ТПHЪ &Ї =ПZТSDorado Hardware ManualDisplay Controller14 September 1981112MixerThe Mixer is controlled by the A8B2, BBypass, and 24Bit mode controls. It is a 1024-wordx 24-bit RAM for which the 10 bits of address required may be obtained from two possiblesource distributions, depending upon the A8B2 mode. When A8B2 is true, the addressconsists of AItem[0:7] and BItem[0:1]; when false (called A6B4), the address is AItem[0:5]and BItem[0:3].Another mode, the BBypass mode, can be enabled independently for the B channel. If B isbypassed, none of its bits contribute to the Mixer address. Instead, they bypass the mixerand address a 256 x 8 RAM, the BMap, whose outputs are ORed with the mixer outputs forthe blue DAC. For example, with ASize=8, BSize=4, BBypass true, and A8B2 true, andwith appropriate values in the Mixer RAM, the controller may be thought of as three 4/bitspixel channels driving three color guns. One channel is bypassed data from B, while theother two are mapped through the Mixer.24Bit mode, used in conjunction with BBypass mode, is used to run a three-channel colordisplay directly from memory. In this mode, items from the A channel alternately addressthe Mixer (called the AMap in this mode) and another 256 x 8 RAM called the CMap.Meanwhile, the B channel runs at half the A channel rate and addresses the BMap asdescribed above. (That is, the B channel must be set to one-half the resolution of the Achannel.) With suitable values in the color maps, the AMap, BMap, and CMapindependently generate outputs for the red, blue, and green DACs respectively.Note: when the A channel is turned on, the first AItem addresses the AMap and the second AItemaddresses the CMap. For the A and B pixels to align properly on the display in 24Bit mode, the leftmargin counts must be set to start the B channel one pixel clock earlier than the A channel. Theblue and green portions of the AMap must be entirely zeroed, since the blue and green outputs areORed with the BMap and CMap.After routing as dictated by the mixer modes, chosen items are loaded into the mapaddress registers, causing the color maps to produce a new video value every pixel clock(every two pixel clocks in 24Bit mode), and these values are latched in the three 8-bit mixeroutput registers. Three very fast DAC modules then produce a Red-Green-Blue triple ofanalog signals for a color monitor, or up to three grey-level video signals. In conjunctionwith the sync, blank, and composite waveforms produced by the monitor control circuitry,these signals can drive a wide variety of monitors attached to the Dorado.Alto Video InterfaceA small circuit on the DispY board produces video for an Alto monitor. This circuit ORsCursorData, AItem[0], and BItem[0], then XORs by the polarity, and finally ORs with thevertical and horizontal blanking signals. This interface is obsolete and is no longer inactive use.MiniMixerA small video mixer on the DispY board, not to be confused with the large Mixer on theDispM board, can drive either a DAC or the seven-wire interface discussed later. TheMiniMixer is a 256 word x 4-bit RAM addressed by a combination of AItem, BItem, and statebits, as shown in Figure 14. On every pixel clock, dDAC[0:3] are loaded from MiniMixerЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbqНЇО^╛pТ▒qpqpqpТ▓НЇО\АТ⌠ПXНЇО[ТЎП9Т©НЇОYLТ╔ ТіПOНЇОW│ТНЇОTТ└ПXНЇОREТ∙П[НЇОPzТ├ПRТ┤НЇОN╟Т╙Т╚П3НЇОLЕТ▒ПGТ▓НЇОKТ║П8Т╒НЇОIPТП'НЇОEчТ▌П>Т▐НЇОDТ║ПQТ╒НЇОBIТйТкПHНЇО@~ТеП<ТфНЇО>ЁТёТєП:НЇО<ИТ/Т0П3НЇО;ТПNНyО8]utТ ПLТ⌡НyО6ЭТ├Т┤ПFНyО5⌡Т√П"Т≈П?НyО4:Т▌ПaНyО2зТНЇО/▀pТАП&ТБП,НЇО-юТ⌡П)Т°П/НЇО+ЖТ┌Т┐ПCНЇО*+Т╡П&ТЁП0НЇО(`Т·П2Т÷П*НЇО&√Т═П6Т║П"НЇО$кТПJНЇО ├qНЇОpТїПTТ╗НЇОIТ╪П(ТҐП/НЇО~ТкП+q ТлНЇОЄТ НЇОnНЇОЭpТ╙П?Т╚НЇО2ТиТйПIНЇОgТ─Т│П@НЇО °ТґТўПO╟Ї U=П]ІDDorado Hardware ManualDisplay Controller14 September 1981113output, while dDAC[4:7] are loaded directly from AItem[4:7]. The MiniMixer aims atexperiments with mixing channels and driving grey level monitors.Horizontal and Vertical ControlEvery monitor requires horizontal synchronizing and blanking waveforms. Interlacedmonitors must be able to distinguish fractions of a scan line to implement interlacing. Ingeneral, the duration and phasing of sync/blank waveforms is unique to a given monitor.The DDC uses the 1024-word x 3-bit HRam (Horizontal RAM) to control horizontalsync/blank.The DDC has a set of registers called the CLCB (Current Line Control Block) whichcontrols video generation for the current scan line. The DHT sets up parameters for thenext scan line in NLCB (Next Line Control Block), a 16-word x 12-bit RAM. The first 32pixel clocks of horizontal blanking are called the HWindow; during HWindow parameters forthe next line are copied from NLCB into CLCB. Vertical control is also handled through theNLCB.The interpretation of fields in NLCB and HRam are shown in Figure 15 and loading will bediscussed in the "Slow IO Interface" section; the use of the different information isdiscussed here. The top part of Figure 14 shows how horizontal timing is controlled.Line Control BlocksThe fields in NLCB/CLCB are interpreted as follows, where a denotes that the item ischannel-specific (i.e., copies exist for both A and B channels):aPolarity. A single bit, used only for binary monitors, that inverts black and white(APolarity and BPolarity are or'ed by the hardware).aResolution. A 2-bit field that controls item clock generation; values of 0, 2, and 3cause quarter, half, and full resolution, respectively.aItemSize. A 4-bit field unary encoded as aSize1, aSize2, aSize4, or aSize8,denoting bits/pixel for the channel; setting multiple bits is illegal.aLeftMargin. A 12-bit field in units of pixel clocks specifying 31 less than thenumber of pixel clocks to wait after HWindow completes before turning thechannel on. This value is not a straightforward constant, but depends uponmonitor-specific horizontal blanking time. If the horizontal blanking time is B pixelclocks and the desired beginning of data is L pixel clocks after the end ofhorizontal blanking, then aLeftMargin should be loaded with B+L3231 =B+L63, independent of resolution. Since L may be 0, this implies that thehorizontal blanking time for the monitor must be greater than 63 pixel clocks.Since high-speed monitors typically have greater than 4 ms horizontal blankingtimes, and are this fast only with high speed pixel clocks, this restriction is notexpected to be significant.НЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТЗП=ТШНЇО`SТПAНЇО[:sНЇОWхpТП!ТП2НЇОUЩТ╔П[НЇОT3ТіПWНЇОRhТ П"ТqpП'НЇОP· НЇОM,ТЕП)ТФqpП#НЇОKaТїПEТ╗НЇОI√Т╛qpП.ТґНЇОGлТ└П3qpТ┘НЇОFТ│П?Т┌НЇОD7НЇО@еТ⌠Т■П<НЇО>ЗТСТТП<НЇО=/ТрТсП5НЇО8ЙqТНЇО5xpТкП*ТлvpНЇО3ґТП@НyО/ДЧЯО0;vНjО/ДЧпО0;pТ∙Т√ПHНyО.qТП4НyО*╗ЧЯО*ЪvНjО*╗ЧуО*Ъp Т├Т┤П;НyО)4ТП7НyО%kЧЯО%бvНjО%kЧuО%бpТсТтvpvpvp vpНyО#ЬТПFНyО /ЧЯО ├vНjО /ЧЮО ├p ТхП8Ти НyО╩ТТП3НyОПТГТХП1НyО&Т▓П=Т⌠НyО[ТЯПKНyО▒ТНvpП,НyОфТьП5ТыНyОШТжП=ТвНyО1ТсП(ТтvpНyОfТ╪П3ТҐНyО⌡ТЪ zЇT=ПYЇ\Dorado Hardware ManualDisplay Controller14 September 1981114Note: For a monitor connected via the 7-wire interface, aLeftMargin must be B+L68,rather than B+L63, because video signals are delayed from horizontal control waveformsby 5 pixel clocks.Note: The value loaded into aLeftMargin must actually be the negative of the left margincount computed above.aWidth. A 12-bit counter that counts at the pixel clock rate as soon as thechannel turns on; when the counter runs out (or when horizontal retrace starts,whichever is earliest), the channel is turned off. Precisely, if the channel is to runfor W pixel clocks, the width counter must be loaded with (W+255).aFifoAddr. An 8-bit quantity pointing to the munch and word within the munch forthe first FIFO read for the next scan line; this must be an even number becausedoublewords are fetched from the FIFO. Firmware must keep track of the numberof used munches for any given line and advance aFifoAddr by exactly the rightamount, adjusting for munch boundaries, interlacing, and data breakage. TheCLCB register for aFifoAddr is the channel read pointer itself.MixerModes. A set of bits that control the mixer; these are not channel-specific.These will normally be changed infrequently, maybe at the field rate or duringdisplay initialization. However, they are in the NLCB to allow modes to change onthe fly.Vertical Control Word (VCW). A word controlling the vertical retrace operation ofthe monitor; it contains the vertical blank bit, vertical sync bit, and interlace fieldbit discussed in the "Vertical Waveform Generator" section below.Cursor and CursorX. The 12-bit CursorX value is loaded into a counter whichstarts counting at the end of HWindow. When the counter runs out, the 16-bitCursor value is shifted out onto the CursorVideo line. This is used by the Altovideo interface and in the MiniMixer address. Precisely, if horizontal blanking is Bpixels in duration, and the leftmost bit of the cursor is to appear X pixels beyondthe end of horizontal blanking, then the CursorX register must be loaded with(B+X+226), or (B+X+221) when using the 7-wire interface.Horizontal Waveform GeneratorThe 1024-word x 3-bit HRam contains control information for these waveforms. Undernormal operation, HRam is addressed by a 12-bit counter (HRamAddr[0:11]) which is resetat the leading edge of horizontal sync and then increments every pixel clock until the nextleading edge of horizontal sync; HRamAddr[1:10] address the RAM, and the output isloaded into the HRamOut register every other pixel clock. The three bits in HRamOutcontrol horizontal sync, horizontal blank, and half-line; these three bits are combined andlevel shifted by a logic network appropriate for the monitor being driven.The 1024-word HRam imposes the uninteresting restriction that there be fewer than 2048 pixels/scanline.As shown in the diagram at the top of Figure 14, horizontal blanking (HBlank) is true fromthe end of one scan line to the beginning of the next. During horizontal blanking, HSync isturned on to initiate the horizontal retrace and turned off again when horizontal retrace isНЇОf╙pТ─Н" qН3gpТНFMОfЯН;ОbutТ╞П2Т╟vtН;О` Т┐Т└П<Н;О_9ТН;О\wutТ≤ Т≥vtutН;ОZСТНyОWNЧЯОW╔vНjОWNЧ╡ОW╔pТшТэПBНyОUзТ╦Т╧П>НyОTТ░ПMТ▒ НyОREТПCНyОN|ЧЯОNсvНjОN|Ч─ОNсpТ┴П3Т┼НyОMТ╗П+Т╘П$НyОK>Т▒П@Т▓ НyОIsТІТЇvpНyОG╗ТзП4ТшНyОEчТvpП,НyОBЧфОBl ТїП!Т╗qpНyО@║ТпТяП0НyО>вТ▐П1Т░П!НyО=ТНyО9CЧЭО9 Т²Т·НyО7пТ╗ПQТ╘НyО6ТПAНyО2<Ч ┴О2⌠ТфТгП%НyО0хТҐПFТЎНyО.ЧТ╧П?Т╨НyО-3Т▒П<Т▓НyО+iТ÷ПSНyО)·ТтП9ТуНyО'сТП:НЇО#▌qНЇО pТмП7ТнНЇОQТ∙П/Т√П(НЇО┤Т▓ПVТ⌠НЇО╪ТэПRНЇОЯТйПTНЇО'Т╔П[НЇО\ТПJНyО⌡tТ┤ПQТ┬НyО:НЇО КpТ∙ Т√ПOНЇО Т└Т┘ПDНЇО VТ°П;Т²П!ЪBЇ <П]ЭoDorado Hardware ManualDisplay Controller14 September 1981115finished. HBlank then continues for a monitor-specific interval. Note that if a channel'svisible left margin is non-zero, then the horizontal scan will begin before that channel isproducing any data; in this case, the video channel outputs zero items to the mixing stagesuntil the channel is turned on.Due to an implementation error, when the 7-wire interface is being driven from DispY, the value ofHBlank[i] may differ from HBlank[i1] only when i is even, where i is HRamAddr[1:10].Vertical Waveform GeneratorOnly 2:1 interlaced monitors are supported in this design, but more complicated verticalcontrol could be provided, if desired. To support 2:1 interlace, HRam contains a waveformcalled HalfLine, which is a pulse at the horizontal line frequency, 180o out of phase withHSync.Vertical control is handled by DHT through the NVCW word in the NLCB, which specifieswhether or not vertical blank or retrace should begin or end during the next scan line. TheDHT microcode must keep track of scan lines to enable vertical signals at the appropriatetimes.The three VCW bits are called VBlank, VSync, and OddField. VSync enables vertical syncto begin on the next line, and the OddField bit chooses either HSync or HalfLine on whichto do vertical syncing (OddField=1 implies HalfLine phasing for vertical sync). This phasewill alternate from the start of the line to the middle of the line and back for successivefields. The blanking signal for the monitor is VBlank ORed with HBlank.Pixel Clock SystemThe programmable pixel clock on the DispM board, if present, determines the fundamentalvideo data rate for a given monitor. The pixel clock is controlled by loading the PixelClkregister via the slow io system. The pixel clock frequency is (312.5*(241M))/(16D)KHz, where M is PixelClk[4:11] and D is PixelClk[12:15]. Note that the pixel clock will notstabilize until about 1/2 second after the PixelClk register is loaded.The parts of the DDC synchronized to the rest of Dorado do, of course, use the Doradosystem clock. As discussed earlier, the synchronization logic for refilling SIB after SR_SIBputs a lower bound on the pixel clock period of (3*C+12.4)/4 ns (= 25.6 ns for a Doradoclock period of C = 30 ns), for an item size of 8 on either channel. We anticipate thatpixel clock rates in the range 10 to 50 MHz (100 to 20 ns/pixel) will be required, so thelower bound is approximately consistent with this.НЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТ╠П&Т╡П5НЇО`SТЁПPТЄ НЇО^┴Т┤П[НЇО\ЎТНyОYЭtТ Т⌡ПJНyОX°utТut ututНЇОTyqНЇОQpТ╨ПOТ╩НЇОO=Т▀Т▄ПJНЇОMOТїТ╗П7ОMэОMOНЇОK└НЇОHТёП3ТєП"НЇОFHТ┐П<Т└НЇОD}Т≥ПFТ НЇОB╡НЇО?AТ≤qpqpТ≥qpНЇО=vТ░П/Т▒П*НЇО;╚Т▓П?Т⌠НЇО9АТўТ╞П;НЇО8ТПHНЇО2ЩsНЇО/▀pТ■П#Т∙П4НЇО-юТёПRТєНЇО+ЖТиТйП@НЇО*+Т≥П8Т П$НЇО(`ТПGНЇО$НТ╛П(ТґП-НЇО#$Т█ПBТ▌НЇО!YТ▐ПPТ░НЇО▐Т╚Т╛ПLНЇОдТ╙П$Т╚П5НЇОЫТП2Ъ ІЇ╡<ПLYDorado Hardware ManualDisplay Controller14 September 1981116Seven-Wire Video InterfaceSo that a number of different controller and terminal types may be freely interconnected inDolphin and Dorado-based systems, a common interface between terminals and controllershas been defined. This interface assumes that a terminal contains a raster-scanned bitmapdisplay and one or more low bandwidth input devices (keyboard, pointing device, etc.) TheDDC transmits digital video and sync to the terminal over six pairs of a seven-pair cable.The input data is encoded by a microcomputer in the terminal and sent back serially overthe seventh pair (the "back channel"). Video and control (sync) are time-multiplexed, andfour bits are transmitted in parallel to reduce the cable bandwidth required.While the description in the following sections assumes a display having one bit/pixel, thebasic signalling mechanism may be extended to support gray-level or color displays.Video OutputThe four output lines are interpreted as either a 4-bit nibble of video or four control signalsaccording to the phases of the two clock signals; the DDC places data on the data lines atthe falling edge of ClkA, and the terminal samples this data on the rising edge of ClkA. IfClkB is 1 at this time, the nibble is interpreted as four bits of video, else as sync andcontrol information. ClkA and ClkB are transmitted in quadrature so that the terminal canreconstitute a clock at the video bit rate.When a nibble is interpreted as control information, bit 2 is reserved for horizontal syncand bit 3 for vertical sync, while 0:1 are undefined; different types of terminals may use 0:1for any purpose.A circuit on the DispY board drives the seven-wire interface from the MiniMixer.MinMixer[0] is serial-to-parallel converted into four-bit nibbles, which are held in a registerfor transmission. Sync, blank, and clock phases are generated in accordance with theseven-wire interface specification.Back ChannelData from low bandwidth input devices at the terminal are transmitted serially over the backchannel. Data are clocked by the terminal on the rising edge of the horizontal blank pulseand are sampled by DHT during the subsequent scan line after HWindow.By convention the terminal microcomputer encodes 32-bit messages (delivered in 32 scanlines); each message begins with a 1, and after the 32nd bit of the message the DHTignores the backchannel until the start of another message is indicated by another 1. Themessage consists of a start bit, 3 unused bits, a 4-bit message type, a 16-bit message body,and finally an 8-bit trailer which must be 2008.The terminal microcomputer perpetually cycles through all possible keys on the keyboard(as well as mouse buttons and keyset paddles), detecting changes in state of the keys; thestate of the keyboard then exists in seven 16-bit words, and a back channel message isdefined for each. Whenever one of these words changes value, it is sent to the Dorado ina message. Additionally, changes in mouse x,y coordinates are reported once per field(i.e., twice/frame or typically 60 times/sec). If the mouse has not changed position duringЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbsНЇО^╛pТ▐Т░ПKНЇО\АТ▐ПVНЇО[Т├П)Т┤П1НЇОYLТ┘П%Т├П5НЇОW│ТїПCТ╗НЇОUЇТ Т⌡П?НЇОSЛТ■ПDТ∙НЇОR"ТПMНЇОN╟Т≈П/Т≤П,НЇОLЕТУПIТЖ НЇОH÷qТНЇОE-pТ┬Т┴ПJНЇОCcТ▀Т▄П@НЇОA≤Т∙Т√П?НЇО?нТдПMТеНЇО>Т ПZНЇО<8ТП+НЇО8фТЄТ╣П>НЇО6ЭТ┼Т▀ПUНЇО51ТНЇО1©Т*ПPНЇО/УТ ПVТ⌡НЇО.*ТцП0ТдП%НЇО,_ТП#НЇО(qНЇО$╗pТ┌П5Т┐П'НЇО"щТ▄Т█ПRНЇО!ТПEНЇО║Т÷ПVНЇОжТиП3ТйНЇОТ▒ПRТ▓НЇОAТ│ПEТ┌НЇОvТП.ОИtОvpНЇО(Т╘ПWНЇО]Т▐ПZНЇО▓Т╟ПSТ╠НЇО хТ▐П"Т░П7НЇОЩТ╩ПPТ╪НЇО 3Т▐Т░П=ZЇ Л=П^0Dorado Hardware ManualDisplay Controller14 September 1981117a field, then one keyboard word is reported instead of the mouse position change; thus, thecorrect state of the keyboard is eventually reported even if transitions are missed.Table 24: Terminal Microcomputer MessagesMessageTypeComments00BIllegalignored01BKeyboard word 0 (corresponds to Alto memory location 1077034B)02BKeyboard word 1 (Alto 177035B)03BKeyboard word 2 (Alto 177036B)04BKeyboard word 3 (Alto 177037B)05BMouse buttons and keyset (Alto 177033B)06B8-bit changes in X-coordinate (0:7 of the message body) and Y-coordinate (8:15 of themessage body), represented in excess-200B notation07BIllegalignored10BKeyboard word 4 (Star keyboards only; no Alto analogue)11BKeyboard word 5 (Star)12B16BIllegalignored17BBoot message. Actually, depressing the boot button jams the data to one continuously,rather than generating a valid terminal message. Furthermore, when the boot button islet up, there may be as many as 8 bits of garbage following the last consecutive onebit; these must be ignored by the firmware. The firmware should also ignore bootbutton pushes less than 10 ms in duration, as these may be caused by noise or contactbounce.Processor Task ManagementThis section outlines the implementation requirements of DHT and DWT and discusses thehardware associated with task wakeups and DWT subtask arbitration between the twochannels.Since DHT must do a lot of processing, it runs at low priority and is awakened once/scanline at the end of HWindow. When it runs, it must calculate all parameters for the nextscan line (i.e., the one after the scan line that is just starting), load the NLCB appropriatelyfor each channel, and set up the munch address and count for each channel in the RMregisters aNextAddr and aNextCount referred to in the DWT sample code below; then itsets the aNextWCBFlag flags discussed below. The DHT wakeup will remain active untilany NLCB output command is executed, so the DHT must execute at least one NLCBoutput command every time it wakes up, and this must occur at least three instructionsprior to blocking.DWT is a very high priority task which may run on behalf of either channel: channel A issubtask 0; channel B, subtask 2. Since it uses the subtask mechanism, DWT must alwaysblock at the same instruction each iteration. DWT does not explicitly know the channel forwhich it is executing at any given time; its two parameters, a start address and munchcount, are received from DHT in RM registers specific to the subtask. In the normal case,DWT initates an IOFetch and blocks. The following is the main-line DWT microcodepresently in use:НЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТ┐П@Т└НЇО`SТПTНжО\TwТFП)НyОYІuНyОXUНН▓ОUMtНН▓ОS┌НТП>Н▓ОQ╦НН▓ОOМНН▓ОN#НН▓ОLXНП'Н▓ОJ█НТ≤П%Т≥П0НОI-ТП2Н▓ОGbНН▓ОE≈НП7Н▓ОCмНН▓ОBНН▓О@7НТ▀П'Т▄П/НО>вТ░Т▒П>НО=vТ·ПDТ÷НО<ТюПLТаНО:ЄТ┐Т└П5НО9TНЇО4^sТНЇО0ЛpТ∙П<Т√НЇО/!ТшПQНЇО-VНЇО)ДТ∙ПBТ√НЇО(Т╟Т╠П8qНЇО&OpТ▄Т█ПWНЇО$┘Т╚Т╛ПBНЇО"╨ТЁ vpvpТЄП)НЇО ОТ╚vpПEТ╛НЇО%ТнТоП7НЇОZТ╦П"Т╧П4НЇО░ТНЇОТ ПYНЇОSТ∙П!Т√П5НЇО┬Т┼П"Т▀П9НЇОЎТюПDТаНЇОСТ⌠ПMТ■НЇО)ТЦП#ТДП.НЇО ^Т dЇ <ПZТ?Dorado Hardware ManualDisplay Controller14 September 1981118%RM registers for channel A, indicated by names beginning with "A" below, are used in the program, but thecorresponding set of registers for channel B, in a different RM region, will be referenced when SubTask is 2.Note that TIOA selects the DWTFlag register and T contains 20 at the beginning of the loop, so the secondinstruction is used both to increment the munch address and to signal the hardware that an IOFetch iscommencing.%DWTStart:ACount_(ACount)T, Branch[DWTCheck, R<0];AAddress_(IOFetch_AAddress)+(Output_T), Block, Branch[DWTStart];%AAddress will be even if we just exhausted a scan line. AAddress will be odd if we have just been awakened tostart a new scan line. In either case, isolate flag in AAddress[15] for use in adjusting the WCB flags.%DWTCheck:AAddress_ (AAddress) AND (1C), Branch[DWTAdjustWCBFlags, R even];%Note that the change-RSTK-for write function used below is ok, but the change-RBase-for-write functions areillegal because of subtasking.%DWTRefill:ACount_ANextCount;*from DHT, # munches to fetch -1 in 0:11BrLo_ANextAddrLo;*first munch addressBrHi_ANextAddrHi;%Now adjust WCB flags, as follows: If we just exhausted a scan line, AAddress=0 now; execute Output_0 toclear the CurWCB flag, and set AAddress to 1 for the next wakeup. If we are starting a new scan line,AAddress=1 now; execute Output_1 to set the CurWCB flag and clear the NextWCB flag, and set AAddress to 0for the first IOFetch.%DWTAdjustWCBFlags:AAddress_ (AAddress)1, Output_AAddress, Block, Branch[DWTStart];DWT lowers its wakeup request at the onset of the DWTStart instruction, and the DDCremembers that DWT is in progress. No further DWT wakeups will be generated while thetask is running or is preempted by a higher priority task. Whenever DWT blocks, a counteris initialized to a constant value N and counts once per Dorado cycle; when the counterruns out, DWT wakeups are allowed again. This counter has two purposes. First, within amunch loop it spaces out IOFetch references to the memory system by 8 or more cycles(depending upon N, which is adjustable through a hardware SIP component), so as not toclog the memory pipeline. Second, the decision to generate subsequent DWT wakeups isbased upon the state of flags that may be altered by output commands; these commandstake time to get from the processor to the DDC and alter the state. Other tasks may havethe processor while these state changes take effect.After N cycles have elapsed, DWT will be woken whenever aWantsDWT is true for one ofthe channels. Two channel-specific flags are involved in DWT wakeup control:aCurrentWCBFlag is true when a is actively moving words into the FIFO; aNextWCBFlag isset true by DHT after it has loaded the munch address and munch count into DWTnextaddrand DWTnextcount for a. After fetching the last munch for a scan line, DWT clearsaCurrentWCBFlag and blocks unless aNextWCBFlag is true. In other words, aWantsDWTwhen(aNextWCBFlag & not aCurrentWCBFlag) %(aCurrentWCBFlag & aFifoAvailable).If only AWantsDWT or only BWantsDWT, no conflict arises and the requesting channel getsDWT. However, if both channels want DWT, the channel that ran least recently will runnext.ЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНyОbAtТFПjНyО`ЮПmНyО^BПiНyО\АПeНyО[─ НyОZ НyОX©НmП)НmОW^П@НyОT²ПoНyОS<ПhНyОQшНyОPzНmПAНyОOПlНyОM╧НyОLXНyОJВ НmН$┘П(НmОI√Н$┘НmОH6НyОEtПhНyОDПgНyОB╡ПiНyОARНyО?ЯНyО>░НmО=/ПAНЇО9АpТб ТцПEНЇО8Т⌠П2Т■П$НЇО6KТ├Т┤ПEНЇО4│Т╚П$Т╛П3НЇО2ІТ▄ПHТ█НЇО0ЛТїПEТ╗НЇО/!Т≈П&Т≤П0НЇО-VТ ПUНЇО+▄Т║ПKТ╒НЇО)аТ⌠ПCТ■НЇО'ВТП4НЇО$┘Т²П8vpНЇО"╨ТEПMНЇО ОvpТ▀Т▄ vpП)vp НЇО%Т└ПJТ┘НЇОZТыvpП+ТзНЇО░vpТ·П!vpТ÷vpНЇОеНyОSvpТvpНyО┬vpvpНЇОТ└Т┘ПFНЇОLТ╠П-Т╡П)НЇО │Ъ╛Ї :<ПZя7Dorado Hardware ManualDisplay Controller14 September 1981119Two observations must be made about the DWT microcode. First, because the finalinstruction is normally an IOFetch_, the next instruction executed (by another task) will beheld one cycle if it initiates any memory reference. Secondly, the two instruction loopabove requires that the hardware cope with the NextLies condition discussed in the "SlowIO" chapter; a pathological lockout problem could occur if a high demand task of higherpriority is coded so that it always creates NextLies (say, by doing Block and immediate_Md in the instruction after a fetch). This would result in the DWT wakeup beingfrequently delayed by 2 cycles.Note: Neither DWT nor DHT drives the IOAtten branch condition.Slow IO InterfaceDDC manages all control functions via the slow io system. At this point you should studyFigure 15, which shows the format of the various output and input commands; there are sixoutput devices and one input device on the DispY board, and eight output devices and oneinput device on the DispM board (if present). Output commands are handled uniformly:TIOA is clocked into a register at t1; the register output is decoded and identified as one ofthe DDC commands; if the processor is doing an Output_B, then at t3 IOB data from theprocessor is clocked into a register and one of the "TIOA command" pulses occurs from t3to t5, at which point the desired action is complete.The IOB data received at t3 of an Output_B will remain in the DDC buffer register (RIOB)until the next output command. This is useful for debugging and for muffler readout of theNLCB (because an NLCB address can be loaded into RIOB for multiple cycles).The HRam, MiniMixer, Mixer, BMap, and CMap are RAMs that will generally be loadedduring system initialization and not often changed while pictures are being displayed. Theprogrammable pixel clock will also be loaded during initialization, if it is being used insteadof the fixed crystal oscillator.The HRam, Mixer (AMap), BMap, and CMap addresses each have two independentsources: the Dorado slow io system and the video system. Video system addressing isdisabled during loading from the Dorado. The output commands to each of these RAMsare interpreted as follows: The Keep' bit is saved in a flipflop loaded by every RAM outputcommand; as long as Keep' is true (i.e., low), video system addressing is off. If LoadAddris true, then IOB[4:15] are loaded into the RAM address register. If Write' is true (i.e., low),the currently-addressed word of the RAM is written from the data field; additionally, theRAM address register increments after writing, so the RAM can be loaded sequentially athigh speed. A RAM output command with Keep' false (i.e., high) releases the RAM fromDorado control and returns it to the video system.Note: the LoadAddress and Write' bits of a RAM output command take effect only if the Keep'flipflop is already true (i.e., set to zero by a previous RAM output command).Note: in the case of the Mixer, the RAM address is loaded from IOB[4:14] and a Hi/Lo Select bit isloaded from IOB[15]. The latter bit determines which 12 bits of the 24-bit wide mixer word will beloaded by the next Write'. The Hi/Lo Select bit behaves as a low-order extension of the Mixeraddress counter, so successive Write' commands will alternate between the halves of one mixer wordbefore advancing to the next.ЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТЦП#ТДП-НЇО`SТ⌠ПBТ■НЇО^┴ТфПXНЇО\ЎТ∙ПXНЇОZСТіП1ТїП&НЇОY)Т©ПWНЇОW^ТСП,ТТП%НЇОU■ТНЇОR"qpП:НЇОMsНЇОI√pТ П%Т⌡П4НЇОGлТ└Т┘ПCНЇОFТ├ПXНЇОD7ТЇТ╦П6НЇОBlТ├П$ОAъtОBlpТ┤П!НЇО@║Т╒П"ТёО@tО@║pНЇО>вТ└П/Т┘П(О>JtНЇО=pТО<tО=pП0НЇО9ҐТ²О90tО9ҐpП&Т·НЇО7СТ┬ Т┴ПMНЇО6(ТПKНЇО2ІТлПJТмНЇО0ЛТ⌠ПKТ■НЇО/!Т▀ПRТ▄НЇО-VТНЇО)ДТ П>ТНЇО(ТЎТ©ПHНЇО&OТ╗ПHТ╘ НЇО$┘Т▀Т▄П@НЇО"╨Т∙ПNТ√НЇО ОТ├П+Т┤П6НЇО%ТЎП7Т©П"НЇОZТїП1Т╗П&НЇО░Т╔ТіПCНЇОеТП2НyОutТҐПPТЎНyОёТututНyОutТ▐ПOТ░ НyОєТ≤ПPТ≥НyОCТЄП;Т╣П#НyОБТ┌ПIТ┐НyО │Т ZЇ ^=ПZґ]Dorado Hardware ManualDisplay Controller14 September 1981120The MiniMixer is loaded by a single output instruction that specifies both the address anddata to be loaded. During the command pulse from t3 to t5 of the Output_B instruction,the video channel address to the MiniMixer is replaced by the address being loaded, so ifthe video channel is active, garbage may appear at the output during this cycle.The 16-word x 12-bit NLCB is also loaded by single output instructions that specify boththe address and data. For the NLCB, output instructions are only effective when HWindowis not occurringduring HWindow the RAM address is supplied by a counter thatsuccessively copies the NLCB words into CLCB. The format of each of the words in NLCBis shown in Figure 15. Note that any NLCB output operation will dismiss the wakeuprequest for DHT, and DHT must not block any sooner than the fourth instruction after thefirst NLCB output operation is issued.The Statics output command is used for debugging and initialization. Two bits in theStatics register called DHTShutUp and DWTShutUp are discussed in the "DDC InitializationRequirements" section below. Three other fields called FakePClk, UseFakePClk, andMufAddr are used for debugging. When UseFakePClk is true, the regular pixel clock isdegated; if FakePClk is true, then a pixel clock will occur at t5 of the Statics outputcommand; otherwise no clock occurs. Every Statics command also loads the hardwaresignal addressed by MufAddr into a flipflop (at t5) which can be read by the Status inputcommand discussed below. In combination, the fake pixel clock and muffler readoutfeatures allow diagnostic firmware to checkout most of the internal data paths in theDDCby simulating a very slow pixel clock and "stepping" the DDC through various states,the diagnostic can check nearly all of the data paths between fake pixel clocks. Thehardware signals selected by MufAddr[5:11] are given in the table below.Table 25: DDC Muffler SignalsMufAddrSignalMufAddrSignal 0ACurrentWCBFlag 70AFifoFull 01:07AReaderPtr[1:7] 71BFifoFull 10ANextWCBFlag 72ASize8 11:17AWriterPtr[1:7] 73ASize8-4 20BCurrentWCBFlag 74ASize8-4-2 21:27BReaderPtr[1:7] 75BSize8 30BNextWCBFlag 76BSize8-4 31:37BWriterPtr[1:7] 77BSize8-4-2 40:47AItem[0:7] 100AOn 50:57BItem[0:7] 101BOn 60:63AServicePtr[1:4]102:103ARes[0:1] 64:67BServicePtr[1:4]104:105BRes[0:1] 106MonitorTypeMuffler 106 (MonitorType) is the only one of interest during normal operation. It identifiesthe type of monitor connected via the 7-wire interface: zero denotes an Alto-style monitor;one denotes an LF (large format) monitor.ЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТ⌡Т°П>НЇО`SТїП3О_фtО`SpО_фtО`SpТ╗НЇО^┴Т Т⌡П@НЇО\ЎТПPНЇОYLТґТўП?НЇОW│Т▄Т█П>НЇОUЇТtpП7ТНЇОSЛТ┬Т┴П:НЇОR"ТнТоПAНЇОPWТ°ПBТ²НЇОN▄ТП&НЇОKТяПUНЇОIPТ▀ПXНЇОG┘ТУП7ТЖqpq pНЇОE╩qpТ╧П*Т╨П$НЇОCПТэП0ТщОCctОCПpНЇОB%ТҐТЎП7НЇО@[ТіП1О?нtО@[pТїНЇО>░ТАП)ТБП)НЇО<фТАПQТБНЇО:ШtpТ├ПLТ┤НЇО90ТмП5ТнНЇО7fТПHНфО2LsТXНyО.зqН Н)WН0ЛНyО+ipТ─Н Н)WН0ЛНyО)·Н Н)WН0ЛНyО'сН Н)WН0ЛНyО& Н Н)WН0ЛНyО$>Н Н)WН0Л НyО"sН Н)WН0ЛНyО ╘Н Н)WН0ЛНyОчН Н)WН0Л НyОН Н)WН0ЛНyОIН Н)WН0ЛНyО~Н Н)WН0ЛНyОЄН Н)WН0ЛН)WОИН0Л НЇОwТ П]НЇОґП[НЇОБТП)Ъ ZЇ⌡<ПYpuDorado Hardware ManualDisplay Controller14 September 1981121A single input device called Status is implemented. It is used to return the currentlyselected muffler bit and the seven-wire interface received data bit.The MapInLo and MapInHi input devices read the current values output from the colormaps (Mixer, BMap, and CMap, whichever are active). When the color maps are controlledby the video system, these outputs change too rapidly for reading them to be useful (unlessthe DDC is being single-stepped by means of UseFakePixelClk). However, when the colormaps are controlled by the Dorado, this input device can be used to read out the colormap entries addressed by their respective RAM address registers.MapInHi[0] is the 7-wire terminal input bit for the independent terminal interface on DispM;its position corresponds to Status[0] on DispY (see below). MapInHi[1] is a constant 1 if aDispM board is installed; if DispM is not installed, an Input from the nonexistent registeryields a zero value. This enables firmware to detect the presence or absence of a DispMboard. MapInLo[0:3] are a 4-bit color monitor type jumpered on the Dorado backpanel.Note: the MapInLo and MapInHi input devices do not generate IOB parity, so they must be read bythe Pd_InputNoPE function to disable parity checking.DispM Terminal InterfaceThe independent terminal interface on the DispM board functions much the same as asingle-channel DispY board, but is specialized to driving a binary monitor via a 7-wireinterface. The data path is one bit/pixel; the resolution is full; there is no MiniMixer; andthe horizontal waveforms are fixed by a PRom (which must be changed when a differenttype of 7-wire terminal is installed).Aside from these limitations, the DispM terminal interface operates almost identically to theA channel of DispY. In particular, the io addresses are grouped parallel to the ones onDispY, and the data formats are identical; so a microprogram can initialize TIOA to thecorrect group and subsequently use the function that changes only TIOA[5:7] to selectregisters within that group. This enables practically all the microcode for driving a 7-wireterminal to be shared between DispY and DispM.In Figure 15, the DispY io operations that are also defined for DispM are marked with anasterisk. Note that outputs to unused NLCB addresses are ignored.Due to hardware differences between DispY and DispM, the ALeftMargin and CursorXvalues must be computed slightly differently. For DispM driving the 7-wire interface,ALeftMargin = (B+L130) and CursorX = (B+X+190).Note: DispM does not have a muffler system. In particular, the MonitorType muffler value is alwaysread from DispY. By convention, this refers to the type of 7-wire terminal attached to the Dorado,whether that terminal is connected to DispY or to DispM. Also by convention, the 7-wire terminal isalways connected to DispM if DispM is installed.ЪНЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbТж ТвПIНЇО`SТПDНЇО\АТхП*ТиП)НЇО[Т├П4Т┤П#НЇОYLТ┌П-Т┐П.НЇОW│Т Т⌡П"qpНЇОUЇТ╡ТЁП<НЇОSЛТП@НЇОPzТ▓П3Т⌠П)НЇОN╟Т▐П\НЇОLЕТЁПBТЄНЇОKТ°ПRТ²НЇОIPТ╧П<Т╨НyОF▌utТ■П-Т∙П-НyОE-ТП5НЇО@7sНЇО<фpТхП/ТиП#НЇО:ШТсП1ТтП&НЇО90Т÷Т═П>НЇО7fТ╙П'Т╚П-НЇО5⌡ТП&НЇО2)Т░Т▒ПMНЇО0_Т╞П@Т╟НЇО.■Т©ТюП<НЇО,иТхТиП7НЇО*ЪТ·Т÷П@НЇО)4ТП.НЇО%бТєП?Т╔НЇО#ЬТПBНЇО ├ТщПPНЇО╩ТМП2ТНП$НЇОПТП2НyО/utТ⌠П6Т■П(НyОнТ П;Т⌡П(НyОmТ▄П>Т█П&НyОТП0nЇИ<ПR"Dorado Hardware ManualDisplay Controller14 September 1981122DDC Initialization RequirementsThe two low-order bits in the Statics register are called DWTShutUp and DHTShutUp. Theyare forced true by IOReset and prevent the respective task wakeups from happening. Theyare individually set or cleared by the Statics output command. In addition, IOReset sets thesignal DoradoHasHRam; this will prevent horizontal sync from being sent to monitors untilthe HRam has been loaded and released by firmware. Blanking is sent to monitors as longas DHTShutUp remains true. It is anticipated that DHTShutUp will be left true until all DDCinitialization has been completed by the emulator (or by the DHT running in response to aNotify).Some other initialization requirements are as follows: aLeftMargin should be loaded with alarge negative value in case one of the channels remains unused forever; the Cursor inNLCB should be zeroed in case the cursor is completely off-screen forever; HRam must beloaded with monitor-specific waveforms; the pixel clock rate must be set; mixer modes mustbe set; the MiniMixer must be loaded. In addition, the DHT must explicitly set the aAddressregisters to zero on behalf of the DWT, which cannot initialize itself completely for eachsubtask.Speed and Resolution LimitsHigh performance color monitors are typified by the following performance limits: 22 mshorizontal scan time 5 mshorizontal blanking time800 msvertical blanking timeParameters for a particular monitor can be modified slightly through hardware adjustments,but cannot be controlled by the DDC, which must provide control signals with timingappropriate for the monitor. Consequently, a monitor must be chosen that conforms to thespeed limitations of the DDC.One important speed limitation is how fast bits can be moved from storage through theDDC. This limit is derived using the following parameters:FFrame update rate. High speed phosphors require a minimum update rate of30 frames/sec with interlaced operation for reasonable visual effects; this ismarginal and faster update is desirable.SScan lines/frame.VRVertical retrace time; with interlaced operation, there will be two verticalretraces/frame.HBHorizontal blanking time.HSHorizontal scan time. The FIFO must not go empty during the horizontal scanor garbage will be displayed.TTime/munch or the rate at which storage can deliver data for IOFetches; thisis 1 munch/8 cycles = 1 munch/0.4 ms.НЇОf╙pТ─Н" qН3gpТНFMОfЯНЇОbsНЇО^╛pТ┌П2Т┐П&НЇО\АТ┘П&Т├П2НЇО[Т│Т┌ПFНЇОYLТ·П2Т÷П'НЇОW│Т┬ПPТ┴НЇОUЇТ┬Т┴ПBНЇОSЛТ≈ПWТ≤НЇОR"НЇОN╟Т▀П*Т▄ vpП"НЇОLЕТЇТ╦ПBНЇОKТ▄Т█П:НЇОIPТ┌П<Т┐НЇОG┘Т│П3Т┌П!vpНЇОE╩Т╡П4ТЁП&НЇОCПНЇО>вsТНЇО;epПQНyО7СТ─vpНҐ Т НyО6(Т─vpНҐ Т НyО4^Т─vpНҐТ НЇО0ЛТ▄ПZНЇО/!ТЮП?ТАНЇО-VТ┼ПDТ▀НЇО+▄ТНЇО(Т╩ПUНЇО&OТП;НxО#jН:Т·ПAТ÷Н:О!цТ╦П=Т╧Н:О ТП(НxО7Н:НxОRН:ТШП7ТЭН:О╚НxОфН:ТНxОАН:Т▌Т▐П-Н:О:ТНxОUН:Т²П8Т·Н:ОўТП"vp ░Їg<П[є/Dorado Hardware ManualDisplay Controller14 September 1981123MMunches/scan line that the fast io system can deliver.The time required to fill the FIFO for both channels is a little longer than 30*8+20 cycles(= 276 cycles) or about 13.8 ms at a Dorado clock period of 25 ns; this follows from thefact that there are 15 munches/channel or a total of 30 munches of FIFO storage, and thefast io system can deliver one munch per 8 cycles with the first munch arriving 20 cyclesafter the first IOFetch_. 13.8 ms is much smaller than the vertical blanking time and longerthan the horizontal blanking time, so the FIFO will start out full at the beginning of a fieldand will be actively refilling itself during HS+HB of each scan line. If the memory systemkeeps up with the demands of the video channels, then the FIFO will tend to refill itselfafter momentary transients in which it empties out a little.Consequently, we know that HS+HB = 1/(S*F) 2*VR, and that M = (HS+HB)/T lesscorrections for refresh references, storage references by other tasks, hold, and delays fortasks of higher priority than DWT. At F = 30 frames/sec, VR = 800 ms, and S = 1000scan lines, we get HS+HB = 31.7 ms and M = 31.7/0.4 = 79 munches less corrections.There will be an average of two refresh references/scan line, so we get an upper bound of77 munches = 19,712 bits/scan line from storage.However, the DWT will not get all storage bandwidth. The DWT wakeup spacing iscontrolled by a SIP; the smallest reasonable spacing would result in one IOFetch every 8cyclescloser spacing would result in hold while a preceding IOFetch completed, so moreprocessor cycles would be consumed without improving data rate. At this tightest spacing,DWT runs for 2 cycles out of every 8. Conceivably, worst case memory activity discussedin the "Fast IO" chapter could occur during these 6 cycles (a clean miss 3 cycles beforethe IOFetch, followed by a dirty miss 2 cycles before the IOFetch, each by a different task).However, the large amount of storage in the FIFO allows us to rely upon statistics toaverage out memory competition, so it is probably reasonable to allow DWT at least 80% ofstorage bandwidth or about 16,000 bits/scan line in the above example, which wouldaccommodate 1000 line x 1000 pixels/line x 16 bits/pixel. For HB = 5 ms this is equivalentto a pixel clock period of 26.7 ns.This is only one speed limitation. Since the 32-bit wide FIFO is accessed once/cyclealternately by the A and B channels (i.e., 16 bits/cycle/channel), and since exactly threedoublewords are fetched before the horizontal scan begins for each channel, the maximumbits/scan line for each channel is about (3*32 bits)+[(26.7 ns/pixel)*(16 bits/50 ns)*(1000pixels/line)] = 8640 bits/scan line. This means that unless both channels are running atthe same data rate, the data rate will be significantly below the upper bound determinedabove. For example, in 24Bit mode, if the A channel runs at full resolution and gets 8640bits/scan line, the B channel will run at half resolution and get only 4320 bits/scan line, sothe maximum data rate would be about 1000 lines x 538 pixels/line x 24 bits/pixel.НЇОf╙pТ─Н" qН3gpТНFMОfЯНxОbН:П6НЇО^╛Т⌡П7Т°П$НЇО\АТ╒vpТёНЇО[Т▐ПXНЇОYLТ·ПOТ÷ НЇОW│Т┤Т┬vpП<НЇОUЇТ⌡Т°ПVНЇОSЛТ≤П,Т≥П/НЇОR"ТЄТ╣ПHНЇОPWТП<НЇОLЕТ÷П,Т═П"НЇОKТ║ПFТ╒НЇОIPТіПCvpНЇОG┘Т≥Т vpП1НЇОE╩Т┬ПPТ┴НЇОCПТП0НЇО@~ТИП5ТЙНЇО>ЁТ╔ТіПDНЇО<ИtpТ√Т≈П2НЇО;Т┴ПQТ┼НЇО9TТ√П3Т≈П%НЇО7┴Т·Т÷ПHНЇО5ЎТ┬ПLТ┴НЇО3ТТсТтПIНЇО2)Т├ Т┤ПNНЇО0_ТчТъП>НЇО.■Т─Т│П/vpНЇО,иТП#НЇО)WТпП>ТяНЇО'█Т╘ПBТ╙НЇО%бТ┴П-Т┼П*НЇО#ЬТ█ПQТ▌ НЇО"-Т·ПYНЇО bТ╛ТґПKНЇО≤Т√Т≈ПFНЇОмТ┬ПUТ┴НЇОТКП5ТЛXЇ╩=ПMP2Dorado Hardware ManualEthernet Controller14 September 1981124Ethernet ControllerAn Ethernet is the principal means of communication between a Dorado and the outsideworld. An Ethernet is a broadcast multi-access packet switched network which canconnect up to 256 stations separated by as much as 1 kilometer with a 3 mHz channel.The 'Ether' is a passive coaxial cable to which each station is connected through atransceiver that is high-impedance when receiving, low impedance when driving.Readers unfamiliar with the general concepts behind the Ethernet should refer to"Ethernet: Distributed Packet Switching for Local Computer Networks," by R. M. Metcalfeand D. R. Boggs, CACM, 19(7):395-404, July 1976; or to Design and Performance of LocalComputer Networks, by John Shoch, published by University Microfilms, August 1979.Read this chapter with Figure 16 in view.Ethernet PacketsEthernet data are encoded in packets. Packets are preceded by a low signal (i.e., silence)on the Ether; they begin with a one-bit prefixed by the transmitter, called the start bit. Bitsin the packet are phase encoded, where the bit cell time is nominally 340 ns; phaseencoded signals have one data transition per bit cell and its direction (low-to-high = 1) isthe value of the bit. Midway between these there may be a setup transition, so that thenext data transition can be in the correct direction.Packets end when no transitions are detected for more than 1.5 bit times and the Ether islow. Collisions are transmissions that overlap in time and cause malformed andundecodable bits. Transmitters jam the Ether with a continuous high for several bit timesafter participating in a collision. Collisions are of four types: too many transitions, in whichtwo transitions occur within .25 bit times; too few transitions, in which a transition occursbetween 1.25 and 1.5 bit times after the last one; end-of-packet (EOP), in which notransitions occur for more than 1.5 bit times and the Ether is low; and jam, which is thesame as EOP except that the Ether is high.In a well-formed packet that does not experience a collision, the start bit is immediatelyfollowed by an 8-bit destination host number, then an 8-bit source host number. This isfollowed by an indefinite number of 16-bit data words, a 16-bit checksum, and finallysilence.Even when transmitted without a source-detected collision, a packet may fail to reach itsdestination; packets are delivered only with high probability. Stations requiring a lowerresidual error rate must follow mutually agreed upon communication protocols.When the sender of a packet detects a collision, some method is needed to arbitrate(without communication) its use of the Ether with other stations contending for it. Thealgorithm used on the Ethernet, called the 'binary exponential backoff collision algorithm,'is discussed in the above references. It involves waiting a random interval and thenreattempting transmission. The (ideal) distribution of the random intervals depends uponmany factors.НЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОaвrНЇО^epТґП3ТўП!НЇО\⌡ТРПQНЇОZпТ╟П$Т╠П0НЇОYТФП<ТГНЇОW;ТПNНЇОSиТ$ПPНЇОQЧТ╔ПWНЇОP4Т√Н╞ОOщЧОP4П"Н1uОOщЧmОP4Н6xОOщЧhОP4Н9vОOщЧ@ОP4 НBLОOщЧ?ОP4Т≈НD"ОOщЧ├ОP4НЇОNЧQОNiТИНЯОNЧЫОNiП4ТЙНРОJВqТXП)НЇОEчsТНЇОBlpТ▌qpТ▐НЇО@║Т▀Т▄П>qpНЇО>вТэqpП4НЇО=Т≤qpТ≥П,НЇО;AТ╛П;qpТґНЇО9wТП5НЇО6Т√П0Т≈П)НЇО4:Т(q pП;Т)НЇО2pТ°qpП!Т²НЇО0╔Т┴ПCqpТ┼НЇО.зТ║П,qpТ╒НЇО-ТЙТКП'qpНЇО+EТ╚П;Т╛qpНЇО){ТП*НЇО& ТІП=ТЇНЇО$>ТґПUТўНЇО"sТъП6ТЮНЇО ╘НЇО7Т╙ПUТ╚НЇОlТеqП1pТфНЇО╒ТПMНЇО0ТжП%ТвП.НЇОeТ©П"ТюП6НЇО⌡Т═ПPТ║НЇОпТщП&ТчП/НЇОТўПYНЇО ;ТпЇТ=П[xDorado Hardware ManualEthernet Controller14 September 1981125RemarksFrom the method of collision detection, it follows that in a noise free Ether with ideal transmitters andreceivers, a bit cell time between 0.75*T and 1.25*T, where T is the nominal bit cell time (340 ns), can bedecoded correctly.Phase encoding has the undesirable property that only 50% of the transmission medium's theoretical bandwidthis utilized. A number of reasonably simple encodings are known that more nearly approach the theoreticallimit, though phase encoding is simple to implement. If at some time we were willing to abandon compatibilitywith the existing Ethernet, we should reconsider the use of phase encoding.A promising alternative to phase encoding is bit-stuffing, which averages 67%, 86%, or 93% of theoreticalbandwidth for 0th, 1st, and 2nd order codes. This encoding outputs data bits in a cell time equal to 1/2 ofthe phase-encoded cell time; when 1 (0th order), 2 (1st order), or 3 (2nd order) data bits have been outputwithout a transition, then a non-data transition is inserted into the bit stream. The 1st order encoding (86%)could be implemented with a few changes to the current controller.Controller OverviewThe Ethernet controller is a slow IO device packaged with the disk controller on the DskEthlogic board. These two devices require more edge pins than are available in an MSA-IOslot, so the board must be mounted in a Fast IO slot (see Figure 2).It would be possible to package two Ethernet controllers on one logic board using different task andTIOA assignments for each. This might be appropriate if Dorados are ever used as Ethernetgateways.A cable connects the controller to a transceiver outside the Dorado enclosure; thistransceiver is almost identical to the ones used for Altos and other computers, thedifference being that it uses +12 volts rather than +15. Dorado transceivers are paintedbright red and have large block lettering saying "Dorado only". Plugging in the wrong typeof transceiver will not damage anything; it just won't work. The cable between thecontroller and the transceiver contains twisted-pair signals for receiver data, transmitterdata, collision, +5 v, and +12 v.The controller has independent transmitter and receiver sections. Because these twosections are completely independent, the Dorado can receive its own transmissions. Thisis an important aid in hardware and software debugging and simplifies the device driver,which need not check for sending to itself. Furthermore, the receiver can receiveconsecutive packets separated by the minimum inter-packet spacing (510 ns). This meansthat the Dorado can receive, without loss, streams of packets directed to it by mulitplehosts and packets that immediately follow broadcasts. This capability is important forservers and other high-performance applications.The controller uses two tasks, one for the transmitter (EOT for Ethernet Output Task) andone for the receiver (EIT for Ethernet Input Task). The receiver task is higher priority. Topermit two instruction/wakeup loops, a wakeup request is removed whenever the Next bussays the task is about to run. This simple strategy can be fooled into removing a requestwhen NextLies occurs, but this is harmless since the required service rate is low. To avoida spurious wakeup, a wakeup is not requested again until after the task has blocked. Adebugging control bit can be set which prevents wakeups even when all other conditionsare satisfied.ЪНЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbAtНЇО_ёuТяПeТрНЇО^BТєП-Т╔П>НЇО\АТНЇОZCТ├П9Т┤П3НЇОXБТ╚ПYТ╛НЇОW│Т┼П&Т▀ПHНЇОV!ТПKНЇОS┌ТЎПiНЇОR"Т≤П;Т≥П1НЇОPаТ║П0Т╒П;НЇОO`Т ПEТ⌡П*НЇОMЪТПBНЇОI sНЇОE≈pТ└П[НЇОCмТїПIТ╗НЇОBТПDНxО?AuТ┬ПKТ┴НxО=ЮТпТяПJНxО<НЇО90pТЧТЪq pП#НЇО7fТЭПDТЩНЇО5⌡Т°П8Т²П!НЇО3пТ┐ПIТ└НЇО2ТОП(ТПП+НЇО0;ТкП4ТлП'НЇО.qТП!НЇО*ЪТч Тъq pqpНЇО)4Т═П;Т║НЇО'iТўП:Т╞НЇО%÷ТП=ТНЇО#тТ▒Т▓П7НЇО" ТцП$ТдП4НЇО ?ТвП6ТьП!НЇОtТП0НЇОТ⌡П&Т°Н6О╚ЧГОН< О╚ЧОНAО╚ЧэОНЇО8Т▓НщОАЧГО8Н$ДОАЧcО8Н(╨ОАЧэО8Т⌠НЇОmТ▐ПRТ░НЇОёТ≥П8Т П"НЇОьТ┬ПRТ┴ НЇО Т╘Т╙ПCНЇОCТёП<ТєНЇОxТ Ъ ╛Ї1<ПYз~Dorado Hardware ManualEthernet Controller14 September 1981126The transmitter and receiver each have 16-word x 20-bit Fifos. The bits are 16 data + 2parity + 2 spare (the receiver uses one of the spare bits). Each Fifo has read and writepointers, multiplexed into the address inputs of the storage chips, to select the nextlocation to be read or written; these pointers are zeroed by IOReset. A Fifo is empty whenthe pointers are equal and full when (WritePtr+1) mod 16 equals ReadPtr. There are busregisters between the Fifos and IOB. Service requests from the Ether side of a Fifo aregiven priority. The Fifos are synchronous to t1.The basic clock for transmitting and receiving data from the Ether, called EtherClk,originates from a 23.5 MHz crystal oscillator (i.e., the period is 42.5 ns or 1/8 of the 340 nsbit cell time). The memory system's Pendulum clock (period 16 ms) is also used to timeretransmissions after a collision, as discussed later.The receiver runs continually; its phase decoder (PD) samples the Ether every EtherClk; afinite state machine (FSM) driven by the samples detects the presence or absence ofpackets on the Ether, zero/one transitions, and collisions. Another FSM accumulates thestatus of the packet and controls a shift register that assembles 16-bit words from theincoming data. Words in the shift register are written into the receiver's Fifo together withodd parity on each byte; the status is written into the Fifo after the last word of eachpacket and marked to distinguish it from data words. This allows the receiver to handleback-to-back packets; firmware decides what to do with each packet as it is read from theFifo. EtherClk is used for receiver stages through the shift register; data in the shiftregister is synchronized to the Dorado system clock as it is written into the Fifo.When the transmitter is turned on, it attempts to send one packet and then must berestarted by firmware. The EOT fills the Fifo; the transmitter FSM loads the shift registerfrom the Fifo and supplies a serial bit stream to the phase encoder (PE). Transmitter statusis read directly from the controller status registers (unlike receiver status, which travelsthrough the data path). Data is synchronized to EtherClk between the output of the shiftregister and the input of the PE. A collision may be detected by either the transceiver orthe PD. The occurrence of a collision is captured, synchronized, and used to abort theoutgoing packet after jamming the Ether briefly.The controller has a number of features to help debugging. All of the interesting internalstate is available via the IOB and the muffler system. The transceiver can be disconnectedand PE output internally connected to PD input under firmware control. Task wakeups canbe disabled permitting the controller to be driven entirely from emulator-level software. Theinternal clock can be single-stepped. These features permit the construction of asimulation program which compares its predictions with what the controller is actuallydoing.НЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbТ÷ПLТ═НЇО`SТ║П<Т╒НЇО^┴ТЙП0ТКП&НЇО\ЎТ▀ПPТ▄ НЇОZСТ√П?Т≈qНЇОY)pТўТ╞П3НЇОW^ТП/ОVяuОW^pНЇОTТЩПKqpНЇОREТ┤П\Т┬НЇОPzТїТ╗ qpП*НЇОN╟ТП6НЇОK>Т°П#qТ²pП$НЇОIsТзП<ТшНЇОG╗Т║П<Т╒НЇОEчТиТйПNНЇОDТ▒ Т▓ПPНЇОBIТг ТхПNНЇО@~Т╛П:ТґНЇО>ЁТ▓П.Т⌠П+НЇО<ИТыТзПGНЇО;ТПSНЇО7╛ТэПRНЇО5БТіП/ТїП-НЇО4Т┌П#Т┐qpНЇО2LТеПTТфНЇО0┌ТёП-ТєП,НЇО.ЇТ⌡Т°ПOНЇО,МТ╔ТіП>НЇО+"ТП0НЇО'╟Т·П:Т÷П!НЇО%ЕТ┼П"Т▀П9НЇО$Т┤ПLТ┬НЇО"PТ┐ПYТ└НЇО ├ТП@ТНЇО╩ТщПVНЇОПЪЙЇ╘<ПKb+Dorado Hardware ManualEthernet Controller14 September 1981127ReceiverMost of the receiver runs continuously, tracking traffic on the Ether. The PD reports whatit sees to the receiver FSM, which assembles packets in the shift register and buffers themin the Fifo. As words emerge from the Fifo into the bus register, they are either discardedor generate a wakeup request under control of the wakeup logic. Following the last dataword of each packet as it travels through the Fifo are the CRC word and a status word.IOAtten branches when a status word is present in the receiver bus register. Data andstatus are synchronized to the Dorado clock between the output of the shift register andthe input of the Fifo.The peculiar placement of status bits in Figure 16 eases emulation of the Alto Ethernet controller.The PD is a FSM which takes in raw phase-encoded serial data and produces phasedecoder events and carrier. Phase decoder events are 'saw a zero bit', 'saw a one bit',and 'saw a malformed bit'. Carrier indicates that the PD is seeing transitions on the Ether(i.e. the Ether is in use). Since the PD is completely digital, it can be single-stepped fordebugging. Receiver collision detection, a by-product of this decoding technique, works aswell as transceiver collision detection.The receiver control is another FSM that takes in PD output and produces control andstatus signals. RxSRCtrl controls the shift register and the bit counter. The bit counterdecrements when a data bit is shifted into the shift register and resets to -1 when thestatus is parallel loaded into the shift register. RxSRFull' is low when the next shift willmake the register full. RxEOP travels in parallel with each Fifo word and is true if the wordis an ending status word. This bit is called EthData.18 when it is in the bus register whereit can be tested with IOAtten.Writing data or status from the shift register into the Fifo has priority over loading the busregister from the Fifo. Byte parity is computed at the shift register output and travels withthe data through the Fifo and the bus register, down IOB and into the processor where it ischecked.The optimum point at which to synchronize received data with the Dorado clock systemwould be at the input to the PD, where there is only one signal to synchronize, except thatthis would make proper operation of the PD depend upon the Dorado clock period. Thenext best sync point is the PD output where the number of signals has only grown to three.The problem here is that the PD can produce events faster than they can be synchronizedto the Dorado clock without buffering. Consequently, synchronization takes place after theshift register where the number of signals exceeds 20. This is not as unfortunate as itseems because status and data use the same paths and can share a single synchronizer,RxSRDump, which produces RxFifoWE' each time RxFSM pulses RxSync'. This leaves onlyRxCollision and PDCarrier which must be synchronized for the transmitter. RxCollisionshares a synchronizer with XcCollision, and PDCarrier's is a simple level synchronizer.A receiver data-late occurs when the receiver FSM requests a Fifo write and the Fifo is full.In this case the write does not happen and the data is lost. RxDataLate is cleared after anend-of-packet status word is successfully written into the Fifo. This status has the data lateerror bit set so that the EIT is notified that the preceding packet was bad.НЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbsНЇО^╛pТ■ Т∙ПPНЇО\АТ█П,Т▌П/НЇО[Т░П8Т▒П$НЇОYLТ·П8Т÷НЇОW│Т╚П2Т╛П$НЇОUЇТ╦ПRТ╧НЇОSЛТ╘Т╙П:НЇОR"ТНyОO`uТ≥ПOТ НЇОLpТьПIТыНЇОJGТ╛П1ТґП'НЇОH|Т⌠ПRТ■ НЇОF╡Т╗Т╘ПNНЇОDГТ└ Т┘ПPНЇОCТП(НЇО?╙ТбПPТцНЇО=ЮТ╞ПAТ╟НЇО<ТдП%ТеП2НЇО:KТ╡ТЁПKНЇО8─Т┘ПJТ├НЇО6╣Т▀Т▄ПXНЇО4КТНЇО1yТ≈Т≤ПRНЇО/ўТ∙Т√ПMНЇО-ДТ┘ПAТ├НЇО,НЇО(їТІТЇП@НЇО&эТ▄П,Т█П/НЇО%Т╔ТіП4НЇО#GТ└Т┘П?НЇО!}Т░П+Т▒П,НЇО╡Т▄ПKТ█НЇОГТ╨П<Т╩НЇОТ·Т÷П8НЇОRТ─П'Т│П-НЇО┤ТкПIТлНЇОҐТжПWНЇОKТ├Т┤П=НЇО─Т┴П*Т┼П2НЇОІТ┐ПMТ└НЇО КТПLЪ ▄Ї є=ПZgKDorado Hardware ManualEthernet Controller14 September 1981128EIT wakeup requests occur when the bus register contains an interesting word (providedthat the EIT is currently blocked, as discussed earlier). Words are interesting if theyemerge from the Fifo into the bus register while RxOn and RxBOP are true and NoWakeupsis false. RxBOP is set after the status word for a packet is discarded, so that the next wordout of the Fifo (presumably the first word of the next packet) can generate a wakeup. It isreset by the EIT to discard the remaining words of a rejected packet (usually because theaddress didn't match). The receiver may be reset at any time by clearing RxOn. No morewakeups are generated and every word is discarded as it emerges from the Fifo. WhenRxOn is next set, the receiver will continue to discard words until it has discarded a statusword. It will then set RxBOP, and the next word (first word of the first packet after turningon the receiver) will cause a wakeup.TransmitterWhen the transmitter is turned on, it attempts to send one packet and then must berestarted by firmware. At the request of the wakeup logic, the EOT fills the Fifo usingOutput_B to the bus register. The transmitter FSM loads the shift register from the Fifoand supplies a serial bit stream to the PE. Transmitter status is read directly from thecontroller status registers (unlike receiver status, which travels through the data path).Data is synchronized to the Ether clock between shift register output and PE input.EOT wakeups occur when the bus register is empty, TxOn is true, and TxEOP, TxCntDwn,and NoWakeups are false (provided that EOT is blocked, as discussed earlier). Afterdelivering the last word of a packet, EOT wakeups are disabled by setting TxEOP. Whilecounting down a collision retransmission interval, firmware can disable wakeups until thenext tick of Pendulum by setting TxCntDwn. The transmitter may be reset at any time byclearing TxOn, which stops wakeup requests and shuts down the PE within 2 bit times.The binary exponential backoff collision algorithm must be implemented in firmware. Thecontroller merely provides a way to generate a wakeup on the next rising edge ofPendulum, making the grain size of countdown intervals 16 ms for the Dorado (compared to38 ms for Altos and Novas). Note that setting TxCntDwn prevents a wakeup; for one toactually occur when Pendulum clears it, the bus register must be empty and TxEOP mustbe false. Pendulum is considered to be a foreign signal so it is synchronized before beingapplied to the reset input of TxCntDwn.Loading the shift register from the Fifo has priority over writing into the Fifo from the busregister. Byte parity is computed in the processor and travels with the data down IOB intothe bus register, and through the Fifo to the shift register where it is checked.The transmitter control is a FSM which takes in start, end, and abort signals and producescontrol signals. TxSRCtrl controls the shift register and bit counter. The bit counterdecrements when a data bit is shifted into the shift register and resets to -1 when the nextword is parallel loaded into the shift register. TxSREmpty' is low when the next shift willmake the register empty. TxData wire-or's the start bit at the beginning of each packet.TxGone clears TxEOP to cause a wakeup at the end of each packet. The transmitter startswhen the Fifo is full or, if the packet is less than 15 words long, when TxEOP is true. Thetransmitter ends normally when the Fifo is empty and TxEOP is true. The transmitter abortswhen a collision, Fifo parity error or data late occurs. TxAbort can be tested with IOAtten.ЪНЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbТ╙П&Т╚П0НЇО`SТшПDТэНЇО^┴Т─П0Т│П&НЇО\ЎТ─П!Т│П=НЇОZСТ█Т▌ПMНЇОY)Т≥П=Т НЇОW^Т▐П$Т░П4НЇОU■ТўП1Т╞П#НЇОSиТ▌П7Т▐П&НЇОQЧТ▄Т█П@НЇОP4ТП%НЇОKs НЇОG╗pТэПRНЇОEчТ©ТюПAНЇОDТґТўПFНЇОBIТ©ТюПDНЇО@~ТвПZНЇО>ЁТРТСП?НЇО;AТ■Т∙П:НЇО9wТзП9ТшНЇО7╛Т║П5Т╒П"НЇО5БТўПYНЇО4Т÷П*Т═П-НЇО2LТ╨П@Т╩НЇО.зТ╒ПFТёНЇО-ТЩПMТЧНЇО+EТ┐П:vpНЇО){ТЁvpП4qpТЄНЇО'╟Т·Т÷П9НЇО%ЕТ▌П;Т▐НЇО$ТП'НЇО ╘Т·ПUТ÷НЇОчТ▐П[НЇОТПQНЇО╒Т▐Т░П>НЇОвТжП-ТвП+НЇОТ▐Т░П?НЇОBТєП<Т╔НЇОwТґТўПHНЇОґТ┐П0Т└П(НЇОБТ█П7Т▌П%НЇО Т│ПHТ┌НЇОMТ▌П9Т▐П$\Ї=П]WDorado Hardware ManualEthernet Controller14 September 1981129A transmitter data late occurs when the TxFSM requests a Fifo read and the Fifo is emptybut TxEOP is false. The PE sends one random bit and then stops. The resulting packethas an illegal length and probably a bad CRC.The PE inverts and latches TxData at the start of each bit cell and inverts the latched value1/2 bit time later. TxGo, synchronized to the beginning of a bit cell, enables the PE. ThePE assumes that a data bit is available long before it is needed and acknowledges each bitafter latching it by generating TxGotBit.A collision may be detected by either the transceiver or PD. The occurrence of a collisionis captured, synchronized, and used to abort the outgoing packet. The output of the firststage of the TxCollision synchronizer is wire-or'ed with PD output to jam the Ether after acollision. The jam lasts for one or two bit times, being the delay through the TxCollisionsynchronizer, TxFSM, and TxGo synchronizer.ClocksThe controller needs a clock with a nominal frequency of eight times the Ether bit rate.The SingleStep control bit selects either the 23.53 mHz crystal oscillator or single Doradoclocks injected under program control. The clocks for the Ether-synchronous parts of thecontroller are constructed from this basic clock.The slowest Dorado clock period at which the transmitter works is 42.5 ns. Disabling theDorado system clocks while TxOn is true causes a transmitter data late. If TxGo is true,the packet is chopped off, causing an incomplete transmission and probably a runt bit.When the clock is reenabled, the PE sends a few fragmentary bits and then the data lateaborts the packet.The slowest Dorado clock period at which the receiver works is 85 ns. Disabling theDorado system clocks causes a receiver data late. The next packet that arrives after theclock is reenabled reports the data late.Task WakeupsThe controller is designed for two completely independent tasks, with the receiver higherpriority. Two IOAs select data and status/control registers. IOAtten may be tested todecide whether a wakeup request is just for another word or something special (endingstatus for the receiver, or PE aborted for the transmitter).Task wakeups must, on the average, be serviced within 5.44 ms. The transmitter andreceiver each have 17 words of buffering (bus register + 15 Fifo + shift register) so thevariance can be quite largeaccumulated delay of up to about 90 ms is tolerable, whilelonger delay will cause a data late error.ЪНЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbТ√П'Т≈П1НЇО`SТ║ПAТ╒НЇО^┴ТП-НЇО[Т└П!Т┘П<НЇОYLТ░Т▒ПPНЇОW│Т┬ПQТ┴НЇОUЇТП)НЇОREТ▐П[НЇОPzТ°ПMТ²НЇОN╟Т Т⌡ПCНЇОLЕТ╔ТіПBНЇОKТП+НЇОFsНЇОB▐pТ╣ТІПDНЇО@еТ÷ Т═ПMНЇО>ЗТ≈П2Т≤П'НЇО=/ТП1НЇО9ҐТ²П"Т·П7НЇО7СТіП0ТїП)НЇО6(ТаТбПAНЇО4^Т═ПIТ║ НЇО2⌠ТНЇО/!ТнТоП<НЇО-VТ╒ТёПEНЇО+▄ТП)НЇО&ssНЇО#pТ╘П"Т╙П7НЇО!6Ти ТйПMНЇОkТ╧П8Т╨НЇО║ТП<НЇО/ТрП:ТсvpНЇОdТєП@Т╔НЇО Т╣П?ТІvpНЇОоТП*Ъ НЇҐ=ПUN Dorado Hardware ManualEthernet Controller14 September 1981130Muffler InputAll muffled signals on the DskEth board are accessible to Dorado firmware. The method bywhich a particular signal is selected and read out is discussed in the "Muffler Input"section of the "Disk Controller" chapter. Signal addresses 1208 to 1778 for the Ethernetcontroller are enumerated below. Unless it is obvious, signals which are specific to thereceiver or transmitter have Rx or Tx respectively somewhere in their names.Table 26: Ethernet Muffler SignalsWord BitNameMeaningERX0 120PDNew1/8 bit time sample of PD input signal 121PDOldPDNew delayed one sample time122:125PDCnt[0:3]Number of samples since last data transition 126PDCntCtrlIncrements or clears PDCnt 127ReportCollisionsControl register bit that enables PD collision reporting 130RxBOP"Beginning Of Packet" enables receiver data wakeups 131EthData.18Marks status word terminating a packet 132 133RxCRCErrorOutput of receiver CRC checker 134RxDataLateReceiver Fifo overflowed 135RxBusRegFullWord in BusReg can be read with Pd_Input 136RxFifoFullReceiver Fifo is full 137RxFifoEmptyReceiver Fifo is emptyETX140:142TxState[0:2]State of transmitter FSM 143TxEOPTransmitter data wakeups are disabled 144TxBusRegFull'Word is waiting to be written into the transmitter Fifo 145TxGoneTransmitter FSM is shut down 146TxSREmpty'Transmitter shift register is empty 147TxCntDwn'Transmitter wakeups disabled until next pendulum clock 150TxCRCEnblShift/compute control for transmitter CRC 151TxGoEnable PE 152TxDataSerial data input to PE153:154TxSRCtrl[0:1]Transmitter shift register control 155PEOutputPhase Encoder (PE) output 156TxFifoFullTransmitter Fifo is full 157TxFifoEmptyTransmitter Fifo is emptyERX1160:162RxState[0:2]State of receiver FSM 163RxCollisionReceiver-detected collision 164PDCarrierThe Ether is in use165:166PDEvent[0:1]PD output (no event, collision, 0, and 1) 167RxSRFull'Receiver shift register is full 170RxEOPMarks status word terminating a packet 171RxSync'True for one cycle triggering write of SR into Fifo 172RxIncTransReceiver incomplete transmission 173RxCRCResetResets receiver CRC chip 174RxCRCClkClocks receiver CRC ship 175RxDataSerial data output from RxFSM176:177RxSRCtrl[0:1]Receiver shift register controlНЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbsНЇО^╛pТ┌П+Т┐П.НЇО\АТьТыПDНЇО[Т╗П?ОZ┼uО[pТ╘ОZ┼uО[pНЇОYLТ╣П.ТІП+НЇОW│ТПLНмОS_wТFП#НyОPаtНyОO`Т─НnН"tНyОLбuНyОKaНnН"tТП&НyОJТ─НnН"tТНyОH÷Нn Н"tП,НyОG?Т─НnН"t ТНyОEчТ─НnН"tТП1НyОD}Т─НnН"t ТП)НyОCТ─Нn Н"tТП!НyОA╪Т─НnНyО@[Нn Н"tТНyО>ЗТ─Нn Н"tТНyО=≥Т─НnН"tТП$НyО<8Т─Нn Н"tТНyО:ьТ─Нn Н"tТ НyО89НyО6ыНnН"tНyО5xТ─НnН"t ТНyО4Т─НnН"tТП3НyО2ІТ─НnН"t ТНyО1UТ─Нn Н"t ТНyО/УТ─НnН"t ТП+НyО.■Т─НnН"tТНyО-3Т─НnН"tТНyО+рТ─НnН"tТНyО*rНnН"tП"НyО)Т─НnН"tТНyО'╟Т─Нn Н"t ТНyО&OТ─Нn Н"t Т НyО#╠НyО"PНnН"tНyО ОТ─Нn Н"tТ НyО▐Т─НnН"tТНyО.НnН"tП)НyОмТ─НnН"tТНyОlТ─НnН"tТП!НyОТ─НnН"tТП/НyО╚Т─Нn Н"tТНyОJТ─Нn Н"tТНyОИТ─НnН"tТНyО┬Т─НnН"tТНyО(НnН"tЪ Ї<ПUЬDorado Hardware ManualEthernet Controller14 September 1981131IOB RegistersTIOA equals 158 selects the IOB registers (called EthD). The transmitter bus register isloaded by Output_B and the receiver bus register is read with Pd_Input. At end-of-packet,after the last data word, the receiver delivers first the CRC word and then a status wordcontaining the following bits:RxCollisionReceiver-detected collision occurred (can happen only if ReportCollisions hasbeen set in the control word).RxDataLateReceiver data-late occurredone or more words of the last packet were lost.RxCRCErrorCRC was incorrect in last packet.RxIncTransLast packet did not end on a word boundary.Control RegisterTIOA equals 168 selects either the (write-only) control register (EthC), discussed here, orthe (read-only) status register (also called EthC), discussed in the next section. The controlregister has three fields: transmitter, receiver, and test. Bits in a field are decoded only ifthe command-enable bit for the field is true. Control bits with a single quote as their lastcharacter are true when zero.TxCmdEnbl'enables decoding of transmitter commands.TxOnenables the transmitter. The transmitter may be reset at any time by clearingthis bit. Cleared by IOReset.TxEOPdisables transmitter wakeups. EOT sets this bit after outputing the last word ofa packet. It is cleared by the controller when the PE shuts down after an abortor normal end. Cleared by TxOn=0.TxCntDwndisables transmitter wakeups. Set by EOT to time a retransmission interval aftera collision; cleared by the controller when the next rising edge of Pendulumoccurs (period = 16 ms). N.B. the binary exponential backoff is done byfirmware. Cleared by TxOn=0.RxCmdEnbl'enables decoding of receiver commands.RxOnenables the receiver, which may be turned off at any time by clearing this bit.Cleared by IOReset.RxBOP'disables receiver wakeups. Cleared by EIT to discard the currently arrivingpacket; set by the controller when the first word of the next packet is available.Cleared by RxOn=0.TestCmdEnbl'enables decoding of test commandsLoopBackdisconnects the transceiver, loops PE output to PD input, and enables TestColl'.Cleared by IOReset.SingleStepdisables the 23.53 mHz oscillator. Changing this bit can produce a runt clock.Reset the transmitter first and expect an occasional bad receiver status.Cleared by IOReset.NoWakeupsdisables all controller wakeups. Cleared by IOReset.TestClockinjects a single Dorado clock pulse (t3 of the Output instruction) into theEtherClk logic. SingleStep must already be set.TestColl'injects a single Dorado clock pulse (t3 of the Output instruction) into thecollision synchronizer. LoopBack must already be set.TestDatawire ORs with PD input. LoopBack must already be set and TxOn must alreadyНЇОf╙pТ─Н",qН3┼pТНFMОfЯНЇОbsНЇО^╛pТЇ О^uО^╛pТ╦П0НЇО\АТ┤ПHТ┬НЇО[Т╚ПKТ╛ НЇОYLТНxОV!u Н@ТдП)ТеП$Н@ОTюТНxОR▀ Н@Т╠П9Т╡НxОPW Н@ТП!НxОN# Н@П+НЇОI-sНЇОE╩pТґ ОE.uОE╩pТўПLНЇОCПТ─Т│ПPНЇОB%Т П1Т⌡П/НЇО@[Т°П-Т²П0НЇО>░ТНxО;eu Н@П)НxО90Н@Т╔П9ТіН@О7пТНxО5⌡Н@Т░Т▒П=Н@О4:Т┤ПPН@О2зТП"НxО0╔Н@Т┴П3Т┼Н@О/DТ╧П;Т╨Н@О-юТоvuТпП0Н@О,<ТНxО* Н@П&НxО'сН@Т·П"Т÷П-Н@О&sТНxО$>Н@ТиП-ТйН@О"щТ█ Т▌ПDН@О!}ТНxОHН@П!НxОН@Т▄П=Т█Н@ОЁТНxО~ Н@Т≈П5Т≤Н@ОТЩТЧП4Н@ОҐТНxО┬Н@П5НxОTН@ТщТчОгОTП$Н@ОґТП0НxОxН@ТщТчО КОxП$Н@ОяТП6НxО °Н@Т┴ПCТ┼Ъ ПЇ y<П]▓iDorado Hardware ManualEthernet Controller14 September 1981132be false. Do not issue TestClock in an instruction that changes TestData.Cleared by IOReset.ReportCollisionsallows the PD to report malformed bits as collisions. Cleared by IOReset.Status RegisterTIOA of 168 also selects the (read-only) status register. The bits in this register are themost interesting to the microcode. Less interesting state is available from the mufflers.Host Addrthe host address set by pullups on the backplane.RxOnthe receiver is enabled.TxOnthe transmitter is enabled.LoopBackthe interface is looped back.TxCollthe current output packet was aborted by a collision.NoWakeupsall wakeups are disabled.TxDataLatethe current output packet was aborted by a data late.SingleStepthe 23.53 mHz oscillator is disabled.TxFifoPEthe current output packet was aborted by a parity error.НЇОf╙pТ─Н",qН3┼pТНFMОfЯН@ОbAuТпП$ТяП&Н@О`ЮТНxО^╛Н@ТтПAТуНЇОYІsТНЇОVDpТ╟ ОUЇuОVDpП/Т╠П"НЇОTyТ©ПGТюНxОQNuТ─Н@ТП.НxОOН@НxОLЕН@НxОJ╠Н@НxОH|Н@П5НxОFHН@НxОD Н@П5НxОAъ Н@П%НxО?╙Н@П8Ъ^Ї?┤<П(└├Dorado Hardware ManualOther IO and Event Counters14 September 1981133Other IO and Event Counters In addition to the disk, ethernet, and display controllers discussed in earlier chapters,Dorado contains a general input/output interface and a junk task wakeup located on theIFU board; the two registers used in this interface may alternatively be used as eventcounters in performance monitoring, and that use is also discussed here.Since the IFU board is not interfaced to the IOB, it cannot use the slow io system to controlthese features, so functions are used instead.Junk Task WakeupThe IFU board contains a circuit which wakes up the junk task (task 1) every 32 ms. Thewakeup is dismissed by the AckJunkTW_B function; this function interprets B[15] as follows:a 1 enables wakeups; a 0 disables them; B[0:14] are ignored. The junk task can dismiss thewakeup by doing IFUTest_B with any value on B (but B[15] must be 0 to reenable thewakeup at the next 32 ms tick).Junk task microcode will, among other things, maintain a Real Time clock.General IOA 16-bit register called GenIn (synonym EventCntA) is used for general input; it can be readwith the B_GenIn (synonym B_EventCntA) function but cannot be written by firmware.When used for general input, GenIn is written with information that is TTL-to-ECL convertedfrom the backpanel.A 16-bit register called GenOut (synonym EventCntB) is used for general output; it can beeither read with the B_GenOut (synonym B_EventCntB) function or written with theGenOut_B (synonym EventCntB_B) function. GenOut is connected to the backpanelthrough ECL-to-TTL converters.The plan is that devices such as Diablo printers can be connected to the GenIn and/orGenOut signals via backpanel connectors.The choice of using one of these registers for general io or for event counting is determinedby the InsSetOrEvent_B function discussed below.Event CountersThe GenIn and GenOut registers can alternatively be used as event counters. They cannot,of course, be used simultaneously for general io. The registers are setup for either io orevent counting by the InsSetOrEvent_B function, where B[0:15] are interpreted as follows:ЪНЇОf╙pТ─НQqН2ЩpТНEVОfЯНЇОaвrpНЇО^eТэ ТщПNНЇО\⌡Т╘qpqpТ╙НЇОZпТуТжП5qНЇОYpТП@НЇОU■Т─Т│ПTНЇОSиТП.НЇОN╟sНЇОK>pТ╗ПOТ╘tpНЇОIsТ┴П[НЇОG╗Т┼ПOТ▀НЇОEчТлП-ТмП%НЇОDТtpНЇО@║ПIНЇО;┬s НЇО8pТ░Т▒ПDНЇО6KТГП3ТХНЇО4│Т⌠П"Т■П9НЇО2ІТНЇО/DТ╙Т╚ПHНЇО-zТП3ТНЇО+╞ТП=ТНЇО)ДТНЇО&sТцП4ТдП!НЇО$╗ТП(НЇО!6Т┴ПOТ┼ НЇОkТП0НЇОRs НЇОЮpТ▓П3Т⌠П&НЇОТ╠П%Т╡П6НЇОKТ╟П5Т╠П$rЇ>ГUПDorado Hardware ManualOther IO and Event Counters14 September 1981134If B[0] is 1, then InsSet[0:1] are loaded as discussed in the "Instruction Fetch Unit" chapter.If B[0] is 0, then its the general io/event counters as follows:B[4] enables counting of EventCntAB[5] enables counting of EventCntBB[8:10] select the event type to be counted by EventCntA as follows:0True (i.e., every cycle)1Hold2Processor memory reference (not held)3Good IFUJump (i.e., not held and not an exception)4Miss5-7Backpanel events A, C, and E, respectivelyB[12:14] select the event type to be counted by EventCntB as follows:0True1Hold2Successful IFU memory reference3IFUJump that wasn't ready4Miss5-7Backpanel events B, C, and D, respectivelyB[15] causes the event to be counted for all cycles if 1 or only for emulator or fault task cycles if 0.To use the event counters, you first stop them counting and read their current values; thenyou tell them what to count and start them counting and your system running. Note thatthey never get reset, but just keep counting from wherever they areit's up to the user toworry about counter turnover.The expected mode of operation is that the junk task will detect counter overflow andupdate double or triple-precision vectors in RM that count events; even if the counter iscounting once per 60 ns cycle, counter wraparound only occurs every 3.93 ms, so adouble-precision vector could count for at least 255 seconds and triple-precision for 228days. Sample microcode for maintaining a double-precision counter is given in theexample below:*The double-precision vector consisting of two RM locations, CountHi and CountLo*is initialized such that CountHi eq 0 and CountLo contains minus the value in*the event counter, and another RM location called CountFlag is initialized to 0.*The microcode below increments CountHi whenever the event counter cycles.*At any instant, the high part of the total count is in CountHi and the low part*is CountLo+event counter; CountHi has to be incremented by 1 if the counter*just overflowed.(CountLo) - (EventCntB') - 1;*CountLo + event counterPd_CountFlag, Branch[.+2,alu>=0];CountFlag_T-T-1, Branch[.+3];*Set CountFlag to -1 in 2nd half of the counter cycle.CountFlag_T-T, Branch[.+2,alu>=0];*Set CountFlag to 0 in 1st half of the counter cycle,CountHi_(CountHi)+1;*and increment CountHi, if we were in the 2nd half. . .*of the counter cycle last time.The microcode for reading the counter when it is updated like this is as folows:*Return to caller high part of event count in T, low part in Q.TaskingOff, Pd_CountFlag;T_(CountLo) - (EventCntB') - 1, Branch[.+3,alu>=0];*CountLo + event counter = low part of resultTaskingOn, Branch[.+3,alu<0];*Low part ovf iff CountFlag<0 and low sum >=0T_(CountHi)+1, Q_T, Return;*High part of result = CountHi+1TaskingOn;*High part of result = CountHiЪНЇОf╙pТ─НQqН2ЩpТНEVОfЯНЇОbТ▄П6Т█П)НЇО`SТП@НyО](uП"НyОZСП"НyОX©ПDН:ОV┼Н╛Н:ОU*Н╛Н:ОSиН╛П%Н:ОRhН╛П2Н:ОQН╛Н:ОOїН╛П*НyОMrПEН:ОK>Н╛Н:ОIщН╛Н:ОH|Н╛Н:ОGН╛Н:ОE╩Н╛Н:ОDZН╛П*НyОB%Т╡ТЁПLНЇО>вpТ┴ Т┼ПQНЇО=Т║П%Т╒П2НЇО;AТ■П1Т∙П)НЇО9wТНЇО6ТйПUНЇО4:Т╡ПBТЁНЇО2pТыТзП3НЇО0╔Т╟ПYНЇО.зТЗПEТШНЇО-Т НxО*NuПPНxО(НПNНxО'█ПQНxО&,ПJНxО$кПPНxО#jПLНxО" НxОkТ─Н)WТНxОП!НxО╙Т─Н)WТ╔ТіНxОIТ─П"Н)WТ╧Т╨П"НxОХН)WТпП"ТяНxО┤Т─Н)WТНЇО9pПPНxОwuП?НxОыНxОxТ─П3Н+ЖТдТеНxО Т─Н)WТП)НxОЇТ─Н)WТНxО V Н)WЪ Ї 3>Г]ьLDorado Hardware ManualOther IO and Event Counters14 September 1981135T_CountHi, Q_T, Return;. . .ЪНЇОf╙pТ─НQqН2ЩpТНEVОfЯНxОbAuНxО`ЮЪbЇ`Ґ4ъN$Dorado Hardware ManualError Handling14 September 1981136Error HandlingIn addition to single-error correction and double-error detection on data from storage,Dorado also generates, stores, and checks parity for a number of internal memories anddata paths. The general concepts on handling various kinds of detected failures are asfollows:(1) Failures of the processor or control sections should generally halt Dorado becausethese sections must be operational before any kind of error analysis or recovery firmwarecan be effective.(2) Failures arising from memory and io sections should generally result in a fault taskwakeup and be handled by firmware. In some situations, such as map parity errors, it isespecially important to report errors this way rather than immediately halting becausefirmware/software may be able to bypass the hardware affected by the failure and continuenormal operation until a convenient time for repair occurs. In other situations, the firmwaremay be able to diagnose the failure and leave more information for the hardwaremaintainers before halting.(3) IFU section failures and memory section failures detected by the IFU should generallybe buffered through to the affected IFUJump, then reported via a trap; in this way, if it ispossible to recover from the failure, then it will be possible to restart the IFU at the nextopcode and continue.(4) Memories and data paths involving many parts should generally be parity checked. Itis not obvious that this is always a good idea because extra parts in the parity logic will bean additional source of failures, but instantly detecting and localizing a failure seemspreferable to continuing computation to an erroneous and undetected result.(5) When Dorado halts due to a failure, information available on mufflers and in the 16-bitsof passively available error status (ESTAT) should localize the cause of the error asprecisely as possible.Since the MECL-10K logic family has a fast 9-input parity ladder component, the hardwareuses parity on 8-bit bytes in most places; there is usually insufficient time to compute parityover larger units. IM and MIR, two exceptions, compute parity over the 17-bits of data ineach half of an instruction; and the cache address section computes parity over the 15address bits and WP bit.Odd parity is used throughout the machine, except that the cache address section andIFUM use even parity. Odd parity means that the number of ones in the data unit,including the parity bit, should be odd, if the data is ok.The control processor (Midas or the baseboard microcomputer) independently enablesvarious kinds of error-halt conditions by executing a manifold operation discussed in the"Dorado Debugging Interface" document. It also has to initialize RM, T, the cache addressand data sections, the Map, and IFUM to have valid parity before trying to run programs.Reasons for this will be apparent from the discussion below.When Dorado halts, error indicators in ESTAT indicate the primary reason for the halt, andНЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОaвr НЇО^epТэПWНЇО\⌡Т╞ПIТ╟НЇОZпТЁ ТЄПLНЇОYНЇОU■ТЎП2Т©П%НЇОSиТ║П)Т╒П0НЇОQЧТНЇОN▄Т╧П(Т╨П1НЇОLбТ╔П"ТіП6НЇОJВТДПVНЇОI-Т┘Т├П9НЇОGbТ└П^НЇОE≈ТП'ТП(НЇОCмТНЇО@[Т ПZНЇО>░Т║ПVТ╒НЇО<фТїП*Т╗П3НЇО:ШТНЇО7┴Т≥ПEТ НЇО5ЎТ┼ПPТ▀ НЇО3ТТДПHТЕНЇО2)ТПKНЇО.ЇТ├ПUТ┤НЇО,МТЛТМПIНЇО+"ТНЇО'╟Т■Т∙ПOНЇО%ЕТ┘П0Т├П/НЇО$Т÷П#Т═П7НЇО"PТ╨ПJТ╩НЇО ├ТНЇОТаПTНЇОIТЕП!ТФП0НЇО~ТП;НЇОТЦП-ТДП%НЇОBТ╠ПYНЇОwТ└П&Т┘П4НЇОґТ╔ПXНЇОБТП<НЇОpТ▌ПLТ▐ жЇ)=П\Б&Dorado Hardware ManualError Handling14 September 1981137muffler signals available to the control processor further define the halt condition; ESTATalso shows the halt-enables. Midas will automatically prettyprint a message describing thereasons for an error halt. The exact conditions that cause error halts are detailed in thesections below; the table here shows the ESTAT and muffler information which is relevant.Table 27: Error-Related SignalsESTATESTAT Task ErrorEnableExperiencingRelated Muffler Signals Bit Bit Haltand MeaningRAMPERAMPEenTask2BkSTK, RM, or T parity failure.RmPerr and TmPerr mufflers on each processorboard indicate which byte of RM/STK or T had aparity failure. StkSelSaved indicates that RmPerr appliesto STK rather than RM.MdPEMdPEenprocessor-detected Md parity failureTask2Bkif immediate _Md (_MDSaved false)Task3Bkif deferred _Md (_MDSaved true)MdPerr muffler on each processor boardshows which byte of Md failed.IMrhPEIMrhPEenCTDparity failure of IM[17:33]IMlhPEIMlhPEenCTDparity failure of IM[0:16]IOBPEIOBPEenTask2BkPd_Input parity failure if IOBoutSaved falseTask2BkOutput_B parity failure if IOBoutSaved trueIOPerr mufflers on each processor board showwhich byte failed.MemoryPEMemoryPEencache address section parity failure,cache data parity failure on write ofdirty victim or dirty Flush_ hit, orfast input bus parity failure.Processor ErrorsThe processor has parity ladders on each byte of the following:input to RM/STKgenerate parity for write of RM/STKinput to Tgenerate parity for write of TBgenerate parity for DBuf_B, MapBuf_B, Output_B, IM_BIOBcheck parity for Pd_Input and Output_BMdcheck parity for _MdRcheck parity for _RM/STK (unless bypassed from Pd orMd or replaced by _Id)Tcheck parity for _T (unless bypassed from Pd or Md orreplaced by _Id)Input ladders to RM/STK and T generate parity stored with data in the RAM; these laddersare not used for detecting errors.The processor computes parity on its internal B bus (alub). The generated parity may betransmitted onto IOB when an Output_B function is executed; Store_ references write Bdata and parity in the cache; parity for IM writes and map writes is computed from B parity.None of the other B destinations either check or store B parity. External B sources do notНЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbТ╔П[НЇО`SТ∙П6Т√П%НЇО^┴ТєП;Т╔НЇО\ЎТ▓П2Т⌠П'НNОXxsТXНyОUMtН НТ─НyОSЛН НН(╗ТНyОR▀Т─Н НН(╗ТНyОOМuН НН(╗Н(╗ОN▄tutuН(╗ОM,ТЪ ТН(╗ОKкТ≥t utuН(╗ОJjТНyОH÷Н Н(╗П$НОG?Н(╗П!НОEчН(╗Н(╗ОD}tuН(╗ОCНyОARН НН(╗НyО?┤Н НН(╗НyО=╪Н НН(╗t uНО<\Н(╗t uН(╗О:ШtuП&Н(╗О9 НyО7пН НН(╗П%Н(╗О6oП%Н(╗О5П$Н(╗О3ґНЇО.ЇsНЇО+EpП?Н:О'сТ─НRТН:О& Т─ НRТН:О$>НRП4Н:О"sНRП&Н:О ╘НRН:ОчНRП4НRОН:ОIНRП5НRО~НЇОТ▄Т█ПAНЇОBТП"НЇОпТ·П;Т÷НЇОТ╞ПMТ╟НЇО ;Т┘ПAТ├НЇОpТ▀Т▄ПJЪЙЇ)=П\БQDorado Hardware ManualError Handling14 September 1981138generate parity.Parity on the R/T ladders is checked only when the R/T data path is sourced from theRAM, not when bypassing from Md or Pd is occurring, and not when R/T is sourced fromId. A detected failure causes the RAMPE error halt, which indicates that some byte of RM,STK, or T had bad parity. The muffler signals that further describe this error are in thePERR word: StkSelSaved is true if the source for R was STK, false if the source for R wasRM; each processor board has RmPerr and TmPerr signals; RmPerr is true if the RM/STKbyte on that board had bad parity, TmPerr if the T byte had bad parity. Note that if aninstruction beginning at t0 suffered an error, Dorado halts immediately after t4; the mufflersignals apply to the instruction starting at t0. The Task2Bk muffler signals show the taskthat executed the instruction at t0.Md parity is checked whenever _Md is done; a failure causes the MdPE error-halt whenenabled. The _MDSaved muffler signal in PERR is true when a deferred _Md caused theerror (T_Md, RM/STK_Md), false when an immediate _Md (A_Md, B_Md, or ShMdxx)caused the error. On a deferred _Md error, Dorado halts after t6 and Task3Bk shows thetask that executed the instruction starting at t0; on an immediate _Md, Dorado halts aftert4, and Task2Bk shows the task. The MDPerr muffler signals on each processor boardshow which byte of Md was in error.Io devices (optionally) compute and send odd parity with each byte of data; the processorchecks parity when the Pd_Input function is executed, but not when the Pd_InputNoPEfunction is executed. When enabled, an IOBPE error halts the processor at t4 of theinstruction that suffered the error; Task2Bk shows the task that executed the instruction.The processor also checks IOB parity on Output_B, and an error halts at t4 as forPd_Input. The IOBoutSaved muffler signal distinguishes Pd_Input from Output_B errors;an IOPerr muffler signal on each processor board shows which byte of IOB was in error; allof these are in the PERR muffler word.The processor generally does not pass parity at one stage through multiplexing to the next stage, soany failure in the multiplexing between one stage and the next will go undetected (exception: Bparity passed through to IOB).For example, the processor could write Md parity sent by the cache into the T RAM, when T isbeing written from Md. Instead, however, it checks Md parity independently, but then recomputesthe parity written into T with the input ladder. Hence, a parity failure detected on a byte of T canonly indicate a failure in either (1) the input parity ladder; (2) the output parity flipflop; (3) the outputparity ladder; (4) one of three 16x4 T RAM's; (5) one of two 4-bit latches clocked at t1 (Figure 3)through which the output of the T RAM passes; (6) one of two 4-bit latches clocked by preSHC'.Parity is handled similarly for writes of RM/STK.Parity is similarly recomputed on B.The processor does not generate or check parity on the A, Mar, or Pd data paths. Anyfailures of the A, Mar, B, Pd, or shifter multiplexing or of the ALU go undetected; failures ofQ, Cnt, RBase, MemBase, ALUFM, or branch conditions go undetected.RemarkSince 256x4 and 16x4 RAM's are used for RM, STK, and T, and since the processor is implemented with thehigh byte (0:7) on ProcH and the low byte (8:15) on ProcL, byte parity requires an additional 4-bit storageelement on each board, of which only 1 bit is used. We could conceivably have used all 4 bits to implement afull error-correcting code for each byte of R and T data. However, there is insufficient time to correct thedata. (Also, we use 256x1 RAM's instead of 256x4 RAM's for the RM and STK parity bits.)ЪНЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbНЇО^╛ТЄПKТ╣НЇО\АТ■П(Т∙П,НЇО[Т▄П#qpП!Т█НЇОYLТўТ╞ПAНЇОW│Т┤q pТ┬П'НЇОUЇТ≈qpqpТ≤qpНЇОSЛТ╞П#qpТ╟НЇОR"Т°ОQ∙uОR"pТ²П4ОQ∙uОR"pНЇОPWТ╔П.ОOйuОPWpТіqpНЇОN▄ТП"ОMЪuОN▄pНЇОK>Т╟Т╠П,qpНЇОIsТ√ qpП$Т≈НЇОG╗ТнПAТо НЇОEчТ≈Т≤П)ОEQuОEчpqp НЇОDТ║П+Т╒ОC├uОDpП)НЇОBIОA╪uОBIpТ©qpqpТюП(НЇО@~ТП#НЇО=Т≤П8Т≥П!НЇО;AТЇП5Т╦НЇО9wТсП(qpО8ЙuО9wpНЇО7╛Т╟П%qpП!Т╠НЇО5БТМП>ТН О5UuО5БpНЇО4Тїq pП+Т╗НЇО2LТ└qpПQНЇО0┌ТП&НyО-юuТ├ПAТ┤П#НyО,_ТЇ Т╦ПTНyО*ЪТНyО(`ТўП5Т╞П'НyО'Т╔П3ТіП-НyО%÷Т∙Т√ПNНyО$>Т├П!Т┤ПLНyО"щТ·ПWО"PО"щТ÷НyО!6Т╔П%ТіП9НyО≤ТП1НyОЫП$НЇО╚pТ╛ТґП6НЇОЮТ└Т┘ПEНЇОТПBНЇОTtНЇОІuТ⌠ПRТ■НЇОUТ╛П)ТґПBНЇОТТ│ Т┌ПcНЇО⌠ТїТ╗ПOНЇО 3ТПX╗Ї =П]ШўDorado Hardware ManualError Handling14 September 1981139Alternatively, parity could be computed over each 4-bit nibble rather than each 8-bit byte; the MC170component allows nibble parity to be computed just as economically as byte parity. If this were done, then aparity failure would be isolated to a particular nibble. With byte parity, a detected failure could be any of 9+components; with nibble parity, it would be isolated to one of 6+ components. Implementing nibble parity forRM/STK and T would require about 4 more ic's per board than byte parity.It is hard to say whether the additional precision of nibble parity would be worth the additional parts.Control Section ErrorsThe control section stores parity with each 17-bit half of data in IM. When IM is written,the two byte-parity bits on B are xor'ed with the 17th data bit to compute the odd parity bitwritten into IM. It is possible to specify that bad (even) parity be written into IM, and thisartifice is used to create breakpoints; bad parity from both halves of IM is assumed to be adeliberately set breakpoint by Midas.IM RAM output is loaded into MIR and parity ladders on each 17-bit half give rise to errorindicators that, when enabled, will halt the processor after t2 of the instruction suffering anerror. For testing purposes, halt-on-error can be independently enabled for each half ofMIR. Both the unbuffered output of the MIR parity ladders and values buffered at t2 appearin ESTAT. The buffered values show the cause of an error halt, and the unbuffered signalsallow Midas to detect parity errors in MIR before executing instructions or when displayingthe contents of IM.The special MIRDebug feature discussed in the "Dorado Debugging Interface" documentprevents MIR from being loaded at t2 when MIR parity is bad. In other words, when theMIRDebug feature is being used, all of the t2 clocks in the machine will occur except theones to MIR. This feature prevents the instruction that suffered an error from beingoverwritten at the expense of being unable to continue execution after the error.MIRDebug can be enabled/disabled by the control processor.IFU ErrorsThe IFU never halts the processor; any errors it detects are buffered until an IFUJumptransfers control to a trap location. The errors it detects, discussed in "IFU Section", areparity failures on bytes from the cache, IFUM parity failures, and map parity failures on IFUfetches.Memory System ErrorsThere is no parity checking on Mar or on data in BR, so any failure in the addresscomputation for a reference goes undetected. However, valid parity is stored with VA inthe cache, and any failure detected will cause the MemoryPE error to occur, halting thesystem (if MemoryPE is enabled).Parity is also stored in the Map (computed from B parity) and an error causes a fault taskwakeup in most situations (Exceptions: IFU references and Map_ references do notwakeup the fault task when a map parity error occurs).ЪНЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbAuТДТЕПGНЇО`ЮТ░ПJТ▒П#НЇО_─Т▌П#Т▐ПNНЇО^Т▌П"Т▐ПKНЇО\ЎТПHНЇОZ ТвП<ТьП,НЇОU*sТНЇОQ╦pТ╔П+ТіП0НЇОOМТ┴П-Т┼П0НЇОN#Т°П4Т²П+НЇОLXТ▀ПWТ▄НЇОJ█ТП%НЇОGТ≤ПTТ≥НЇОEQТ█Т▌П'qpОDдuОEQpНЇОC├Т╟ Т╠ПNНЇОA╪Т┐ПSОA/uОA╪pТ└НЇО?ЯТ┐П9Т└П!НЇО>&Т▓П*Т⌠П1НЇО<\ТНЇО8ЙТґТўП?НЇО7Т╗П#О6▓uО7pТ╘П2НЇО5UТ═П,О4хuО5UpП,НЇО3┼ТъТЮПNНЇО1©Т ПQНЇО/УТП:НЇО*шs НЇО'ipТфПVНЇО%÷Т П]НЇО#тТ▄ПYТ█НЇО" НЇОПsТНЇО~pТчП%ТъП-НЇОЄТ╚ПFТ╛НЇОИТ╠П(Т╡П/НЇОТНЇОґТ≈П9Т≤П!НЇОБТХТИПHНЇО ТП6Ъ√Їп=П[;7Dorado Hardware ManualError Handling14 September 1981140The cache data section stores valid parity with each byte of data. When a munch is loadedfrom storage, the error corrector carries out single-error correction and double errordetection using the syndrome and recomputes parity on each 8-bit byte of data stored inthe cache. When a word from B is Store_'d in the cache, byte parity on B is stored withthe data.A MemoryPE error occurs if, when storing a dirty victim back into storage, the memorysystem detects bad parity on data from the cache.The IFU and processor also check parity of data from the cache, as discussed previously.Sources of FailuresIn a full 4-module storage configuration, Dorado will have 1173 MOS storage, about 700Schottky-TTL, 3000 MECL-10K, and 60 MECL-3 DIPs, and about 1500 SIPs (7-resistorpackages). This logic is connected with over 100,000 stitch-welded or multiwireconnections to sockets into which the parts plug; logic boards connect to sidepanelsthrough about 2500 edge pins. Sockets are used for all the RAM DIPs in the machine;other parts are soldered in. Given all these potential sources of failure, reliable operationhas been a surprising achievement.Initial debugging of new machines has been slow and difficult, requiring expertise not easilyavailable in a production environment. In addition to mechanical assembly, board stuffing,and testing for shorts and opens both before and after stuffing, each machine hasaveraged about one man month of expert technician time to repair other malfunctionsbefore it could be released to users.Once released, the Dorados have been pretty reliable. During a 100-day period (6 October1980 to 14 January 1981) the CSL technicians kept records of service calls made forapproximately 15 Dorados in service at that time. The following summarizes the 43 servicecalls that were made.37 daysmean time between service calls per machine.45 days mean time between failures (some service calls were for microcode orsoftware problems).2.5 hours per machine per month average service time.13% of failures and 5% of time reseating logic boards in the chasis (connectors notmaking contact).11% of failures and 17% of time on open nets.13% of failures and 12% of time repairing 16k MOS RAM failures (standardconfiguration was 2 modules).37% of failures and 28% of time replacing other DIPs and SIPs.5% of failures and 10% of time on T80 problems.НЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbТ│П<Т┌НЇО`SТРПVНЇО^┴ТіП2ТїП%НЇО\ЎТ≥П1Т П'НЇОZСТНЇОW│ТЎПJТ© НЇОUЇТП1НЇОREТ ПXНЇОM,sТНЇОI╨pТЁП:ТЄНЇОGОТБПPНЇОF$Т5ПPНЇОDZТЮП%ТАП/НЇОB▐Т╣П/ТІП%НЇО@еТ°П7Т²П'НЇО>ЗТП"НЇО;┬Т─П!Т│П<НЇО9ҐТ⌠П'Т■П4НЇО7СТЩПQНЇО6(ТтТуП=НЇО4^ТП%НЇО0ЛТ▀ПQТ▄НЇО/!ТтТуПLНЇО-VТ▀ПOТ▄ НЇО+▄ТНyО(Т─НёТП(НyО%ЕТдТеП;НyО$ТНyО!ФП5НyОtТ─Т│П5НyО╙ТНyОuП-НyОТЫТЗПBНyО9ТНyОП>НyО▓П/Ъ tЇK<ПXюDorado Hardware ManualError Handling14 September 198114113% of failures and 11% of time on power supply failures.2% of failures and 2% of time on Terminal and display problems.4% of failures and 20% of time on repairing boards damaged during manufacturingor overheating.The power supply failures were due to problems that have since been corrected, and mostof the service calls for microcode or software problems would not happen in the moremature environment we have today. However, the other failures are believed to berepresentative. Note that none of the MOS RAM failures was the reason for a service call.These were found when testing a machine with diagnostics after a service call had beenmade for some other reason.Error CorrectionReliability has been improved by error-correction on storage. The Dorado error-correctionunit of 64 data and 8 check bits (quadword), guards 1152 MOS RAMs from single failures,but almost no other parts on storage boards or in the error corrector are guarded.Our Alto experience suggests that some machines repeatedly fail under normal use due toundiagnosable failures. For this reason, error correction should be viewed as guarding notonly against new failures but also against imperfect testing of parts that are either alreadybad or subject to noise (e.g., cosmic rays) or other kinds of intermittent failure. The lattermay be more important in our environment.The failure summary above indicates, for a small sample, that 16k MOS RAMs, accountingfor 6% of all DIPs and SIPs (because the 15 Dorados had 2-module configurations, half themaximum) average about 4 times the failure rate of other parts and account for about 1.5failures/year/Doradothis would become 3 failures/year with a 4-module configuration. Ifwe continue to do this well, a Dorado with error correction should run for years withoutuncorrectable MOS RAM failures. The manufacturer's literature indicates that the dominantfailure mode appears to be single-bit failures with row and column addressing failuresaffecting many bits somewhat less frequent, but we don't know the distribution of these.If MOS failures do become significant, different strategies may be needed for single- andmulti-address failure modes. With a multi-address failure, another failure in the samequadword causes a double error; but many single-address failures can occur in the samequadword without double errors.The failure model used below shows that with no periodic testing and replacement of badMOS RAMs, fatal failure statistics of the 1152 RAMs would approximate those of a 108RAM uncorrected store. By thoroughly testing storage and replacing bad parts 4 timesmore often than the mean time to total failure of a part (defined below), the likelihood of anuncorrectable RAM failure crashing the system can be made insignificant compared withother sources of failure.Although system software could bypass all pages affected by a multi-address RAM failure,the entire module, 25% of storage, would be eliminated, so this is impractical except on anemergency basis. Continuing execution despite a multi-address RAM failure will result in adouble error when any other coincident storage failure occurs in the same quadword; 1/16НЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНyОbП9НyО_ИП?НyО]╣Т┬П2Т┴НyО[ЙТНЇОXxТ▐Т░П>НЇОVўТеПHТфНЇОTЦТЖПBТВНЇОSТ▐П"Т░П8НЇОQNТ╘Т╙П9НЇОO┐ТНЇОJjsНЇОFЬpТ≈ПZНЇОE-Т■ПFТ∙НЇОCcТРПRНЇО?ЯТ⌠ПEТ■НЇО>&Т┴Т┼ПDНЇО<\Т≥ПJТ НЇО:▒Т■П.Т∙П1НЇО8фТП)НЇО5UТ≈П0Т≤П&НЇО3┼Т┴Т┼ПPНЇО1©Т⌡П/Т°П)НЇО/УТ▐ПYНЇО.*Т╟Т╠ПBНЇО,_Т└ПZНЇО*∙ТуП%ТжП1НЇО(йТ╡П/ТЁП)НЇО%XТ╙ПUТ╚НЇО#▌ТъП"ТЮП5НЇО!цТ║П#Т╒П3НЇОЬТНЇО┤Т Т⌡ПAНЇО╪ТюП)ТаП+НЇОЯТҐП-ТЎП(НЇО'Т┘ПAТ├НЇО\Т╡ПUНЇО▒ТНЇО Т⌡Т°П@НЇОUТ▄Т█ПIНЇО┼Т▌П[НЇО юТ┴Т┼ПLЪ√Ї y=П]▓3Dorado Hardware ManualError Handling14 September 1981142of future failures will do this.Some interesting questions are: How does MTBF vary with the EC arrangement? MTBF ispertinent if we let Dorados run until they fail. Alternatively, how likely is a failure in thenext day, week, or month, if we test the memory that often and replace bad RAMs? Thesequestions can be asked assuming perfect testing (no failures at t=0) or imperfect testing(some likelihood of failures at t=0 because diagnostics didn't find them).To answer them, MOS RAM failures are modelled as one of two types: those affecting asingle address in the RAM (called SF's), and those affecting all addresses (called TF's).We assume that TF's occur about 1/4 as often as SF's in 4Kx1 RAM's. RAM failures areassumed exponentially distributed, correct if the failure rate doesn't change with time; overthe time range of interest, this is reasonable. Finally, perfect testing is assumed, so thereare 0 failures at t=0. These assumptions give rise to the following:let p = prob that an ic has a TF = 1 eatlet q = prob that an ic has a SF = 1 ebtlet n = number of MOS RAMs in the memoryWithout error correction, MTBF is the integral from 0 to infinity of [(1p)(1q)]n =1/n(a+b). With b = 4a, in our 4-module system with n = 1024, this is 1/5120a =.00018/a.With error correction, failure occurs when, in a single EC unit, a TF coincides with eitheranother TF or an SF. This ignores two coinciding SF's which is about 4000 (16k RAMs) or16000 (64k RAMs) times less likely.let n = number of RAMs in an error correction unitthen Prob[no failure] = Prob[no TF] + Prob[1 TF and 0 SF]Prob[no TF] = (1p)nSince failure modes are independent,Prob[1 TF and 0 SF] = np[(1p)(1q)]n1Prob[no failure] = Pok = (1p)n + np((1p)(1q))n1Pok = enat + n(1eat)(e(a+b)(n1)t)This is the probability for a single EC unit, so mean time to failure for all MOS storage isPok raised to a power equal to the number of EC units. In other words, the argument ofthe integral for a 4-module x 4 quadwords/module system is Pok16 with n = 64+8; it isPok4 with n = 256+10 for a one munch EC unit.Then, expected time to failure for our 16 x n=64+8 memory system, is about:(1/n) * (1/16a + 16a/(16a+b)2 + 240a2/(16a+2b)3 + 3360a3/(16a+3b)4)= (1/an) * (1/16 + 1/25 + 5/288 + 105/17208)= (1/16an) * (1 + .64 + .28 + .006) = 1.93/16an= 1.93/16*72*a = .00168/aIn other words, mean time to failure is about 1.93 times longer than the time to the first TF= 9.5 times better than with no error correction = as often as 1024/9.5 = 108ЪНЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbНЇО^╛Т▐ПUНЇО\АТ╟П_НЇО[Т█Т▌П>НЇОYLТ╔ТіПIНЇОW│ТПJНЇОTТёП$ТєП1НЇОREТ╨ПJТ╩НЇОPzТ°ПUНЇОN╟Т▐ПXТ░НЇОLЕТ Т⌡ПCНЇОKТПEНyОG╗П(ОH5uНyОEчpП(ОFkuНyОDpП(НЇО@║ТЙП5ТКОA.О@║НЇО>вТуП'ТжП(НЇО=НЇО9 Т╒ПTТёНЇО7пТ┼П1Т▀П'НЇО6ТП#НyО2⌠П2НyО0хП9НyО-VО-ЦuНyО+▄pП$НyО)аП$О*NuНyО'ВpО'juО'ВpО(└uО'ВpО(└uНyО%бpО%5uО%бpО&OuО%бpО&OuО%бpО&OuО%бpНЇО"sТ═Т║П?НЇО ╘О uО ╘pТ╒ТёП<НЇОtТ╞Т╟П!ОГuООtpНЇО@ОЁuОмО@pТП)НЇОЯПKНyООuОpОuОpОuОpОuОpОuОpНyО╣П,НyОЙП/НyО НЇОўТ▄П-Т█П0НЇО ЦТ ТПHЪ Ї °=П]oSDorado Hardware ManualError Handling14 September 1981143uncorrected storage ic's.The results don't change much when imperfect testing is assumed. The effect of this is toreplace densities for p and q by 1 Aeat, where A would be .999 if there was a 1/1000chance of a MOS ic being bad at t=0.RemarksOn each storage board, data from MemD is transported to a shift register consisting of 8 flipflops which arethen written into the MOS RAM's after transport has been completed. This arrangement is unfortunateanyfailure in one of these components will cause a multiple error, and there are about 250 of these parts in a fullstorage configuration.One way to eliminate this problem while simultaneously reducing the part count on each storage board wouldbe to make modules consist of four storage boards, rather than two, so that only four flipflops receive data oneach bit path during transport; since each of these is in a different quadword, single failures would not causemultiple errors.The Dorado EC operates on quadwords, requiring 8 check-bits/64 data bits, or a 12.5% storage penalty.Alternative schemes are: 10 check bits/256 data bits (3.9%); 9 check bits/128 data bits (7.4%); 7 check bits/32data bits (22%); and no error correction at all (0%).The implementation of the EC pipeline is such that wider correction units significantly increase the time for amiss. The current quadword error corrector requires 7 clocks (3 clocks for setup and correction, 1 clock perword of the quadword); this would become 11 clocks with a 128-bit EC scheme or 19 clocks with a 256-bit ECscheme. Although cache hit rate seems to be above 99%, some implementation avoiding this delay would stillbe needed to make larger correction units attractive.If our quadword correction unit were replaced by a 4 x n=256+10 scheme:1/4na + 4a/n(4a+b)2 + 3a2/2n(2a+b)3, where for b = 4a this is(1/4na)*(1 + 1/4 + 1/36) = 1.28/4na = .0012/aIn other words, MTBF is about 1.28 times longer than the time to the first TF. So error correction hasincreased MTBF by a factor of 6.2 over no error correction; alternatively, a 1064-RAM corrected memory failsas frequently as a 1064/6.7 = 159 RAM uncorrected memory.Surprisingly, the 64+8 EC scheme has only 42% longer MTBF than a 256+10 EC scheme. This improvementmay not be worth the 96 additional MOS RAM and 80 other DIPs required for address buffering; the 80additional DIPs might cause more failures than they save, being a net loss.The other method of maintaining our systems is to regularly test storage and replace bad RAMs. Then thelikelihood of no double error before replacement is simply the value of the probability distribution (Pok4 andPok16 above) at the selected instant. This reduces to an approximation of the form Pok = [ex + xex]mwhere x = nat, m is 4 or 16, and n = 72 for m=4 or 266 for m=16. If this is evaluated at t = 1/mna,1/2mna, 1/4mna, etc. the following results are obtained:Table 28: Double Error Incidence vs. Repair Rate m1/mna1/2mna1/4mna1/8mna 4.52.81.94.9816.79.84.98.99The interpretation of this table is as follows: Measure mean time to total failure (TF) of a MOS RAM and callthis time 1/a; then assume 4 SF's per TF. Then the rate at which TF's occur in storage will be 1/mna. Sothe above tables show probability that the Dorado hasn't suffered a double error when tested and fixed asoften, 1/2 as often, 1/4 as often, or 1/8 as often as the mean rate of TF's.НЇОf╙pТ─Н#█q Н2ЩpТНFMОfЯНЇОbНЇО^╛Т▌ПEТ▐НЇО\АТ⌠П'О]nuО\АpП&Т■НЇО[ТП$НЇОVТtНЇОT3uТ²П?Т·П-НЇОR╞Т║Т╒ПIНЇОQ+Т┴П`Т┼НЇОOїТНЇОLЕТ⌠ПCТ■П'НЇОKaТ┴ПlТ┼НЇОIщТ▐Т░ПbНЇОHYТНЇОE≈ТфtuП&ТгНЇОDТ┘ПaТ├ НЇОB▐ТП5НЇО?нТ≈П`Т≤НЇО>JТ▒ПcТ▓ НЇО<фТ├ПXТ┤НЇО;AТ▄П)Т█ПBНЇО9ҐТП5НЇО6ЭПGНyО4:О4гО4:О4гО4:О4гО4:НyО2ІП-НЇО/УТҐТЎП_НЇО.qТ≈ Т≤П^НЇО,МТП9НЇО*+Т√Т≈ПMНЇО(їТюПcНЇО'#ТПKНЇО$aТєПdТ╔НЇО"╨Т╒П>ТёП)О"-О#GО"╨НЇО ├ОЫО!О ├Т▓ПPОЫО ├О!О ├Т⌠О!О ├О!НЇОчТ║ПYТ╒ НЇОZТП8Н╗О≥vТFП1НyОвuТ─НШНъН!дН(їНyОН╚Н▐Н"tН)WНyО▒Н╚Н▐Н"tН)WНЇОпТ▐ПaТ░НЇОLТ⌠Т■ПaНЇО хТ╛ПCТґП&НЇОDТПLЪ hЇ!=П[ЙtDorado Hardware ManualPerformance Issues14 September 1981144Performance IssuesThis chapter discusses two issues:(1) How rapidly will Dorado be able to execute Mesa, Lisp, SmallTalk, etc. macroprograms;(2) What relationship do some of the design parameters bear to performance;Cycle TimeThe first issue is cycle time. Dorado was designed for a 50 ns cycle time; the first threeprototypes used stitchweld technology for interconnections and operated correctly at 55 nscycle time; however, subsequent machines are being built using multiwire technology andwill not operate faster than about 60 ns cycle time. The baseboard at present initializes theclock period to 64 ns for all machines during a boot, although there is some indication thatdesign changes made recently and repair of a few lingering slow path problems wouldpermit 5 to 10 ns faster operation.With respect to achievable cycle time, the two important differences between stitchweldand multiwire technology are that stitchweld uses point-to-point wiring and has wireimpedance of about 100 ohms (which is ideal), but multiwire uses Manhattan (square-corner) wiring with wire impedance of about 50 ohms on the inner layer and 70 ohms onthe outer layer of wiring (Most signals are in the outer layer.); longer wires and imperfectimpedance matching result in slower speed.Emulator PerformanceGene McDaniel's measurements of the Alto Mesa compiler have been adjusted to makethem compatible with Pilot Mesa and are summarized below. It must be pointed out thatthe compiler makes heavier use of short pointers than do Pilot Mesa programs; programsbeing developed now are heavily biased toward long pointers and would be slower than theexecution rate below indicates. Average execution rate was about 5.6 cycles/opcodeexcluding disk wait. About 38% of all cycles are consumed by XFER opcodes (i.e.,subroutine call or return) and account for about 6% of opcodes executed. If these areexcluded, the remaining 94% average about 3.1 cycles/opcode; if jumps and conditionaljumps are also excluded (about 14% of executions), the others average 2.5 cycles/opcode.These times include all memory and IFU delays.These excellent results indicate that there are no unusual delays due to problems with thememory or IFU and that the processor is completing most opcodes quickly. Since XFERopcode take 34 (local) to 54 (external) cycles/opcode excluding memory delays, speeding,respecifying, or reducing executions of XFER seem to be the most promising ways ofimproving performance.In the above results, instruction forwarding has saved an average of about .25cycles/opcode or about 4% overall, in agreement with our expectations.ЪНЇОf╙pТ─Н##qН5ЎpТНEVОfЯНЇОaвrНЇО^epП"НЇОZСТ░Т▒ПEНЇОW│ТПKНЇОRhs НЇОNЖpТёПKТєНЇОM,Т▄ПZНЇОKaТёПWНЇОI√Т┌П%Т┐П9НЇОGлТ┤П/Т┬П-НЇОFТгТхП3НЇОD7ТП#НЇО@еТбПWНЇО>ЗТЧПTНЇО=/ТвП-ТьП&НЇО;eТ═Т║ПBНЇО9 Т╒ТёПJНЇО7пТП*НЇО2ІsНЇО/DpТ©П(ТюП)НЇО-zТ╔ПVНЇО+╞ТёП8ТєНЇО)ДТ┐ПHТ└НЇО(ТщПAТчНЇО&OТЙП=ТКНЇО$┘Т╧П*Т╨П,НЇО"╨ТІПIТЇНЇО ОТ▄П2Т█П&НЇО%ТП.НЇОЁТ■П/Т∙П+НЇОХТ╗ПIТ╘ НЇОТ⌠ПFТ■НЇОSТжПJТвНЇО┬ТНЇОТ?Т@ПHНЇОLТПFЪ tЇ=ПYDorado Hardware ManualPerformance Issues14 September 1981145For SmallTalk and Lisp instruction sets, performance is much worse than Mesa (averagingover 30 cycles/opcode on Smalltalk 76). Careful studies should be made to understand thereasons for this fully, but one reason is that the 16-bit word size is a serious limitation.Long storage pointers are used extensively, so execution would be substantially faster on amachine with, say, 32-bit data paths.IFU Not-Ready WaitFor the Mesa compiler, 19.5% of all cycles were in IFU not-ready wait; 16% due toincorrectly predicted jumps, 2.5% to cache miss wait, and 1% to other causes. The 16%due to incorrectly predicted jumps might be improved.The Mesa microcode presently predicts that all conditional jumps will not jump; it isdesirable to predict not-jump unless more than 75% of executions jump due to theoverhead of restarting the IFU an extra time. 40% of the time the prediction is wrong and ajump occurs, so it seems that the microcode is doing the best it can.However, some loops ("while J ne 0 do," for example) are compiled as a normally-falseconditional jump at the beginning of the loop and an unconditional jump from the end ofthe loop back to the beginning; a faster sequence is a normally-true conditional jump at theend of the loop, eliminating the unconditional jump altogether. The general objectives inchanging the compiler would be as follows: (1) Eliminate unnecessary jumps andconditional jumps; (2) Make the jump/not-jump execution of conditional jumps be aspredictable as possible; and (3) Make the not-jump path be the most likely, unless thisconflicts with objective (1).Microstore RequirementsSpeed is not the only issuesome reduction in microstore requirements might be possiblethrough design changes. Space requirements for a 1981 release of the Alto/Mesaemulator system were as follows:Table 29: Utilization of the MicrostoreMesa basic opcode set20248Cedar allocator & collector5768Floating point4578Alto opcode set11638Alto BCPL Runtime2268BitBlt subroutine4168Fault handling658Ethernet driver2558Disk driver4308Display driver5008Junk io driver768LoadRam1008Initialization1508НЇОf╙pТ─Н##qН5ЎpТНEVОfЯНЇОbТ≤ПGТ≥НЇО`SТ─ПUТ│НЇО^┴Т╦ Т╧ПQНЇО\ЎТ┬П>Т┴НЇОZСТП%НЇОUзsНЇОRhpТНП6ТОНЇОP·ТїПMТ╗НЇОNсТП5НЇОKaТЛТМПMНЇОI√ТШП5ТЭНЇОGлТ┌ПMТ┐НЇОFТПEНЇОB▐Т╣П"ТІП3НЇО@еТєПGТ╔НЇО>ЗТ┐ПDТ└НЇО=/ТїП.Т╗П,НЇО;eТ(ПNНЇО9 ТРПOТСНЇО7пТфПWНЇО6ТНЇО0ЛsНЇО-zpТ∙П-Т√П*НЇО+╞ТП/ТНЇО)ДТНО&ssТXП(Н:О#$pТ─Н)WО"≈tН:О!YpН)WО лtН:О▐p Н)WОtН:ОдpН)WО7tН:ОЫpН)WОltН:О/pН)WО╒tН:Оdp Н)WОвtН:О pН)WО tН:Ооp Н)WОBtН:Оp Н)WОwtН:О:p Н)WОґtН:ОopН)WОБtН:О ╔p Н)WО t─ЇУ=П[:Dorado Hardware ManualPerformance Issues14 September 1981146Total76738leaving 1058 free locationsSince we do not require that more than two emulators be loaded in the microstore at onetime, there is presently a little space left for extensions. MicroD is able to utilize well over99% of the available microstore.The third performance issue is cache efficiency and miss wait; the fourth is available iobandwidth and io task cycle consumption. These are discussed in sections below.Cache Efficiency and Miss waitThe value of shortening the wait for a storage read is roughly proportional to misslikelihood. Suppose that the prototypical opcode was a one-byte opcode implemented bythe following microcode:Fetch_Id, StkP+1;Stack_Md, IFUJump[0];For this example, execution time on a hit is 2 cycles; on a miss, 28 cycles. Delay for IFUmisses must be added to this. Since the IFU is 6 bytes ahead of the current opcode, itsmisses delay 28 cycles less execution time for preceding 6 bytes; if any of the 6 bytes itselfcauses a miss, IFU delay will be 0 because it will catch up; the IFU never gets two misses(in this example) because it crosses at most one munch boundary. Hence, execution timewill be 2 + 26*(1H) + (28-12)*H6*(1H), with the following results:Table 30: Execution Time vs. Cache EfficiencyHitExecution IFU% Miss% CyclesCycles Wait1002.00.000 992.26.1517 982.52.2829 963.04.5044 943.56.6753 924.08.7959This crude analysis shows the importance of cache efficiency in determining systemperformance. Fortunately, measurements made by Doug Clark and Gene McDanielindicated the following surprisingly high cache hit statistics:Overall cache hit rate on three Mesa programs was 99.2% to 99.8%. 4.9% to 8.1%of all cycles were held. 10% to 19% of references were Store_'s, the rest fetches.16% to 66% of misses had dirty victims, which cause additional cycles to be heldwhile the cache address section is busy.Another measurement showed a 99.7% hit rate for IFU references.The processor obtains a word from the cache in 16% of all cycles and the IFU in32% of all cycles; the processor actually shuts out the IFU by making its ownНЇОgpТ─Н##qН5ЎpТНEVОfЯН:ОbН)WОa▒tН.qОbp Оa▒tОbpНЇО^оТ≥П>Т НЇО]Т⌠П,Т■П5НЇО[:ТНЇОWхТ╦П$Т╧П5НЇОUЩТПPНЇОPДsНЇОMrpТУПKТЖНЇОK╗Т║П7Т╒НЇОIщТНxОFkНxОD═НЇОA.Т≥П.Т П-НЇО?dТёПDТєНЇО=≥Т├П^НЇО;оТ■П-Т∙П-НЇО:Т⌠ПRТ■НЇО89ТО8фtО89pП#НО4гsТXП.НАО1UqНнН!цТ─Н(└ТНАО/▀НнТ─Н!цН(└ТН⌡О,┐pН.Н"sН*╦Н⌡О*╦Т─Н.Н"sН*╦Н⌡О(НН.Н"sН*╦Н⌡О'#Н.Н"sН*╦Н⌡О%XН.Н"sН*╦Н⌡О#▌Н.Н"sН*╦НЇО ТЬПRНЇОQТПCТНЇО┤ТП?НyОТ▀Т▄П9НyОJТ▀Т▄ПAНyОТ·П-Т÷П#НyО╣ТП(НyОCП?НyОяТ⌡ПHТ°НyОТмП)ТнП$`Ї ©<П]o;Dorado Hardware ManualPerformance Issues14 September 1981147reference about 20% of the time.Provision has been made to expand the Dorado cache to 16k words, when 4k x 1 MECLRAM's are economically available, but the existing cache is so efficient that this may neverbe necessary.Performance Degradation Due to IO TasksTo first approximation, only the display controller word task (DWT) uses enough storagebandwidth to interfere significantly with emulators. Since it uses the fast io system, DWTrequires service once/munch and will require two instructions/wakeup in the ordinarycase. In addition, if the next instruction (by another task) issues a memory reference, it willalways be held one cycle while the DWT's IOFetch_ advances ASRN.A quick calculation shows that at an io bandwidth of 256 x 106 bits/sec (106 munches/sec)the display controller will use 48% of storage bandwidth and 12% of processor cycles at 60ns/cycle.The earlier example showed that with no io interference and a 99% hit rate, the emulatorspent 17% of cycles in miss wait, 83% in useful execution. With a 256 x 106 bit/sec displayactive, emulator misses are slowed about 2 cycles each, so the overall effect of the displaywould be that about 78% of all cycles are emulator executions, 12% display taskexecutions, and 16% hold; the one cycle holds for IOFetch_ would make performancesomewhat worse than this.An IOFetch_ by the display task to the same cache row as an emulator miss will remain inthe address section, increasing display task latency and requiring more buffering.However, this won't degrade emulator performance.The Alto monitor only uses 14.7 x 106 bits/sec (1/17 of the above) and would not interfereappreciably with emulators.The disk controller is the fastest "slow" io device among standard peripherals. Whenrunning, its word interrupt task reads a double word from the cache every 3.2 ms in a 3instruction/interrupt inner loop, consuming about 5.6% of all cycles at 60 ns/cycle. Itsmemory references consume the cache at a rate of .04 munches/ms, low enough thatstorage interference with the emulator isn't significant. However, a 256-word disk transferdisplaces about 1/16 of the cache entries, so the emulator may experience a lower hit rate.Cache and Storage GeometryThe current geometry was chosen without measurements or simulation of programs, butmeasurements made since then have indicated a surprisingly good cache performance, sonot much could be gained through changes.The following parameters are relevant:1 word as the unit of storage inside the memory pipeline;16-word munch;НЇОf╙pТ─Н##qН5ЎpТНEVОfЯНyОbНЇО^╛Т╔ПJТіНЇО\АТ▌ПHТ▐НЇО[ТНЇОUЩsП'НЇОR▀pТ╣П(ТІП/НЇОPаТёП>ТєНЇОNЖТБП$ТЦП0НЇОM,Т┐П=Т└П#НЇОKaТП@НЇОGОТ┐П'Т└ОH|tОGОpОH|tОGОpНЇОF$Т┬П<Т┴НЇОDZНЇО@ХТ╒ТёПEНЇО?Т│П:Т┌О?╙tО?pНЇО=SТ▀П>Т▄НЇО;┬ТП%ТП*НЇО9ҐТпП@ТяНЇО7СТНЇО4│Т█ПUТ▌НЇО2ІТ0Т1П3НЇО0ЛТП1НЇО-zТ▀П$О.tО-zpТ▄П#НЇО+╞ТНЇО(=ТнП,ТоП)НЇО&sТґПMТўupНЇО$╗ТЎПDТ©НЇО"щТоП=upТпНЇО!Т Т⌡ПHНЇОHТ┐П>Т└НЇО/sТНЇОҐpТ╨ПSНЇОРТ≈ПUНЇО(ТП)НЇОІП&НyОDП9НyО yqp Ї 2=П]ы&Dorado Hardware ManualPerformance Issues14 September 1981148256 munches in the cache (expandable to 1024);4 columns in the cache.Munch SizeA 16-word munch size was chosen primarily because 8 cycles for transport balances 10cycles for storage access, avoiding loss of bandwidth. The use of 256x4 RAM's toimplement the cache address section allows the original 4k-word cache (implemented with1kx1 RAM's) to be expanded to 8k words or 16k words, when 4kx1 RAM's are economicallyavailablethis is possible because only 64 of the 256 words in the address section arebeing used with the 4k-word cache. Miss wait is about 28 cycles and storage bandwidthabout 533 x 106 bits/sec with 16-word munches.8-word munches would lower the storage bandwidth to about 262 x 106 bits/sec, probablyunacceptable. Also 8-word munches would limit cache expansion to 8k words. However,miss wait would be reduced to about 24 cycles because transport would require only 4cycles. 32-word munches would not allow greater storage bandwidth to fast io devicesbecause bandwidth is already limited by transport with 16-word munches. Nor would itallow expansion to a larger cache data section because we have no way to build a datasection larger than 16k words. Also, miss wait would be slowed to 36 cycles, so it does notseem that this munch size is attractive.For a given size of the cache data section, with smaller munches the cache will tend tostabilize with a larger amount of useful information; however, when a program is changingcontexts, larger munches might bring the new context into the cache more quickly. Also,fast io tasks will interfere less with the emulator on larger munches because fewer wakeupsand IOFetch_'es will be required. However, the extra buffering and longer miss wait offsetsthis advantage somewhat.Considered together, these factors suggest that the 16-word munch we are using issubstantially better than either 8 or 32-word munches.Data Path WidthHaving only 16 bit wide data paths slows misses. Doubling the paths to 32 bits wouldreduce EC time by 1 cycle and transport time into the cache by 4 cycles (i.e., delay onmisses would be 23 cycles instead of 28). There were not enough edge pins to do this.However, if a method of doubling the path width were found, the storage system wouldprobably be arranged as two modules of four storage boards each rather than fourmodules of two boards each, and 32-word munches might be better than 16-wordmunches.ЪНЇОf╙pТ─Н##qН5ЎpТНEVОfЯНyОbП.НyО`Sqp НЇО\q НЇОX°pТЄП>Т╣НЇОVяТНПHТОНЇОUТ≤ПWНЇОS<Т│П9Т┌НЇОQqТ╪П-ТҐП)НЇОOїТёП(ТєП.НЇОMэТ ОNitОMэpНЇОJjТ⌠Т■П.ОJВtОJjpНЇОH÷Т⌡Т°П;НЇОFуТ╩Т╪ПBНЇОE ТҐП(ТЎП-НЇОC@Т©ПUНЇОAuТ╘Т╙ПCНЇО?╙Т─ПBТ│НЇО=ЮТП(НЇО:nТ╞П@Т╟НЇО8ёТ∙Т√ПBНЇО6ыТ°П4Т²П$НЇО5Т┐ПEТ└НЇО3CТ─ПOТ│НЇО1yТНЇО.ТТП"ТУП/НЇО,<ТП6НЇО'ВqНЇО$┘pТ©ПJТю НЇО"╨ТЁП@ТЄНЇО ОТ╗ПEТ╘НЇО%Т╩ПTНЇОZТСТТП5НЇО░ТП/ТНЇОе ═Ї~=ПN█ Dorado Hardware ManualPerformance Issues14 September 1981149Cache ColumnsThe reason for multiple columns is to approximate LRU reloading; the columns aremoderately expensive because separate hit logic has to be provided for each one; the V-NVstuff also costs a few ic's with more than two columns. Altogether the current 64x4 cacheis about 40 ic's larger than a 128x2 cache (Because of its 50-50 LRU behavior on thefourth column, our cache is somewhere between the 64x4 and 128x2 or 128x3 cachesbelow.). The table below shows likelihood that the Nth LRU munch is no longer in thecache for various geometries:Table 31: Cache Geometry vs. LRU Behavior N32x464x2128x232x364x3128x364x4128x4 4.000.001.000.000.000.000.000.000 8.000.006.002.002.000.000.000.000 16.001.025.007.013.002.000.000.000 32.017.089.026.077.014.002.002.000 64.140.264.090.323.079.014.018.002128.570.596.264.767.323.080.141.019256.960.910.595.987.764.323.568.142512.763.959.567These numbers are computed from a binomial distribution using the following formulae:let R = rows in cachelet C = columns in cachethen p = (R1)/R = probability that a munch of VA is in its rowthen q = 1/R = probability that a munch of VA is not in its rowthen probability of a miss for the nth element is:CP(miss)11 pn21 pn nqpn131 pn nqpn1 n(n1)q2pn2/2!41 pn nqpn-1 n(n1)q2pn2/2! n(n1)(n2)q3pn3/3!etc.Without extensive measurements on programs, it is impossible to know how much better,say, a 32x4 cache is than a 64x2 cache, or to know whether a 128x2 cache is better orworse than a 32x4 cache, for example. If a particular program is confining itself to a verysmall set of munches, then more closely approximating LRU reloading is most important.However, if the likelihood of reference flattens out after a small N, then it won't mattermuch that LRU reloading isn't very well approximatedthe total size of the cache will be amore important determinant of performance.НЇОf╙pТ─Н##qН5ЎpТНEVОfЯНЇОbqНЇО^╛pТЪ ТПBНЇО\АТ┌П6Т┐П#НЇО[Т█Т▌П?НЇОYLТбТцП6НЇОW│ТгПIТхНЇОUЇТ╦Т╧ПLНЇОSЛТНдОPzsТXП*НxОMpТ─НШН~НН$┘Н*Н/▀Н5Н:▒НxОI√НШН~НН$┘Н*Н/▀Н5Н:▒НxОGлНШН~НН$┘Н*Н/▀Н5Н:▒НxОFНШН~НН$┘Н*Н/▀Н5Н:▒НxОD7НШН~НН$┘Н*Н/▀Н5Н:▒НxОBlНШН~НН$┘Н*Н/▀Н5Н:▒НxО@║НШН~НН$┘Н*Н/▀Н5Н:▒НxО>вНШН~НН$┘Н*Н/▀Н5Н:▒НxО=НШН~НН$┘Н*Н/▀Н5Н:▒НЇО9 ТдПUНЇО6(ТНЇО4^НЇО2⌠П?НЇО0хП?НЇО.ЧП2НyО+▄НШНyО(НШО(їtНyО&OpНШО&эtО&OpО&эtНyО$┘pНШО%tО$┘pО%tО$┘p О%tО$┘pО%tО$┘pНyО"╨НШО#GtО"╨pО#GtО"╨p О#GtО"╨pО#GtО"╨pО#GtО"╨pО#GtО"╨pНyО ОНЇО}Т╗П!Т╘П4НЇОЁТ╙ПRТ╚НЇОХТ■ПAТ∙НЇОТўПCТ╞НЇОSТаП0ТбП*НЇО┬Т▄Т█ПMНЇОЎТП*єЇw=ПU■ЄDorado Hardware ManualGlossary14 September 1981150Glossary a - the first 8-bit operand of a two-byte or longer opcode.b - the second 8-bit operand of a three-byte or longer opcode.bypassing - a number of memories and task-specific registers in Dorado (RM, STK, and T,for example) are written with data that might be needed before the write occurs. These areimplemented so that data about-to-be-written is substituted for data read from the registeror memory when appropriate. This substitution is called bypassing and enables Dorado torun considerably faster than would otherwise be possible.cache entry - a munch together with VA of the munch and 4 flag bits. For a 64 row x 4column cache, VA[28:31] are the word in the munch, VA[22:27] address the row, andVA[7:21] are stored in the cache entry.column - one of 4 groups of 64 (expandable to 256) cache entries. The cache column inwhich a word with VA resides is determined by comparing VA[7:21] with the correspondingbits stored in the four columns at row VA[22:27]. Thus a memory word may occupy one of4 locations in the cache.control processor - the microcomputer on Dorado's baseboard, or the Midas programoperating Dorado from an Alto.dirty - a cache entry is dirty if the information in it differs from information in storage,because a store has been done into the cache, and storage has not yet been updated. Apage is dirty if a store has been done into the page since its map dirty bit was cleared.emulator - the lowest priority task, number 0, always awake. The emulator is distinguishedby the fact that it cannot block, can use Stk, and has a private pipe entry. Primarily theemulator task will implement instruction sets.entry vector - the exit microinstruction of an opcode sends control to the firstmicroinstruction of the next opcode by means of IFUJump[n] (n = 0 to 3), where nchooses one of 4 entry microinstructions for the next opcode; these four microinstructionsare the next opcode's entry vector.fault task - the highest priority task, number 15, woken whenever a memory fault or stackerror occurs.hit - a reference which finds the desired word in the cache.Midas - the Alto program used for loading and debugging Dorado remotely.miss - a reference which does not find the desired word in the cache.module - the unit in which storage is packaged, either 64K, 256K, or 1M words. A machinemay have 1 to 4 modules.НЇОf╙pТ─Н%ЕqН5pТНEVОfЯНЇОaвrpНЇО^espП:НЇОZСspП=НЇОW│qpТ√Т≈ПJНЇОUЇТ┌П"Т┐П9НЇОSЛТ·П/Т÷П,НЇОR"Т⌠П9qpТ■НЇОPWТП9НЇОLЕqТ═ pП9Т║НЇОKТжТвПDНЇОIPТП'НЇОEчqpТ≥П2Т НЇОDТ▀Т▄П;НЇОBIТ┤ Т┬ПIНЇО@~ТНЇО=qТАpП@НЇО;AТНЇО7пqpТфq pТгПDНЇО6Т∙П1Т√П%НЇО4:qpТўПDТ╞НЇО0хqТ▌ pП?Т▐НЇО.ЧТ╔П%ТіП6НЇО-3ТП-qНЇО)аТ:pПCНЇО'ВТАТБП=НЇО&,Т■П(Т∙П2НЇО$aТqpНЇО ОqТ· pП.Т÷П!НЇО%ТНЇОЁqpП9НЇОAqpПCНЇОоqpПAНЇО]qpТ┐П>Т└НЇО▓ТЪ XЇK<ПXюDorado Hardware ManualGlossary14 September 1981151MTBF - mean time between failures.munch - 256 bits, or 16 machine words; the unit of data for main storage.parity - the parity of a data unit is the exclusive-or of all bits in the data unit; parity has theproperty that changing any single bit in the data unit will also change the parity, so it canbe used to detect single failures. A data unit has odd parity when the number of 1's in theunit is odd, even parity when the number of 1's is even. Dorado uses odd parityeverywhere, which means that the number of 1's in the data unit including its associatedparity bit should be odd when data is correct.PC - "program counter". In this manual PC refers to the 16-bit byte displacements relativeto BR 31 (the codebase) which are maintained by the IFU for the current instruction set.This term should be distinguished from TPC, which refers to the address of the nextmicroinstruction for a task.pipe - a 16-entry memory which records the state of the last few storage references.quadrant - one of the four 4k-word regions in a 16k-word control store.RAM - "random access memory"; selected words in the memory can be both read andwritten.reference - a reference to the memory, initiated by the processor or by the IFU. Aprocessor reference transfers a single word between the cache and the processor; an ioreference transfers a munch between storage and an io device.ROM - "read-only memory"; the contents of the memory are specified when the hardwareis constructed and cannot be modified during program execution. ROM elements used onDorado can be reprogrammed with a special device constructed for the purpose.row - one of the 64 or 256 groups of 4 cache entries. The cache row in which a wordresides is determined by bits 20..27 of its virtual address.storage - the main memory of the machine, organized in munches of 256 bits, or 16machine words.storage reference - a reference to the storage, initiated as a result of a memory reference.A processor reference causes a storage reference if there is a cache miss or if the FDMisscontrol is true in the memory control register; an io reference always causes a storagereference.storage reference number (SRN) - an address of a pipe entry which identifies a particularstorage reference.subtask - a two-bit number presented by an io device to the processor and memory systemwhile its task is running. The processor OR's subtask with RBase[3]..RSTK[1] indetermining the RM address and with MemBase[2:3] in determining the base registerselection. The memory system buffers the subtask for fast io devices, and then sends itover the Fin or Fout bus as part of device identification.НЇОf╙pТ─Н%ЕqН5pТНEVОfЯНЇОbqpНЇО^╛qpПDНЇО[:qТ┴pПZНЇОYoТ≈П;Т≤П"НЇОW╔Т┬П4q pТ┴НЇОUзТЖqТВpП8НЇОTТїП5Т╗П#НЇОREТП.НЇОNсqpТ▀П2Т▄П'НЇОMТ╘П3Т╙П%НЇОK>ТэТщПCНЇОIsТНЇОFqpТэПPНЇОB▐qpТП?НЇО?qТЎpП@Т©НЇО=SНЇО9АqpТДП>ТЕНЇО8qТ╚pТ╛П5qНЇО6KpТП4НЇО2зqТ²pП.Т·НЇО1Т▓П4Т⌠П!НЇО/DТПMНЇО+рqpТ╛ТґП>НЇО*ТП<НЇО&√qpТшП:ТэНЇО$кТ НЇО!YqТ▓pТ⌠П:НЇО▐Т▄ПIТ█ qНЇОдpТюП/ТаП(НЇОЫ НЇО┤qТ÷pТ═П)НЇОҐТНЇОKqТ┴ pТ┼ПLНЇО─ТПPНЇОІТНПQНЇО КТ╛П=ТґНЇО ТП:Ъ zЇы=П\2(Dorado Hardware ManualGlossary14 September 1981152tag - The extra bit in Md readout which complements for successive Fetch_'es andStore_'s by the same task. Agreement of the bit in Md with the current value equalsreference finished.task - one of the 16 priority scheduled tasks. Special tasks are the emulator (task 0, lowestpriority) and the fault task (task 15, highest priority). Other tasks are paired with iocontrollers.VA - virtual address.Vacant - a cache entry or map entry which does not contain valid data.Victim (Vic) memory - stores 4 bits for each cache row. Two of the bits specify the victimwhich will be chosen if a reference to that row results in a miss, and the other two are thenext victim.victim - on a processor reference that causes a cache miss, the cache entry chosen to bereplaced by the referenced data.WP - write protected. Map entries and cache entries have bits with this name.НЇОf╙pТ─Н%ЕqН5pТНEVОfЯНЇОbqТНpТОП=НЇО`SТиП0ТйП$НЇО^┴ТНЇО[qТ└pПXНЇОYLТЦП@ТДНЇОW│НЇОTqpТНЇОP·qpП@НЇОM,qТ▓pqpП$Т⌠qНЇОKapТ█ПFТ▌НЇОI√ТНЇОF$qpТ▓П5Т⌠НЇОDZТНЇО@ХqpПLЪ░Ї@║<П'j} HELVETICA HELVETICA TIMESROMAN HELVETICA HIPPO HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA HELVETICAHIPPO HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HIPPO HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HIPPO HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICAHIPPO HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICAHIPPO HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICAHIPPO HELVETICA HELVETICA TIMESROMANHIPPO шR СQO" M+Н2е:ЛBЎJ уS и\Сcєk÷s╣z┤┌ █▀Ю▓©≥P²╨ёь╘╦╠≈╧xҐие/ййя гз еЦбЙ еСшШ╡:┬Ґ 5"U* Y3ф:▓AїH├JкR ╞[ ґd ╪mєu ▒~Ы┘ п▌а√ и÷ ё╗▀╟Б╦я© нх┬п ┌ыуА ╣Й ╘ТЖШ≈ z Єэз%╞, R6и>ЭE мNґV ▄_Уgфnеuз}В└L┼О▒ь≥┐°CЮf▀9GЕFiCB"9@Е]▄F[Doн;J;\;∙;╗@┬;┌[E=<t$~y?ґQ,─3gfЬчччччЖҐччччpЧЪАXАXУРч√Ь0─d )(▓ АDУРQ,pЙ│ pыfОemory2.press┌fЬf╘emory2.press to local file d1memory2.press═ [New file]█]█o96Z" :·#:Z┼"FD8m╪;J;\;∙;╗j/ёZ═ЪЪ≈пс▓ЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪDoradoManual-B.pressFiala17-Sep-81 9:34:26 PDT═