This document is for internal Xerox use only.Dorado Hardware Manualby E.R. Fialacontributions to the manual byR. Bates, D. Boggs, B. Lampson, K. Pier, E. Taft, and C. Thackerother help byD. Clark, W. Crowther, W. Haugeland, G. McDaniel, andS. Ornstein14 September 1981The document describes the architecture and hardware design of the Dorado computer at alevel appropriate for programming. At the date of this printing, approximately 22 systemshave been released to users.This release incorporates a major revision of the Display Controller chapter, medium revisionsto the Disk Controller and Instruction Fetch Unit chapters, and minor revisions elsewhere.Revision history:14 February 1979First complete manual exclusive of io controller chapters.8 October 1979Chapters on io controllers added; major revisions.14 September 1981Major revision to the Display Controller chapter, medium revision toInstruction Fetch Unit and Disk chapters, minor revisions elsewhere.XEROXPalo Alto Research CenterComputer Sciences Laboratory3333 Coyote Hill Rd.Palo Alto, California 94304This document is for internal Xerox use only.Н┘ОbpТXП,qНЇОXxrТ·НЇОTЦsТXН2ОQNНґОO┐П@Н2ОKННґОJ#П5НґОHY НЇОAuНЇО>Т≥ПIТ НЇО<8ТЎП*Т©П0НЇО:nТНЇО6ЭТ┘П^НЇО51ТЎТ©ПDНyО2tТН:О0_Т─НQТП5Н:О.ЇТ─НQТП*Н:О-Т─НQТПТЯП'НRО+╞ТВТЬП%Н,іОуuН,іОpТН,іОRН,іО┤sН,іОҐНtОыpТXП-Ъ\Ї▓>QSі°Table of Contents1.Introduction 12.Overview 22.1Control 22.2Registers, Memories, and Data Paths 22.3Timing 62.4Instruction Fields 82.6Notation 93.Processor Section 103.1RM and STK Memories, RBase and StkP Registers 103.2Cnt Register 123.3Q Register 133.4T Register 133.5BSEL: B Multiplexor Select 133.6ASEL: A Source/Destination Control 153.7ALUF, ALU Operations 173.8LC: Load Control for RM and T 193.9FF: Special Function 193.10Multiply and Divide 233.11Shifter 233.12Hold and Task Simulator 254.Control Section 264.1Tasks 264.2Task Switching 264.3Next Address Generation 274.4Conditional Branches 294.5Subroutines and the Link Register 304.6Dispatches 314.7IFU Addressing 324.8IM and TPC Access 334.9Hold 344.8Program Control of the DMux 345.Memory Section 365.1Memory Addressing 365.2Processor Memory References 375.3IFU References 415.4Memory Timing and Hold 41ЪН$:О^оpТiНЙО[єqНН9АrТЫPЬЪЗЪНЙОXxqЫЬЗНН9АrЫPЬЪЗЪЫЬЗНЙОV┼НН9АЫPЬЪЗЪНЙОT²ЫЬЗНТ─П#Н9АТЫPЬЪЗЪНЙОR╞ЫЬЗНН9АЫPЬЪЗЪНЙОPаЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОNсЫЬЗНН9АЫPЬЪЗЪНЙОK╗qЫЬЗНqТ─Н9АrТЫPЬЪЗЪНЙОIPЫЬЗНТ─П-Н9АТЫPЬЪЗЪНЙОGЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОDГЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙОB╡ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО@~ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО>JЫЬЗНТ─П#Н9АТЫPЬЪЗЪНЙО<ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО9АЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО7╛ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО5xЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО3CЫЬЗНН9АЫPЬЪЗЪНЙО1НТ─ЫPЬЪЗЪН9АТЫPЬЪЗЪНЙО-ДqЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙО+рНН9АЫPЬЪЗЪЫЬЗНЙО)аНТ─ Н9АТЫPЬЪЗЪЫЬЗНЙО'сНТ─Н9АТЫPЬЪЗЪНЙО%ЕЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО#ЬЫЬЗНТ─П!Н9АТЫPЬЪЗЪНЙО" ЫЬЗН Н9АЫPЬЪЗЪНЙО ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО.ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО@ЫЬЗНН9АЫPЬЪЗЪНЙОRЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО'qЫЬЗНqТ─ Н9АrТЫPЬЪЗЪЫЬЗНЙОНТ─Н9АТЫPЬЪЗЪЫЬЗНЙО(НТ─Н9АТЫPЬЪЗЪНЙО:ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙОLЫЬЗНТ─Н9АТЫPЬЪЗЪЪ"Й/HQ+SDorado Hardware ManualTable Of Contents14 September 19815.5The Map 445.6An Automatic Storage Management Algorithm 485.7Mesa Map Primitives 495.8The Pipe 515.9Faults and Errors 535.10Storage 575.11The Cache 585.12Initialization 595.13Testing 616.Instruction Fetch Unit 646.1Overview of Operation 646.2The IFUJump Entry Vector 696.3Timing Summary 716.4Use of MemBX and Duplicate Stk Regions 726.5Traps 726.6IFU Reset 756.7Rescheduling 756.8Breakpoints 766.9Reading and Writing IFUM 766.10Continuing from Processor Faults 776.11IFU Testing 796.12Details of Pipe Operation 806.13Timing Details 827.Slow IO 857.1Input/Output Functions 857.2IO Opcodes 867.3Wakeup, Block, and Next 877.4SubTasks 887.5Illegal Things IO Tasks Must Not Do 888.Fast IO 908.1Transport 908.2Wakeups and Microcode 908.3Latency 919.Disk Controller 929.1Disk Addressing 939.2Sector Layout Considerations 939.3General Firmware Organization 959.4Task Wakeups 969.5Control Register 979.6Format RAM and Sequence PROMs 97НЇОfЯrТ─Н!ZsН1UrТНЙО_НТ─Н9АТЫPЬЪЗЪНЙО](ЫЬЗНТ─П)Н9АТЫPЬЪЗЪНЙО[:ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОYLЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОW^ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОUpЫЬЗНН9АЫPЬЪЗЪНЙОS┌ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОQ■ЫЬЗН Н9АЫPЬЪЗЪНЙОOїЫЬЗНН9АЫPЬЪЗЪНЙОL{qЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙОJjНТ─Н9АТЫPЬЪЗЪЫЬЗНЙОH|НТ─Н9АТЫPЬЪЗЪНЙОF▌ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙОD═ЫЬЗНТ─П&Н9АТЫPЬЪЗЪНЙОB╡ЫЬЗНН9АЫPЬЪЗЪНЙО@еЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО>вЫЬЗНН9АЫPЬЪЗЪНЙО<ИЫЬЗН Н9АЫPЬЪЗЪНЙО:ШЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО9 ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО7ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО51ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО3CЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО0qЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙО.*НТ─Н9АТЫPЬЪЗЪЫЬЗНЙО,<НТ─ Н9АТЫPЬЪЗЪНЙО*NЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО(`ЫЬЗНН9АЫPЬЪЗЪНЙО&sЫЬЗНТ─П#Н9АТЫPЬЪЗЪНЙО#GqЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙО!6НН9АЫPЬЪЗЪЫЬЗНЙОHНТ─Н9АТЫPЬЪЗЪЫЬЗНЙОZНН9АЫPЬЪЗЪНЙО/qЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙОAНТ─Н9АТЫPЬЪЗЪЫЬЗНЙОSНТ─Н9АТЫPЬЪЗЪЫЬЗНЙОeНТ─Н9АТЫPЬЪЗЪЫЬЗНЙОwНТ─Н9АТЫPЬЪЗЪЫЬЗНЙО┴НТ─Н9АТЫPЬЪЗЪЫЬЗНЙО⌡НТ─Н9АТЫPЬЪЗЪЫЬЗ═ЇT37YЇ═Dorado Hardware ManualTable Of Contents14 September 19819.7Tag Register 999.8FIFO Register 1019.9Muffler Input 1019.10Error Detection and Correction 10410.Display Controller 10910.1Operational Overview 10910.2Video Data Path 11010.3Horizontal and Vertical Control 11310.4Pixel Clock System 11510.5OIS Seven-Wire Video Interface 11610.6Processor Task Management 11710.7Slow IO Interface 11910.8DispM Terminal Interface 12110.9DDC Initialization Requirements 12210.10Speed and Resolution Limits 12211.Ethernet Controller 12411.1Ethernet Packets 12411.2Controller Overview 12511.3Receiver 12711.4Transmitter 12811.5Clocks 12911.6Task Wakeups 12911.7Muffler Input 13011.8IOB Registers 13111.9Control Register 13111.10Status Register 13212.Other IO and Event Counters 13312.1Junk Task Wakeup 13312.2General IO 13312.3Event Counters 13313.Error Handling 13613.1Processor Errors 13713.2Control Section Errors 13913.3IFU Errors 13913.4Memory System Errors 13913.5Sources of Failure 14013.6Error Correction 14114.Performance Issues 14414.1Cycle Time 144ЪНЇОfЯrТ─Н!ZsН1UrТНЙО_НТ─Н9АТЫPЬЪЗЪЫЬЗНЙО](НТ─Н9АТЫPЬЪЗЪЫЬЗНЙО[:НТ─Н9АТЫPЬЪЗЪЫЬЗНЙОYLНТ─Н9АТЫPЬЪЗЪНЙОV!qЫЬЗНqТ─Н9АrТЫPЬЪЗЪЫЬЗНЙОT3НТ─Н9АТЫPЬЪЗЪЫЬЗНЙОREНТ─Н9АТЫPЬЪЗЪНЙОP4ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОN#ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОL5ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОJGЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОHYЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОFHЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОD7ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОB%ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО>ЗqЫЬЗНqТ─Н9АrТЫPЬЪЗЪНЙО=ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО;ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО90ЫЬЗНН9АЫPЬЪЗЪНЙО7BЫЬЗН Н9АЫPЬЪЗЪНЙО5UЫЬЗНН9АЫPЬЪЗЪНЙО3gЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО1yЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО/▀ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО-²ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО+╞ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО(└qЫЬЗНqТ─Н9АrТЫPЬЪЗЪНЙО&√ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО$╗ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО"╨ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО▐qЫЬЗНqТ─ Н9АrТЫPЬЪЗЪНЙО║ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОЁЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО╒ЫЬЗНТ─ Н9АТЫPЬЪЗЪНЙО▒ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОёЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО╣ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО┴qЫЬЗНqТ─Н9АrТЫPЬЪЗЪНЙОxЫЬЗНТ─ Н9АТЫPЬЪЗЪ╡Ї137Yз Dorado Hardware ManualTable Of Contents14 September 198114.2Emulator Performance 14414.3IFU Not-Ready Wait 14514.4Microstore Requirements 14514.5Cache Efficiency and Miss Wait 14614.6Performance Degradation Due to IO Tasks 14714.7Cache and Storage Geometry 14715.Glossary 150ЪНЇОfЯrТ─Н!ZsН1UrТНЙО_НТ─Н9АТЫPЬЪЗЪНЙО](ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙО[:ЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОYLЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОW^ЫЬЗНТ─П'Н9АТЫPЬЪЗЪНЙОUpЫЬЗНТ─Н9АТЫPЬЪЗЪНЙОREqЫЬЗНqН9АrЫPЬЪЗЪ0ЇPz37▒┬Dorado Hardware ManualTable Of Contents14 September 1981List of Tables 1.Memories 3 2.Registers 4 3.Data Paths 5 4.Load Timing 7 5.Instruction Fields 8 6.RSTK Decodes for Stack Operations 11 7.BSEL Decodes 13 8.ASEL Decodes 15 9.ALUFM Control Values 1710.LC Decodes 1911.FF Decodes 2012.ALUF Shift Decodes 2513.Branch Conditions 3014.Reserved Locations in the Microstore 3315.Timing of a Dirty Miss 4416.Map Configurations 4517.Fault Indications 5418.IFUM Fields 6519.Operand Sequence for _Id 6620.IFU FF Decodes 6821.IO Register Addresses 8522.Task Assignments 8623.T-80 Specifications and Characteristics 9524.OIS Terminal Microcomputer Messages 11725.DDC Muffler Signals 12026.Ethernet Muffler Signals 13027.Error-Related Signals 13728.Double Error Incidence vs. Repair Rate 14329.Utilization of the Microstore 14530.Execution Time vs. Cache Efficiency 14631.Cache Geometry vs. LRU Behavior 149ЪНЇОfЯrТ─ЫЬЗН!ZsН1UrТН%╛О^оpТi НЙОZСrТ─Н\Н9АТЫPЬЪЗЪЫЬЗНЙОYТ─Н\Н9АТЫPЬЪЗЪЫЬЗЫPЬЪЗЪНЙОWТ─ЫЬЗН\ Н9АТЫPЬЪЗЪЫЬЗЫPЬЪЗЪНЙОU*Т─ЫЬЗН\ Н9АТЫPЬЪЗЪЫЬЗЫPЬЪЗЪНЙОS<Т─ЫЬЗН\Н9АТЫPЬЪЗЪЫЬЗЫPЬЪЗЪНЙОQNТ─ЫЬЗН\П!Н9АТЫPЬЪЗЪЫЬЗНЙОO`Т─Н\Н9АТЫPЬЪЗЪЫЬЗНЙОMrТ─Н\Н9АТЫPЬЪЗЪНЙОK└Т─ЫЬЗН\Н9АТЫPЬЪЗЪНЙОI√ЫЬЗН\Т─ Н9АТЫPЬЪЗЪНЙОG╗ЫЬЗН\Т─ Н9АТЫPЬЪЗЪНЙОE╩ЫЬЗН\Т─Н9АТЫPЬЪЗЪЫЬЗНЙОCмН\Т─Н9АТЫPЬЪЗЪНЙОAъЫЬЗН\Т─П$Н9АТЫPЬЪЗЪНЙО?ЯЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО>ЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО<ЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО:'ЫЬЗН\Т─ Н9АТЫPЬЪЗЪНЙО89ЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО6KЫЬЗН\Т─ Н9АТЫPЬЪЗЪНЙО4^ЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО2pЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО0┌ЫЬЗН\Т─П'Н9АТЫPЬЪЗЪНЙО.■ЫЬЗН\Т─П#Н9АТЫPЬЪЗЪНЙО,іЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО*╦ЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО(йЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО&эЫЬЗН\Т─П&Н9АТЫPЬЪЗЪНЙО$НЫЬЗН\Т─Н9АТЫPЬЪЗЪНЙО#ЫЬЗН\Т─П#Н9АТЫPЬЪЗЪНЙО!ЫЬЗН\Т─Н9АТЫPЬЪЗЪЇ~37N█?Dorado Hardware ManualTable Of Contents14 September 1981List of Figures 1.Dorado: Programmer's View 2.Card Cage 3.Processor Hardware View 4.Shifter 5.Control Section 6.Next Address Formation 7.Instruction Timing 8.Overall Structure of the Memory System 9.Cache, Map, and Storage Addressing10.The Pipe and Other Memory Registers11.Error Correction12.Instruction Fetch Unit Organization13.Disk Controller14.Display Controller15.Display Controller IO Registers16.Ethernet Controller17.Programmers' Crib SheetНЇОfЯrТ─ЫЬЗН!ZsН1UrТН%dО^оpТiН:О[rТXН\Н:ОYLН\Н:ОW│Н\Н:ОUЇН\Н:ОSЛН\Н:ОR"Н\Н:ОPWН\Н:ОN▄Н\П&Н:ОLбН\П"Н:ОJВН\П#Н:ОI-Н\Н:ОGbН\П#Н:ОE≈Н\Н:ОCмН\Н:ОBН\Н:О@7Н\Н:О>mН\ЪЗЇ>&37)Е┐Dorado Hardware ManualIntroduction14 September 19811IntroductionDorado is a high performance, medium cost microprogrammed computer designed primarilyto implement a virtual machine for the Mesa language, as described in "The MesaProcessor Principles of Operation," and to provide high storage bandwidth for picture-processing applications. Dorado aims more at word processing than at numericalapplications.The microprocessor has a nominal cycle time of 60 ns, and most Mesa opcodes willexecute in one or two cycles; the overall average opcode execution time will be subject toa number of considerations discussed later. Dorado will also achieve respectableperformance when implementing virtual machines for the Alto, Interlisp, and Smalltalkprogramming systems, although simple instructions for these run three to five times slowerthan Mesa.Dorado is implemented primarily of MECL-10K integrated circuits; storage boards use MOSand Schottky-TTL components primarily. Backplanes and storage boards are printedcircuits; other logic boards are stitchweld in prototypes and multiwire or PC in productionmachines. The mainframe is divided into sections called Control, Processor, InstructionFetch Unit (IFU), and Memory, and peripheral control is accomplished by the Disk,Ethernet, and Display Controller sections, as discussed in chapters of this manual. Themain data paths, shown in Figure 1, are 16-bits wide (the word size). The control section isshown in Figure 5. The Baseboard section, used to control the mainframe, is discussed inthe "Dorado Debugging Interface" document.The processor is organized around an Arithmetic and Logic Unit (ALU) whose two inputsare the A and B data paths (Figure 1), and whose output is normally routed to the Pd datapath. Inputs to A, B, and Pd include all registers accessible to the programmer. Inaddition, 16-bit literal constants can be generated on B. B appears on the backplane forcommunication with the IFU, Control, and Memory sections.The processor also includes a 32-bit in/16-bit out shifter-masker optimized for fieldinsertion and extraction and with specialized paths for the bit-boundary block transfer(BitBlt) instruction.An instruction fetch unit (the IFU) operating in parallel with the processor can handle up tofour instruction sets with 256 opcodes each; opcodes may independently be specified asone, two, or three bytes long.Emulator and IFU references to main memory are made through a 4k-word high-speedcache. Main storage can be configured in various sizes up to a maximum of 222 16-bitwords when 64k x 1 RAMs.The processor initiates data transfers between main memory and fast input/output devices.16 16-bit words are then transmitted without disturbing the processor data paths in about1.68 ms (28 cycles). New references can be initiated every 8 cycles, so total bandwidth ofthe memory, 533 mHz, is available for devices with enough buffering.НЇОf╙pТ─Н%5qН5©pТНGыОfЯНЇОaвrНЇО^epТ│ Т┌ПJНЇО\⌡ТЩ ТЧПAНЇОZпТяПVНЇОYТПOНЇОW;НЇОSиТчП5ТъНЇОQЧТ▓ПWТ⌠НЇОP4ТПQНЇОNiТМП.ТНП'НЇОL·Т⌠П5Т■П%НЇОJтТ НЇОGbТ░ПWНЇОE≈ТСПIТТНЇОCмТ·ПJТ÷НЇОBТ©ПXНЇО@7ТПKТНЇО>mТЇП-Т╦П+НЇО<╒Т│П9Т┌П$НЇО:ьТ▐ПYНЇО9 ТП*НЇО5⌡ТіП!ТїП4НЇО3пТ├П;Т┤НЇО2ТыПAТзНЇО0;Т╒ПUТёНЇО.qТП9НЇО*ЪТПUНЇО)4ТъПNТЮНЇО'iТНЇО#ЬТ┴ПWТ┼НЇО"-Т╛П8ТґНЇО bТНЇОПТмП3ТнНЇО&ТЁТЄП1ОЁsО&pНЇО[ТНЇОИТ▄ПYНЇОТ║П$Т╒П5НЇОTТ■tpП7Т∙НЇОCТПDЪdЇЭ=ПX$Dorado Hardware ManualOverview14 September 19812OverviewExperience suggests that programmers will gradually develop a mental model somethinglike Figure 1; until this mental model is well established, it is probably desirable toRead the following with Figure 1 in view.Dorado has Processor, Control, Memory, IFU, and IO controller sections.Io controllers are independent of each other and of the other sectionsyou will have tounderstand a particular io controller iff you are going to write microcode that controls it.The memory and IFU are "slaves" to the processor/control section. In most situations,their external interface is simple relative to internal details of operation, and effectiveprogramming is usually possible without detailed understanding.However, programmers will have to understand the processor thoroughly because thedifferent parts of the processor are controlled directly by instruction fields, and most of theprocessor will be used, even in a small program.Programmers must also understand most of the control section, although fairly simpleassembly language contstructs are transformed into the complicated branch encodingsneeded by Dorado, so detailed understanding of Dorado branching is not required.ControlDorado supports up to 16 independent tasks at the microcode level. Each task has its ownprogram counter (TPC), and other commonly-used registers are also replicated on a per-task basis. Tasks are scheduled automatically by the hardware in response to wakeuprequests, where task 15 is highest priority, task 0, lowest.Emulator microcode runs entirely in task 0 (lowest priority); fault conditions normallywakeup task 15, the "fault task" (highest priority). Other tasks are normally paired with iodevices that issue wakeup requests when they need service. Task switching, discussed in"Control Section", is in most cases invisible to the programmer, because commonly-usedregisters are duplicated for each task.In this manual, "instruction" refers to a microinstruction in the control store, as opposed toan opcode in the higher level language interpreted by a microprogram. The JCN field in aninstruction encodes a variety of jumps, calls, conditional jumps and calls, instructiondispatches and returns for the current task.Registers, Memories, and Data PathsTables 1, 2, and 3 describe memories, registers, and data paths in Dorado; these arediagrammed in Figure 1. The first two tables below focus on a particular register ormemory and tell how it is used and where it connects; the third table focuses on particulardata paths and shows how they connect various parts of the machine.ЪНЇОf╙pТ─Н%5qН5pТНGыОfЯНЇОaвrНЇО^epТҐП=ТЎНЇО\⌡ТЫПTТЗНОY)qТXП)НЇОUЇpТПGНЇОREТ╠П,Т╡П+НЇОPzТ╚П@Т╛НЇОMТ╧ПAТ╨НЇОK>ТДП?ТЕНЇОIsТП?НЇОFТХПMТИНЇОD7Т▄П$Т█П;НЇОBlТП0НЇО>ЗТъПTНЇО=/ТзТшП6НЇО;eТПPНЇО6KsНЇО2зpТ└П$Т┘П5НЇО1Т╚П<Т╛НЇО/DТдП>ТеНЇО-zТП<НЇО*ТУПWНЇО(=Т■Т∙ПNНЇО&sТ■Т∙ПLНЇО$╗Т╗ПVНЇО"щТП'НЇОkТ┤ПSТ┬ НЇО║Т│Т┌ПQНЇОжТУПWНЇОТП,НЇОРsП#НЇО─pТтПJТу НЇОІТвП"ТьП3НЇО КТ▄ Т█ПQНЇО ТПC BЇ <П]ЭDorado Hardware ManualOverview14 September 19813Table 1: MemoriesMemoryCommentsIMIM is a 4096-word x 34-bit (+2 parity) RAM used to store instructions. When written, the address istaken from Link and data from B 16 bits at-a-time (1 extra bit and parity from RSTK field). Whenread, the address is taken from Link, and data is delivered to Link 9 bits at-a-time. The read orwrite is controlled by the JCN field and two or three low bits of RSTK.ALUFMALUFM is a 16-word x 6-bit ALU control RAM addressed by the 4-bit ALUF field. Five ALUFM bitsspecify 16 boolean or 5 arithmetic operations on A and B. One bit is the input carry for arithmeticoperations (modifiable by several functions). ALUFM[ALUF] is read onto Pd by the ALUFMEMfunction or both read onto Pd and loaded from B by the ALUFMRW_ function.RMRM is a 256-word x 16-bit (+2 parity) RAM used for general storage by all tasks. The normaladdress is RBase[0:3],,RSTK[0:3]. Data can be read onto A or B and loaded from Pd or Mdwithout using FF. Together with T, RM forms the input to the Shifter.STKSTK is a 256-word x 16-bit (+2 parity) stack accessible only to the emulator, used instead of RMwhen the BLOCK bit in the instruction is 1. Its address comes from StkP, modified by -4 to +3under control of RSTK.IFUMIFUM is a 1024-word x 24-bit (+3 parity) decoding memory containing 256 words for each of fourinstruction sets. The instruction set can be set by the InsSetOrEvent_ function. The low 8 addressbits are normally an opcode fetched from the cache, but can be loaded from B by the BrkIns_function to read or write IFUM itself. The IFUMLH_ and IFUMRH_ functions load, and theB_IFUMLH' and B_IFUMRH' functions read different bits of IFUM. During normal operation IFUMcontrols decoding of the stream of opcodes and operands fetched from memory relative to BR 31,the code base.MAINMain storage consists of a 64-row x 4-column x 16-word virtual cache coupled with one to four256k x 16-bit memory modules (using 16k-bit storage chips). The IFU and processor independentlyaccess the cache, with IFU references deferring to the processor. The processor has two dissimilarmethods of reference, one primarily to the cache (with "misses" initiating main memory action) andone directly to main memory (invalidating cache hits on writes, using dirty cache hits on reads).Fetch_, Store_, IFetch_, LongFetch_, and PreFetch_ are cache references. Md can be loadedinto T or RM (LC field), routed onto B (BSEL field), onto A (FF field), or used in a shift-and-maskoperation (ASEL and ALUF fields). IOFetch_ and IOStore_ (ASEL field) initiate a 16-word transferbetween an io device and memory without further processor interaction (using Fin or Fout bus).Virtual addresses are transformed to absolute using the Map memory. All references leaveinformation in the Pipe memory.BRA 32-word x 28-bit base register memory addressed by the MemBase register. The virtual addressfor any memory reference is BR[MemBase]+Mar. BR is loaded from Mar by the BrLo_A andBrHi_A functions and can be read indirectly onto B via the virtual address left in the Pipe after amemory reference (Pipe0 and Pipe1 functions).PipeThe 16-entry x 6-word pipe contains trace information left by memory references. This informationincludes the virtual address, map stuff, single-error and double-error information, cache control stuff,task and subtask. It is automatically loaded during any memory reference and can be read onto Bby the Pipe0, Pipe1, ..., Pipe5' functions.MapThe Map is a 16k or 64k-word x 19-bit (+parity) memory used to transform virtual addesses toabsolute. Addressed by VA[10:23], map entries contain 16 bits of real page, write protect, dirty,and referenced bits. They can be written from B with Map_ (ASEL) and read from the Pipe aftermain storage references.НЇОf╙pТ─Н%5qН5pТНGыОfЯН#ОbsТXН╦О`vtНгЫPЬЪЗЪН╦О]▓uНгТ─ЫPЬЪЗЪПYТ│ НЛО\Т⌠ПNТ■НЛОZ┼Т²ПVТ·НЛОYТПGН╦ОV!НгТ█ЫPЬЪЗЪТ▌П@НЛОT²Т┤П>Т┬П&НЛОSТцТдП@НЛОQ■ТПIН╦ОN╟НгТ╞ЫPЬЪЗЪПUТ╟НЛОM,ТҐП'ТЎП1НЛОK╗ТПFН╦ОHцНгТ≥ЫPЬЪЗЪПMТ НЛОG?Т°Т²ПEНЛОE╩ТН╦ОBжНгТ√ЫPЬЪЗЪТ≈ПUНЛОARТ─ПQТ│НЛО?нТ╙ПEТ╚НЛО>JТвТьПGНЛО<фТ·П>Т÷НЛО;AТ▓ПZТ⌠НЛО9ҐТ Н╦О6ыНгТ╚ЫPЬЪЗЪТ╛ПEНЛО5UТ┼П<Т▀П$НЛО3пТ┌П2Т┐П1НЛО2LТ┴П#Т┼П?НЛО0хТ╘П/Т╙П2НЛО/DТ╛ТґПKНЛО-юТ■П9Т∙П*НЛО,<Т⌠ПaНЛО*╦ТґП'ТўП7НЛО)4ТДП!ТЕП8НЛО'╟ТН╦О$кНгТ⌠ЫPЬЪЗЪП@Т■НЛО#GТҐПQТЎНЛО!цТ∙Т√ПCНЛО ?ТП-Н╦ОZНгТ⌠ЫPЬЪЗЪП=Т■П%НЛОжТ─Т│ПKНЛОRТ┼Т▀ПNНЛОнТП+Н╦ОИНгТ╠ЫPЬЪЗЪТ╡П@НЛОeТїТ╗ПNНЛОАТ■ПXТ∙НЛО]ТЛЇo<ОX°▒Dorado Hardware ManualOverview14 September 19814Table 2: RegistersRegisterComments * = one of these for each task; i.e., "task specific"T*16-bit (+2 parity) T sources either A (ASEL field or FA field with memory ops) or B (BSEL field), orthe Shifter (ASEL) and loads from either Pd or Md (LC field).RBase*4-bit RBase,,RSTK field forms addresses for RM. RBase can be loaded from FF[4:7] or fromB[12:15] by the RBase_SC, RBase_B, or Pointers_B functions; it is read onto Pd[12:15] by thePd_Pointers function. RBase is loaded with 0 or 1 when the IFU dispatches to the first instructionfor an opcode.StkPThe emulator uses STK instead of RM when the BLOCK bit is 1. 8-bit StkP holds the address forSTK. The RSTK field is interpreted as an adjustment to StkP, which can be modified -4 to +3 inconjunction with testing for overflow and underflow. This mechanism implements the Mesaevaluation stack. StkP can be loaded by the StkP_B function and read onto Pd[8:15] by the_TIOA&StkP function (Stack overflow and underflow indicators are read into Pd[8:9] by thePd_Pointers function.).Q16-bit Q is used as a shift register by multiply and divide. Q can be read onto A (FF field or FAwith Fetch_ or Store_) or B (BSEL field) and loaded from any B source except a constant (BSELand FF fields). Functions implement Q lsh 1 and Q rsh 1.CntCnt is a 16-bit counter that can be both decremented and tested for zero by a branch condition.Cnt can be loaded from FF[4:7] with 1 to 16 or from B (FF field) and can be read onto Pd (FF).TIOA*TIOA is an 8-bit io address register (see "Slow IO") loaded by the TIOA_B function and read ontoPd[0:7] with the Pd_TIOA&StkP function. TIOA[5:7] may also be loaded from FF[5:7].ShC16-bit ShC controls the shifter-masker (see "Shifter"). RF_A, WF_A, and ShC_B functions loadShC in various ways. ShC can be read onto Pd by the Pd_ShC function.MemBase*MemBase is a 5-bit register addressing BR for memory references. The MemBase_n functions loadit from FF[3:7]; the MemBaseX_n functions load it from 0,,MemBX[0:1],,FF[6:7]. The IFU loadsMemBase with a value between 0 and 3 relative to MemBX or with 34 to 37, as specified in IFUM,prior to executing the first instruction of an opcode. MemBase is read onto Pd[3:7] by thePd_Pointers function and loaded from B[3:7] by the Pointers_B and MemBase_B functions.MemBXMemBX is a 2-bit register used like a stack pointer in conjunction with MemBase. The ideas behindthis are discussed in "Memory Section".Link*16-bit Link holds subroutine return addresses, address-modification for dispatches, IM address for IMreads/writes, and data for TPC reads/writes. It can be read onto or loaded from B[0:15] by theB_Link or Link_B, BigBDispatch_B, or BDispatch_B functions, or from CIA+1 by CALLs andRETURNs.PC16-bit PC contains the byte displacement of the next opcode relative to BR 31, the code base. TheIFU maintains this register, so only conditional jumps that don't jump and opcodes of type "pause"have to load it with the PCF_B function. The B_PCX' function reads PC.TPC*TPC contains the address of the next instruction for each task. It is addressed from B[12:15] andread/write control is in JCN. Data is read from/written into Link under control of the JCN field ofthe instruction.McrMemory control registerdisables parts of memory system for initialization and checkout.ЪНЇОf╙pТ─Н%5qН5pТНGыОfЯН#┘ОbsТXН╦О^╛tНгТЫPЬЪЗЪ uП5Н╦О[гЫЬЗНгТ└П#Т┘ПAНЛОZCТП=Н╦ОW^НгТгЫPЬЪЗЪЫЬЗЫPЬЪЗЪТхП!ЫЬЗЫPЬЪЗЪП&НЛОUзТ╡ПKТЁНЛОTVТ┼Т▀ПDНЛОRрТ Н╦ОOМНгТ▒ЫPЬЪЗЪПCТ▓НЛОNiТ▒Т▓ПHНЛОLЕТИТЙП<НЛОKaТЎПJТ©НЛОIщТЙП'ТКП2НЛОHYТН╦ОEtЫЬЗНгТ√ЫPЬЪЗЪП?Т≈П#НЛОCПТ⌠Т■ПGНЛОBlТП9Н╦О?┤НгТ·ЫPЬЪЗЪПMТ÷НЛО>Т ПPТ⌡ Н╦О;НгТ▌ЫPЬЪЗЪП>Т▐П"НЛО9 ТПSН╦О6╣НгТґЫPЬЪЗЪ ТўПSНЛО51ТПEН╦О2LЫЬЗЫPЬЪЗЪНгЫЬЗТ┬ЫPЬЪЗЪП9Т┴НЛО0хТЎП1Т©П,НЛО/DТ▌ПUТ▐НЛО-юТпП?ТяНЛО,<ТЛП=ТМН╦О)WНгТ┌ЫPЬЪЗЪП[Т┐НЛО'сТП'Н╦О$ННгТ┌ЫPЬЪЗЪПbТ┐НЛО#jТ═П[Т║НЛО!ФТлТмПMНЛО bН╦О}НгТ┐ЫPЬЪЗЪП+Т└П7НЛОЫТ▀ПbНЛОuТПGН╦О▒НгТ⌠ЫPЬЪЗЪПUТ■НЛОТ▌П+Т▐П9НЛО┬ТН╦ОєНгЫPЬЪЗЪПX Ї╣=ОXV∙Dorado Hardware ManualOverview14 September 19815Table 3: Data PathsPathCommentsAThe 16-bit high-true A bus (called "alua" in hardware drawings) may be driven from T, RM, STK, Q,Id, Md, a small constant between 0 and 178, or the shifter. It is also possible to 'or' the low-trueshifter output with one of the other A sources. The A bus is totally inside the processor section,not connected to any other sections of Dorado, and it is one of the two Alu inputs. The RF_A andWF_A functions, which load ShC for subsequent shift operations, receive data from A.MarThe 16-bit Mar bus transmits the displacement for a memory reference from the processor or IFUsection to the memory section. The CFlags register, some bits of the Mcr register, and the BRmemory in the memory section are also loaded from Mar. The processor drives Mar only when it isstarting a reference or executing one of the functions between 1208 and 1278 (i.e., CFlags_A' andLoadMcr[A,B] are in this group of functions); during other instructions, the IFU may use Mar toinitiate instruction-fetches. Mar is driven low-true; when driven by the processor, it receives thesame data as are driven onto A (but the shifter cannot drive Mar).BThe 16-bit B bus consists of one data path inside the processor section (called "alub" in hardwaredrawings) and another on the backplane (called "Bmux" in hardware drawings); the IOB bus isdriven from Alub on Output operations, when it also is an extension of B. Alub and Bmux may bedirectly driven high-true from registers inside the processor; alternatively, Bmux may be driven low-true from other sections, in which case the processor receives the data onto alub through inverters(so the data appears high-true on alub). The BSEL field in an instruction can specify that either T,RM/STK, Q, or Md sources B; other sources and destinations loaded from B are specified in the FFfield; BSEL and FF are used in combination to specify that a literal 8-bit constant (in either the leftor right byte of the word with 0's or 1's in the other byte) sources B. Alub is one of the two Aluinputs. The processor computes odd byte parity on alub; Bmux and IOB destinations may store orcheck the parity computed by the processor. PdThe Pd path ("Processor data") receives data from an 8-input multiplexor whose inputs are the Aluoutput, possibly shifted left or right one bit on Alu shift functions or masked on a shifter operation,io device input data, and the infrequently read registers in the processor section. Pd may bewritten into the T register or the RM or STK memories.IdThe Id path ("IFU data") is used to send arguments from the IFU to the processor for interpretation.It can be routed onto A using ASEL (A_Id, Fetch_Id, Store_Id, or IFetch_RM/STK); alternatively,the TIsId or RIsId functions can be used to replace data from T or from RM/STK by IFUdatathese functions provide a roundabout method of getting Id onto B.MdThe Md path ("Memory data") moves data from the cache in the memory section into theprocessor. The processor latches Md and can route it onto A or B, load it into T and RM/STK, oruse it in a shift-and-mask operation.IOAThe IOA bus ("Input-output address") is driven from the TIOA register; it specifies the io deviceaffected by a Pd_Input or Output_B function.IOBThe IOB bus ("Input-output bus") is driven from alub on an Output_B function or received on Pdby a Pd_Input function; it transmits data to or from an io device.Fout("Fast output bus") transmits data from the error corrector to a fast output device.Fin("Fast input bus") transmits data from a fast input device (Presently, there are no fast inputdevices) to the syndrome generator.Sout("Storage output bus") transmits data from the syndrome generator to storage.Sin("Storage input bus") transmits data from storage to the error corrector.ЪНЇОf╙pТ─ЫЬЗН%5qН5pТНGыОfЯН"ЄОbsТXН╦О`vtНгЫPЬЪЗЪН╦О]▓uНгТ┐ЫPЬЪЗЪП^Т└НЛО\Т▓П)О[│О\Т⌠П7НЛОZfТ≈ПcНЛОXБТ│ПGТ┌НЛОW^ТПTН╦ОTyНгТ²ЫPЬЪЗЪ Т·ПTНЛОRУТ╘ПSТ╙ НЛОQqТ│П6Т┌П*НЛОOМТ█П!Т▌П!ОO`ОOМОO`ОOМНЛОNFТ╞Т╟ПLНЛОLбТґП`ТўНЛОK>ТПBН╦ОHYНгТ▄ЫPЬЪЗЪП*Т█П8НЛОFуТЁПLТЄНЛОEQТ┼ПIТ▀НЛОCмТ▒П`Т▓НЛОBIТ┬Т┴ПZНЛО@еТ├ПeНЛО?AТ─ПHТ│НЛО=╪Т┼П:Т▀П-НЛО<8Т▐ПcНЛО:ЄТ▄П\Т█НЛО90ТП,Н╦О6KНгТ▀ЫPЬЪЗЪ НО5ТЫЬЗЧ╧О6KЫPЬЪЗЪТ▄Н╩О5ТЫЬЗЧїО6KЫPЬЪЗЪПHНЛО4гТ┼ПOТ▀НЛО3CТ╨П/Т╩П/НЛО1©ТП6Н╦О.зНгТ─ЫPЬЪЗЪ НuО.┐ЫЬЗЧOО.зЫPЬЪЗЪН╣О.┐ЫЬЗЧїО.зЫPЬЪЗЪТ│П<НЛО-VТ═П=Т║П"НЛО+рТэТщПCНЛО*NТПFН╦О'iНгТБЫPЬЪЗЪ Н:О'ЫЬЗЧНО'iЫPЬЪЗЪТЦН 4О'ЫЬЗЧїО'iЫPЬЪЗЪП>НЛО%ЕТ├ПQТ┤НЛО$aТП%Н╦О!}НгТ╛ЫPЬЪЗЪ Н┤О!&ЫЬЗЧOО!}ЫPЬЪЗЪН└О!&ЫЬЗЧїО!}ЫPЬЪЗЪН!|О!&ЫЬЗЧ·О!}ЫPЬЪЗЪТґП*НЛОЬТП,Н╦ОНгТ≤ЫPЬЪЗЪ НKОҐЫЬЗЧOОЫPЬЪЗЪНHОҐЫЬЗЧїОЫPЬЪЗЪТ≥Н!-ОҐЫЬЗЧїОЫPЬЪЗЪПBНЛО░ТПBН╦О╚НгЫPЬЪЗЪНфОTЫЬЗЧїО╚ЫPЬЪЗЪНПОTЫЬЗЧіО╚ЫPЬЪЗЪПJН╦ОфНгЫPЬЪЗЪНфОoЫЬЗЧїОфТпЫPЬЪЗЪНюОoЫЬЗЧМОфЫPЬЪЗЪТяПBНЛОBТП#Н╦О]НгЫPЬЪЗЪНфОЫЬЗЧ╧О]ЫPЬЪЗЪНйОЫЬЗЧіО]ЫPЬЪЗЪП@Н╦ОxНгЫPЬЪЗЪНфО!ЫЬЗЧ╧ОxЫPЬЪЗЪНйО!ЫЬЗЧМОxЫPЬЪЗЪП=ЪЬЇ┼=О[│╡Dorado Hardware ManualOverview14 September 19816TimingThe terminology used in discussing timing is as follows:clockThe 30 ns (nominal) atomic time period of the machine. Clock period can becontrolled by the baseboard microcomputer or through the manifold system asdiscussed in the "Dorado Debugging Interface" document.1cycleThe duration of instructionstwo clocks or 60 ns except for instructions thatread/write IM or TPC.t0The instant at which MIR (MicroInstruction Register) is loadedthe beginning ofa cycle.t1The next instant after t0always one clock later.t2The instant following t1one clock after t1 except for instructions thatread/write IM or TPC. Additional clocks intervening for these special cases,which only affect the control section, are denoted by t1a, t1b, etc.t3, t4Subsequent instants for a instruction. t3 of the previous instruction coincideswith t1 of the current instruction; t4 with t2.First half cycleThe interval from t0 to t1 (or t2 to t3).Second half cycleThe interval from t1 to t2 (or t3 to t4).As implied by this terminology, Dorado initiates a new instruction every cycle. Instructionsare pipelined, requiring a total of three cycles for execution. Timing for a typicalinstruction is shown in Figure 7. At t-2, the next instruction address is determined andinstruction fetch from IM begins; at t0, the instruction is loaded into MIR from IM. Duringthe first half cycle, the selected register is read from RM or STK, and at t1 is loaded into aregister. During the next two clocks (t1-t3), addition is performed in the ALU; at t3 theresult is loaded into a register for writing into RM/STK or T. During the final clock, RM iswritten.Since a new instruction begins before the previous one finishes, paths exist to bypass theregister being written if the following instruction specifies it as a source (These paths,inaccessible to the programmer, are not shown in Figure 1).Most registers load from B at t3 (i.e., at the mid-clock of the cycle following the loadinstruction). These may source B in the instruction after they are loaded. The loadinformation and data are pipelined into the next cycle, as described above. Registersloaded at t2 may be used during the first half-cycle of the following instruction. Usually,this type of register is used for some type of control information, since control registers arenormally clocked at t0 (= t2 of previous instruction), data-oriented registers at t1 (t3 ofprevious instruction).Table 4 summarizes the time at which loading takes place and some other information.1 We actually operate with a clock speed of 32 ns, slower than the 30 ns nominal period, and productionmachines typically become unreliable at about a 29 ns clock period.НЇОf╙pТ─ЫЬЗН%5qН5pТНGыОfЯНЇОbsНЇО^┴pП8НgО[НKТёТєПEНKОYLТ°Т²П>НKОW│ТП7ОXuНgОUЇpНKТЁТЄП-НKОSЛТНgОR"ОQ∙uНKОR"pТ─Н$┬ОQкЧ)ОR"Н($ОQкЧcОR" Н/yОQкЧЩОR" Т│НKОPWТНgОN▄ОMЪuНKОN▄pОMЪuОN▄pНgОLбОL5uНKОLбpТ*ЫPЬЪЗЪОL5uОLбpЫЬЗЫPЬЪЗЪОL5uОLбpТ+НKОJВТ╩Т╪П7НKОI-ТП7ОH═upОI-ОH═upОI-НgОG?ОF╡uОG?pТ─ОF╡uНKОG?pЫPЬЪЗЪ Т╙ОF╡uОG?pТ╚НKОEtТОDГuОEtpОDГuОEtpОDГuОEtpНgОC╘Т─Н▀ОAъТЫPЬЪЗЪОARuОAъpОARuОAъpОARuОAъpОARuОAъpНgО@Т─Н▀О>JТЫPЬЪЗЪО=ҐuО>JpО=ҐuО>JpО=ҐuО>JpО=ҐuО>JpНЇО:ШТ⌠ЫЬЗПHТ■НЇО90Т ТПGНЇО7fТҐП'О6ыuО7fpП!ТЎНЇО5⌡Т·П&О5uО5⌡pТ÷П0НЇО3пТ▌ПLО3CuО3пpТ▐НЇО2Т©П(О1yuО2pО1yuО2pП'ТюО1yuО2pНЇО0;Т▓ПZТ⌠НЇО.qНЇО*ЪТ≥ПFТ НЇО)4ТоПCТпНЇО'iТП;НЇО#ЬТяО#kuО#Ьp ТрП*НЇО"-ТмП%ТнП1НЇО bТяПLТр НЇО≤Т╛ ОuО≤pПGТґНЇОмТ├Т┤ПVНЇОТфОuuОpОuuОpТгОuuОpОuuОpНЇО8ТНЇОфТаТбП>НЇОTuОгТ╛П=ТґП)НЇОfТПCЪ ╗Їx<ПY⌠Dorado Hardware ManualOverview14 September 19817Table 4: Load TimingRegister/ TaskLoad Data LoadCommentMemorySpecificTimeSourceControlMIR*not0IMJCNHolds current instructionCIAnot0TNIA,BNPCJCNHolds current instruction addressCIAinc*not1CIATPCI*not2TNIA, CIATPCyesFHCTPCIHOLDt2LINK(??)JCN,BReading/writing takes 3 cyclesLinkyest2BFFAlso loaded by CALL, RETURN, anddispatchesreadout valid t1 to t3IMnoBJCNReading/writing require 3 cyclesCTASKnot0NextSwitchCurrent taskCTDnot1CTASKCurrent task delayedReadynot0PEncSwitchTask-ready flipflopsStkPnot2BFFNew value read if it changes in thesame instructionRBaseyest2F2F1RAM writte at t3, bypassedCntnot2F2F1Br cond to sub 1 and testBFFALUFMnot2BFFAddressed by ALUFThe output is valid t1 to t3TIOAyest2BFFReadout valid till t3MemBXnot2F2F1Readout valid till t3MemBaseyest2F2F1Readout valid till t3MemBase xor 1FFShCnot3A,BFFRF_A, WF_A, ShC_Bt1FFASEL,BSELQnot3BFF,BSELt3ALU[15],,Q[0:14]FFMultiplyt3Q[1:15],,ALUcryFFDividet3QFFQ rsh 1, Q lsh 1RMnoSHCPd,MdLC,RSTKBypassedSTKnoSHCPd,MdLC,RSTKBypassedTyesSHCPd,MdLC,FFBypassedIFUMnoSHCBFFIFUMLH_/IFUMRH_BrkInsnot2BFFPC*not3BFFLevel F PC loaded, level X readBrnot2AFFBrLo_/BrHi_, _Pipe0, _Pipe1MapBuf*noFHCBASEL, FAWritten on Map_, TestSyndrome_,ProcSRN_, LoadMcrDBufnoFHCBASEL, FAWritten on Store_Mdyest5cacheBypassedCFlagsnot2MarFFFor debugging, initializationMcrnot3Mar, MapBufFFFor debugging, initializationAsrnnot2AsrnAddresses the pipe for ring refsProcSRNnot3MapBufFFAddresses the pipe for Pipe0 to Pipe5TestSyndromenot3MapBufFFFor debugging error correctionPipe0not3Br, etc.ASEL, FAWritten on ref., B_Pipe0Pipe1not3Br, etc.ASEL, FAWritten on ref., B_Pipe1Pipe2not3ASEL, FAWritten on ref., B_Pipe2Pipe3not14mapASEL, FAValid after any storage access or Map_Pipe4not14,t48map, ECASEL, FAValid after any storage accessPipe5not3,t4cacheASEL, FAWritten on ref., B_Pipe5*Cannot be read as data by the processorЪНЇОf╙pТ─Н%5qН5pТНGыОfЯН"LОbsТXНЇО^оtНxТ─ННЯН'GН1²НЇО]nЫPЬЪЗЪНxЫPЬЪЗЪНЫхЬЪЗЪНЯЫЬЪЗЪН'GЫ═ЬЪЗЪНЇОZґuЫЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°ТЫ(ЬЪЗЪНЇОYLЫЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪП!НЇОWКЫЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪНЇОV┼Ы═ЬЪЗЪЫЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║Т─ЫЬЪЗЪН'ВЫ═ЬЪЗЪНЇОU*ЫЬЪЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ(ЬЪЗЪНнОSиЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°ТЫ(ЬЪЗЪНЇОRhНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗН1°Ы(ЬЪЗЪЫЗН1°ОQЫ(ЬЪЗЪ Ы═Ы(НЇОOїНьЫPЬЪЗЪНнЫ═ЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇОNFНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇОLЕНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇОK└НьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇОJ#НьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗН1°ТТЫ(ЬЪЗЪТУН1°ОHцТЫ(ЬЪЗЪНЇОGbНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы(ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪНЇОFЫЗНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы(ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪН║ОD═Ы═ЬЪЗЪН'ВЫЬЗНЇОC@ЬЪНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЬЗН1°Ы(ЬЪЗЪН1°ОAъЫ(ЬЪЗЪНЇО@~НьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗН1°Ы(ЬЪЗЪНЇО?ЫЬЗНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы(ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪНЇО=╪ЫЬЗНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы(ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪН║О<\ЫЬЗТ─Ы(ЬЪЗЪН'ВЫЬЗНЇО:ШЬЪНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗН1°ТЫ(ЬЪЗЪНнО9 ЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗНЇО89Ы(ЬЪЗЪНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗНнО6ыЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы═ЬЪЗЪНнО5xЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНнО4ЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇО2ІЫЗЫхЗЪНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪНЇО1UЫЗНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪНЇО/УЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪНЇО.■Ы═НьЫPЬЪЗЪНнЫЬЪЗЪН║ЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇО-3Ы═НьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪНЇО+рЫЬЪЗЫ(ЗЪНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫЬЗН1°Ы(ЬЪЗЪНЇО*rНьЫPЬЪЗЪНнЫхЬЪЗЪН║Ы═ЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇО)ЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВТ─Ы(ЬЪЗЪН1°Ы(ЬЪЗЪТН1°О'╟Ы(ЬЪЗЪНЇО&OЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВТ─Ы(ЬЪЗЪН1°Ы(ЬЪЗЪТ НЇО$НЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇО#▌ЫЗНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇО"-НьЫPЬЪЗЪНнЫхЬЪЗЪН║Т─ЫЬЪЗЪ Н'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪТНЇО лНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°Ы(ЬЪЗЪНЇОkНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°ТиЫ(ЬЪЗЪП%НЇОНьЫPЬЪЗЪНнЫхЬЪЗЪН║ЫЬЪЗЪН'ВЫ═ЬЪЗЪН1°ТЫ(ЬЪЗЪНЇО╙ЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Т─Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪТНЇОIЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Т─Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪТНЇОХЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВТ─Ы(ЬЪЗЪН1°Ы(ЬЪЗЪТНЇО┤ЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВТ─Ы(ЬЪЗЪН1°Ы(ЬЪЗЪТєТ╔НЇО'ЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Т─Ы═ЬЪЗЪН'ВЫ(ЬЪЗЪН1°Ы(ЬЪЗЪТНЇОфЫPНьЫхЬЪЗЪНнЫЬЪЗЪН║Ы═ЬЪЗЪН'ВТ─Ы(ЬЪЗЪН1°Ы(ЬЪЗЪТНЇОП(rЇ│;ПV┼╠Dorado Hardware ManualOverview14 September 19818Instruction FieldsThe 34-bit instruction is divided into the following fields:Table 5: Instruction FieldsFieldSizePurpose (may have other effects, described below)RSTK4 bitsSelects RM register to be read and/or writtenALUF4 bitsSelects ALU function or shifter operationBSEL3 bitsSelects source for BLC3 bitsControls source and loading of RM and TASEL3 bitsSource/destination control for ABLOCK1 bitBlocks io task unless wakeup is waitingSelects stack operations for emulator taskFF8 bitsFunction (FA=FF[0:1], FB=FF[2:4], FC=FF[5:7])JCN8 bitsJump controlP0161 bitOdd parity on first word of instructionP17331 bitOdd parity on second word of instructionTotal34 bits + 2 parityThe above instruction layout emphasizes compactness at the expense of programmingflexibility. The following comments explain some of these tradeoffs1.The RSTK field specifies only four of the eight address bits needed for addressing RM.The other four are taken from the RBase register (loaded by a function). In the emulatortask, BLOCK causes STK to be used instead of RM, and RSTK is decoded to causemodifications of StkP.2. ALUF addresses the 16-word ALUFM memory in which 16 of 26-odd useful ALUoperations are stored. For the shift operation decode of ASEL, the first three bits of ALUFselect the kind of shift, while the ALUFM address is forced to 168 or 178.3. BSEL decodes the most common data sources for B. Less common B sources areselected by FF, and then BSEL encodes one of several destinations for the source.4. ASEL specifies the source and destination for A. The default source is the RM addressselected by RSTK. Four ASEL decodes specify the most common memory operations,where the virtual address is BR[MemBase]+A. These decodes consume the two leadingbits of FF to specify alternate sources (T or Id) or less frequent memory operations. Theremaining four ASEL decodes select alternate sources T, Id, or the shifter, where the shifterdecodes work in combination with ALUF, as discussed later.5. LC specifies loading of RM/STK and T from Pd and Md.ЪНЇОf╙pТ─Н%5qН5pТНGыОfЯНЇОbsНЇО^╛pП<Н RО[:sТXНґОWхtНН cТП1НґОT²uНТ─Н cТП&НґОSНТ─Н cТП"НґОQ■НТ─Н cТНґОPНТ─Н cТНґОN▄НТ─Н cТ НґОKaНТ─Н cТП!Н cОIщП*НґОHYНТ─Н cТП%НґОFуНТ─Н cТНґОC╘НТ─Н cТП$НґОB%НТ─Н cТП%НґО>вННЇО;epТшПQНЇО9 ТПDНЇО6(Н2Т▐ПCТ░НЇО4^Т÷П0Т═П)НЇО2⌠ТзП)ТшП$НЇО0хТНЇО-VТЙПAТК НЇО+▄Т▌ПCТ▐НЇО)аТПAО)4uО)аpО)4uО)аpНЇО&sТ╦Т╧ПLНЇО$╗ТРПQНЇО!6Т┴Т┼ПHНЇОkТкП$ТлП+НЇО║Т╘П+Т╙П'НЇОжТ═ Т║ПPНЇОТ─ПUТ│НЇОAТП:НЇОоП8Ъ2Ї┬=ПS┐5Dorado Hardware ManualOverview14 September 198196. FF is the catch-all field in which operations or data not otherwise specifiable can beencoded. Operations encoded in FF are called "functions". There are five ways FF isused:a.To extend the branch address encoded in JCN (long goto, long call).b.To form a constant on B as selected by BSEL.c.To specify one of 64 common functions and branch conditions while the two leading bits modify thememory reference operation specified in ASEL.d.To specify one of 256 functions and branch conditions, some of which use low bits of FF as literalvalues.e.As a shift control value when ASEL decodes to "shift" and BSEL to a constant.When FF is used as a function, it sometimes modifies the interpretation of other fields inthe instruction. For example:a.16 FF decodes modify RM write address bits which would otherwise have come from RSTK or StkP.b.16 FF decodes modify RM write-address bits which would otherwise come from RBase.c.16 FF decodes select less common B sources, causing BSEL to encode a destination rather than asource for B.7. JCN (in conjunction with current address) encodes the next instruction address asfollows:a.One of 64 global Calls.b.One of 60 local Gotos.c.One of 4 local Calls.d.One of 14 local conditional branches with 7 branch conditions.e.One of 16 long Gotos/Calls (use FF field for rest of address).f.One of 4 IFU jumps for next opcode (high 10 address bits from IFU).g. Return.h.TPC read/write.i.IM read/write (Use low bits of RSTK also).8.P0 and P1 are odd parity on the left and right halves of IM. When wrong, these giverise to error signals (see "Theory of Operations") which stop the machine after(unfortunately) the instruction with bad parity has been executed. The artifice ofdeliberately loading both parity bits incorrectly is used to implement breakpoints.NotationThe notation used in referring to fields in the instruction is that the left-most bit of the fieldis denoted as 0. Hence, the fields in the instruction are as follows: RSTK[0:3], ALUF[0:3],BSEL[0:2], LC[0:2], ASEL[0:2], BLOCK[0], FF[0:7], JCN[0:7].The BLOCK bit is also called StackSelect, for its use in choosing STK instead of RM for theemulator task.ЪНЇОf╙pТ─Н%5qН5pТНGыОfЯНЇОbТ╞П9Т╟П!НЇО`SТ╨ПEТ╩НЇО^┴НxО[гuНґТПCНxОZfНґП,НxОYНґТ⌠П0Т■П1НxОW╔ТП-НxОVDНґТ⌡П*Т°П8НxОTЦНxОS┌НґТПMНЇОP4pТ╗Т╘ПFНЇОNiТНxОK╗uНґТ∙ПWТ√НxОJGНґТПQНxОHФНґТ П"Т⌡П<НxОG┘ТНЇОD7pТоП5ТпНЇОBlНxО?╙uНґТНxО>JНґНxО<ИНґНxО;┬НґП>НxО:'НґП>НxО8фНґПCНxО7f НxО6НґНxО4єНґП*НЇО1UpН2Т÷Т═П9НЇО/▀Т2ПAТ3 НЇО-юТПCТНЇО+ЖТПSНЇО&эsНЇО#jpТ▀Т▄ПDНЇО!═Т▐Т░ПEНЇОуТП;НЇОcТ│П8Т┌П#НЇО≥Т Ъ═ЇR<ПM╧Dorado Hardware ManualProcessor Section14 September 198110Processor SectionThe processor section implements most registers accessible to the programmer anddecodes all instruction fields except JCN. The FF field of the instruction is also decodedby the control, memory, and IFU sections.Read this chapter with Figure 1 in front of you.The processor section contains the Q, ShC, Cnt, StkP, and MemBX registers, the T, RBase,MemBase, and TIOA task-specific registers, and the ALUFM, RM, and STK memories. Itcontains the arithmetic and logic unit (ALU) and the shifter.The processor communicates with the control, memory, and IFU sections via B; with iodevices via the IOB bus. It exports MemBase and Mar to the memory system foraddressing, IOA to devices for io addressing, and branch conditions to the control section.It imports Md from the memory system and Id from the IFU.RM and STK Memories, RBase and StkP RegistersRM ("Register Memory," sometimes called "R") is the memory most easily available tomicroprograms; it stores 256 words x 16 data bits with odd parity on each byte of data.RM is read at t0 and latched at t1. Data may be routed to A, B, or the shifter, and branchconditions (see "Control Section") test the sign bit (R<0) and low bit (R Odd). RM may bewritten between t3 and t4 with data from Md or Pd.The RM read address is RBase[0:3],,RSTK[0:3]. For io tasks SubTask[0:1] (discussed in"Slow IO") are or'ed with RBase[2:3]. Each task can thus select from 16 RM registers inthe block pointed to by RBase.Normally, this read address is also used for the write part of the instruction (if any).However, two groups of FF decodes discussed below modify the write address.The RBase_SC function loads RBase with FF[4:7], selecting any block of 16 registers;RBase_B loads RBase from B[12:15]; Pointers_B loads RBase from B[12:15] while alsoloading MemBase from B[3:7] (Previous RBase value is used for both the read and writeportions of the instruction.). The IFU initializes the emulator task's RBase to 0 or 1 beforedispatching to the first instruction of an opcode.The STK memory (sometimes called "stack") is accessible only to the emulator (task 0).Since the emulator cannot block, the instruction bit interpreted as BLOCK for io tasks isinstead interpreted as StackSelect; when StackSelect is 1, RM is disabled and STK usedinstead. Like RM, STK stores 256 words x 16 data bits with odd parity on each byte ofdata. STK is addressed by the 8-bit StkP register, and RSTK controls the adjustment ofStkP; StkP may be decremented or incremented by any value between 4 and +3.Unadjusted StkP is always the read address and normally the write address, but theModStkPBeforeW FF decode forces adjusted StkP to be used for the write. STK is dividedinto four separate regions, each 1008 words long. Valid addresses are 1 to 778 within eachregion. That is, StkP[0:1] select the region, stack overflow occurs at the onset of ainstruction that would increment StkP[2:7] > 778, and underflow occurs when location 0 isЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОaвrНЇО^epТТП;НЇО\⌡Т÷ Т═ПPНЇОZпТП)Н:ОW^qП0НЇОSЛpТ├П9Т┤НЇОR"Т╠П*Т╡П)НЇОPWТП=НЇОLЕТбП,ТцП(НЇОKТЩТЧП:НЇОIPТ▐П[НЇОG┘ТП9НЇОC@sП-НЇО?нpТкНГО?wЧЩО?нНО?wЧ)О?нТлП$НЇО>Т╠П6Т╡П!НЇО<8Т▒О;╚tО<8pТ▓О;╚tО<8pП9НЇО:nТ█П+Т▌П/НЇО8ёТО8tО8ёpО8tО8ёpНЇО5UТЄП-Т╣П)НЇО3┼Т²П4Т·П$НЇО1©ТНЇО.MТъП#ТЮП5НЇО,┐ТПKНЇО)ТяПTНЇО'FТдПRНЇО%|Т╛ТґПAНЇО#╠Т▒Т▓П?НЇО!ФТП2НЇОtТ╡П,ТЁП*НЇО╙ТґП$ТўП5НЇОъТ╞ПVНЇОТґП'ТўП/НЇОJТґТўПJНЇОТЛПHТМНЇО q pТЙП-ТКНЇОCТ█qp Т▌П$НЇОxТ└П$О КtОxpП&Т┘О КtОxpНЇОўТъП"ТЮП4НЇО ЦТ⌡П/О VtО ЦpТ°жЇ 3=П]ь^Dorado Hardware ManualProcessor Section14 September 198111either read or written or when StkP[2:7] is decremented below 0.StkP[2:7] are initialized to 0, denoting the empty stack. A push could do StkP_StkP+1 andwrite in one instruction. A pop does StkP_StkP1, and the item being popped off can bereferenced in the same instruction if desired.Table 6: RSTK Decodes for Stack OperationsRSTK[0]0 = no underflow on StkP = 0 at start or end1 = underflow when StkP originally 0 or finally 0.RSTK[1:3]Meaning 0no StkP change 1StkP_StkP+1 2StkP_StkP+2 3StkP_StkP+3 4StkP_StkP4 5StkP_StkP3 6StkP_StkP2 7StkP_StkP1In other words, RSTK[1:3] treated as a signed number are added to StkP[2:7] (StkP[0:1]don't change.). In the emulator, an attempt to underflow or overflow the stack generatesthe signal StkError:StkError = (BLOCK eq 1) & Emulator &[((StkP[2:7] + RSTK[1:3]) < 0) % ((StkP[2:7] + RSTK[1:3]) > 778) %((RSTK[0] eq 1) & ((StkP[2:7] eq 0) % ((StkP[2:7] + RSTK[1:3]) eq 0)))]StkError generates HOLD and wakes up the fault task (task 15) to deal with the situation,so the instruction causing StkError has not been executed when the fault task runs.StkUnd and StkOvf are remembered in flipflops read by the Pd_Pointers function. Theseget cleared (i.e., recomputed) when the next stack operation is executed by the emulator.The fault task can read them to decide whether stack underflow or overflow action isnecessary.Interpretation of underflow: StkP eq 0 denotes the empty stack. A stack adjustment mayoccur either by itself or with a read or write stack reference. StkP originally equal 0underflows if the top of stack is read or written; decrementing StkP below 0 is always anunderflow error; StkP equal 0 after modification underflows iff writing at the modifiedaddress. Consequently, the assembler sets RSTK[0] equal 1 for a stack reference onlywhen either reading STK and incrementing the pointer or writing at the modified addressand decrementing the pointer.In other words, the microassembler must tell the hardware when to make the StkP equal 0underflow checks, and it must do this correctly when the ModStkPBeforeW FF decode isused.StkP can be loaded from B[8:15] using the StkP_B function; however, this is illegal inconjunction with a STK read or write in the same instruction (e.g., T_Stack, StkP_T leavesStkP unchanged).ЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbП@НЇО^╛Т─П<Т│НЇО\АТ▐ПWНЇО[ТП.Н╧ОW╔sТXП*НґОTVuНtТП,НОRрП2НґОPuННґОMOtТ─НТНґОKкТ─Н НґОJGН НґОHцН НґОG?Н НґОE╩Н НґОD7Н НґОB╡Н НЇО?┤pТ╣ПKТІ НЇО=╪Т╒П!ТёП8НЇО;РТНґО8─tП$Н(О6ЭП>О6oО6ЭН(О5UПGНЇО2pТ÷Т═ПBНЇО0;ТГТХПMНЇО.qТёП9ТєНЇО,іТ÷ПHТ═НЇО*шТяТрПBНЇО) НЇО%÷Т°Т²ПGНЇО#тТяТрП#qНЇО" pТ╒ТёП"qpНЇО ?ТЕqpТФНЇОtТюПPТаНЇО╙Т╗П7Т╘НЇОъТНЇОmТ▒Т▓ПOНЇОёТ╙ПQТ╚НЇОьНЇОfТйТкП5q НЇО⌡Т▀ПЇ┼=П[│)Dorado Hardware ManualProcessor Section14 September 198112StkP is saved at t2 of an instruction dispatched to by the IFU. The saved value may bereloaded into StkP at t2 by the RestoreStkP function; RestoreStkP is illegal in conjunctionwith a STK read or write in the same instruction.RestoreStkP is useful only if opcodes are restarted after servicing map faults. However, we are alsoarranging for the IFU state, branch conditions, etc. of an interrupted opcode to be readable andreproducible, so that it will be possible to simply continue from the instruction that faulted.RestoreStkP will be useless if the continue-method of restarting is adopted.The opcode-restart method effectively prevents use of the IFU entry vector scheme discussed in "IFUSection," degrading performance perhaps 2%, so it is desirable to continue from rather than restartfrom faults. Also, complicated opcodes may require special-case code in the fault handler beforeopcode restart is possible, so continuing from the instruction that faulted is likely to be simpleroverall.Two groups of FF decodes change the RM address for the write portion of an instruction.The first group of 16 FF decodes forces the write address to come fromRBase[0:3],,FF[4:7]. This allows different registers in the same group of 16 to be used forthe read and write portions of the instruction, or allows STK[StkP] to be used for the readportion and any of the 16 registers pointed to by RBase in the write portion.The second group of 16 FF decodes forces the top four write address bits to come fromFF[4:7]. The complete RM write address becomes FF[4:7],,RSTK[0:3]. This allows anarbitrary RM address to be written without having to load RBase in a previous instruction.Alternatively, if the i'th register in a group of 16 is read from RM, it permits the i'th registerin a different group of 16 to be written in the same instruction. In conjunction with a readof STK, RSTK[0:3] will encode the StkP modification, and whatever RM word this happensto point to will be written (Programmers will have to struggle to use this with a STK read.).Note: SubTask does not affect the write address for these functions.Note that there is no way to read RM and write STK in one instruction.The RisId FF decode causes Id to be substituted for RM/STK in the A, B, or shiftermultiplexing.There are branch conditions to test R[0] (R<0) and R[15] (R odd). These branch conditionsare unaffected by the RisId FF decode; actual data from RM/STK is tested.Cnt RegisterThe 16-bit Cnt register is provided for use as a loop counter. Since it is not task-specific,io tasks must save and restore it.Cnt can be decremented and tested for 0 by the Cnt=0&1 branch condition; loaded fromB[0:15] or from small constants 1 to 16 (FF decodes), and read onto the Pd path (into T orRM/STK) by an FF decode.ЪНЇОgpТ─Н#$qН5ЎpТНGОfЯНЇОbТ╛Оa▒tОbpТґПAНЇО`SТ╒О_фtО`SpqТёНЇО^┴ТП0pНxО[гtТ┼П*vtТ▀НxОZfТ╡ПFТЁНxОYТЛП4vtП#НxОW╔ТПLНxОUТ└ПQТ┘НxОSіТ√ПVТ≈НxОREТ╗Т╘ПTНxОPДТҐТЎПHНxОO┐НЇОL5pТ≥П<Т НЇОHцТ│ПFНЇОFЬТ√ПJТ≈НЇОE-Т∙ПNТ√НЇОCcТПMНЇО?ЯТ╔ТіП<НЇО>&ТжП'ТвП,НЇО<\Т⌡П*Т°П0НЇО:▒Т▀Т▄ПQНЇО8фТ░ПEТ▒НЇО6ЭТ√П!Т≈П5НЇО51Т▐П(Т░П5НЇО3gТqП>НЇО/УpПFНЇО,┐ТшП=ТэНЇО*╦НЇО'FТ│ПOТ┌ НЇО%|Тq pП;НЇО bsНЇОПpТ■Т∙ПGНЇО&ТП"НЇОЄТ├П.Т┤П'НЇОИТ┼П!Т▀П9НЇОТЪЇь=ПTV!Dorado Hardware ManualProcessor Section14 September 198113Q RegisterThe 16-bit Q register is provided primarily for use as a shift register with multiply anddivide, but will probably be used more widely by the emulator. Since it is not task-specific,io tasks must save and restore it.Q can be read onto B (BSEL) or onto A (FF); it can be loaded from B (FF) and when FFspecifies an external B source in the memory, ifu, or control sections, it can also be loadedfrom B (BSEL). Q can be left-shifted or right-shifted one (bringing 0 into the vacant bit) bytwo FF decodes.T RegisterThe 16-bit T register is the primary register for data manipulation in the processor. Since itis task-specific io tasks do not have to save and restore it. T can be read onto B (BSEL) orA (ASEL); it can be loaded from Pd or Md (LC).BSEL: B Multiplexor SelectBSEL normally selects one of the "internal" processor sources for B, as shown in the"Primary" column in the table below (Note that although Md originates in the memorysection, it is latched by the processor and appears as an internal B source.). However, theFF field can be used to substitute some other source external to the processorthere aremany "external" sources in the control, IFU, and memory sections, and the codes for theseare given in Table 11. When an external source is specified, then BSEL instead encodesthe destination for B, as shown in the "External" column of the table below.The sources selected by BSEL are:Table 7: BSEL DecodesBSELPrimaryWith External Source 0Md 1RM/STK 2T 3QQ_B * 40,,FFInapplicable because FF is not available to encode an external source 53778,,FFInapplicable 6FF,,0Inapplicable 7FF,,3778Inapplicable*Note: BSEL decode for Q_B is needed in initializing Dorado from the baseboard or Alto. BecauseALUFM contents may be unknown, and data from the Alto is transmitted via the B_Link FF decode,some other field is needed to encode a destination that can then be routed into ALUFM.The values selected by BSEL=4-7 are 16-bit constants obtained by concatenating the 8-bitFF field with zeroes or ones. When this is done, normal effects of functions are disabled,so external B sources are impossible. In conjunction with a shift operation on A, BSEL =4 to 7 will cause the shifter controls to come directly from FF rather than from ShC asНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbs НЇО^╛pТмП8ТнП!НЇО\АТ┼Т▀ПNНЇО[ТП"НЇОW╔Т║ПHТ╒НЇОUзТ┘П5Т├П(НЇОTТ┼П6Т▀П(НЇОREТНЇОM,s НЇОI╨pТ└Т┘ПCНЇОGОТ─П%Т│П8НЇОF$ТП.НЇОAsНЇО=≥pТиТйПGНЇО;оТоП:ТпНЇО:Т┤П6Т┬П&НЇО89Т≤П'Т≥П1НЇО6oТ┘П'Т├П2НЇО4єТ·П(Т÷П/НЇО2зТПLНЇО/hП!Н!wО+ЖsТXН⌡О(└uНН cТН⌡О%|tТ─НН cН⌡О#ЬНН cН⌡О"sНН cН⌡О ОНН cТН⌡ОkТ─НН cТґТўП'Н⌡ОГТ─НОZОГН cН⌡О@НН cН⌡О╪НО/Н cО╪НyОJТ├Т┤ПFНyОфТ▐П5Т░П)НyОBТЖП7ТВНЇОСpТ▄П@Т█НЇО)Т≥П8Т П#НЇО ^Т▓П5Т⌠П$НЇО⌠Т╧ПTТ╨Ъ ЇL<П\©=Dorado Hardware ManualProcessor Section14 September 198114discussed in "Shifter"; the Q-register sources B when an FF-controlled shift is carried out.The TisId and RisId FF decodes may be used with the B_T or B_RM/STK BSEL decodes,respectively, to accomplish B_Id.The "External" decode of BSEL applies with Link, DBuf, Pipe0-Pipe5, FaultInfo, PCX,DecLo, DecHi, and other functions that source B on the backpanel, as selected by the FFdecode. For these external sources, BSEL is interpreted as the destination for B ratherthan the source.Note: When the memory or control section sources the external B bus, it is illegal toexecute arithmetic alu operations; these sources are not electrically stable soon enough topermit the extra 10 ns required for carry propagation. But: if you are sure carries will notpropagate into the high 8 bits of ALU result, then the hardware is fast enough.However: Arithmetic is permitted when the IFU sources the external B bus, provided theprevious instruction was not one of the slow B sources from the memory or controlsections. This permits (Id)-(PCX')-1, common in emulator microcode.This implies that an io task must never block on an instruction that reads B from a slow externalsource.Hardware ImplementationThe processor's internal version of B, called Alub, is driven by a 4-input multiplexor when sourced from withinthe processor; in this case an identical multiplexor drives the external bus, called Bmux (high-true). When theB source is external, both of these multiplexors are disabled, and the backpanel Bmux (low-true) is invertedthrough a gate onto Alub. The multiplexor arrangement is shown in Figure 3.The IFU section is on/off of Bmux by t1+6 ns and the processor section is off by t1+7 ns, but the memoryand control sections are not on/off until t1+16 ns; hence, a slow Bmux source in the previous instructionprevents Bmux from stabilizing until t1+16 ns of the current instruction, allowing insufficient time to propagateBmux onto Alub and finish carry propagation. However, because Bmux is gated onto Alub, and the gate shutsoff quickly, arithmetic on internal Alub sources is always permissible.Bmux sources in this manual are given high or low-true names that agree with the way signals appear on Alub.For external sources this is inverted with respect to the sense of these signals on Bmux. However, becauseexternal sources cannot feed external destinations (no way to encode this in an instruction), the signalinversion is invisible to programmers.НЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbТ∙qТ√П@pНЇО^╛Т▒Т▓ПHНЇО\АТП!НЇОYoТХПNТИНЇОW╔Т ПWНЇОUзТЄТ╣ПFНЇОTТНЇОP·qpТх ТиП=qpНЇОNсТ░Т▒ПIНЇОMТ▐П8qpТ░НЇОK>ТПOНЇОGлqpТ╗П>Т╘НЇОFТГТХП5НЇОD7ТПDНyОAutТ╒П?ТёП"НyО@НЇО<ИvТНЇО:'tТ█Т▌ПWНЇО8ёТ▄ПHТ█П(НЇО7Т╔ПcТіНЇО5⌡ТПLНЇО2зТ■П&О2MО2зТ∙О2MО2зНЇО12Т╛ТґО0╔О12П=НЇО/▀Т▄Т█О.ЧО/▀ПJНЇО-ДТ├ПQТ┤НЇО,_ТПGНЇО)·Т│П*Т┌ПBНЇО(Т≥Т П_НЇО&√ТуПhНЇО%ТП& рЇ##=ПDХDorado Hardware ManualProcessor Section14 September 198115ASEL: A Source/Destination ControlThe AMux drives the A input to the ALU, and is the data source for the read-field (RF_)and write-field (WF_) methods of loading ShC. The shifter also drives A, in which case theAMux is usually disabled.A copy of the AMux drives the backplane Mar bus on processor memory references. TheIFU may also drive Mar, when the processor isn't using it.The three-bit ASEL field controls the source and destination for A as follows:Table 8a: ASEL Decodes When FF is ok*ASELFF[0:1]Meaning 0 0PreFetch_RM/STK 1Map_RM/STK (emulator or fault task) -or- IOFetch_RM (io task) 2LongFetch_RM/STK 3Store_RM/STK 1 0DummyRef_RM/STK 1Flush_RM/STK (emulator or fault task) -or- IOStore_RM (io task) 2IFetch_RM/STK 3Fetch_RM/STK 2 0Store_Md 1Store_Id 2Store_Q 3Store_T 3 0Fetch_Md 1Fetch_Id 2Fetch_Q 3Fetch_T 4A_RM/STK 5A_Id--see "Instruction Fetch Unit" 6A_T 7Shift operationsee "Shifter" (uses ALUF)Table 8b: ASEL Decodes When FF is not ok*ASELMeaning0Store_RM/STK1Fetch_RM/STK2Store_T3Fetch_T4A_RM/STK5A_Id6A_T7Shift operationsee "Shifter" (uses ALUF)*FF is ok when not used in a long goto, long call, as a BSEL constant, or in an FF-controlled shift.When FF is ok and ASEL = 0 to 3, the decoding of FF as a function is forced to be in therange 0 to 63. In other words, FF[0:1], stolen to modify the memory operation on A, do notparticipate in the FF decode. Hence, only functions 0 to 63 can be used in the sameinstruction with a memory reference.НЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbsП"НЇО^╛pТ╗П7Т╘НЇО\АТ├Т┤П;НЇО[ТНЇОW╔Т≤Т≥П7НЇОUзТП:НЇОRhПNНmОNсsТXП%Н:ОKauНҐН@Н:ОHYtТ─НҐН@НҐОFуН@ТП=НҐОEQТ─Н@НҐОCмН@Н:ОBIНҐН@НҐО@еН@ТП?НҐО?AТ─Н@НҐО=╪Н@Н:О<8НҐН@НҐО:ЄН@НҐО90Н@НҐО7╛Н@Н:О6(НҐН@НҐО4єН@НҐО3 Н@НҐО1°Н@Н:О0НҐН@Н:О.■НҐН@ТП"Н:О-Т─НҐН@Н:О+▄НҐН@ТП)НО(`sТXП)Н:О%5uНҐН1О"-tНҐН1О ╘НҐН1О%НҐН1О║НҐН1ОНҐН1О≥НҐН1ОНҐН1О▒НҐТП)Н:ОeТ╞Т╟ПAН:ОАТНЇО▓pТ▀П-Т▄П+НЇО хТ└Т┘П<НЇОЩТбПKТцНЇО 3ТП$ЪЇ Л<П^EDorado Hardware ManualProcessor Section14 September 198116In the above tables, each instance where the source for A is RM/STK can be overruled byone of the 4 FF decodes for A sources or the FF decodes that put FF[4:7] on A. These FFdecodes are illegal with the ASEL or ASEL-FF[0:1] values that select Id or T, and thesource for A is undefined when this restriction is violated.The notation "Fetch_A", "Store_A", etc. in the above table is compatible with themicrolanguage. These routing expressions mean, for example, that the displacementoriginating on A is routed onto the Mar bus on the backplane, added to BR[MemBase] inthe memory section and loaded into the memory address register. Then the Fetch, Store,etc. is started as detailed in "Memory Section".ASEL does a pretty thorough job of encoding possible actions on A: Store_ and Fetch_references take the address from RM/STK, T, Md, Id, or Q; other references take theaddress from RM/STK; LongFetch_ takes the low 16 bits of address from RM/STK andhigh 8 bits from B.The FF field can be used to select any of the following sources:FF[4:7](small constant)RM/STKQTMdThese functions are illegal except on shifts (ASEL=7) or when the source otherwiseselected would be RM/STK (ASEL=0, 1, or 4). On shifts these functions cause the Asource to be wire-or'ed with the shifter output (otherwise the A source would be disabled);with references, these functions overrule RM/STK as the source.Hardware ImplementationA is driven by a 4-input multiplexor as shown in Figure 3. A similar arrangement drives Mar, which is disabledexcept on memory references or when one of the 8 FF decodes that use Mar is executed; the IFU may useMar when the processor does not. The 4-input multiplexors are usually disabled on shifts, which OR onto Aindependently.However, the A multiplexor is not disabled when the source for A is encoded in FF, so it is possible to OR anyA input except Id with the (complemented) shifter datathis is useful for BitBlt and other complicated uses ofthe shifter. Since shifter data on A is low-true, and since the normal ALU operation is NOT A on shifts, theeffect of enabling both the shifter and the normal A multiplexor is [Shiftdata and not A].ЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbТ▌П<Т▐НЇО`SТ┬П7Т┴П!НЇО^┴ТнПQТоНЇО\ЎТП<НЇОYLТП.ТП#НЇОW│ТСП<ТТНЇОUЇТ║П2Т╒П#НЇОSЛТ Т⌡ПAНЇОR"ТП0НЇОN╟Т═П"Т║П3НЇОLЕТяТрП8НЇОKТЇП)Т╦П'НЇОIPТНЇОEчП@Н:ОB▐tНёН:ОAН:О?┤Н:О>Н:О<НЇО90pТИТЙП?НЇО7fТаП'ТбП+НЇО5⌡Т▐Т░ПDНЇО3пТП?НЇО0┌vНЇО-юtТ├ПXТ┤НЇО,<Т°П]Т²НЇО*╦Т≥ПFТ П$НЇО)4 НЇО&sТ┌vtП1Т┐НЇО$НТ░ПZТ▒НЇО#jТ≈Т≤ПaНЇО!ФТПZЪ╙ЇЬ<ПHЛDorado Hardware ManualProcessor Section14 September 198117ALUF, ALU OperationsThe 4-bit ALUF field controls the ALU operation. It addresses a RAM (ALUFM) containingcontrol for the MC10181 ALU chips.ALUFM is 8-bits wide, of which 6 bits are used. ALUFM[0] controls the carry-in forarithmetic ALU operations. It is a "don't care" for the 16 logical ALU operations. TheXorSavedCarry function causes the saved carry-out of a previous operation to be xor'edwith this bit. The XorCarry function complements the value from ALUFM. ALUFM[3:7]select the ALU function performed as below. The carry-out (task-specific) changeswhenever an arithmetic operation is performed in the ALU unless explicitly disabled by theFreezeBC function (freeze branch conditions).The Carry20 function forces the bit 12 carry-in to one. Assuming that this carry-in wouldotherwise have been zero, then this function adds 208 to the (arithmetic) ALU output.Adding 208 is expected to be useful because the cache, fast input bus, and fast output busdeal with 208-word munches.The table below shows the logical and (useful) arithmetic ALU operations.Table 9: ALUFM Control Values (Octal)LogicalArithmetic (No Carry)Arithmetic (With Carry)*1NOT A*0A*0A+13(NOT A) OR (NOT B)**62*A62*A+15(NOT A) OR (B)*14A+B*14A+B+17All-ones output*22AB1*22AB11(NOT A) AND (NOT B)*36A136A*13NOT B15A XNOR B (Assembler makes "EQV" and "=" synonyms for XNOR)17A OR (NOT B)21(NOT A) AND B*23A XOR B (Assembler makes "#" synonym for XOR)*25B*27A OR B31All-zeroes output**33A AND (NOT B)*35A AND B37A*System microcode can count on these operations being defined.**Emulator task can count on these operations being defined.On a barrel shift (selected by ASEL=7), the first three ALUFM address bits are forced to 1(ALUF[0:2] selects the kind of shift in this case). The intent of this arrangement is thatALUFM[168] selects the "NOT A" ALU operation. Nearly all shifter operations use this ALUfunction to route shifter output through the ALU. ALUFM[178] is loaded with assortedcontrols (i.e., used as a variable) by BitBlt or other opcodes that do more complicatedthings.ALUFM can be read onto Pd by the ALUFMEM function or both loaded from B and readonto Pd by the ALUFMEMRW function.ЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbsНЇО^╛pТ≈П!Т≤П6НЇО\АТП"НЇОYoТДПOТЕНЇОW╔ТЇП*Т╦П.НЇОUзТ╣П1ТІП%НЇОTТфПSНЇОREТПRНЇОPzТ■ПSТ∙НЇОN╟ТП-НЇОK>Т═П&Т║П4НЇОIsТжП#ТвОHФtОIspНЇОG╗Т┤ОGtОG╗pП<Т┬НЇОEчТОEQtОEчp НЇОB▐ПIНО=/sТXП%Н:О9ҐtН&,Т─Н6 ТН:О89НТ─Н&,Н*ЪН6Н:ьН:О6╣НН&,Н*ЪН6Н:ьН:О51Н Н&,Н*ЪН6Н:ьН:О3ґНН&,Н*ЪН6Н:ьН:О2)НН&,Н*ЪН6Н:ьН:О0╔НТН:О/!НП:Н:О-²НН:О,НН:О*∙НП-Н:О)НН:О'█НН:О& НН:О$┘НН:О#НН:О!}НН:О%П>Н:О}П<НЇО/pТ▌ПEТ▐НЇОdТ╣П3ТІП(НЇО О tО pТ├ПLТ┤НЇОоТнП;ОBtОоp То НЇОТфТгП@НЇО:НЇО хТ╗П1Т╘НЇОЩТП"ХЇІ=П\UkDorado Hardware ManualProcessor Section14 September 198118External B sources from the IFU and internal sources are ready in time for arithmetic, butexternal sources from the memory and control sections are not (see the earlier section on"BSEL: B Multiplexor Select"). Internal A sources except shifter are ready in time forarithmetic. Unless explicitly disabled by the FreezeBC function, the branch conditionsALU<0, ALU=0, Carry' (ALU carry out'), and Overflow are available for testing on thecontrol card at t3.The Overflow branch condition, defined as carry-out from bit 0 unequal to carry-out frombit 1, is true iff a signed arithmetic operation yields an incorrect result.Normally, the ALU is routed directly onto Pd, and Pd is then written into either T orRM/STK. However, several functions route ALU output shifted left or right 1 position ontoPd. Note that the ALU output of this instruction are used (not the previous one) and thatALUcarry is undefined on a logical ALU operation. The right shifts are:ALU rsh 1(0 onto Pd[0])ALU rcy 1(ALU[15] onto Pd[0])ALU arsh 1(ALU[0] onto Pd[0] preserving the sign)ALU brsh 1(ALUcarry onto Pd[0])Multiply(ALUcarry onto Pd[0]).The left shifts are:ALU lsh 1(0 onto Pd[15])ALU lcy 1(ALU[0] onto Pd[15])Divide(Q[0] onto Pd[15])CDivide(Q[0] onto Pd[15]).Multiply, Divide, and CDivide have other effects as well discussed later.Note: The barrel shifter discussed in the "Shifter" section also use the Pd multiplexor formasking, so it is illegal to combine barrel shifts and ALU shifts in the same instruction.Note: ALU<0, ALU=0, Carry', and Overflow branch conditions test the ALU output of theprevious instruction executed by the task and any shifting or masking that takes place inthe Pd input multiplexor does not affect the result of these branch conditions.Note: The value of Carry' and Overflow change only on arithmetic ALU operations.However, ALU_A may be either an arithmetic or a logical operation; in order to useXorCarry with ALU_A, we will probably use the arithmetic form of ALU_A, but theconsequence of this is that Carry' will change on ALU_A. Programmers will have to bewary of this.Note: Overflow is implemented correctly only for the A+B, A+B+1, A-B, and A-B-1operations; other arithmetic ALU operations (A+1, A-1, 2A, 2A+1, etc.) may modify thebranch condition erroneously.НЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbТ°ПAТ²НЇО`SТ√Т≈ПIНЇО^┴ТкП(ТлП/НЇО\ЎТыПLТз НЇОZСТгП'ТхП.НЇОY)ТОX°tОY)pНЇОUзТїП3Т╗П%НЇОTТПLНЇОP·ТшП<ТэНЇОNсТ▓ПLТ⌠ НЇОMТ Т⌡Н"=ОL╠ЧGОMП5НЇОK>ТНцОJГЧgОK>П3Н:ОGОtТ─Н~ТН:ОFkТ─Н~ТН:ОDГТ─ Н~ТН:ОCcТ─ Н~ТН:ОAъН~НЇО>░pН:О;AtТ─Н~ТН:О9ҐТ─Н~ТН:О89Н~Н:О6╣Н~НЇО3gpПIНЇО/УqpТ≈Т≤ПRНЇО.*Т╩Т╪ПAНЇО*╦qpТ·Т÷П:НЇО(НqpТїП2Т╗НЇО'#ТqpП.НЇО#╠qpТНТОq pНЇО!ФТДТЕП3НЇО ТП8ТНЇОQТ╞ПEТ╟НЇО┤ТНЇОqpТъП0ТЮНЇОJТҐТЎП5НЇОТЪ (Їn=ПT²<Dorado Hardware ManualProcessor Section14 September 198119LC: Load Control for RM and TThis field controls the loading and source selection for the RM/STK memory and T register.The eight combinations are:Table 10: LC DecodesLC Meaning0No Action1T_Pd2T_Md, RM/STK_Pd3T_Md4RM/STK_Md5T_Pd, RM/STK_Md6RM/STK_Pd7T_Pd, RM/STK_PdThe only missing combination is T_Md, RM/STK_Md. T_Md, RM/STK_Md can beaccomplished by combining an LC value of 5 with the TgetsMd FF decode. It is illegal touse TgetsMd with other LC decodes.FF: Special FunctionThis field is the catch-all for functions not otherwise encoded in the instruction. Forconsistency with the hardware implementation, the 8-bit FF field is shown below as a two-bit field FA (= FF[0:1]) and two 3-bit fields, FB (= FF[2:4]) and FC (= FF[5:7]). Fieldvalues are given in octal.The FF field is interpreted as a function iff:(BSEL not selecting a constant) andJCN does not select a "long" goto or callWhen ASEL selects one of the memory references, the FF decode is forced to be that ofFA=0 because the FA field specifies the source for A or alternate memory reference in thiscase.The decoding assignments have been made with the following considerations:Functions that source the external BMux are grouped for easy decode of the signalthat turns off the processor's B-multiplexors.Operations that might be useful in conjunction with a memory reference are put inthe first 64 decodes (FA=0) since FA is decoded as zero on memory references.Functions decoded by different hardware sections are arranged in groups toreduce decoding logic.НЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbsНЇО^╛pТ─ПPТ│ НЇО\АТН!ЦОY⌠sТXН:ОVguТ─НҐН:ОS_tНҐТН:ОQшНҐН:ОPWНҐН:ОNсНҐН:ОMOНҐН:ОKкНҐН:ОJGНҐН:ОHцНҐНЇОEtpТЧПHНЇОC╘Т ПXНЇОAъТП"НЇО=≥sНЇО:'pТзП7ТшП!НЇО8]Т≤П@Т≥НЇО6▓ТЎПDТ©НЇО4гТНЇО1UП.Н:О-ДП#Н:О,П)НЇО(їТ║П/Т╒П&НЇО&эТ└ПZНЇО%НЇО!═ТПJНyО.Т─П3Т│НyОcТП.НyОЯТ▌Т▐П2НyО'Т═П'Т║П&НyО╣ТЖПJНyОЙТ0Їы=ПX2ЦDorado Hardware ManualProcessor Section14 September 198120Table 11a: FF Decodes (FA = 0)FBFCFunction* The AMux is not disabled when A_xx decodes below are used while ASEL selects a shift.0-1A[12:15] _ FF[4:7]2 0A _ RM/STK2 1A _ T2 2A _ Md2 3A _ Q2 4XorCarry (complements ALUFM carry bit)see the "ALUF, ALU Operations" section2 5XorSavedCarrysee the "ALUF, ALU Operations" section2 6Carry20 (carry-in to bit 11 of ALU = 1)see the "ALUF, ALU Operations" section2 7ModStkPBeforeW (Use modified StkP for write address of STK)3 03 1ReadMap. Modifies action of Map_ (see "Memory Section")3 2Pd _ Input (checks for IOB parity error)3 3Pd _ InputNoPE (no check for IOB parity error)3 4RisId (causes Id to replace RM/STK in A_RM/STK, B_RM/STK, and shifter)3 5TisId (causes Id to replace T in A_T, B_T, and shifter)3 6Output _ B3 7FlipMemBase (MemBase _ MemBase xor 1)4-5Replace RMaddr[0:3] by RBase[0:3] and RMaddr[4:7] by FF[4:7] for write of RM;Forces RM to be written even if STK was read.60-7Branch conditions (see "Control"). In conjunction with an IFU jump in JCN,if the condition is true, IFU advance is disabled (see "IFU")7 0BigBDispatch _ B (256-way dispatch on B[8:15]. See "Control")7 1BDispatch _ B (8-way dispatch on B[13:15]. See "Control")7 2Multiply (Pd[0:15] _ ALUcarry,,ALU[0:14]; Q[0:15] _ ALU[15],,Q[0:14];Q[14] OR'ed into TNIA[10] as slow branchsee "Multiply")7 3Q _ B7 47 5TgetsMd (In conjunction with LC=5, this causes T_Md, RM/STK_Md)7 6FreezeBC (freezes previous values of ALU and IOAtten' branch conditions for 1 cycle)7 7Reserved as a no-opTable 11b: FF Decodes (FA = 1)FBFCAction0 0PCF _ B. Load PCF and starts fetching instructions0 1IFUTest _ B, dismisses junk wakeup, bits used as follows: 0:7 TestFG 8 TestParity 9 TestFault10 TestMemAck 11 TestMakeF_D12 TestFH'13 TestSH'14 enables testing0 2IFUTick0 3RescheduleNow (doesn't set Reschedule branch condition)0 4AckJunkTW_B. B[15]=1 shuts off junk task wakeups, =0 enables them; B[0:14] ignored0 5MemBase_B[3:7]0 6RBase_B[12:15]0 7Pointers_B (MemBase_B[3:7] and RBase_B[12:15])10:7UnusedНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНcОbsТXНЇО^╛uН:НҐНЇО[єtТПWНЇОZ Н:НҐНЇОWН:Т─НҐТНЇОU■Н:Т─НҐТНЇОTН:Т─НҐТНЇОR▀Н:Т─НҐТНЇОQН:Т─НҐТЖТВП2НЇОO┐Н:Т─НҐТП#НЇОMЪН:Т─НҐТЮП$ТАП#НЇОL{Н:Т─НҐ ТП-НЇОJВН:Т─НҐНЇОIsН:НҐТП8НЇОGОН:Т─НҐТП&НЇОFkН:Т─НҐТП,НЇОDГН:Т─НҐТПAНЇОCcН:Т─НҐТП2НЇОAъН:Т─НҐТНЇО@[Н:Т─НҐ ТНЇО=SН:НҐПMНҐО;оП-НЇО8фН:НҐПKНҐО7BП=НЇО5ЎН:Т─НҐТП2НЇО4:Н:Т─НҐТП1НЇО2ІН:Т─НҐТП=НҐО12П8НЇО/ўН:Т─НҐТНЇО.*Н:Т─НҐНЇО,іН:НҐТП?НЇО+"Н:Т─НҐТҐТЎП0НЇО)·Н:Т─НҐТ НSО%|sТXНЇО"PuН:НҐНЇОHtН:Т─НҐТП0НЇОдН:Т─НҐТП2НҐО@Т─Н!цН,иН7оНҐО╪Н!ц Н,и Н7оТНЇО8Н:Т─НҐНЇОЄН:НҐТП7НЇО0Н:Т─НҐТ≤ПGНЇО╛Н:Т─НҐ НЇО(Н:НҐ НЇОєН:НҐТП.НЇО Н:НҐЇx;ПY⌠шDorado Hardware ManualProcessor Section14 September 198121Table 11c: FF Decodes (FA = 1)FBFCAction*The following 8 FF decodes drive Mar from A.20-1Unused2 2CFlags _ A' (see Figure 10) (Mar must be stable during prev. instr.)2 3BrLo _ A. BR[16:31] _ A[0:15]2 4BrHi _ A. BR[4:15] _ A[4:15]2 5LoadTestSyndrome from DBuf (see Figure 10)2 6LoadMcr[A,B] (see Figure 10)2 7ProcSRN _ B[12:15]3 0InsSetorEvent _ B. If B[0] = 0, then B[4:15] are controls for EventCntA and EventCntB;if B[0] = 1, then B[6:7] are loaded into the IFU's InsSet register.3 1EventCntB _ B or equivalently GenOut_B (General output to printer, etc.)3 2Reschedule3 3NoRescheduleB data must setup during previous instruction and not glitch when writing IFUMLH/RHsee IFU section.3 4IFUMRH _ B. Packeda_B.5, IFaddr'_B[6:15]3 5IFUMLH _ B. Sign_B.0, PE[0:2]_B[1:3], Length'_B[4:5], RBaseB'_B.6,MemB_B[7:9], TPause'_B.10, TJump_B.11, N_B[12:15]3 6IFUReset. Reset IFU3 7BrkIns _ B. Opcode_B[0:7] and set BrkPending4 0UseDMD (see "Control Section")4 1MidasStrobe _ B (see "Control Section")4 2TaskingOff4 3TaskingOn4 4StkP _ B[8:15]4 5RestoreStkP4 6Cnt _ B (overrules Cnt=0&1 in the same instruction)4 7Link _ B (overrules loading of Link by Call or Return in same instruction)5 0Q lsh 1 (Q[0:14] _ Q[1:15], Q[15] _ 0)5 1Q rsh 1 (Q[1:15] _ Q[0:14], Q[0] _ 0)5 2TIOA[0:7] _ B[0:7] (Note: loaded from left-half of B)5 35 4Hold&TaskSim _ B (Hold reg _ B[0:7], Task reg _ B[9:15].See "HOLD and Task Simulator")5 5WF _ A (load ShC with write-field controlssee "Shifter")5 6RF _ A (load ShC with read-field controlssee "Shifter")5 7ShC _ B (see "Shifter")6 0B _ FaultInfo'. B[8:11]_SRN for 1st fault, B[12:15]_number of faults6 1B _ Pipe0 (B_VaHisee Figure 10)6 2B _ Pipe1 (B_VaLosee Figure 10)6 3B _ Pipe2' (see Figure 10)6 4B _ Pipe3' (B_Map'see Figure 10)6 5B _ Pipe4' (B_Errors'see Figure 10)6 6B _ Config' (see Figure 10)6 7B _ Pipe5' (see Figure 10)7 0B _ PCX'7 1B _ EventCntA' (see "Other IO and Event Counters")7 2B _ IFUMRH' (low part of IFUM)7 3B _ IFUMLH' (high part of IFUM)7 4B _ EventCntB' (see "Other IO and Event Counters")7 5B _ DBuf (normally non-task-specific data from last Store_ see "Memory")7 6B _ RWCPReg (= Link_B' and B_CPReg)7 7B _ LinkЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНcОbsТXНЇО^РuН:НҐНЇО[ЙtТП-НЇОZfН:НҐНЇОXБН:Т─НҐТП>НЇОW^Н:Т─НҐТНЇОUзН:Т─НҐТНЇОTVН:Т─НҐТНЇОRрН:Т─НҐТНЇОQNН:Т─НҐТ НЇОOйН:Т─НҐТ√Т≈ПDНҐОNFТПCНЇОLбН:Т─НҐТП?НЇОK>Н:Т─НҐ НЇОI╨Н:НҐНЇОH6ТйПIТкНЇОF╡Н:Т─НҐТП#НЇОE-Н:Т─НҐТП=НҐОC╘П1НЇОB%Н:Т─НҐТ НЇО@║Н:Т─НҐТП'НЇО=≥Н:Т─НҐТНЇО<Н:Т─НҐ ТНЇО:▒Н:Т─НҐ НЇО9 Н:НҐНЇО7┴Н:НҐТ НЇО6Н:Т─НҐ НЇО4│Н:НҐТП4НЇО2ЩН:Т─НҐТПFНЇО1yН:Т─НҐТП%НЇО/УН:Т─НҐТП$НЇО.qН:Т─НҐТvtНЇО,МН:Т─НҐНЇО+iН:НҐТП8НҐО)ДНЇО(`Н:Т─НҐТП7НЇО&эН:Т─НҐТП6НЇО%XН:Т─НҐТНЇО"PН:Т─НҐТПDНЇО лН:Т─НҐТНЇОHН:Т─НҐТНЇОдН:Т─НҐТНЇО@Н:Т─НҐТНЇО╪Н:Т─НҐТП#НЇО8Н:Т─НҐТНЇОЄН:Т─НҐТНЇО0Н:Т─НҐТНЇО╛Н:Т─НҐТП1НЇО(Н:Т─НҐТНЇОєН:Т─НҐТНЇО Н:Т─НҐТП1НЇО⌡Н:Т─НҐТПIНЇО Н:Т─НҐТП"НЇО⌠Н:Т─НҐТ╨Їp;П\⌡>Dorado Hardware ManualProcessor Section14 September 198122Table 11d: FF Decodes (FA = 2)FBFCAction0-1RBase _ FF[4:7]2-3Replace RMaddr[0:3] by FF[4:7] for write of RM.Forces RM to be written even if STK was read.4TIOA[5:7] _ FF[5:7] (TIOA[0:4] unchanged)50-3MemBaseX _ FF[6:7](MemBase[0] _ 0, MemBase[1:2] _ MemBX[0:1], MemBase[3:4] _ FF[6:7])54-7MemBX _ FF[6:7]60-16 2Pd _ ALUFMRW (Pd _ ALUFMEM as below, ALUFMEM _ B.8, B[11:15])6 3Pd _ ALUFMEM (Pd.0 _ DMux data, Pd.8 and Pd[11:15] _ ALUFMEM[ALUF])6 4Pd _ Cnt (If Cnt=0&1 in same instruction, unmodified value is read)6 5Pd _ Pointers (Pd[1:2] _ MemBX, Pd[3:7] _ MemBase,Pd[8] _ StkOvf, Pd[9] _ StkUnd, Pd[12:15] _ RBase)6 6Pd _ TIOA&StkP (Pd[0:7]_TIOA, Pd[8:15]_StkP; if the instruction modifies StkPconcurrently, the MODIFIED value is read)6 7Pd _ ShC7 0Pd _ ALU rsh 1 (Pd[0] _ 0)7 1Pd _ ALU rcy 1 (Pd[0] _ ALU[15])7 2Pd _ ALU brsh 1 (Pd[0] _ ALUcarry)7 3Pd _ ALU arsh 1 (Pd[0] _ ALU[0] preserving sign)7 4Pd _ ALU lsh 17 5Pd _ ALU lcy 17 6Divide (Pd[0:15]_ALU[1:15],,Q[0]; Q[0:15]_Q[1:15],,ALUcarry)7 7CDivide (Pd[0:15]_ALU[1:15],,Q[0]; Q[0:15]_Q[1:15],,ALUcarry')Table 11e: FF Decodes (FA = 3)0-3MemBase _ FF[3:7]4-5Cnt _ small constant (Cnt[0:10] _ 0, Cnt[11] _ 0 if FF[4:7] # 0 else 1,Cnt[12:15] _ FF[4:7]; i.e., values of 1 to 16 are loadable)6-7Wakeup[n] Initiate wakeup request for task FF[4:7]ЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯН1ОbsТXНЇО^РuН:НҐНЇО[ЙtН:НҐТНЇОZfН:НҐП/НҐОXБП-НЇОW^Н:НҐП)НЇОUзН:НҐНҐОTVПCНЇОRрН:НҐНЇОOйН:НҐНЇОNFН:Т─НҐТП;НЇОLбН:Т─НҐТПAНЇОK>Н:Т─НҐТПBНЇОI╨Н:Т─НҐТП0НҐОH6П2НЇОF╡Н:Т─НҐТПKНҐОE-П)НЇОC╘Н:Т─НҐТНЇОB%Н:Т─НҐТНЇО@║Н:Т─НҐТНЇО?Н:Т─НҐТНЇО=≥Н:Т─НҐТП.НЇО<Н:Т─НҐТНЇО:▒Н:Т─НҐТНЇО9 Н:Т─НҐТП6НЇО7┴Н:Т─НҐТП7Н7О4^sТXНЇО12tН:НҐТНЇО/ўН:НҐПGНҐО.*П<НЇО,іН:НҐП4ЪXЇ*╦:=S+Dorado Hardware ManualProcessor Section14 September 198123Multiply and DivideThe Multiply, Divide, and CDivide functions operate on unsigned 16-bit operands. Unsignedrather than signed operands are used so that the algorithms will work properly on the extrawords of multiple-precision numbers.The actions caused by these functions are as follows:Multiply:Result _ ALUCarry..ALU/2Q _ ALU[15]..Q/2Next branch address _ whatever it is OR 2 if Q[14] is 1.Divide, CDivide:Result _ 2*ALU..Q[00]Q _ 2*Q..ALUCarry -or- 2*Q..ALUCarry'Complete examples for Multiply and Divide subroutines are given in the microassemblerdocument. The inner loop time is 1 cycle/bit for multiply and 2 cycles/bit for divide.ShifterSee Figure 4.Dorado contains a 32-bit barrel shifter and associated logic optimized for field extraction,field insertion and the BitBlt instruction.The shifter is controlled by a 16-bit register ShC. To perform a shift operation, ShC isloaded in one of three ways discussed below with 14 bits of control information, and one ofeight shift-and-mask operations is then executed in a subsequent instruction. Alternatively,(a limited selection of) shift controls may be specified in FF and BSEL concurrent with ashift; in this case, ShC is not modified. ASEL=7 causes a shift and ALUF[0:2] select thekind of masking.The execution of a shift instruction (after ShC has been loaded in a previous instruction)proceeds as follows:ShC[2] selects between T and RM/STK for the left-most 16 bits input to the shifter;ShC[3] selects between T and RM/STK for the right-most 16 bits. Using the RisIdor TisId FF decode in the same instruction allows Id to replace either T or RM/STKin the shift. This 32-bit quantity is then left-cycled by the number of positions (0-15) given by ShC[4:7]. When ShC[2] and ShC[3] are both 1, then the shifter left-cycles T; when both 0, RM/STK. In these cases it operates as a 16-bit cycler.When ShC[2] and ShC[3] are loaded with complementary values, then it left-cyclesthe 32-bit quantity R..T or T..R.The low order 16 bits of shifted data are placed complemented on A by the shift,and normal A source is disabled (except when the source for A is encoded inFFsee the ASEL section).ALUF[0:2] select one of eight mask operations (see below) and the first threeНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНЇОbsНЇО^╛pТ│ПFТ┌НЇО\АТ├ПEТ┤НЇО[ТП$НЇОW╔П5НЇОT3tН:ОR╞Н:ОQ+Н:ОOїП8НЇОL·Н:ОKН:ОI√П%НЇОF$pТцПFТдНЇОDZТпП>ТяНЇО@sН:О<╒qТНЇО90pТіТїПMНЇО7fТП+НЇО3ТТ╪П7ТҐП"НЇО2)Т┐П;Т└НЇО0_Т▄ПNТ█НЇО.■Т╘П;Т╙НЇО,иТ÷П*Т═П/НЇО*ЪТНЇО'█ТїП3Т╗П'НЇО%бТНxО"PТ┐Т└П;НxО ├Т▐П'Т░П)НxО╩Т─ Т│ПGНxОПТ⌡П>Т°НxО&ПQНxО[Т╦Т╧П/НxО▒Т▄Т█П:НxОфТП!НxОTТ⌡П1qpТ° НxО┴ТеПHТфНxО©ТНxОMТьПGТыЪРЇ<П]Dorado Hardware ManualProcessor Section14 September 198124ALUFM address bits are forced to 1, so that the ALU operation in either ALUFM168 or ALUFM 178 can be performed. This must be a logical ALU operation usingthe shifted data on A and data on B because there is insufficient time to propagatecarries for an arithmetic operation. The intent is that ALUFM 168 contain thecontrol for the "NOT A" ALU operation normally desired, while ALUFM 178 is usedby BitBlt and other opcodes that need computed ALU operations.ALU output passes to the masking logic. The mask operation determines which oftwo independent masks in ShC are applied to the data. LMask contains 0 to 15ones starting at bit 0, RMask 0 to 15 ones starting at bit 15. The masked area(s)of ALU output corresponding to 1's in the mask are replaced either with zeroes orwith corresponding bits from Md according to the shift-and-mask function selected.Replace-with-Md generates HOLD if Md isn't ready yet, and the timing for this isthe same as Md onto B (i.e., data is never ready sooner than the secondinstruction after the Fetch_).Masked data is routed onto Pd, then sent to the destination specified by LC.Note: The Pd input multiplexor is used to carry out masking, so it is illegal tocombine a shifter operation with an ALU shift in the same instruction.Three functions load ShC: RF_A and WF_A treat A[8:15] as a Mesa field descriptor andtransform the bits appropriately before loading ShC; they also load ShC[2:3] from A[2:3].ShC_B allows an arbitrary value to be placed in ShC (used by BitBlt).Microcode for the Mesa RF (Read Field) and WF (Write Field) opcode is shown as anexample of the use of the shifter. In these examples, a and b are the two operand bytesfor the opcode, as discussed in "Instruction Fetch Unit." RF and WF both take a pointerfrom the top of the stack and add a to it as a displacement. RF fetches the word, andpushes the field specified by b onto the stack; WF fetches the word, and inserts a fieldfrom the rightmost bits of the word in the second position of the stack into it, then restoresthe word to memory.RF:IFetch_Stack, TisId;*Calculate the pointer. a replaces BR[MemBase] (MDS);*this value is then added to Stack to compute the*address for the pointer.Stack_Md, RF_Id;*IFU supplies b, the field descriptorIFUJump[0], Stack_ShiftLMask;*Right-justify & mask the field, IFU to next instructionWF:T_(IFetch_Stack&-1)+T, TisId;*Start fetch of word containing fieldWF_Id, RTemp _T;*IFU supplies b, the field descriptorT_ShMdBothMasks[Stack&-1];IFUJump[0], Store_RTemp, DBuf_T;The shift controls come directly from FF if ASEL=7 (a shift) and if BSEL = 4, 5, 6, or 7,selecting a constant. This specifies complete shift control in the instruction which does theshift, so ShC doesn't have to be loaded in a previous instruction, and ShC isn't clobbered,so io tasks don't have to save and restore it. When BSEL controls a shift in this way, the Bsource is forced to be Q.The mask operations are as follows:ЪНЇОgpТ─Н#$qН5ЎpТНGОfЯНxОbТ╚Т╛П;НxО`SО_фtО`SpТ▐О_фtО`SpП>НxО^┴Т┐Т└П@НxО\ЎТяП8ТрО\1tО\ЎpНxОZСТ┴П7Т┼ОZftОZСpНxОY)ТП>НxОUЇТ┼П'Т▀П(НxОSЛТ╗П/Т╘НxОR"Т≥П?Т НxОPWТ▐П%Т░П,НxОN▄Т└Т┘П;НxОLбТ╛П9ТґНxОJВТП!ТП&НxОI-ТНxОE╩ТэТщП.НxОBIqpТйП:ТкНxО@~ТПFНЇО=Т═П#Т║П2НЇО;AТ╠Т╡ПGНЇО9wТПEНЇО6ТиТйП2НЇО4:Т П&Т⌡wpwpНЇО2pТ°Т²П<НЇО0╔ТґТўwpП3НЇО.зТ╗wpТ╘П5НЇО-Т┬П*Т┴П4НЇО+EТНЇО'╟tН;Т─Н$└ ТwtН$└О& П1Н$└О$┘Н:О"щТ─Н$└Т wtН:О!6Т─Н$└ ТП*НЇО.Н;Т─Н$└ТН:О┤Т─Н$└Т wtН:ОъН:О8НЇОфpТ÷ПMТ═НЇОШТ┐ПOТ└НЇО1Т▒ПFТ▓НЇОfТ┌ПBТ┐НЇО⌡ТНЇО*П#ЪюЇ Ц=П]KhDorado Hardware ManualProcessor Section14 September 198125Table 12: ALUF Shift Decodes ALUF[0:2]*0ShiftNoMask1ShiftLMaskmasked bits on the left-hand-side of the word replaced with 0's2ShiftRMaskmasked bits on the right with 0's3ShiftBothMasksmasked bits on both sides replaced with 0's4ShMdNoMaskunused (falls out of decoding)5ShMdLMaskmasked bits replaced with Md6ShMdRMaskmasked bits replaced with Md7ShMdBothMasksmasked bits replaced with Md*ALUF[3] selects the ALU operation in either ALUFM 168 or 178ShiftLMask implements right shift and load-field operations; ShiftRMask implements leftshift; ShiftBothMasks deposits the selected field into a word of zeroes; ShMdBothMasksdeposits the selected field into data coming from memory; and ShiftNoMask implementsvarious cycle operations.Note: On a shift the ALU branch conditions apply to the unmasked ALU output.Hold and Task SimulatorThe hold and task simulators are provided for hardware checkout (programmers skip thissection).Hold&TaskSim_B loads HOLDSIM[0:7] from B[8:14]..0 and TASKSIM[0:6] from B[1:7].HOLDSIM is a recirculating shift register in which the presence of a 1 in bit 7 causes HOLDtwo instructions later. For example, Hold&TaskSim_2008 will complete three instructionsafter the Hold&TaskSim_, HOLD the next cycle, and HOLD every seventh instruction (i.e.,every eighth cycle) thereafter. Since this register cannot be loaded with all 1's and since itsclocks are not disabled by HOLD, HOLD of infinite duration is impossible.To disable this debugging feature, the register must be loaded with 0.TASKSIM is a seven-bit counter which determines the number of cycles before a taskwakeup occurs. The task selected for wakeup must be jumpered on the backplane (else no-op). Whenever TASKSIM is loaded with a non-zero value, it counts up to 1778, thengenerates a wakeup request when the counter overflows to 2008. The wakeup requestremains true until TASKSIM is reloaded.ЪНЇОf╙pТ─Н#$qН5ЎpТНGОfЯНО`SsТXНыО^╛tТНыО]( Н:О[єНҐ Н:ОZ НҐПJН:ОX°НҐП,Н:ОWНҐП:Н:ОU■НҐП)Н:ОTНҐП&Н:ОR▀НҐП&Н:ОQНҐП*Н:ОMЪП5ОMrОMЪОMrНЇОJ╠pТъП!ТЮП6НЇОHФТдПVНЇОGТбПTНЇОEQТНЇОAъП9qpНЇО<фsНЇО9TpТ╪П6ТҐНЇО7┴НЇО4Т+ПOНЇО2LТ░П2Т▒П)НЇО0┌ТкТлО/УtО0┌pП!НЇО.ЇТІТЇПNНЇО,МТ▀Т▄ПMНЇО+"ТПIНЇО'╟ПFНЇО$>ТЕТФП4НЇО"sТ┤П@Т┬НЇО ╘ТЕП%ТФП&О tО ╘pНЇОчТсП#ТтОQtОчpНЇОТП'Ъ°Їм>ГK>СDorado Hardware ManualControl Section14 September 198126Control SectionThe control section interfaces the mainframe to the baseboard microcomputer or Alto whichcontrols it as detailed in the "Dorado Debugging Interface" document. In addition, thecontrol section stores instructions in 4k x 34-bit (+2 parity) IM ("Instruction Memory") andcontains logic for sequencing through instructions and switching among tasks.The current instruction is clocked into the MIR register at t0 and exported to the processor,memory, and IFU sections for decoding. The control section itself decodes the JCN field,the BLOCK bit, and its own FF decodes (Wakeup, B_Link, B_RWCPReg, Link_B,TaskingOn, TaskingOff, BDispatch_B, BigBDispatch_B, Multiply, MidasStrobe_B, UseDMD,and branch conditions).The control section also exports the task number via the Next bus, which somewhat after t-2contains the task number that will execute an instruction at t0.Figure 5 shows the overall organization of the control section. Figure 6 shows how branchcontrol is encoded in JCN. Figure 7 shows the timing for regular instructions and for themulti-cycle TPC and IM read/write instructions.TasksDorado provides sixteen independent priority-scheduled tasks at the microcode level. Task15 is highest priority, task 0 lowest. Task 15 (the "fault task") is woken by StkError and bymemory map and data error faults. Tasks 1-14 provide processing functions for iocontrollers implemented partially in hardware, partially in firmware; the present assignment ofthese tasks to device controllers is given in the "Slow IO" chapter. Task 0 (the "emulator")implements instruction sets (Mesa, Alto, etc.). In the absence of io activity, task 0 (alwaysawake) controls the processor.Essentially, io devices are paired to tasks when built, and a device controller can assert awakeup request for the task with which it is paired. A program cannot modify theassignment of controllers to tasks (although the hardware change for this is easy).Additional flexibility in this area is not thought to be worth additional hardware cost.Each task has its own program counter and subroutine return link, stored in the (task-specific) TPC and TLINK registers when the task is inactive. TPC may also be treated as amemory, so program counters for tasks other than the current task can be read and writtenby a program. This is discussed later in this chapter.Task SwitchingWhen device hardware requires service from a task, it activates its wakeup request line at t0.Wakeup requests are priority-encoded, and the highest priority request (BNT or "Best NextTask") is clocked at t2 and competes with the current task (CTASK) for control of themachine. If BNT is higher priority than CTASK, or if the current (non-emulator) instructionhas BLOCK = 1, a task switch will take place; in this case, CTASK will be loaded from BNTat t4. This implies that the shortest delay from a wakeup request to the first instruction ofЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОaвrНЇО^epТ▒ПKТ▓ НЇО\⌡ТсПSТтНЇОZпТёП#ТєП!Н7ЎОZyЧcОZп Н?7ОZyЧ)ОZп НЇОYТПMНЇОU■Т∙П"Т√ОUsОU■pНЇОSиТґПJТўНЇОQЧТ3Т4П,НЇОP4ТцПLТдНЇОNiТНЇОJВТ█Т▌ПFОJjsНЇОI-pТП>ОH═sОI-pНЇОEчТ≤ПOТ≥ НЇОDТ╚П*Т╛П0НЇОBIТП/НЇО=/tНЇО9ҐpТ║П<Т╒НЇО7СТ■Т∙ПBНЇО6(ТП!Т П0НЇО4^Т├П;Т┤П$НЇО2⌠Т█ Т▌ПRНЇО0хТіП"ТїП<НЇО.ЧТНЇО+▄Т╙П7Т╚П%НЇО)аТПQНЇО'ВТТП:НЇО&,ТЗТШПBНЇО"╨ТжТвПIНЇО ОТ≈ПXТ≤НЇО%Т∙ПHТ√НЇОZТП7НЇОAt НЇОоpТ┤П%Т┬П7ОBsОоpНЇОТєП-Т╔П#НB9ОґЧРОНEЄОґЧЩОНЇОЦЧэО:ТвОґsО:pТьП:НЇОoТёТєПLНЇО ╔Т▌Т▐П=НЇОзТєОMsОзpП.Т╔П+■Ї*>Г\АfDorado Hardware ManualControl Section14 September 198127the associated task is two cycles.The 16 Wakeup[task] FF decodes allow any task to be woken, just as though a hardwaredevice had activated its wakeup line. A minimum of two cycles elapses after the instructioncontaining Wakeup before the task executes its first instruction. The task responding to aWakeup must not block sooner than the second instruction, or it will get reawakened.When a task has been woken by Wakeup[task] or has executed one or more instructions andthen deferred to a higher priority task, the fact that it is runnable is remembered in a Readyflipflop. The Ready flipflop is cleared only when the associated task blocks. In other words,there is no way to deactivate a task, after its ready flipflop has been set, except by forcing itto execute an instruction that blocks. The Wakeup[task] function must be executed withtasking off, if it is possible that the specified task might be waking up for some other reason(e.g., due to a wakeup request from an external device, or due to a wakeup issued by yetanother task). Otherwise, the control section may get horribly confused, and the machinewill hang in the same task forever.An acceptable sequence is:TaskingOff;Wakeup[task];TaskingOn;The baseboard and Alto controllers may also clear the Ready flipflops by another mechanism,discussed in "Dorado Debugging Interface".The emulator has no Ready flipflop and cannot block; the BLOCK bit in the instruction is interpretedas StackSelect for the emulator.Task switching may occur after every instruction unless explicitly disabled by the TaskingOfffunction. The TaskingOn function reverses the effect of TaskingOff. TaskingOff is "atomic";an instruction containing TaskingOff will be held if a task switch is pending; the nextinstrcution will be executed in sequence without any intervening task switches. TaskingOn isnot immediately effective; at least two more instructions will be executed by the same taskbefore task switching can occur.It would be a programming error for a task to block with tasking off, but if it did, the block would fail,and it would continue execution.It is illegal for a task to block in an instruction that might be held, if the wakeup line for thetask might be dropped at t0 of the instruction. If this occurred, the instruction mightinadvertently be repeated before the block occurred.RemarkMultiple tasks seem better than a more conventional priority interrupt system because interference byinput/output tasks is substantially reduced. As to the exact implementation, variations are possible. The currentscheme requires more hardware than one in which the program explicitly indicates when a task switch is legal(as on Alto and D0). However, because Hold may last for about 30 cycles, a reliance upon explicit taskingwould result in inadequate service for high priority tasks.ЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbП"НЇО^╛Т╠Т╡П>НЇО\АТ▓П>Т⌠НЇО[ТіПKТїНЇОYLТДПTНЇОUзТ─Т│ПCНЇОTТ▒ПHТ▓НЇОREТ┴ПRТ┼НЇОPzТ▀ПaНЇОN╟Т╪П8ТҐНЇОLЕТ▄ПXТ█НЇОKТіП#ТїП5НЇОIPТґПYНЇОG┘ТП#НЇОDНyО@║ НyО>вНyО= НyО:KsТГП[НyО8ЙТП*НyО6KТ≤ПXТ≥НyО4КТНЇО1°pТ░П7Т▒П&НЇО/яТ├ПSТ┤ НЇО.ТИП$ТЙП3НЇО,<Т│Т┌ПMНЇО*rТїП[НЇО(їТНyО%ЕsТ┤П%Т┬ПEНyО$┘ТНЇО!6pТ▒П8Т▓П*НЇОkТДТЕОчsОkpП=НЇО║ТП4НЇО╪uНЇОЗsТП!ТПDНЇОvТ┤ПMТ┬П&НЇОРТ ПlНЇОnТўТ╞ПLНЇОЙТП;NЇг=ГVDDorado Hardware ManualControl Section14 September 198128Next Address GenerationThis section gives a low-level view of jump control. Because the microassemblerand loader handle details of instruction placement automatically, programmers neednot struggle with the encodings directly. For this reason, programmers may wish toskim this section while concentrating on high-level jump concepts described in"Dorado Microassembler".Read this with Figure 6 in front of you.For the most part, instruction memory (IM) addressing paths are 16 bits wide, although only12 bits are presently used; the extra width allows for future expansion to 13 or 14 bits, whensufficiently fast 4kx1 ECL RAMS are economically available; there are no plans to utilize theremaining 2 bits, but since nearly all hardware components in the control data paths arepackaged 4/can, the extra two bits are almost free. Also, the 16-bit wide Link register canbe used to hold full word data items.The various registers and data paths that contain IM addresses are numbered 0:15, wherebits 4:15 are significant for the 4k-word microstore, while the quadrant bits 2:3 are ignored.This numbering conveniently word-aligns the bits while also allowing for future expansion.The discussion below assumes a 4k-word microstore.Dorado does not have an incrementing instruction-address counter. Instead, the address ofthe next instruction is determined by modifying the current instruction address (CIA) invarious ways. The Tentative Next Instruction Address (TNIA) is determined from JCN[0:7] inthe instruction according to rules in Figure 6. TNIA addresses IM for the fetch of the nextinstruction unless a task switch occurs. If a task switch occurs, the program counter for thehighest priority competing task (BNPC or "Best Next PC") addresses IM. A 16k-word microstore is viewed as consisting of four 4k-word quadrants; each IM quadrantis viewed as containing 64 pages of 64 instructions. Values in JCN are provided for thefollowing kinds of branches:Local branches to any of the 64 locations in the current page;Global branches to location 0 on any of the 64 pages of the current quadrant;Long branches to any location in the quadrant using the 8-bit FF field to extend JCN(normal interpretation of FF is disabled);Conditional branches to any of 14 even locations in the current page, if the selectedcondition is false, or to the adjacent odd location, if the condition is true (7 branchconditions are available);IFU jumps to a starting address supplied by the IFU; JCN selects any one of up to 4entries in the starting address vector (This is motivated by an entry-vector schemediscussed in "Instruction Fetch Unit".);read/write IM and read/write TPC, after which execution continues at .+1;Return to the address in Link;ЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbtНyО^╛qТЄП+Т╣П%НyО\АТ█П(Т▌П*НyО[Т█ПKТ▌НyОYLТДТЕП7НyОW│ТН9ОTТXП(НЇОP·pТ√П;Т≈НЇОNсТ┴П+Т┼П3НЇОMТ⌠ПEТ■НЇОK>Т╪ПNТҐ НЇОIsТ⌡ПXТ°НЇОG╗ТП%НЇОD7Т╚П>Т╛НЇОBlТ⌡ПQТ°НЇО@║Т╣П0ТІП*НЇО>вТП2НЇО;eТ■ПZНЇО9 ТчП;ТъНЇО7пТ░П?Т▒НЇО6Т÷П0Т═П,НЇО4:Т┬П3Т┴П+НЇО2pТПGНЇО.ЧТ⌠П"Т■П7НЇО-3Т╧ПCТ╨НЇО+iТНyО(йq pП0НyО&,qТяpТрП;НyО#▌qТ│pТ┌ПGНyО!цТП*НyО%qТ┤pП1Т┬НyОZТ≈ПMТ≤ НyО░ТНyОЯqТ┼pП+Т▀НyО'ТіПSНyО\ТП(НyОЎПIНyО qp БЇы>ГX2 Dorado Hardware ManualControl Section14 September 198129Branch conditions may also be specified in FF, as discussed below. Several dispatches mayalso be specified in FF. These 'OR' bits into the branch address computed by the followinginstruction.If IM is expanded to 16k words, branching from one quadrant to another will only be possibleby loading the Link register with a 14-bit address and then returning; jumps, calls, andIFUJumps will be confined to the current 4k-word IM quadrant.Remarks on JCN EncodingJCN cleverly encodes in 8 bits almost as much programming flexibility as would be possible with an arbitrarilylarge and general field. The main disadvantage is that MicroD is needed to postprocess assemblies and placeinstructions.The earliest prototype of Dorado used a 7-bit JCN encoding that had fewer global and conditional branchtargets, so programming was harder and additional instructions had to be inserted in a few places. This wasslightly worse than the 8-bit encoding, but it would have been feasible to stay with the 7-bit encoding andemploy the bit thus saved for some other use in the instruction.Local, global, and long branches are analogous, respectively, to local, page-zero, and indirect branches used onmany minicomputers. However, Dorado scatters its global locations over the microstore rather than concentratingthem in page-zero; this is better than the minicomputer scheme for the following reason. During instructionplacement, when a cluster of instructions is too large to fit on one page, a global allows it to be divided betweentwo pages; but if all globals were in page zero, then page zero itself would quickly fill up. In other words,dispersing the globals is theoretically more powerful than concentrating them in page zero; because MicroD doesall the tedious work of placing instructions, this theoretical advantage is made practical; minicomputers have notemployed any program like MicroD, so they have used the less powerful but simpler page-zero scheme.Local branches on Dorado are within a 64-word page, where minicomputers usually branch relative to the currentPC. Relative branching is probably more powerful, but it cannot be used on Dorado because of insufficient timefor addition.Long branches on Dorado use 4 bits of JCN in conjunction with the 8-bit FF field to specify any location in the4k-word quadrant. Since BSEL never selects a constant in this case, an improvement on our scheme wouldhave used 3 bits of JCN in conjunction with BSEL.0 and the 8-bit FF field; this would have freed 8 values ofJCN to encode some other kind of branch. In addition, 5 of the 256 values of JCN are unused and 1 is aduplicate (See Figure 6 for the 5 unused decodes; the replicated decode is the Global call on the Local page.).We have variant JCN decodings that correct these problems, but they were not ready when the design wasfrozen.Conditional BranchesIM is organized in two banks, with odd addresses in one bank, even in the other. Theaddress is needed shortly after t0, but the bank-select signal not until 15 ns after theaddress. For this reason conditional branches select between an even-odd pair ofinstructions (i.e., between the two banks) according to branch conditions that need not bestable until a little after t1.Alternatively, a conditional branch may be encoded in FF in conjunction with any addressingmode except a long branch in JCN. When this is done, the result of the branch test is ORedwith TNIA[15].This implies that for both FF-encoded and JCN-encoded branch conditions, the false targetaddress is even and the true target is odd.Hence, it is possible to conditionally branch using only JCN, while using FF for an unrelatedНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbТ┴ПVТ┼НЇО`SТ░Т▒П9qНЇО^┴pНЇО[Т─Т│ПWНЇОYLТтП*ТуП.НЇОW│ТП=НЇОS_uНЇОPаsТ·П9Т÷П5НЇОO`Т≥Т ПTНЇОMЪНЇОKaТеП`ТфНЇОJТ║ПQТ╒НЇОH÷Т╨ПTТ╩НЇОG?ТП@НЇОD═Т█П_Т▌НЇОC@Т│ПbТ┌ НЇОAъТЁПFТЄП&НЇО@~Т│П]Т┌НЇО?ТґП]ТўНЇО=╪Т├ПoНЇО<\Т▌П[Т▐НЇО:ШТщП)ТчП:НЇО8]Т│П%Т┌ПIНЇО6ЭТ┤ПjТ┬НЇО5⌡ТНЇО2ЩТ█ПSТ▌НЇО1°Т╘П+Т╙П<НЇО0;Т═ПCТ║П)НЇО.зТ╗П(Т╘П?НЇО-zТ▌Т▐ПZНЇО,ТЁПbТЄНЇО*╦НЇО%бtТНЇО"PpТфТгПPНЇО ├ТЦП!ОЫsО ├pТДНЇО╩ТПQНЇОПТїТ╗ПGНЇО&ТО≥sО&pНЇОвТ┼П;Т▀НЇОТ┐ПNТ└НЇОBТ НЇОпТ÷ПYНЇОТП+НЇО⌠Т▌П-Т▐П0vЇL>Г\©RDorado Hardware ManualControl Section14 September 198130function, or to encode a branch condition in FF while using any addressing mode in JCN. Ifbranch conditions are encoded in both FF and JCN, the branch test results are OR'ed,providing further flexibility.The branch condition encodings are:Table 13: Branch ConditionsJCN[5:7]FFBranch Condition060ALU=0161ALU<0262ALUcarry'363Cnt=0&-1 (decrements count after testing)464R<0 (RM or STK, whichever is selected, not overruled by RIsId)565R Odd (RM or STK, whichever is selected, not overruled by RIsId)666IOAtten' (non-emulator) or ReSchedule (emulator)67OverflowALU=0 and ALU<0 are the results of the last ALU operation executed by the current task.ALUcarry' (the saved carry-out of the ALU) and Overflow are the result of the last arithmeticALU operation executed by the current task (ALU_A may be stored in ALUFM as either anarithmetic or logical operation, so programmers should be wary of smashing these branchconditions when ALU_A is used.). These are saved in a RAM and may be frozen by theFreezeBC function for one cycle. In other words, the branch conditions are ordinarily loadedinto the RAM at t3, but if FreezeBC is present, then the RAM is not loaded and values fromthe previous instruction for the same task will apply.The IOAtten' branch condition tests the task-specific IOAttention signal optionally generatedby the io device associated with the current (non-emulator) task.Remark on Target PairsThe bank-select toggling trick, which allows branch conditions to be developed very late, is valuable. Withoutthis trick, it would be necessary to choose between slowing the instruction cycle or restricting branch conditionsto signals stable at t0. Neither of these alternatives is palatable.A more traditional implementation of conditional branches would go to the branch address, if a condition weretrue, or fall through to the instruction at .+1, if it were false. This traditional scheme is never faster but issometimes more space-efficient than the target-pair scheme because the target-pair requires a duplicatedinstruction for every instance of a conditional branch to a single target, which is fairly common. The traditionalscheme does not allow DblGoto and DblCall constructs discussed in "Dorado Microassembler," but these areinfrequent.Subroutines and the Link RegisterDorado provides single-level subroutines by means of the (task-specific) Link register. A Calloccurs on any instruction whose destination address is 0 mod 16 before any modification ofTNIA due to branch conditions or dispatches. On a Call, Return, or IFUJump, Link is loadedwith CIA+1.Because Return loads Link with CIA+1, CoReturn constructs are possible. Because IFUJump alsoloads Link with CIA+1, the conditional exit feature discussed in the "Instruction Fetch Unit" chapter ispossible.ЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbТ┴П5Т┼П&НЇО`SТоПTНЇО^┴ТНЇО[П#НпОVяtТXН хОS_vН~Н!YТН:ОP4sН~Н!YН:ОN╟Н~Н!YН:ОM,Н~Н!YН:ОK╗Н~Н!YusН:ОJ#Н~Н!YП'usН:ОH÷Н~Н!YТзП)usТшН:ОGН~Н!YТП0Н:ОE≈Н~Н!YНЇОBIpТєТ╔П8НЇО@~Т▌П.Т▐П%q НЇО>ЁpТ÷П1Т═П$НЇО<ИТ╞П6Т╟П!НЇО;Т╠П+Т╡П(НЇО9TТ┌П]НЇО7┴Т≈О6ЭsО7┴pПHНЇО5ЎТП6НЇО2LТ≤ПAТ≥НЇО0┌ТПAНЇО-VuНЇО*╦sТ═Т║ПPНЇО)WТ┼ПKТ▀П'НЇО'ВТО'jО'ВП.НЇО%XТ°ПIТ²П$НЇО#ЬТ╙П^Т╚НЇО"≈ТД ТЕПZНЇО!6Т▐ПBТ░П1НЇОуТ╛ПIТґНЇОt НЇО~tТП!НЇОpТ└ПHТ┘НЇОBТ░Т▒ПQНЇОwТ┬П2Т┴П)НЇОґТ НyО КsТ╡ПPТЁНyО┼Т┘ПhНyО*ЪЇ>Г]RDorado Hardware ManualControl Section14 September 198131CIA+1 is used rather loosely in discussion here; the actual value loaded into Link by a callor return is [(CIA & 1777008) + ((CIA+1) & 778)]. In other words, a call or return in location778 of any page loads Link with location 0 of that page.Link may be loaded and read by programs, so deeper subroutine nesting is possible, if Linkis saved/restored across calls.The functions Link_B and B_RWCPReg and the B dispatch functions discussed below, all of whichload Link from B, overrule a call. In other words, if there are conflicting reasons for loading Link,Link_B wins over Link_CIA+1.The B_RWCPReg function (= Link_B, B_CPReg') is provided primarily for initialization from thebaseboard computer and for use by the Midas debugging program. Since the CPReg register clock isasynchronous to the Dorado clock system, a Dorado microprogram that reads CPReg (e.g., to receiveinformation from the baseboard) must use some synchronization method to ensure that CPReg isstable during the cycle in which it is read.Note: it is illegal to use an ALU branch condition in the instruction after Pd_RWCPReg, if CPRegmight have been loaded during the cycle in which it is readthis might result in an unstable IMaddress being presented to the control store.Remark on Call/JumpDeciding between call and jump based on target address saves one bit in the instruction and costs little for thefollowing reasons. Instructions can be divided into three groups: those always jumped to, those always called,those for which Link can be smashed (i.e., "don't care" about call or jump), and those both jumped to andcalled.A realistic guess is that over half of all instructions will be "don't care"; namely, these will be executed at thetop level, not inside a subroutine, and the Link register will not contain anything of importance. Assemblylanguage declarations make this information available to MicroD.The hardware makes 1/16 of the locations in each page "call locations". It is estimated that this is somewhatmore than real programs will need, on the average (although we vacillated about whether 1/8 or 1/16 of thetargets should be calls).In each page, MicroD first places instructions that must be called or must be jumped to. Because there are somany "don't care" instructions, it is unlikely that either call or jump slots in a page will be exceeded.Consequently, it will nearly always be possible to complete allocation of the call and jump targets withoutoverflowing due to the call/jump restriction. After this "don't care" instructions fill in the remaining slots.The remaining situation, with which Dorado cannot cope, is an instruction both called and jumped to. Thiswould arise in a subroutine whose entry instruction closed a loop (uncommon). On Dorado, this situationrequires duplicating the entry instruction, so it costs one location but no extra time.ЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbТ≤ПMТ≥НЇО`SТ┐О_фsО`SpО_фsО`SpТ└НЇО^┴О]ЭsО^┴pТП5НЇО[:Т■П+Т∙П/НЇОYoТНyОVўsТ╔П?ТіНyОUMТїПTТ╗НyОSЛТНyОQNТяПYТрНyОOМТ┬ПXТ┴НyОN▄Т▒П1Т▓П0НyОM,ТдТеПHНyОKкТП,НyОI-ТЁП!ТЄП?НyОGлТ╩П_НyОFkТП-НЇОB▐uНЇО?ЯsТ▌ususП9Т▐НЇО>░Т⌡П'Т°ПHНЇО=/Т╛ПPТґНЇО;оНЇО90Т≈ Т≤ПhНЇО7пТ╪ ТҐПbНЇО6oТП@НЇО3пТ■ПNТ∙НЇО2pТєП;Т╔П/НЇО1ТНЇО.qТ▌П;Т▐П3НЇО-ТщП3ТчП6НЇО+╞ТхПRТиНЇО*NТдП9ТеП7НЇО'╟ТЁП`ТЄ НЇО&OТдТеПXНЇО$НТПWЪdЇ$к>ГC@Dorado Hardware ManualControl Section14 September 198132DispatchesSeveral FF decodes are dispatches which OR various bits with TNIA[8:15] during thefollowing instruction. The dispatch bits must be stable by t2.Dispatches are:BigBDispatch_BB[8:15] (256-way dispatch)BDispatch_BB[13:15] (8-way dispatch)MultiplyOR's Q[14] into TNIA[14] (The value of Q[14] is captured in a flipflop at t2 of theinstruction containing the Multiply function and is OR'ed into TNIA[14] during thenext instruction for the same task.)Example:BDispatch_T;Branch[300];*branches to 300 OR T[13:15]The two B dispatches load Link register from B, then OR appropriate bits of Link into TNIAduring the next instruction for the task. Since Link is task-specific, this works correctlyacross task switching. The Q-bit is only loaded during a multiply, and tasks other than theemulator are not allowed to use the multiply function.The decision between call and jump in the instruction after a dispatch is unaffected bydispatch bitsit depends only upon JCN. In other words, the instruction following adispatch is a Call if its unmodified target address is 0 mod 16, else a jump.It is possible to neutralize any bits in a dispatch by placing target instructions at locationswith 1's in the neutralized bits. In other words, a dispatch on B[8:10] could be accomplishedby locating the 8 target instructions at IM locations whose low five address bits were 1, e.g.at 378, 778, 1378, 1778, 2378, 2778, 3378, and 3778, and by branching to 378 in theinstruction after the BigBDispatch_B.Note: Methods discussed later for resuming a program interrupted by a page fault do notpermit continuation when a fault occurs between a dispatch and the following instruction; forthis reason, programmers should ensure that no fault can possibly occur by holding formemory faults with _Md prior to or concurrent with the dispatch; also, stack operations thatmight overflow/underflow may not be used in the instruction after a dispatch.Note: When the PC for another task is loaded using the LdTPC_ operation discussed later,any pending dispatch conditions for that task are cleared. The debugging program Midasdoes not clear pending dispatches, however, so it should be ok to put a breakpoint on theinstruction after a dispatch or to single-step through a dispatch.IFU AddressingThe IFU supplies ten bits of opcode starting address to the processor. During the lastinstruction of every opcode, exit to the next opcode is accomplished by IFUJump[n] (n = 0to 3) which selects among four entry locations for the next opcode. The starting addresssupplied by the IFU is used for TNIA[4:13] and TNIA[14:15] are set to n. If the IFU isunprepared, it supplies a trap address instead of a starting address, and control goes to theНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbt НЇО^╛pТЫТЗq pП1НЇО\АqpТП4О\TsО\АpНЇОY⌠Н:ОVўs Н░Н:ОU Н░Н:ОS_Н░Т√Т≈П.ОRрОS_Н░ОQ╦ТіП#ТїП/Н░ОPWТП$НЇОMpН:ОJН:ОH6Н░НЇОDдТ ПCТ⌡НЇОBЫТхТиПAНЇОA.Т²ПXТ·НЇО?dТП6НЇО;РТпП%ТяП2НЇО:'ТСП<ТТНЇО8]ТПMНЇО4КТ╘П_НЇО3 Т┌Т┐ПVНЇО1UТ⌠П;Т■П#НЇО/▀ТТО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pО.ЧsО/▀pТУО.ЧsО/▀pНЇО-юТП%НЇО*NqpТ╘ПSНЇО(└Т└П:Т┘П#НЇО&╧ТнПVНЇО$НТ░Т▒ПOНЇО#$ТПMНЇО╡qpТ▓П2Т⌠П"НЇОГТґПWНЇОТ÷Т═ПCНЇОRТПBНЇО9t НЇОгpТоПNТпНЇОЭТ√ПWТ≈НЇО2Т╛ТґП@НЇОgТоП>ТпНЇО °Т░ПYТ▒ ьЇ U=Г]ІTDorado Hardware ManualControl Section14 September 198133nth location in a trap vector.IFUJump's always load Link with CIA+1. This is necessary to implement the followingconditional exit feature for opcodes.If an FF-encoded branch condition is true in the same instruction as an IFUJump, IFUadvance to the next opcode is disabled. This kludge allows an opcode with common anduncommon exit conditions to finish, for example, with IFUJump[2,condition]. If the conditionis false (common case), then the IFU advances normally to the next opcode, starting atlocation 2 of the entry vector. Otherwise (uncommon case), control continues at location 3of the entry vector, but the IFU does not advance, so emulation of the current opcode cancontinue.Utilization of IFUJump and conditional IFUJump is discussed in "Instruction Fetch Unit."IFU trap addresses and other reserved locations in the microstore are as follows:Table 14: Reserved Locations in the MicrostoreReasonLocationsCommentReschedule request*14-17Indicates that some previous instruction executed theReSchedule function.IFUM parity error*74-77Indicates a hardware failure in the IFUM storage.IFU not ready*34-37The instructions in this vector should contain IFUJump[n],waiting for the IFU to become ready.IFU data parity error *4-7Parity wrong on data from cache.IFU map fault *0-3The IFU buffers the fact of a map fault and completes allopcodes in the pipe ahead of the one experiencing the fault.Upon dispatch to the first instruction for the opcode affectedby the fault, this trap occurs.Midas Call command 7776Midas Crash detect 7777*Ifu traps OR the 1's complement of the instruction set into bits 8:9 of the trap address, so actual traplocations for Reschedule, for example, are 14-17, 114-117, 214-217, and 314-317. The trap vector is 1to 4 instructions long according to the IFUJump programming convention, as discussed in the"Instruction Fetch Unit" chapter.IM and TPC AccessSee figures 6 and 7.IM is read and written by programs using a special decode of JCN in conjunction with theRSTK field of the instruction; TPC is also read and written using a special JCN decode.TaskingOff must be in force, and anything that might cause hold is illegal in the sameinstruction; hold is also illegal in the instruction after an IM or TPC read, when the data isaccessed using B_Link.It has been reported that IM_Md doesn't work because _Md causes hold at unexpected times.After the read or write instruction, control passes to the next sequential instruction, i.e., toCIA+1 (with wrap-around at 64-word page boundaries). CIA+1 also winds up in Link.НЇОf╙pТ─Н##qН5pТНGОfЯНЇОbНЇО^╛ТыПTНЇО\АТП%НЇОYoТэП5ТщНЇОW╔Т╘ПQТ╙НЇОUзТ▄Т█ПEНЇОTТкТлПNНЇОREТ≤П[НЇОPzТ÷ПNТ═ НЇОN╟НЇОK>ТнП1ТоП'НЇОGлТПQН'ОDZtТXП/НxОA.uНRН$┘НxО>&sТ─НRН%5Т4П(Т5Н%5О<фТНxО:ШТ─НRН%5ТП(НxО90Т─НRН%5ТюП#ТаН%5О7пТП$НxО6Т─НRН%5НxО4:НRН%5ТґТўП!Н%5О2зТ│ Т┌П.Н%5О1yТ┐П>Н%5О0ТНxО.MТ─НRНxО,┐НRНxО)ДТ┌Т┐ПTНxО(└Т┤Т┬ПYНxО'#ТДТЕПWНxО%бТП!НЇО лtН:ОZqНЇОХpТёТєПSНЇОТбП"ТцП5НЇОSqТтПQТуНЇО┬Т╔ПGТіНЇОЎТНyОЭsТтП,ТуП-НЇОўpТіПJТїНЇО ЦТжП5ТвЪ 4Ї °>Г]o:Dorado Hardware ManualControl Section14 September 198134Note: The hardware does not actually load Link with the IM or TPC data; instead B_Link in the nextcycle routes inverted data onto B using an alternate path. The Link register itself is smashed withCIA+1 as discussed above, and this value would be read (assuming it wasn't overwritten) in laterinstructions.This implies that continuation from a breakpoint or program-interrupt halt on the instruction followingan IM or TPC read (i.e., on the B_Link instruction) won't work correctly.Total time for an IM or TPC read or write operation is 6 clocks (i.e., thrice as long as anormal instruction).A 34 (+2 parity)-bit IM word is read as four 9-bit quantities. The read address is taken fromLink. Data must be read from Link[7:15] in the instruction immediately after the IM read; thisdata is inverted; Link[0:6] contain 1's, so that when the entire word is 1's complemented thedesired data will have leading 0's. The byte select is RSTK[2:3].IM writes also take the write address from Link, 16 bits of data from B and 2 bits from RSTK;the half-word affected is also specified in RSTK.Any task can read or write TPC for an arbitrary task other than itself (an attempt to set TPCof the running task is unpredictable). The task number is B[12:15], and data is taken from orwritten into Link. The assembly language notations for these are RdTPC_B and LdTPC_B.After RdTPC_B, the 16 bits of data in Link are 1's complemented.Note: The dispatch-pending conditions for a task whose TPC is loaded by LdTPC_ arecleared, so LdTPC_ works even when that task has just executed a BDispatch_B orBigBDispatch_B.HoldMany events in the memory system, StkError and the hold simulator in the processor, andseveral IFU error conditions generate hold (The IFU error conditions cause a one-cycle holdiff an IFUJump occurs on the first cycle of the error.). The control section itself forces holdwhen a task switch occurs concurrent with TaskingOff. This signal, clocked at t1, occurswhen the current instruction cannot be completed. Its effect on the hardware is to suspendthe current instruction, while completing parts of the previous instruction that have beenpipelined into the current cycle. Approximately, it converts the current instruction into aGoto[.] while preserving branch conditions and some other stuff.Higher priority tasks are not prevented from running when the current task is experiencingHold.RemarkThe fact that the address of the next instruction is needed at t0, while Hold is not generated until t1 means thatconcurrence of Hold and BLOCK with a switch to a lower priority task produces an anomalous situation called"Next Lies". The hardware disables clocks to CIA, TPC, and MIR when this occurs, so that the currentinstruction is repeated. This results in some hardware complications discussed in the "Slow IO" chapter, butprogrammers need not worry about it.Program Control of the DMuxНЇОf╙pТ─Н##qН5pТНGОfЯНyОbAsТ√ПAТ≈П!НyО`ЮТіТїПOНyО_─Т╣П@ТІНyО^НyО[─Т°Т²ПVНyОZ ТПIНЇОVяpТ╧П)Т╨П1НЇОUТНЇОQ■Т┴Т┼ПGНЇОOйТ┐qП4Т└pНЇОMЪТ▌П+Т▐П2НЇОL5ТПBНЇОHцТ└Т┘ПJНЇОFЬТП1НЇОC├Т▐Т░ПQНЇОA╪Т│П&Т┌П8НЇО?ЯТ П7Т⌡НЇО>&ТП@НЇО:ЄqpТгПCТх НЇО8ЙТТП@ТУНЇО7НЇО2tНЇО.■pТ╚П!Т╛П6НЇО,иТ▒ПVТ▓НЇО*ЪТ▐Т░ПZНЇО)4Т╟ Т╠ПEО(їsО)4pНЇО'iТ░ПSТ▒НЇО%÷ТеП2ТфП(НЇО#тТаП=ТбНЇО" ТП@НЇО≤ТіqpТїНЇОмНЇО╚uНЇОёsТ┬П@ООёТ┴ООё НЇОШТ≥П?Т П,НЇО⌡ТйП?ТкП&НЇО:ТєП7Т╔П6НЇОыТП$НЇО Цt┬Ї °>Г]o4Dorado Hardware ManualControl Section14 September 198135Dorado contains a large number of multiplexors called mufflers which allow a selected signalfrom a set of up to 2048 signals to be observed on a one-wire bus called the DMux. Thisprovides a passive method by which the Baseboard section or the external Midas debuggercan examine internal control signals and registers not otherwise observable.The particular DMux signal is selected by shifting in an 11-bit address one bit at-a-time.Each board with mufflers contains a 12-bit address register that responds to the shiftedaddress bits; the highest bit is ignored for the purposes of selecting the signal to be read."Dorado Debugging Interface" discusses a clever generator algorithm that allows all 2048signals to be read into a table in 2048+11 shift-read cycles.In addition, the DMux address can also be executed as a control function. In this case thefull 12-bit address determines what function is executed. This "manifold" mechanism isused to control power supplies, set clock rate, enable/disable error halt conditions, and testIM without involving other hardware.The DMux facility can also be controlled directly by Dorado programs by means of theMidasStrobe_B and UseDMD functions. Essentially, the DMux address mechanism iscontrolled externally by the Baseboard or by Midas operating through the Baseboard whenDorado isn't running, and by Dorado when Dorado is running.The MidasStrobe_B function causes B[4] to be shifted out as an address bit. This takesthree cycles, so the program must execute three more instructions before doing anotherMidasStrobe_B function. The DMux signal selected by the last 11 address bits shifted out isread on B[0] when the Pd_ALUFMEM function is executed.The UseDMD function causes the current DMux address to be executed as a manifoldoperation.The following subroutine reads the DMux signal selected by the address in T:Subroutine;ReadDMux:Cnt_13S;RdDMuxLp:MidasStrobe_T;*Shift out address in T[4]Noop;Noop;T_(T) lsh 1, Goto[RdDMuxLp,Cnt#0&-1];T_ALUFMEM;*T[0] returns selected DMux addressReturn;ЪНЇОf╙pТ─Н##qН5pТНGОfЯНЇОbТ├ПLТ┤НЇО`SТ║Т╒ПKНЇО^┴Т≥П"Т П5НЇО\ЎТПLНЇОYLТфП5ТгП%НЇОW│ТмПLТнНЇОUЇТ╔ПFТіНЇОSЛТ╣П9ТІНЇОR"ТП=НЇОN╟Т≈ПMТ≤ НЇОLЕТкПTТлНЇОKТ░ПIТ▒НЇОIPТП$НЇОEчТтП(ТуП,НЇОDТП1ТНЇОBIТєП)Т╔П.НЇО@~ТП;НЇО=Т╣П8ТІНЇО;AТгП/ТхП'НЇО9wТ┤П@Т┬НЇО7╛ТП6НЇО4:ТщП6ТчНЇО2p НЇО.ЧТПLН:О+╞s Н:О*+Н╛О(їН:О'#Н╛О%÷ Н"sН╛О$Н╛О"≈Н╛О!П%Н╛О▐ Н"sП#Н╛О2ЇХ>ГJ#КDorado Hardware ManualMemory Section14 September 198136Memory Section Dorado supports a linear 22-bit to 28-bit virtual address space and contains a cache toincrease memory performance. All memory addressing is done in terms of virtualaddresses; later sections deal with the map and page faults. Figure 8 is a picture of thememory system; Figure 9 shows cache, map, and storage addressing. As Figure 8suggests, the memory system is organized into three more-or-less independent parts:storage, cache data, and addressing.Inputs to the memory system are NEXT (the task that will control the processor in the nextcycle) from the control section, subtask from io devices, Mar (driven from A or by the IFU),MemBase, B, the fast input bus, and an assortment of control signals. Outputs are B, Mdto the processor, the F/G registers for the IFU, the fast output bus (data, task, andsubtask), and Hold.The processor references the memory by providing a base register number (MemBase) and16-bit displacement (Mar) from which a 28-bit virtual address VA is computed; the kind ofreference is encoded in the ASEL field of the instruction in conjunction with FF[0:1].Subsequently, cache references transfer single 16-bit words between processor and cache;fast io references independently transfer 256-bit munches between io devices and storage.There is a weak coupling between the two data sections, since sometimes data must beloaded into the cache from storage, or returned to storage.The storage pipeline allows new requests every 8 cycles, but requires 28 cycles tocomplete a read. The state of the pipeline is recorded in a ring buffer called the pipe,where new entries are assigned for each storage reference. The processor can read thepipe for fault reporting or for access to internal state of the memory system.Memory AddressingProcessor memory references supply (explicitly) a 16-bit displacement D on Mar and(implicitly) a 5-bit task-specific base register number MemBase. Subtask[0:1] (See "SlowIO") are OR'ed with MemBase[2:3] to produce the 5-bit number sent to the memory.MemBase addresses 1 of 32 28-bit base registers. The full virtual address VA[4:31] isBR[MemBase]+D. D is an unsigned number.The 28 bits in BR, VA, etc. are numbered 4:31 in the discussion here, consistent with the hardwaredrawings. This numbering conveniently relates to word boundaries.Note that although the VA path is 28 bits wide, limitations imposed by cache and map geometry limitusable virtual memory to only 222 or 224 words in most configurations, as discussed in "The Map"section later.MemBase can be loaded from the five low bits of FF, and the FlipMemBase function loadsMemBase from its current value xor 1. In addition, MemBase can be loaded from0.MemBX[0:1].FF[6:7], where the purpose of the 2-bit MemBX register is discussed in "IFUSection." The IFU loads the emulator task's MemBase at the start of each opcode with aMemBX-relative value between 0 and 3.The intent is to point base registers at active structures in the virtual space, so thatmemory references may specify a small displacement (usually 8 or 16 bits) rather than fullЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОaвr pНЇО^eТ╩ПWНЇО\⌡ТПOНЇОZпТєПDТ╔НЇОYТВПLТЬНЇОW;ТТП@ТУНЇОUpТП$НЇОQЧТ▐Т░П?НЇОP4Т▀П0Т▄П,НЇОNiТ≤П#Т≥П5НЇОL·ТЕП+ТФП*НЇОJтТНЇОGbТ┴П0Т┼П%НЇОE≈Т║П>qpТ╒НЇОCмТАПMТБНЇОBТ┼П.Т▀П*НЇО@7Т⌠Т■П+qpНЇО>mТ╟ПTНЇО<╒ТП;НЇО90ТШП<ТЭНЇО7fТЇПOТ╦ НЇО5⌡ТёП:ТєНЇО3пТПNНЇО.ЇsНЇО+EpТСПFqpТТНЇО){ТўПYНЇО'╟ТЦП#ТДП-НЇО%ЕТйТкП=НЇО$ТП(НyО!YtТ▒utП?Т▓НyОЬТПBНyОZТ┘ПcНyОЁТ⌠О@ОЁО@ОЁТ■П/НyОRТ НЇОpТ∙П#Т√П3НЇО9ТЭП;ТЩНЇОnТ√ПCТ≈НЇОєТ≥ПAТ НЇОыТП%НЇОgТъТЮП=НЇО °Т≤П=Т≥Ъ>Ї U=П]ІDDorado Hardware ManualMemory Section14 September 19813728-bit VA's. In the Mesa emulator, for example, two base registers point at local (MDS+L)and global (MDS+G) frames.In any cycle with no processor memory reference, the IFU may make one. IFU referencesalways use base register 31, the code base for the current procedure; the D supplied bythe IFU is a word displacement in the code segment.Programmers may think of Mar as an extension of A since, when driven by the processor, Marcontains the same information as A.The base register addressed by MemBase can be loaded using BrLo_A and BrHi_Afunctions. VA is written into the pipe memory on each reference, where it can be read asdescribed later. The contents of the base register are VA-D on any reference.Processor Memory ReferencesMemory references are initiated only by the processor or IFU. This section discusses whathappens only when references proceed unhindered. Subsequent sections deal with mapfaults, data errors, and delays due to Hold.Processor references (encoded in the ASEL and FF[0:1] instruction fields as discussed inthe "Processor Section" chapter) have priority over IFU references, and are as follows:Fetch_Initiates one-word fetch at VA. Data can be retrieved in anysubsequent instruction by loading Md into R or T, onto A or Bdata paths, or masking in a shift operation.Store_Stores data on B into VA.LongFetch_A fetch for which the complete 28-bit VA is(B[4:15],,Mar[0:15])+BR[MemBase].IFetch_A fetch for which BR[24:31] are replaced by Id from the IFU.When BR[24:31] are 0 (i.e., when BR points at a page boundary),this is equivalent to BR+Mar+Id, saving 1 instruction in manycases. Note: the IFU does not advance to the next item of _Idfor IFetch_, so an accompanying TisId or RisId function isneeded to advance.PreFetch_Moves the 16-word munch containing VA to the cache.DummyRef_Loads the pipe with VA for the reference without initiating cache,map, or storage activity.Flush_Removes a munch containing VA (if any) from the cache, storingit first if dirty (emulator or fault task only).Map_Loads the map entry for the page containing VA from B andclears Ref; action is modified by the ReadMap function discussedlater (emulator or fault task only).IOFetch_Initiates transfer of munch from memory to io device via fastoutput bus (io task only).ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbТ▌П9Т▐П!НЇО`SТНЇО\АТ■Т∙ПJНЇО[Т╘П2Т╙П%НЇОYLТП3НyОV┼tТ╩П/Т╪П+НyОU*ТП#НЇОQшpТМП-ТННЇОPТ■ПNТ∙ НЇОNFТПNНЇОJsНЇОF▌pТ▄q pТ█П3НЇОDдТўПSНЇОBЫТП,НЇО?┤ТєПUТ╔НЇО=╪ТцПGТдНxО;НJТоЫPЬЪЗЪТпП"Н*О9TТ╞П$Т╟Н*О7┴ТП,НxО4КНJЫPЬЪЗЪНxО2LЫЬЗ НJТPП%ТQН*О0┌П!НxО-ДНJТиТйП+Н*О,Т┘Т├П$Н*О*NТ╩Т╪П+Н*О(└Т≈qП&Т≤Н*О&╧ТП%ТН*О$НТНxО"PpНJЫPЬЪЗЪП3НxО╡НJТ┌ЫPЬЪЗЪЫЬЗЫPЬЪЗЪТ┐П,Н*ОГТНxОIНJТ┬ЫPЬЪЗЪП6Т┴Н*О~ТП0НxОЮНJТнЫPЬЪЗЪТоП&Н*ОТ┌qpТ┐Н*ОKТП$НxОґЫЬЗНJТуЫPЬЪЗЪП8ТжН*ОБТЪ Ї⌡<ПYpMDorado Hardware ManualMemory Section14 September 198138IOStore_Initiates transfer of munch from io device to memory via fastinput bus (io task only).(Inside the memory system, there are three other reference types: IFU reads, dirty cache victimwrites, and FlushStore fake-reads that result from Flush_ references which hit dirty cache entries.)The notation for these memory references has been confusing to people who first startwriting microprograms. The following examples show how each type of reference wouldappear in a microprogram:Fetch_T;*Start a fetch with D coming from T via MarT_Md;*Read memory data for the last fetch into TStore_Rtemp, DBuf_T;*Start a store with D coming from an RM*address via Mar and memory data from T via B.PreFetch_Rtemp;Flush_Rtemp;IOFetch_Rtemp;IOStore_Rtemp;Map_Rtemp, MapBuf_T;*Start a map write with D coming from an RM*address (Rtemp) via Mar, data from T via BRMap_Rtemp;*Start a map read with D coming from an Rm*address (Rtemp) via Mar.LongFetch_Rtemp, B_T;*Start a fetch reference with*VA = BR[4:31]+(T[4:15],,Rtemp[0:15]).IFetch_Stack;*Start a fetch reference with Id replacing BR[24:31]*and with D coming from Stack.IFetch_Stack, TisId;*Start a fetch as above and also advance the IFU to the*next item of _Id.The tricky cases above are Store_, Map_, and LongFetch_, which must be accompaniedby another clause that puts required data onto B. DBuf_ and MapBuf_ are synonyms forB_, and do not represent functions encoded in FF; these synonyms are used to indicatethat the implicitly loaded buffer registers (DBuf on MemD and MapBuf on MemX) will windup holding the data.The encoding of these references in the instruction was discussed in the "Processor"section under "ASEL: A Source/Destination Control". The ten possible memory referencetypes have the following properties:Fetch_, IFetch_, and LongFetch_These three are collectively called "fetches" and differ only in the way VA is computed. Inany subsequent instruction, memory data Md may be read. If Md isn't ready, Hold occurs,as discussed below. If the munch containing VA is in the cache and the cache isn't busy,Md will be ready at t3 of the instruction following the fetch, with the following implications:If Md is loaded directly into RM or T (loaded between t3 and t4), it can be read inthe instruction after the fetch without causing Hold. This is called a deferredreference.If Md is read onto A or B (needed before t2) or into the ALU masker by a shift(needed before t3), it is not ready until the second instruction after the fetch (Holdoccurs if Md is referenced in the first instruction.). This is called an immediatereference.НЇОf╙pТ─Н#$q Н5pТНFОfЯНxОbНJТуЫPЬЪЗЪП8ТжН*О`SТНxО]ШtТІПSТЇНxО\⌡ТёПdНЇОYLpТюЫЬЗТаПIНЇОW│Т╞ПDТ╟НЇОUЇТН:ОRУtН!П+Н:ОQqН!П+Н:ОOМТ─Н!ТП!Н!ОN▄П.Н:ОMН:ОK└Н:ОJ Н:ОH| Н:ОFЬТ─Н!ТП%Н!ОE≈П+Н:ОD Н!П*Н!ОB╡Н:ОA.Т─Н!ТН!О?нП&Н:О>JН!П4Н!О<ИН:О;eТ─Н!ТюП*ТаН!О:ТНЇО6╣pТ║П8Т╒НЇО4КТ≥П<Т НЇО3 Т╚П@Т╛НЇО1UТ≤П9Т≥НЇО/▀ТНЇО,ТрТсПEНЇО*NТ∙ПVНЇО(└ТП$НЇО$>qНЇО лpТ┴П=Т┼НЇОТ▒ Т▓ПJНЇО7Т░П?Т▒НЇОlТ⌠ОъtОlpП1Т■НxОТ∙П7О▒tОpО▒tОpТ√НxОSТяП'ТрП!qНxО┬p НxОТЁП*О┴tОpТЄНxОLТ░О©tОLpП?Т▒НxО │ТЄП7Т╣qНxОЇp XЇp<П\⌡ZDorado Hardware ManualMemory Section14 September 198139The above timing is minimum, and delays may be longer if data is not in the cache or if thecache is still busy with an earlier reference.Md remains valid until and during the next fetch by the task. If a Store_ intervenesbetween the Fetch_ and its associated _Md, then _Md will be held until the Store_completes but will then deliver data for the fetch exactly as though no Store_ hadintervened.Store_Store_ loads the memory section's DBuf register from B data in the same instruction. On ahit, DBuf is passed to the cache data section during the next cycle. On a miss DBufremains busy during storage access and is written into the cache afterwards.Because DBuf is neither task-specific nor reference-specific, any Store_, even by anothertask, holds during DBuf-busy. However, barring misses, Store_'s in consecutiveinstructions never hold. A fetch or _Md by the same task will also hold for an unfinishedStore_.PreFetch_PreFetch_ is useful for loading the cache with data needed in the near future. PreFetch_does not clobber Md and never causes a map fault, so it can be used after a fetch beforereading Md.IOFetch_An IOFetch_ is initiated by the processor on behalf of a fast output device. When ready toaccept a munch, a device controller wakes up a task to start its memory reference and doother housekeeping.An IOFetch_ transfers the entire munch of which the requested address is a part (in 16clocks, each transferring 16 data+2 parity bits); the low 4 bits of VA are ignored by thehardware. If not in the cache, the munch comes direct from storage, and no cache entry ismade. If in the cache and not dirty, the munch is still transferred from storage. Only whenin the cache and dirty is the munch sent from the cache to the device (but with the sametiming as if it had come from storage). In any case, no further interaction with theprocessor occurs once the reference has been started. As a result, requested data not inthe cache (the normal case) is handled entirely by storage, so processor referencesproceed unhindered barring cache misses.The destination device for an IOFetch_ identifies itself by means of the task and subtasksupplied with the munch (= task and subtask that issued IOFetch_). The fast output bus,task, and subtask are bussed to all fast output devices. In addition, a Fault signal issupplied with the data (correctable single errors never cause this fault signal); the devicemay do whatever it likes with this information. More information relevant to IOFetch_ is inthe "Fast IO" chapter.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbТ└П.Т┘П-НЇО`SТП.НЇО\АТвП>ТьНЇО[Тъ ТЮПFНЇОYLТТП2ТУНЇОW│ НЇОSТьНЇО╙ТП9НЇОdqНЇОРpТ─Т│ПHНЇО(Т╔ТіПHНЇО]Т─П#Т│П6НЇО▓Т∙П(Т√П3НЇО хТПAЪ ПЇ │=ПZ┼Dorado Hardware ManualMemory Section14 September 198141IFU ReferencesThe F and G data registers shown in the IFU picture (Figure 11) are physically part of thememory system. The memory system fetches words referenced by the IFU directly intothese registers. The IFU may have up to two references in progress at-a-time, but thesecond of these is only issued when the memory system is about to deliver data for the firstreference. An IFU reference cannot be initiated when the processor is either using Mar or referencingthe Pipe; for simplicity of decoding, the hardware disables IFU references when theprocessor is either making a reference or doing one of the functions 1208 to 1278(CFlags_A', BrLo_A, BrHi_A, LoadTestSyndrome, or ProcSRN_B); or 1608 to 1678(B_FaultInfo', B_Pipei, or B_Config').The IFU is not prevented from making references while the processor is experiencing Hold,unless the instruction being held is making a reference or doing one of the functionsmentioned above.Memory Timing and HoldMemory system control is divided into three more or less autonomous parts: address, cachedata, and storage sections. The storage section, in turn, has several automata that may beoperating simultaneously on different references. Every reference requires one cycle in theaddress section, but thereafter an io reference normally deals only with storage, a cachereference only with the cache data section. Address and cache data sections can handleone reference per cycle if all goes well. Thus, barring io activity and cache misses, theprocessor can make a fetch or store reference every cycle and never be held.If the memory is unready to accept a reference or deliver Md, it inhibits execution with hold(which converts the instruction into a Goto[.] while freezing branch conditions, dispatches,etc.). The processor attempts the instruction again in the next cycle, unless a task switchoccurs. If the memory is still not ready, hold continues. If a task switch occurs, theinstruction is reexecuted when control returns to the task; thus task switching is invisible tohold.In the discussion below, cache references are ones that normally get passed from theaddress section to the cache data section, unless they miss (fetches, stores, and IFUfetches), while storage references unconditionally get passed to storage (IOFetch_,IOStore_, Map_, FlushStore arising from Flush_ with dirty hit, and dirty-victim writes).PreFetch_ and DummyRef_ don't fall into either category.Situations When Hold OccursA fetch, store, or _Md is held after a preceding fetch or store by the same task has misseduntil all 16 words of the cache entry are loaded from storage (about 28 cycles).Store_ is held if DBuf is busy with data not yet handed to the cache data or storagesections. LongFetch_ (unfortunately) is also held in this case. Since DBuf is not task-specific, this hold will occur even when the preceding Store_ was by another task.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbs НЇО^╛pТ≤Т≥П@НЇО\АТ©ПAТюНЇО[ТаП"ТбП4НЇОYLТ─Т│ПEНЇОW│Т НЇОTТ▀ПNТ▄НЇОREТЬП2ТЫП!НЇОPzТЗПHОOМtОPzpОOМtНЇОN╟pТ%П-Т&ОN#tОN╟pОN#tНЇОLЕpТqpНЇОIsТ┼ПYНЇОG╗ТсП:ТтНЇОEчТНЇО@еsНЇО=SpТ─П%Т│П4НЇО;┬Т┤Т┬П;НЇО9ҐТ┘Т├ПAНЇО7СТ╙ПSТ╚НЇО6(Т≥ПCТ НЇО4^ТґТўПMНЇО2⌠ТПLНЇО/!Т└ПXТ┘qНЇО-VpТ∙П\НЇО+▄Т√П\НЇО)аТкПEТлНЇО'ВТ┘П.Т├П1НЇО&,НЇО"╨ТмqpТнП+НЇО ОТзП;ТшНЇО%ТqpТ НЇОZТзПXНЇО░ТП8НЇОJqНЇОьpТ├П6Т┤П%НЇО ТПPНЇО⌡Ти ТйПFНЇОяТІП2ТЇП'НЇОТТП(ТУП*|Ї ©=П]L6Dorado Hardware ManualMemory Section14 September 198142An immediate _Md is held in the cycle after a fetch or store, and in the cycle after adeferred _Md.Because the task-specific Md RAM is being read t2 to t3 for the deferred _Md in the precedingcycle, and t0 to t1 for the immediate _Md in the current cycle, which are coincident, hold isnecessary when the tasks differ. Unfortunately, hold occurs erroneously when the immediate anddeferred _Md's are by the same task.Any reference or _Md is held if the address section is busy in one of the ways discussedbelow._Md is erroneously held when the address section is busy, an unfortunate consequence of thehardware implementation, which combines logic for holding _Md on misses with logic for holdingreferences when the address section is busy.B_Pipei is held when coincident with any memory system use of the pipe. Each memorysystem access uses the pipe for one cycle but locks out the processor for two cycles. Thememory system accesses the pipe t2 to t4 following any reference, so B_Pipei will be heldin the instruction after any reference. Storage reads and writes access the pipe twicemore; references that load the cache from storage access the pipe a third time.Map_, LoadMcr, LoadTestSyndrome, and ProcSRN_ are not held for MapBuf busy; theprogram has to handle these situations itself by polling MapBufBusy or waiting longenough, as discussed in the Map section.Flush_, Map_, and DummyRef_ are not held until a preceding fetch or store has finishedor faulted. The emulator or fault task should force Hold with _Md before or coincident withissuing one of these references, if it might have a fetch or store in progress.In the processor section, stack overflow and underflow and the hold simulator may causeholds; in the control section TaskingOff or an IFUJump in conjunction with the onset of oneof the rare IFU error conditions may cause one-cycle holds; there is also a backpanelsignal called ExtHoldReq to which nothing is presently connectedthis is reserved forinput/output devices that may need to generate hold in some situation. All of thesereasons for hold are discussed in the appropriate chapters.Address Section BusyThe address section can normally be busy only if some previous reference has not yetbeen passed to the cache data section (for a cache reference that hits) or to storage (for astorage reference, or a cache reference or PreFetch_ that misses). A reference is passedon immediately unless either its destination is busy or the being-loaded condition discussedbelow occurs.The address section is always busy in the two cycles after a miss, or in the cycle after aFlush_, Map_, IOFetch_, or IOStore_.Hardware note: This allows Asrn to advance; for emulator and fault task fetch and store misses,which do not use Asrn, this hold is unnecessary. Unfortunately, the display controller's word taskfinishes each iteration with IOFetch_ and Block, so many emulator fetches and stores will be heldfor one cycle when a high-bandwidth display is being driven. Asrn is the internal register thatcontains the pipe address for storage references.НЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbТдПHТе НЇО`SТНyО]▓tТўП0О]О]▓Т╞О]О]▓П&НyО[ЙТхО[]О[ЙО[]О[ЙТиП4НyОZCТ╛ТґПMНyОXБТП$НЇОU■pТ Т⌡П9НЇОSиНyОQtТаТбП;НyОOїТўТ╞П@НyОNFТП,НЇОJВpqpТ÷Т═ПEНЇОI-Т┤ПIТ┬НЇОGbТ∙П!ОFуtОGbpОFуtОGbpП#qpТ√НЇОE≈ТфТгП?НЇОCмТПOНЇО@[ТгП2qpНЇО>░ТТТУП$q pНЇО<фТП(НЇО9TТ⌡qpТ°П.НЇО7┴Т─П'Т│П5НЇО5ЎТПOНЇО2LТ═Т║П>НЇО0┌Т│Т┌ПFНЇО.ЇТмПKТн НЇО,МТжПQТвНЇО+"ТъПKТЮНЇО)WТП;НЇО$>qНЇО лpТбТцПAНЇОТ├П*Т┤П2НЇО7Т▒ПBТ▓НЇОlТ┘Т├ПGНЇО╒ТНЇО0Т╒П-ТёП-НЇОeТП$НxО uТ╗ tТ╘ПRНxОґТ╒ПLТёНxОLТ═П8Т║П)НxО КТ©ПIТюНxО┼ТП1Ъ6Їg<П[є[Dorado Hardware ManualMemory Section14 September 198143There are six other ways for the address section to be busy:(1)A cache reference or PreFetch_ that misses, or a FlushStore, transfers storagedata into the cache. At the end of this reference, as the first data word arrives,storage takes another address section cycle.(2)The preceding cache reference hit but cannot be passed to the cache data sectionbecause the data section is busy transferring munches to/from storage (or to an iodevice if an IOFetch_ finds dirty data in the cache). Total time to fetch a munchfrom storage is about 28 cycles, but the cache data section is busy only during thelast 10 of these cycles (9 for PreFetch or IOFetch_ with dirty hit), while data iswritten into the cache. The cache data section is free during the interim.(3)The preceding storage reference, or cache reference or PreFetch_ that missedhas not been passed on to storage because the storage section is busy. Storageis busy if it received a reference less than 8 cycles previously, and may be busylonger as follows:successive cache references must be 10 cycles apart;successive write references must be 11 cycles apart;with 4k storage ic's, successive references must be 13 cycles apart.(4)A cache write (caused by a miss with a dirty victim or FlushStore) ties up theaddress section until the storage reference for the write is started; this happens 8cycles after the storage reference for the miss or FlushStore is started. Note thatthe new munch fetch starts before the dirty victim store and that hold terminatesright after the store is started.(5)A reference giving rise to a cache write that follows any other cache miss will tieup the address section until the previous miss is finished.(6)The address section is busy in the cycle after any reference that hits a cache rowin which any column is being loaded from storage.Any reference except IOFetch_, DummyRef_, or Map_ that hits a cache row inwhich any column is being loaded from storage remains in the address sectionuntil the BeingLoaded flag is turned offi.e., for the first 19 of the 28 cyclesrequired to service a miss, the reference is suspended in the address section;during the last 9 cycles of the miss, when the munch is transferred into the cachedata section, the reference proceeds (except that a fetch or store will still be heldbecause the cache data section is busy during these 9 cycles). This is believed tobe very infrequent.A more perfect implementation would suspend a reference in the address section onlywhen the hit column, rather than any column in the row, was being loaded. However, thesituation is only costly when the suspended reference is by another task; since there are64 rows, ~1.5% of all references will be held whenever any task is experiencing a miss.There is more discussion of this in the "Performance Issues" chapter.References to storage arise as follows:a cache miss (from a cache reference or PreFetch_) causes a storage read;a cache reference or PreFetch_ miss with dirty victim also causes a storage writeimmediately after the read;ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbП<НЇО_─НxТ©ТюП=НxО]╣Т╗П6Т╘НxО[ЙТП,НЇОYLНxТ┐ПPНxОW│Т└ПRНxОUЇТ≥Т ПFНxОSЛТ└П$Т┘П/НxОR"Т╣ТІП;НxОPWТПKНЇОM╧НxТеТфП7НxОKНТ≤П=Т≥НxОJ#ТїТ╗ПDНxОHYТН:ОE╩П4Н:ОCПП4Н:ОB%ПDНЇО?┤НxТгП3ТхНxО=╪Т≤ПJТ≥ НxО;РТ≈П/Т≤П%НxО:'Т╔ТіqpП0НxО8]ТqpНЇО5ЎНxТ²П(Т·П+НxО3ТТП;НЇО1UНxТ▓П(Т⌠П*НxО/▀Тq pНxО,МТЇП=Т╦НxО+"Т©q pП%ТюНxО)WТу q pТжП"НxО'█То ТпПCНxО%бТ░П4Т▒НxО#ЬТ∙ПGТ√ НxО"-Т┴ПSНxО bТН;О}tТ╩П7Т╪Н;ОЫТ┬Т┴ПCН;ОuТ≥ПIТ Н;ОЯТ═ПOТ║Н;ОmТПEНЇОpП'НxО─ТЦП;ТД НxОБТ∙П#Т√П.НxО ТЇп:/[;LDorado Hardware ManualMemory Section14 September 198144a Flush_ which gets a dirty hit causes a FlushStore read reference which in turncauses a storage write of the dirty victim;every io reference causes a storage read or write;a Map_ causes a reference to storage (actually only the map is referenced, but thetiming is the same as for a full storage reference).The following table shows the activity in various parts of the memory system during a fetchthat misses in the cache and displaces a dirty victim; the memory system is assumed idleinitially and nothing unusual happens.Table 15: Timing of a Dirty Miss Time Time(Cycles)Activity of Fetch(Cycles)Activity of Dirty-Victim Write 0Fetch_ starts 1in address section 2-9in address section (wait for map) 3-18in ST automaton (generatesyndrome, transport to storage) 2-9in map automaton *10-17in map automaton * 7-14in memory automaton *15-22in memory automaton *14-21in Ec1 automaton22-29in Ec1 automaton **21-28in Ec2 automaton29-36in Ec2 automaton ** 27_Md succeeds* The map automaton continues busy for two cycles after a reference is passed to the memory automatonbecause it is necessary for the Map storage chips to complete their cycle.** The work of the dirty-victim write is complete after it has finished with the memory automaton, but itmarches through Ec1 and Ec2 anyway for fault reporting.STOP! The sections which follow are about the Map, Pipe, Cache, Storage, Errors, andother internal details of the memory system. Only programmers of the fault task ormemory system diagnostic software are expected to require this information. Since thereare many complications, you are advised to skip to the next chapter.The MapVA is transformed into a real address by the map on the way to storage. The hardware iseasily modifiable to create a page size of either 256, 1024, or 4096 words and to use either16k, 64k, or 256k ic's for map storage. The table below shows the virtual memory (VM)sizes achievable with different map configurations. However, the cache configuration limitsVM size independently, as discussed later, and this limit may be smaller than the Map limit.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНxОbТ═ПHТ║НxО`SТП+НxО]╣П2НxО[Т│П.Т┌П$НxОYLТП4НЇОUзТ┼П)Т▀П2НЇОTТ÷ Т═ПJНЇОREТП&Н √ОNFvТFП!НyОK╗uТ─Н)WНyОJGНҐН)WН1°ТНyОG╗tТ─НҐТНyОFHТ─НҐН)WН1°ТН)WОDГТ─Н1°ТН1°ОC├НyОB%Т─НҐН)WН1°ТНyО@еТ─НҐН)WН1°ТНyО?dНҐТ─Н)WН1°ТНyО>НҐТ─Н)WН1°ТНyО<╒Т─НҐТНyО:Т│П1Т┌П4НyО8ёТПJНyО6Т≤ПPТ≥НyО4єТП7НЇО/ўqТ╣П*ТІП+НЇО-ДТХПEТИ НЇО,Т║ПXНЇО*NТПDНЇО%5sНЇО!цpТ▓П%Т⌠П3НЇОЬТ┤Т┬ПHНЇО.ТґТўПMНЇОcТ┴П\НЇО≥Т▀П@Т▄`ЇR<ПM╧&Dorado Hardware ManualMemory Section14 September 198145Table 16: Map ConfigurationsMap Map ICPageVMAddressedSizeSizeSize By21428222VA[10:23]214210224VA[8:21]214212226VA[6:19]requires 16k-word cache21628224VA[8:23]216210226VA[6:21]requires 16k-word cache216212228VA[4:19]requires 16k-word cache sans parity21828226VA[6:23]218-bit ic's don't exist yet218210228VA[4:23]218-bit ic's don't exist yetLarger page sizes increase the virtual memory size limit. Since the 4k-word cache imposesa 225-word size limit (226 if the parity bit in the address section is converted into anotheraddress bit), the largest VM sizes are only achievable in conjunction with a 16k-wordcache. Larger page sizes might reduce map and storage management overhead; ourexperience in this area is inconclusive but suggests that 4k-word pages would only bedesirable with very large storage configurations.Note that the physical storage size limit is unaffected by either cache parameters, mapRAM size, or page size because RP is large enough to address the largest possible storageconfiguration (4 modules using 218-bit MOS RAM components), even when the smallestpage size is used; this maximum size is 224 words.The cache handles virtual addresses, so the map is never involved in cache referencesunless they miss.A consequence of virtual addresses in the cache is that it is illegal to map several virtualpages into the same real page (unless all instances are write-protected). This restrictionprevents cache and storage from becoming inconsistent.A map entry contains a 16-bit real page number (RP) and three flags called Dirty, Ref, andWP, which have the following significance:WPwrite-protects the page; a fault occurs if a write is attempted.Dirtyindicates that storage has been modified; set by any IOStore_ or by adirty-victim write; Store_ does not set Dirty.Refindicates that the page has been referenced; set by any storage referenceexcept Map_; cleared by Map_.The combination WP=true with Dirty=true makes no sense, and encodes the Vacant stateof the map entry. A map entry is vacant if it has no corresponding page in real memory.НЇОf╙pТ─Н#$q Н5pТНFОfЯНВОbsТXН:О^╛qН!цТН:О\АТ─НҐН@Н!цН:О[НҐН@Н" ТН:ОWхpОXUtНҐОWхpОXUtН@ОWхpОXUtН!цОWхpН:ОUЩОV┼tНҐОUЩpОV┼tН@ОUЩpОV┼tН!цОUЩpН:ОT3ОTюtНҐОT3pОTюtН@ОT3pОTюtН!цОT3pН){Н:ОRhОRУtНҐОRhpОRУtН@ОRhpОRУtН!цОRhpН:ОP·ОQ+tНҐОP·pОQ+tН@ОP·pОQ+tН!цОP·pН){Н:ОNсОO`tНҐОNсpОO`tН@ОNсpОO`tН!цОNсpН){П#Н:ОMОM∙tНҐОMpОM∙tН@ОMpОM∙tН!цОMpН){ОM∙tОMpН:ОK>ОKкtНҐОK>pОKкtН@ОK>pОKкtН!цОK>pН){ОKкtОK>pНЇОFЬТ┤ПLТ┬ НЇОE-ТєОE╨tОE-pОE╨tОE-pТ╔П@НЇОCcТэПUНЇОA≤ТшПKТэНЇО?нТиП4ТйП!НЇО>ТП1НЇО:▒ТеПGТфНЇО8фТ└Т┘ПCНЇО6ЭТфО7┴tО6ЭpТгНЇО51ТП)О5ЎtО51pНЇО1©ТЎП,ЫPЬЪЗЪЫЬЗТ©П#НЇО/УТНЇО,┐Т║П/Т╒П-НЇО*╦Т╚Т╛ПHНЇО(НТП6НЇО%|Т▐Т░П.qpqpНЇО#╠qpТП(НxО ?НШП@НxОмНШТгП4ТхНЭОТП.НxО▒НШТ█П,Т▌НЭОфТНЇОTТ▐ПHqpНЇО┴Т²Т·П@HЇB<ПWисDorado Hardware ManualMemory Section14 September 198146FaultsEvery storage reference causes a mapping operation. If mapping reveals something otherthan Vacant, the reference proceeds normally. Otherwise, the storage reference is aborted,and MapTrouble is reported as discussed later. There are two kinds of faults:Page faultreference to a vacant map entry (WP = true, Dirty = true)WP faultStore_ that misses, IOStore_, or dirty-victim write with WP true.(Dirty-victim WP faults should not occur if the map and cache arehandled as proposed below.)Writing the MapMap_, which can only be encoded in an emulator or fault task instruction, is used to writethe map; like other storage references, it returns previous map contents in the pipe, wherethey can be read. For reasons discussed in "The Pipe" section later, Map_ should not beissued if a preceding fetch or Store_ might be in progress; normally issue a _Md to holduntil preceding references cannot fault.Map_ first writes B[0:15] and TIOA[0:1] into MapBuf (a buffer register on the MemX board)and turns on MapBufBusy in the pipe; 9 cycles later (barring delays) MapBuf has beenwritten into the Map entry addressed by the appropriate bits of VA and MapBufBusy isturned off.B[0:15] are the real page number, TIOA.0 is WP, and TIOA.1 is Dirty. Map_ zeroes Ref,and there is no direct way to write a map entry with Ref=1; a fetch, Store_, or PreFetch_to the appropriate page after loading the map entry will set Ref=1. Note that if the realaddress space is less than 224 words (216 pages), high order RP bits are ignored duringreferences though they are kept in the map and appear in the pipe.Map_ never wakes up the fault task.If previous map contents indicated Vacant or had a parity error, MapTrouble will be true in the pipebut not reported to the fault task. Quadword and syndrome in the pipe, not written by Map_,contain previous values.For all programming purposes, Map_ is complete on the cycle when MapBufBusy is turnedoff; at this time, previous map contents are valid in the pipe entry. Polling MapBufBusyuntil it is false is the only way to find out when the pipe entry is valid.Since Map_ never faults and doesn't use any pipe information clobbered by an overlappingreference, another reference may be started without waiting for Map_ to finish, unless thefollowing reference is another Map_. Also, Map_ does not interfere with Md or DBuf, soits only interference with other kinds of references is its use of the private pipe entry (0 or1) pointed at by ProcSRN. However, it is illegal to issue another Map_, LoadMCR,LoadTestSyndrome, or ProcSRN_ without waiting for Map_ to finish. These functions(discussed later) share the MapBuf register with Map_; there is no Hold arising fromMapBufBusy, so the microprogram must ensure that MapBuf is free when one of thesefunctions is executed.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbqНЇО_─pТ≥ПQТ НЇО]╣Т┐qpПGТ└НЇО[ЙТq pП@НxОYLТ─ НҐТП0НxОVўТ─НҐТП;НҐОTЦТнП3То НҐОSТНЇОNсqНЇОKapТ░П7Т▒П#НЇОI√Т░П*Т▒П1НЇОGлТ▄Т█ПPНЇОFТ·ПPТ÷НЇОD7ТП(НЇО@еТ█П-qpТ▌П#НЇО>ЗТаq pП%ТбНЇО=/ТдП7ТеНЇО;eТ НЇО7СТ╗П3Т╘П#НЇО6(Т░П;Т▒НЇО4^ТїП-Т╗П-НЇО2⌠Т╛ТґО3 tО2⌠pО3 tО2⌠pП.НЇО0хТПBНЇО-VП#НyО*∙tТ▐П#utu tТ░НyО)4Т╪ТҐututП"НyО'сТНЇО$┘pТ┼Т▀П8НЇО"╨ТЁТЄП0q НЇО ОpТПKНЇО}Т└Т┘ПHНЇОЁТ≈П;Т≤НЇОХТ═П+Т║П,НЇОТ▓ПNТ⌠НЇОSТНП$qП$ТОНЇО┬ТжТвП,pНЇОЎТАПOТБНЇОСq pТдП5ТеНЇО)ТЪ*ЇБ<ПY)MDorado Hardware ManualMemory Section14 September 198147B is latched in MapBuf during t2 to t3 and TIOA[0:1] are clocked into MapBuf at t2 for all of these;then MapBuf is written into MCR, TestSyndrome, or ProcSRN at t3 or into the Map at t14 (if nodelays). In other words, MapBuf is a buffer register for all registers on the MemX board that areloaded from B.Reading the MapEvery storage reference causes mapping and returns old contents of the relevant map entryin the pipe. I.e., Ref and Dirty may change as a result of a referenceold values appear inthe pipe.A ReadMap function accompanying Map_ prevents the map entry from being modified, sothat old contents can be read from the pipe without also smashing the map entry.Flushing One Page From the CacheAny cache reference or PreFetch_ that misses or any IOFetch_ or IOStore_ sets Ref in themap; IOStore_'s set Dirty as well. If the victim for the miss or hit for the Flush_ is dirty,Ref and Dirty for its map entry also get set. However, Store_ does not set Dirty in the mapentry until that munch is chosen as victim.For this reason, any calculation based upon Dirty must first validate the map Dirty bit byflushing associated cache entries, as discussed below.In addition, almost any change to a map entry requires a flush, again because of problemswith dirty cache entries. The following examples illustrate this point:Before changing RP, a flush prevents dirty victims from being written into theprevious real page (if the old page had WP false).Before turning on WP, a flush prevents dirty cache entries from being written intothe now write-protected page.Before turning off WP, a flush prevents multiple cache entries for a munch, onewrite-protected, the other not (The cache will not work correctly, if there aremultiple entries for a munch.).Before sampling Ref, flushing is required so that subsequent references to thepage will set Ref=1 and so that dirty munches in the cache will not erroneouslyset Ref=1 when they are displaced.Before clearing Dirty, a flush prevents dirty munches subsequently displaced fromthe cache from erroneously setting Dirty again.To flush a 256-word page from the cache, 16 Flush_ references may be made, one to eachmunch of the page (64 with 1024-word pages). Flush_ invalidates any existing cache entryfor the munch (and stores the munch if dirty).This succeeds iff there are no anomalous multiple cache entries for a particular munch. Multiplecache entries for a munch should never occur except prior to system initialization or when some ofthe debugging features are turned on in Mcr.ЪНЇОgpТ─Н#$q Н5pТНFОfЯНyОbAtТ█ОaЄОbAОaЄОbAП)Т▌ОaЄОbAНyО` Т╛ ТґП3О` О` О` О` НyО^РТ╒ТёПVНyО]▓Т НЇОYoqНЇОUЩpТ─П*Т│П/НЇОT3Т└П!Т┘П;НЇОRhТНЇОNЖТ≤ПFТ≥НЇОM,ТЫТЗП?НЇОHФqТНЇОEtpТ│ПTТ┌НЇОC╘Т═Т║ПOНЇОAъТ┘ПQТ├ НЇО@ТП+НЇО<╒Т╛ПEТґНЇО:ьТП6НЇО7fТ▌П?Т▐НЇО5⌡ТПHНyО2ЩТвПJТьНyО12ТП2НyО.■Т≤Т≥ПAНyО,иТНyО*+Т╠П6Т╡НyО(`ТЦТДП;НyО&√ТНyО#ЬТнТоП.НyО"-ТїТ╗ПBНyО bТП"НyОдТ°Т²ПBНyОЫТП/НЇО┤Т┐ПAТ└НЇОҐТ├П-Т┤П,НЇОРТП.НyО1tТ╘ПEТ╙НyОпТ▒П9Т▓П)НyОoТП, ЇL<ПXБ*Dorado Hardware ManualMemory Section14 September 198148Flushing the Entire CacheDepending upon what kind of storage management algorithm is used, it may be desirable toclear out the entire cache; for example, this might be useful before looping through all themap entries to sample Ref. It would be extremely expensive to do this with Flush_ onepage at-a-time (216 Flush_'es for 1M words of storage). The cleverest method which wehave thought of for doing this is as follows:Designate 4 consecutive 256-word pages (64-row cache) or 16 consecutive pages (256-rowcache) that contain vacant map entries; the munch VA's in these pages will span every rowin the cache. Make one pass through the cache for each column; before each pass, loadMcr with UseMcrV true and McrV equal to the columneven though it is usually illegal tomodify Mcr while the memory system is active, it should be safe to change these particularfields. Then do PreFetch_'es for all 64 or 256 munches in the designated pages; thesePreFetches will all miss and map fault, leaving the selected column filled with vacant cacheentries. After clearing all four columns, restore Mcr to its previous value. While this flushis going on, io tasks may continue to reference memory, but they will experience moremisses and longer miss wait than usual. The total time for this algorithm is about 9cycles/PreFetch or about 138 ms with 64 rows or 553 ms with 256 rows in the cache at 60ns/cycle.Map Hardware DetailsThe map and its control are on the MemX board. Physically, map storage consists of 21 16k, 64k, or 256k x 1MOS RAM's; in addition to the 19 bits discussed earlier, there are a duplicate of the Dirty bit and an odd paritybit.Dirty is duplicated so map parity won't change when both Dirty bits are set. Ideally Ref should also beduplicated, but it is not, and Ref is not checked by map parity. The parity written on Map_ is the exclusive-orof the two B byte parity bits and TIOA.0 (i.e., WP). Parity failure on any map read will cause MapTrouble andMemError and wakeup the fault task when appropriate.On a cache reference that misses or on an IOFetch_ or IOStore_, the map read starts at t4 and the realaddress is passed to storage at t14.The MOS RAM's in the map require refresh, carried out like the storage refresh discussed later.An Automatic Storage Management AlgorithmWe envision for Mesa, Lisp, etc. an automatic storage manager that will pick pages in storage which have notbeen referenced recently for replacement by new ones. This manager will use the Ref bits in map entries todetermine which pages have not been referenced for a long time.The storage manager discussed here controls N pages, where N is some subset of all pages in storage; ingeneral N will vary. A procedure called DeliverPage() returns RP for one of the N pages to the caller andremoves that page from N; a procedure called ManagePage(RP,Age) adds a page to N. A page returned byDeliverPage has been removed from the virtual space; pages accepted by ManagePage may or may not bevacant.Entries for the N pages are sorted into 8 bins, such that entries in the bin 0 have age 0, bin 1 age 1, etc.Whenever DeliverPage has been called N/8 times or after a specified elapsed time has occurred, all N pagesare aged, which means:(a) Entries originally in bin 7 wind up in bin 0 if they have been referenced, bin 7 if not referenced;(b) Entries in bin i (i # 7) wind up in bin 0 if referenced or bin i+1 if not referenced.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbqНЇО^╛pТ─ПUТ│НЇО\АТ▒ПDТ▓НЇО[Т╞П1Т╟П%НЇОYLТ╗ОYыtОYLpП%Т╘НЇОW│ТП-НЇОTТ┼ПVНЇОREТ┘П6Т├П#НЇОPzТ∙ПFТ√НЇОN╟Т Т⌡П>НЇОLЕТ▄Т█ПFНЇОKТ╟Т╠П9НЇОIPТ┤Т┬ПMНЇОG┘Т■П6Т∙П)НЇОE╩ТюП"ТаП3НЇОCПТьП(ТыП-НЇОB%Т▒wpwpТ▓НЇО@[НЇО=uТНЇО:ntТ┌Т┐ПQНЇО9 Т│ПVutТ┌НЇО7╛НЇО5utТҐП4utТЎutНЇО3ґТ├utПAТ┤НЇО2LТ▀П0utТ▄u tНЇО0ЛutТП,НЇО.MТ╛ПXО-юО.MНЇО,іТП!О,НЇО){П_НЇО&,vП)НЇО#▌tТ█Т▌ПQНЇО"-Т■П+Т∙П@НЇО лТП?НЇО.ТїП4Т╗П3НЇОмТ╔П6ТіП4НЇОlТ ПKТ⌡НЇОТ╙Т╚ПOНЇО╚НЇОТ·ПHТ÷П$НЇО╛Т∙Т√ПMНЇОKТНyОґТ▄ПNТ█НyОТФТГП=Ъ ▓Ї К=ПZ \Dorado Hardware ManualMemory Section14 September 198149This aging is performed by first clearing the entire cache using the clever algorithm discussed earlier, thensampling and zeroing Ref.DeliverPage first returns the RP of any page on the vacant queue. If the vacant queue is empty, it next scansentries on the disk write-complete queue; if one is found that has not been referenced in the interim, its mapentry is cleared and its RP is returned; if referenced, it is moved to bin 0. If the disk write-complete queue isempty, entries in bin 7 are scanned; if this bin is exhausted, bin 6 is scanned, etc., until finally bin 0 isscanned. When an entry has been referenced, it is moved to bin 0; when unreferenced but dirty, it is put onthe disk write queue; when unreferenced and clean it is returned.The caller of DeliverPage will frequently be a disk read or new page creation procedure. It should complete itswork and then call ReturnPage(RP,0) to restore the page to the storage manager. ReturnPage will put thepage on the vacant queue, if it is vacant, or into bin 0.Mesa Map PrimitivesBasic Mesa mapping primitives are:Associate[vp,rp,flags] adds virtual pages to the real memory, or removes them if flags=Vacant.SetFlags[vp,flags] RETURNS oldValue: flags reads and sets the flags. If flags=Vacant, the page isremoved from the real memory.GetFlags[vp] RETURNS [rp,flags].These are defined as indivisible operations and are implemented trivially on a machine with no cache (e.g.,Dolphin). For example, if a SetFlags clears Dirty and sets WP, the returned value of Dirty tells correctlywhether the page has been changedno store into the page may occur between reading Dirty and setting WP.One intended use of the primitives is illustrated by the following Mesa sequence for removing a virtual pagefrom real memory:oldFlags _ SetFlags[v,WP];IF oldFlags.Dirty THEN WritePage[...]SetFlags[v,Vacant]This sequence prevents the page from being changed during the write. Another possibility would be just toclear Dirty, and then to check it again after the write. This must be done properly, however, to avoid a racecondition:WHILE (oldFlags_SetFlags[v, Vacant]).Dirty DO SetFlags[v, [Dirty: false]]; WritePage[...]; ENDLOOPTo avoid inconsistent map and cache entries, SetFlags[v, ...] must remove entries for page v from the cache.Unfortunately, since we don't want to make the cache removal process atomic, parts of the page alreadypassed over by the removal process could be brought back into the cache before the process is complete.The implementation of the primitive must allow for this.On Dorado it is, unfortunately, difficult to implement these primitives as indivisible operations because almostany change to map flags must be preceded by clearing all cache entries in the page. However, it isunacceptable to do this with TaskingOff because the time required might be as long as 16*10 cycles with 256-word pages or 64*10 cycles with 1024-word pages (if every munch in the page is in the cache and dirty),which is too long. Consequently, io tasks will be active during the removal process, and one of them mightmove a munch back into the cache after it has been passed over by the removal process. For this reason, thepresent Mesa code flushes once with tasking on and then again with tasking off.The implementation of SetFlags(v, ...) proceeds as follows (Associate is similar.):Flush all cache entries for the page in question. If any entry is dirty, removal will cause a write andset Dirty in the map, as discussed earlier.Disable tasking.Flush all cache entries for the page again.НЇОf╙pТ─Н#$q Н5pТНFОfЯНЇОbAtТ╚Т╛ПVНЇО`ЮТНЇО^BТ├Т┤ПQНЇО\АТ▓Т⌠П[НЇО[─Т┼Т▀П^НЇОZ Т╦ПDТ╧П)НЇОX©Т▌ Т▐П^НЇОW^ТПAНЇОTюТ│П8Т┌П8НЇОS_Т╘ПFТ╙П"НЇОQЧТП9НЇОMЪvНЇОKatП"НxОHцТяП^НxОF$Т°П9Т²П)НxОDдТНxОB%НЇО?┤Т╙utТ╚П8НЇО>&Т╪ПLТҐНЇО<фТ└ПhНЇО:'Т═Т║П`НЇО8фТНxО6▓НxО51П%НxО3пНЇО12Т═ПDТ║П&НЇО/яТ▓Т⌠ПVНЇО.q НxО,<ТП*НxО*шП:НЇО(=Т≤П6utП$utНЇО&эТ╪П^ТҐНЇО%|ТїПRТ╗НЇО$ТП8НЇО!}Т═Т║ПdНЇО То ТпПYНЇО╩Т┤ПGТ┬П%НЇОZТ╚ПFТ╛П!НЇОЫТ⌡П2Т°П9НЇО≥Т│ПlНЇО8ТПOНЇО ПSНxОeТ┤ПhНxОТП+НxОпНxО⌡П+Їx=ПY⌠:Dorado Hardware ManualMemory Section14 September 198150oldFlags _ map[v].FlagsIf turning on WP: map[v].flags _ [WP: true, Dirty: false, Ref: false].If setting Vacant: map[v].flags _ [WP: true, Dirty: true, Ref: false].If turning off WP: map[v].flags _ [WP: false, Dirty: false, Ref: false].These are done with Map_ after which old data is retrieved from the pipe (possibly followed byPreFetch to set map[v].Ref true). Note: These primitives do not support the complete cache flush discussed earlier; another primitive willprobably needed to do this. Also, we really want a primitive that will allow the flags to be sampled and Refzeroed without changing the value of WP or Dirty. And efficiency may demand primitives particularly tailoredto the needs of whatever storage management algorithm is employed.ЪНЇОf╙pТ─Н#$q Н5pТНFОfЯНxОbAtНxО` ПGНxО]ьПGНxО[єПIНxОYoТ╣ПHТІНxОXТП"НЇОUpuТвТьПOНЇОTТ≥П1Т П<НЇОR╞Т≈ Т≤П_НЇОQNТПB*ЇQ+<ПЮVDorado Hardware ManualMemory Section14 September 198151The PipeInformation about each reference is recorded in the 16-word pipe memory. Pipe layout isshown in Figure 10, which you should have in front of you while reading this section. Theprocessor reads the pipe with the B_Pipe0, ..., B_Pipe5 functions. You should note thatPipe0, 1, and 5 are read high-true, while Pipe2 and 3 are read low-true; Pipe4 contains amixture of high and low-true fields; 1503618 xor Pipe4' produces high-true values for allfields in Pipe4. The discussion in this section assumes that all low-true fields have beeninverted.It is illegal to do ALU arithmetic on pipe data (not valid soon enough for carry propagation),and B_Pipei is illegal in the same instruction with a reference because Hold won't becomputed properly.The EmulatorFault, NFaults, and SRNFirstFault stuff in Pipe2, which duplicate what B_FaultInfo wouldread back, is not part of the pipe, although it is read by B_Pipe2'; B_Pipe2' is simply a convenientdecode for reading it backthis will be discussed in the section on fault handling, not here.Similarly, Dirty, Vacant, WP, BeingLoaded, NextVictim, and Victim stuff in Pipe5 is not part of thepipe and is read back by B_Pipe5 purely for decoding convenience. This information, used primarilyfor debugging, is discussed later.The Task, SubTask, VA, and cache control stuff in Pipe0, 1, 2, and 5 is used both internallyby the memory system and externally by the processor. Map and error stuff in Pipe3 and 4is solely for memory management and diagnostic activities carried out by the processor.Two main problems in dealing with the pipe are:Finding the pipe entry for a particular reference;Knowing when various bits become valid;How the Pipe Is AddressedSystem microcode is expected to use the pipe in only two situations: fault handling by task15 (the "fault task") and reading map or base registers by task 0 (the "emulator"). Othertasks will not read the pipe. This rigid view of how the pipe will be used during systemoperation has motivated the implementation discussed below.Pipe entries are addressed by 4-bit storage reference numbers, or SRNs, assigned to eachstorage reference. All task 0 and task 15 references except PreFetch_ with miss (andimplicit FlushStore and Victim references) use the SRN contained in ProcSRN exclusively;all other references share SRN's 2 to 15, which form a ring buffer addressed by an invisibleregister called ASRN.To read a pipe entry, first ProcSRN_B addresses the pipe entry, then the contents of thatentry are read with B_Pipei. In system microcode, the emulator is expected to keep thevalue 0 in ProcSRN to avoid smashing the ring buffer on references; if the fault task needsto make a reference, it will normally load ProcSRN with 1 and use that SRN for thereference; the fault task will manipulate ProcSRN however it likes to examine the pipe butalways restore it to 0 before blocking; other tasks will not use ProcSRN. Thisimplementation is welded to the assumption that only the fault task will probe the pipewhen io tasks are running.ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbrНЇО^╛pТ⌡П/Т°П)НЇО\АТ▐ПZНЇО[Т╔ПDqНЇОYLТ≤Т≥ПBНЇОW│ТЇП+ОVТsОW│qТ╦П-НЇОUЇТїПQТ╗ НЇОSЛНЇОPzpТ│П?Т┌НЇОN╟Тм qpТнПGНЇОLЕТНyОJ#tТ┤ststТ┬stП7НyОHцТ░П:Т▒П*НyОGbТуТжПHНyОFТґ stststs ts tstП"НyОD═Т└ПAТ┘П"НyОC@ТП"НЇО?ЯpТ└qpqpТ┘П4НЇО>&Т├П&Т┤П3НЇО<\ТЇПAТ╦НЇО8ЙТП/Н:О6KП2Н:О4│П'НЇО0;qНЇО,иpТ▄ПSТ█НЇО*ЪТ≈Т≤ПSНЇО)4Т╛П9ТґНЇО'iТП;НЇО#ЬТ░П#Т▒qpНЇО"-ТбТцПCНЇО bТ═ПKТ║НЇО≤Т┌П%Т┐П7НЇОмТНЇО[Т≥ПDТ НЇО▒Т╗qpТ╘П;НЇОфТ▀П(Т▄П3НЇОШТщП9ТчНЇО1Т Т⌡ПFНЇОfТ*П'Т+П(НЇО⌡ТфПWНЇОяТЇ┼<П[│PDorado Hardware ManualMemory Section14 September 198152To io task references and emulator PreFetch_'es that miss, the cache address section'sSRN, called ASRN, is assigned at t2. ASRN will be advanced to the next ring value iff thereference starts the map. In all other cases ASRN remains unchanged and is used by thenext reference as well.A reference starts the map unless it is a DummyRef_, a cache reference or PreFetch_ thathits, or a Flush_ that misses or gets a clean hit. A convenient way to guarantee that themap is started without worrying about the contents of the cache is to do a Map_ in theemulator or an IOFetch_ in any other task. The reasoning behind this treatment of ASRNis explained in the section on fault reporting.Tasks 1 to 14 generally cannot find out the SRN for their last reference. Even if this weredetermined somehow by polling all the pipe entries, there would be no assurance that,meanwhile, a higher priority task didn't clobber the pipe entry.Because of its single pipe entry, the emulator must wait for an earlier reference to finish orfault, before starting another. Of all emulator references, only a fetch, Store_, orPreFetch_ might fault. However, PreFetch_ doesn't use the private pipe entry, so only apreceding fetch or Store_ might still be in progress when a new reference is issued. If thenew reference is another fetch or Store_, it will hold until the preceding one finishes (noproblem). Hence, the only restriction imposed by the private pipe entry is that the emulatormust cause hold with _Md before issuing Map_, Flush_, or DummyRef_, if a fetch orStore_ might still be in progress.Timing constraints do not permit generating Hold in the above case. It has been observed thatissuing Map_ without holding for a previous Store_ to finish will result in infinite DBufBusy (i.e.,infinite Hold), so do not fail to issue _Md before or concurrent with Map_ or RMap_.When the Pipe is AccessedConceptually, the pipe is three different memories. First, VA, task, subtask, and cachecontrol bits in Pipe0, 1, 2, and 5 are written during the reference. Next, the 20 bits of mapinformation in Pipe3 and Pipe4 are written following the map read-write (if any). Finally, theerror correction-detection stuff in Pipe4 is written following the storage read (if any). Thememory system needs one cycle for each of these accesses.However, the hardware treats the pipe as only two separate memories internally, or as onlya single memory for purposes of holding the processor. In other words, within the memorysystem Pipe0, 1, 2, and 5 may be accessed by one part of the pipeline, while another partindependently accesses Pipe3 and 4. But processor accesses by B_Pipei are held, if thememory system wants any part of the pipe. Worse, the memory system uses the pipebetween even clocks (t0 to t2), the processor between odd clocks (t1 to t3), so theprocessor is locked out for two cycles during each of these intervals.Programs can safely read Pipe0, Pipe1, Pipe2, or Pipe5 (i.e., task, subtask, VA, and cachecontrol stuff) in the cycle after any reference, since these are updated at the end of thecache address section cycle. B_Pipei in the cycle after a reference will hold for one cyclewhile the memory system uses the pipe.Values in a pipe entry are not reset at the onset of a reference and Pipe3 and Pipe4 arenot written at all unless storage is accessed. Consequently, Pipe3 and Pipe4 may refer to aНЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbТ╣П/ТІП'НЇО`SТ■П"О_фtО`SpТ∙НЇО^┴Т⌠Т■П>НЇО\ЎТНЇОYLТ┬П4Т┴П$НЇОW│Т⌡ПDТ°НЇОUЇТ╛ТґП7НЇОSЛТ≈ПRТ≤НЇОR"ТП/НЇОN╟Т▓ПWТ⌠НЇОLЕТфПUНЇОKТП@НЇОG╗Т┼ Т▀ПPНЇОEчТТ П?НЇОDТ╒П*ТёП.НЇОBIТ┤Т┬ПJНЇО@~Т╒ТёПCНЇО>ЁТ┘П5Т├П(НЇО<ИТи ТйПGНЇО;ТП"НxО8]tТЄП^НxО6ЭТ╡П5ТЁП/НxО5⌡ТПTНЇО1yqНЇО.pТфПXНЇО,<Т┼П9Т▀П%НЇО*rТ│Т┌ПGНЇО(їТ°ПJТ²НЇО&эТП9НЇО#jТ▄П%Т█П5НЇО!═Т┴ПNТ┼ НЇОуТ■ПFТ∙НЇОТ²Т·П/qpНЇО@ТгqpТхП'НЇОuТПОХtОupОХtОupП&ОХtОupТЯОХtОupНЇО╚ТПFНЇО9Т√Т≈ПGНЇОnТґПKТўНЇОєТ┼Т▀qpП7НЇОыТП&НЇОgТ╒qpТёП1НЇО °Т─П.Т│П. xЇ U<П]ІcDorado Hardware ManualMemory Section14 September 198153previous reference *Caution*.The control bits in Pipe2' and Pipe5, used by the memory system, also indicate (to the faulttask) what kind of reference is described in the pipe, as follows:CacheRefa fetch or Store_Store'Store_'IFURefIFU fetchesRefTypedistinguishes read, write, Map_, and other referencesFlushStoredirty victim write triggered by Flush_ColViccache column of a hit, or of the victim on a missDummyRef_ finishes immediately and only VA in Pipe0 and Pipe1 and the stuff in Pipe2 arerelevant. For Flush_, cache information in Pipe5 is also valid. Flush_ finishes immediatelybecause the resulting FlushStore and dirty-victim write references (if any) are started inring-buffer pipe entries.Programs can read map stuff (Pipe3 and Ref, WP, Dirty, MapTrouble, and MemError inPipe4) as soon as that part of the reference is complete. For Map_, completion of the mapread is coincident with MapBufBusy going false, determined by polling. For a fetch orstore, there is no way to distinguish completion of the map read from completion of theentire reference. Consequently, Pipe3 and Pipe4 are normally read by doing _Md (whichholds for completion), then reading the pipe.For IOFetch_, IOStore_, and PreFetch_ there is no way to tell when the reference hasfinished, except by waiting longer than the memory can possibly take to complete thereference.IOStore_'s and dirty victim writes zero the Syndrome and EcFault fields in Pipe4. Hence,the only reference that leaves junk in these bits is Map_; the fault task can distinguish pipeentries for Map_ by means of the RefType field.All data in Pipe0, 1, 2, and 5 except FlushStore and ColVic are written at t3, and can be readimmediately after a reference. However, FlushStore and ColVic are written at t4. Ordinarily, thiswould mean that their values could not be read safely; however, since B_Pipei is held in the cycleafter a reference, the values will always be ok.In the best case, map information in Pipe3 and Pipe4 will be loaded at t14, fault and error correctorinformation in Pipe4 at t48.Faults and ErrorsRemember that high-true values for all fields in the Pipe are used in the followingdiscussion.ErrorsSeveral events cause memory errors from which the system does not recover. Errors haltthe processor if the MemoryPE error is enabled (see "Error Handling"). If MemoryPE isdisabled, the program will continue without any error indication. MemoryPE conditions are:ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbrpНЇО^╛Т├ПVТ┤НЇО\АТПBНyОYoН~НyОW╔Н~НyОUзН~ НyОTН~П5НyОRE Н~П&НyОPzН~П1НЇОMТ┌П3Т┐П%НЇОK>Т┤ Т┬ПSНЇОIsТаПBТбНЇОG╗ТНЇОD7ТгП"ТхqpqpНЇОBlТ┌Т┐П?НЇО@║Тдq pТеНЇО>вТ╡ПEТЁНЇО=Т║П*Т╒П,НЇО;AТП-НЇО7пТ╩Т╪П9НЇО6ТрТсП9НЇО4: НЇО0хТ²Т·qpqpНЇО.ЧТ┘П:Т├П$НЇО-3ТП!qpНyО*rtТІП&s tstТЇО)ЕО*rНyО(йТ╞П)s tstО(=О(йТ╟НyО'#Т√Т≈П7stНyО%бТП0НyО#$Т┼П!Т▀П'О"≈О#$НyО!}ТО ПО!}НЇОZrНЇОХqТЧП4ТЪНЇО НЇОьНЇОfpТ≥ПJТ НЇО⌡Т╡qpП*ТЁqpНЇОяТ┤П+Т┬qpЪ ╨Ї┼<П[│?Dorado Hardware ManualMemory Section14 September 198154Byte parity errors from the cache data memory (checked on write of a dirty victim, noton _Md or IFU reads); the processor checks Md parity (see "Error Handling") and theIFU checks F/G parity;Byte parity errors from fast input bus;Cache address memory parity errors.FaultsOther events cause faults. A fault condition is indicated in the MapTrouble, MemError, andEcFault fields of Pipe4 when it occurs; in addition, the fault task is woken to deal with thesituation unless NoWake is true in Mcr. The encoding of the various errors is as follows:Table 17: Fault IndicationsKind of ErrorNameMapTroubleMemErrorEcFaultMap parity errorMapPE11Page faultPageFlt10Write-protectWPFlt10Single errorSE001Double errorDE011In the above table, WPFlt and PageFlt have the same encoding; these must be distinguishedby means of the Store' bit in Pipe5 and the WP bit in Pipe4; WPFlt can only occur forStore_, IOStore_, or dirty-victim stores that encounter WP true.MapTrouble might be true and reported to the fault task on a fetch or store that misses oran IOFetch_, IOStore_, FlushStore, or dirty-victim write. Flush_ and DummyRef_ nevercause MapTrouble. Map_, PreFetch_, or IFU fetches might record MapTrouble in the pipebut never wake the fault task. Map faults on IFU fetches are reported instead to the IFU,which buffers the fault indication until an IFUJump occurs to an opcode with at least oneinstruction byte in the word affected by the map fault; then a trap occurs, as discussed in"Instruction Fetch Unit".In system microcode, we expect a WPFlt and PageFlt due to IOFetch_, IOStore_, FlushStore, or avictim write to indicate a programming error; however MapPE might occur. Note that if any kind ofMapTrouble occurs on a storage write (i.e., on an IOStore_, FlushStore, or victim write), storage isnot modified and contains the old value; however, the map's Dirty bit will be true, even though thestorage write has not completed.SE and DE may occur on any cache reference or PreFetch_ that misses or on anIOFetch_. Map_, IOStore_, DummyRef_, and Flush_ never cause these errors. Also notethat fault task wakeup on an SE requires not only NoWake false but also ReportSE true inMcr; the fault indication transmitted with the munch for an IOFetch_ is set only for DE,never for SE.Unlike map faults, data errors on IFU fetches and PreFetch_'es are reported to the fault task. Thismust be done for DE's, which are fatal; for corrected SE's, the fault causes no disruption to theprogram because the fault task, after logging the failure, simply lets the task that faulted continue.ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНyОbТ▒Т▓П?НyО`SТ▒П@Т▓НyО^┴ТНyО[ЙП'НyОYLП#НЇОUqНЇОQ■pТ┬qpТ┴П%qpНЇОOйqpТ√Т≈ПAНЇОMЪТ÷qpТ═П.НЗОI╨rТXН:ОFHqТ─НПН#$ Н,иН5Н:ОC╘pНПqН&-pН/DН6ыН:ОAъ НПqН&,pН/DН6ыН:О@НПqН&,pН/DН6ыН:О>JНПqН&,pН/DН6ыН:О<НПqН&,pН/DН6ыНЇО89qpqpТ│П4НЇО6oТЎqpqp Т©qpНЇО4єТП8qpНЇО12q pТ▓Т⌠ПGНЇО/hТ╦ПUНЇО-²Т▄q pП"Т█ q pНЇО+рТ²ПFТ·НЇО*Т÷ПYНЇО(=Т≈Т≤ПKНЇО&sТНxО#╠tТ═П!ststТ║П(НxО"PТ▓П6stТ⌠НxО Оs tТ П!Т⌡П9НxО▐Т≈ПТ╒НyОK╗ТПLНyОI П=Н:ОF╡tТ░Т▒П4stН:ОEQstТ≤П$sТ≥ tП"Н:ОCПТ⌡Т°П;Н:ОB▐Т┐ПBТ└Н:ОA.ТП0НyО>ЁpТРТСqpqpqp НyО<ИТНyО:KТЕТФПDНyО8─ТПFНЇО5Т√Т≈ПUНЇО3CТ÷Т═ПFНЇО1yТ╛ПRТґНЇО/ўТП?НyО,іТйП%ТкП*НyО)·ТПJНyО&√ТрП!ТсП*НyО$кТП%НЇО!YqpТфqpТгП;НЇО▐ТП6НЇОТ┼П?Т▀НЇОRТ©ПKТю НЇО┤Т╩ПWНЇОҐТяП3qp Їv<ПQ∙Dorado Hardware ManualMemory Section14 September 198156RemarksThe careful scheme in which ASRN is advanced only for storage references, and faults reported in preciseorder is essential. If faults were reported out of sequence, then the fault task might see Pipe0 to Pipe2 stuffinconsistent with Pipe3 and Pipe4 error indicators for a previous loop through the ring buffer.The hardware must not and does not report MapTrouble until the end of the pipeline. If this were not true,then an SRN might report MapTrouble before its predecessor reported SE or DE; this could screw up the faulttask.In tasks other than the emulator, map faults will probably represent programming errors. In the emulator, page-not-in-memory and write-protect faults are expected, and the fault task will trap the emulator to a fault-handlingMesa program. Information saved by the trap microcode must be sufficient to continue the faulted opcode atthe instruction that faulted.The B_DBuf FF decode permits the fault task to retrieve data being written when a Store_ faults.Error Correction FaultsFor error correction purposes, munches are divided into four quadwords, each containing64 data and 8 check bits.At the end of a storage read, the hardware indicates DE after a double-error or SE after asingle error as discussed earlier. The SE or DE indication is unambiguous assuming atmost two bits in error in any 64-bit quadword; for an odd number of errors greater than 2,the hardware erroneously reports an SE; for an even number of errors greater than 2, DE isreported. If several quadwords in a munch suffer errors, the hardware reports the first DE,if any, or the last SE, if no DE's.Error correction can be enabled/disabled by the LoadTestSyndrome function discussedlater; when enabled, the hardware will complement (= correct) any SE; for DE's thehardware does not modify any bits from storage.The absolute address of the quadword containing the reported error isRP[0:15]..VA[24:27]..quadword[0:1] with 256-word pages, orRP[2:15]..VA[22:27]..quadword[0:1] with 1024-word pages. I.e., the word address of thefirst word in the quadword would be these 22 bits with two low-order zeroes appended.SE and DE are derived from the 8-bit syndrome field in Pipe4. Syndrome = 0 means noerror; neither DE nor SE should be true in this case. Syndrome non-0 with an odd numberof 1's should have SE indicated. Syndrome non-0 with an even number of 1's or an invalidword code (discussed below) should have DE indicated.See Figure 11 for the correspondence between syndrome and bits within the quadword.НЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbAsНЇО_ёtТ╛ТґПSНЇО^BТ⌠Т■ПYНЇО\АТП_НЇОZCТ°П*s tТ²П7НЇОXБТ▀s tП!ststТ▄НЇОW│НЇОTЦТ─П:Т│П6НЇОS┌Т├ПQТ┤П!НЇОR"Т▓ПIТ⌠П"НЇОPаТНЇОN#ТЗПXТШНЇОJqТНЇОF▌pТ÷Т═qpqpНЇОDдТНЇОARТ■ПOТ∙ НЇО?┤Т╣П-ТІqpНЇО=╪Т⌠П.Т■П,НЇО;РТ│Т┌ПNНЇО:'Т▀П=Т▄qpНЇО8]ТqpНЇО4КТнП(ТоП+НЇО3 ТФПIТГНЇО1UТП/НЇО-ДТДПEНЇО,ТгП0Тх НЇО*NТхПPТиНЇО(└ТЇПKТ╦ НЇО%Т═П3Т║П!НЇО#GТ┴ПJТ┼ НЇО!}Т├П*Т┤П/НЇО╡ТП5НЇО@qТ╦ПEТ╧ ЇЫ<ПLDorado Hardware ManualMemory Section14 September 198157For SE's, syndrome specifies exactly which of the 64 data bits or 8 check bits was in error.If syndrome has a single one in it, then the corresponding checkbit was in error. Whensyndrome contains more than one 1, then syndrome[4:6] indicate which word in thequadword suffered the error as follows:word 0011word 1101word 2110word 3111The other four values of syndrome[4:6] are impossible for an SE and are reported as a DE.Syndrome[0:3] indicates the bit position within that word; unfortunately these bits arereversed, so that the bit number is given when the bits are taken in the order 3, 2, 1, 0.Syndrome[7] is the parity of the syndrome, and a double error is indicated by a non-zerosyndrome having even parity.Storage writes leave garbage in the EcFault and Syndrome fields of the Pipe; the fault task mustdistinguish these cases by means of the RefType field in Pipe2.As discussed below in the "Testing" and "Initialization" sections, TestSyndrome is xor'ed with checkbits that would otherwise be written on storage writes. This means that Syndrome-of-read equalsTestSyndrome-of-write is an exact indication of no-error. However, the hardware always reports non-zero syndrome as an error, as discussed above, regardless of what's in TestSyndrome.Dirty is set in the cache after a Store_ that misses, despite any fault, so when that munchis chosen as victim, it will be written back into storage. Consequently, if the fault taskattempts recovery from a double error on a Store_, it may have to clear the cache addresssection's Dirty bit for the munch using the tricky sequences discussed later. StorageStorage is organized into modules consisting of two identical boards per module. Themodules appear in the chasis as shown in Figure 2. Depending on whether 16k-bit or 64k-bit IC's are used, a module stores 256k or 1m 64-bit (+8 check bit) quadwords. A Doradocan have up to 4 modules, for a maximum of 16m words. Every module must be the samesizeit is illegal to mix module sizes.The module in slot 0 supplies the first quarter of real memory; slot 1, second quarter; slot2, third quarter; and slot 3, fourth quarter. In other words, real memory addresses are notinterleaved among modules and the address range covered by a particular module cannotbe controlled by the firmware.The B_Config' function (Figure 10) returns M0, M1, M2, and M3 which are true only whena module is plugged into the corresponding storage board slot. ChipSize indicates whatsize ic's are used on the storage boards. The memory system automatically adjusts itself tooperate according to the IC size in use on the storage boards.When 256k x 1 MOS storage ic's become available, we plan to replace the 4k and 16kwires on the backplane by an extra address wire and a 256k wire; at this time we will losethe ability to handle 4k x 1 and 16k x 1 ic's and the hardware will allow either 64k or 256kstorage ic's to be used.НЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbТ┤Т┬ПSНЇО`SТ╛П#ТґП4НЇО^┴ТРПPНЇО\ЎТП'Н:ОYLТ─НН:ОW│НН:ОUЇНН:ОSЛННЇОPzТ┼ПPТ▀НЇОMТЙПWНЇОK>Т╗П7Т╘П#НЇОIsТёП.ТєП*НЇОG╗ТНyОDГtТ╚П$ststП#Т╛НyОC├ТП(stНxО@ХТ┴Т┼П*stНxО?┤Т╚ПHТ╛НxО>&Т┤ПPТ┬НxО<фТПTНЇО9wpТ▓Т⌠ПOНЇО7╛ТҐПVТЎНЇО5БТ┬Т┴ПAНЇО4ТПNНЇО/яrНЇО,_pТфТгqpП4НЇО*∙Т┴П/Т┼П)НЇО(йТ▄П4Т█П$НЇО'Т▀П5Т▄НЇО%5ТП'НЇО!цТ П\НЇОЬТ▓П1Т⌠П+НЇО.Т ПUНЇОcТНЇОЯТ∙П+qpТ√qpqpqpНЇО'Т╛ТґП$qpНЇО\Т│ПYТ┌НЇО▒ТП>НЇО Т╨П8Т╩qpqНЇОUpТ▓Т⌠qpНЇО┼Т▌П\НЇО юТ■Ї y=П]▓FDorado Hardware ManualMemory Section14 September 198158MOS RAMs used on storage boards (and in the map) must be refreshed at regularintervals, else they drop data. This occurs during refresh references once every 16 ms.Every MOS RAM on every storage board participates in every refresh reference, and onerow of data is refreshed each time. This means that 64 (4k RAMs), 128 (16k RAMs), or 256(64k RAMs) refresh references are required to refresh all data (So the refresh period is 2 or4 msthe specification on both 16k and 64k RAMs is a 2 ms refresh period at themaximum operating temperature (85o C). The dominant leakage term is exponential intemperature so the refresh period can be doubled each 5o C drop in operatingtemperature. Because the specification is conservative and because we have no intentionof operating anywhere near 85o C, a 4 ms refresh period should be adequate.The time for each refresh reference is 8 cycles (13 cycles with 4k-bit RAMs), same asnormal references. Refresh hardware competes for storage access with the cache datasection and fast io references. During the first 8 ms of a 16 ms period, refresh defers tonormal references; during the last 8 ms, it preempts normal references.The CacheThe physical cache structure consists of 256 entries in an array of 64 rows by 4 columns.Each entry holds 15 address bits, a parity bit for the address bits, four flag bits, and onemunch of data (= 256 data bits + 32 check bits). Hence, the cache holds a total of 4kwords of data.The address section is implemented with 256-word RAM's, but only 64 words are presently used.The data section uses 1kx1 RAM's for storage. When sufficiently fast 4kx1 ECL RAM's becomeavailable, we plan to use them in the cache data section and utilize all 256 words in the addresssection. In this case, the cache geometry will be 256 rows by 4 columns (16k words in the datasection).The cache address section stores 4 flag bits discussed below, 15 VA bits, and 1 parity bit.The way the VA bits are assigned depends upon whether or not 4k x 1 ECL RAM's areused in the cache data section. VA[7:19] are stored in the address section for allconfigurations. Two other bits are either VA[5:6] or VA[20:21]; VA[5:6] are used with 4kic's in the cache data section (VA[20:21] then appear in the row address of the cache, sothey don't have to be stored). The hardware is also arranged so that the parity bit may bereplaced by VA[4].In other words, the cache initially implements a 225-word virtual memory with provision forexpanding this to 227 words when 4k x 1 RAM's are available or to 228 words at the cost ofeliminating the parity bit in the address section. However, the map organization also limitsvirtual memory, probably to a smaller size than the cache limit, as discussed earlier.Normally, the cache is invisible to the programmer except for problems with map/cacheconsistency discussed in the map section. However, features discussed below in"Testing" allow more direct access for checkout, initialization, and error recovery.An address VA, if in the cache at all, must be in one of the four columns of the rowaddressed by VA[22:27] (or VA[20:27] if the cache is expanded). References compare theappropriate 15 or 16 bits of VA[4:21] with the values stored in each of the 4 columns todetermine which cache entry, if any, contains VA.ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbТРПMНЇО`SТ╦П3Т╧П"upНЇО^┴Т╘ПBТ╙НЇО\ЎТ└ПYНЇОZСТ┐ПFТ└НЇОY)ТДТЕПGНЇОW^ТйП!ОWКtОW^pТкП,НЇОU■Т)П!Т*ОV!tОU■pНЇОSиТ≈ПFТ≤НЇОQЧТОR▀tОQЧpП-НЇОN▄ТфТгП<НЇОLбТ╧ПIТ╨ НЇОJВТїТ╗up upНЇОI-ТП%upП!НЇОDrНЇО@║pТ≈Т≤ПMНЇО>вТёПXТєНЇО=ТёПBТєНЇО;AТ НyО8─tТ╟П]НyО7ТЄП$Т╣П7НyО5ЎТ╒П+ТёП6НyО4^Т║Т╒ПDНyО2ЩНЇО/ўpТ⌠ПOТ■НЇО-ДТ╡ПQНЇО,ТРТСП=НЇО*NТ╞ПLТ╟НЇО(└Т≈Т≤ПRНЇО&╧Т█ПPТ▌ НЇО$НТНЇО!}Т≥П2О" tО!}pТ П"НЇО╡Т┘О ?tО╡pТ├О ?tО╡pНЇОГТ░П)Т▒П4НЇОТГТХП>НЇО╚Т╦П9Т╧НЇОЮТ Т П8НЇОТПTНЇОєТгТхП<НЇОыТ∙ПWНЇОТ╙Т╚ПDНЇОDТП1Ъ ЇЩ=П\cDorado Hardware ManualMemory Section14 September 198159The VNV memory contains two two-bit entries for each row of the cache. The Victim fieldspecifies the cache column displaced if a miss occurs in this row. The NextV field is thenext victim. When a miss or a hit in Victim occurs, Victim_NextV is done. When a miss,hit in Victim, or hit in NextV occurs, NextV_Victim.0',,NextV.1' is done (i.e., NextV is loadedwith a value different from both the original NextV and Victim). This strategy is not quiteLRU, since there is a 50-50 guess on the ordering of the third and fourth replacements.This treatment of VNV is used for fetches, Store_, PreFetch_, and IFU fetches but not forIOFetch_, IOStore_, or Map_, which don't use the cache.On a Flush_, Victim is written with 0 on a miss or with the column of the hit and NextV iswritten with Victim.0',,NextV.1'. If the Flush_ hit a dirty cache entry, then a FlushStorereference is fabricated which will wind up writing Victim (= column hit by the Flush_) backinto storage. The FlushStore reference will also do Victim_NextV andNextV_Victim.0',,NextV.1' again. This means that the VNV entry for the row touched by aFlush_ is effectively garbaged, which probably won't affect performance much.A better strategy for Flush_ and IOStore_ would be as follows: On a miss, Victim and NextV remainunchanged; on a hit in a column different from Victim, Victim_hit column, NextV_Victim; on a hit inVictim, no change.The UseMcrV feature discussed in "Testing" allows Victim and NextV to be replaced by McrV andMcrNV.Associated with each cache entry are four flag bits that keep track of its state, as follows:Dirty - set by Store_, cleared when loaded from storage. This bit does not implyanything about the map's Dirty bit. The cache Dirty bit causes a storage writewhen the entry is chosen as victim, and the map's Dirty bit is set at that time.Vacant - set by hit on Flush_, hit on IOStore_, or Store_ into a write-protectedentry, cleared when the entry is loaded from storage. Vacant is not set after an SEor DE. Vacant prevents the entry from matching any VA presented to the cache.WriteProtect - a copy of the map's WP bit. It is copied from the map when thecache entry is loaded and not subsequently changed. If a Store_ is attempted intoa write-protected entry, the entry is invalidated, there is a cache fault, and a writeprotect fault will be reported by the map.BeingLoaded - set while an entry is waiting for data from storage. Any referencethat hits in the same row will remain in the cache address section until the bit goesoff; any reference or _Md following the one which hit a row being loaded will beheld.RemarkAt the end of a miss, data from the error-corrector is loaded into the cache 16 bits/clock. Not until all 16words of the munch have been loaded is Md loaded and the task (which has been held) allowed to continue.A scheme whereby the word being waited for is loaded into Md concurrent with writing it into the 1kx1 RAM'shas been considered but rejected as too complicated. This would reduce average miss time from about 28cycles to about 24.НЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbТ⌠Т■П1qpНЇО`SТ·П5Т÷qpНЇО^┴Т⌡Т°ПLНЇО\ЎТ┴П_НЇОZСТ║П$Т╒П8НЇОY)Т╠П(Т╡П/НЇОW^Т≈Т≤ПDНЇОU■ТП7НЇОR"Т■ПZНЇОPWТЎПIТ©НЇОN▄Т┘П9Т├П"НЇОLбТНПEНЇОJВТ Т⌡П8НЇОI-ТПMНyОFktТ┌П5Т┐П-НyОE Т▐Т░ПDНyОC╘ТНyОAТ═stПIТ║stНyО?╙stНЇО<\pТ≈ПFТ≤НxО9ҐqpТ·П9Т÷НxО7СТЎПOНxО6(Т©ТюП5НxО3┼qpТЁПJНxО1©Т┼П5Т▀НxО/УТ·П!Т÷П-НxО-VqpТўТ╞П)НxО+▄Т┌П9Т┐НxО)аТ≈П8Т≤НxО'ВТП*НxО%Xq pТ²ПFНxО#▌Т│ПLТ┌НxО!цТ÷Т═П7НxОЬНЇОжsНЇОtТ║ПBТ╒П+НЇО▒Т▒П=Т▓П+НЇОТ┼Т▀ПcНЇО┬ТєПdТ╔НЇОТ0ЇА=ПU*:Dorado Hardware ManualMemory Section14 September 198160InitializationThis section outlines the order in which parts of the memory system can be initialized.ClocksThe instruction decoding flipflops of the memory section are enabled when the processorclocks are enabled. All other memory clocks are enabled by a signal called RunRefresh, asdiscussed in "Dorado Debugging Interface".When RunRefresh is true, clocks internal to the memory system always happen, even if theprocessor is halted. When RunRefresh is false, memory clocks run with the processor.Except for low-level debugging of the memory system itself, RunRefresh should be true.Otherwise, storage will not retain data at breakpoints.Mcr RegisterThe Memory Control Register (Mcr) contains fields that affect the memory system (seeFigure 10). Mcr is intended to facilitate testing, and in some cases initialization. Theregister can be loaded with the Mcr_ function and read back over the DMux. Bits in Mcrare as follows (Some of these bits are loaded from A and others from B, as indicated inFigure 10):dVA_VicOn each reference, write the cache address entry selected by the rowof VA and column of Victim (Note: Victim determines the column, evenon a hit) into VA of the pipe, so that VA[4:21] in the pipe contain theaddress from the cache. Also prevent both map and storageautomata from starting (which prevents ring buffer pipe entries frombeing allocated to these as well). FDMiss should always be true whendVA_Vic is true.FDMiss"Force dirty miss" forces each cache reference to miss and store thevictim, even if not dirty. Misses caused by FDMiss do not cause Hold(*details*).UseMcrVUse McrV as victim and McrNV as next victim for all cache missesinstead of Victim and NextV from VNV.McrVThe two-bit victim, or cache column used on a miss, when UseMcrV istrue.McrNVThe two bit next-victim when UseMcrV is true.DisBR"Disable base registers" prevents base registers from being added toD in computing VA and prevents BR from being written.DisCF"Disable cache flags" forces cache flags to read out as zeroes andprevents them from being written.DisHold"Disable Hold" unconditionally prevents hold and BLretry fromoccurring.НЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbr НЇО^╛pТеПCТфНЇОZfqНЇОVТpТёПIТє НЇОU*Т│Т┌П8q pНЇОS_ТП*НЇОOМТ▀ПTТ▄НЇОN#ТҐП(ТЎП-НЇОLXТ╨ПMТ╩НЇОJ█ТП7НЇОFHqНЇОBжpТпПOТяНЇОAТкПUТлНЇО?AТ≤Т≥П<НЇО=vТ╙Т╚ПCНЇО;╚Т НxО9 qН~pТ▄П0Т█Н~О7BТ┐П(Т└Н~О5xТ≈П/Т≤Н~О3ґТ#Т$П.Н~О1ЦТІП&ТЇН~О0Т┬ПEН~О.MТНxО+╞qН~pТ▓П1Т⌠Н~О)ДТ░П:Т▒ Н~О(rpНxО%|qН~pТюП@Н~О#╠ТП%НxО!qН~pТ┘П3Т├Н~ОHНxО╙qН~pТП-НxОqН~pТ⌡П!Т°П#Н~ОAqpТП4НxОёqН~pТ╡П0ТЁН~ОьТП!НxО:qН~pТDП0ТEН~Оo ▄Ї(=ПXЦ&Dorado Hardware ManualMemory Section14 September 198161NoRefDisable storage references.WMissWakeup fault task on every miss.ReportSE'Don't wake up fault task after (correctable) single errors.NoWakeNever wakeup fault task.During normal operation every bit in Mcr should be 0, except possibly ReportSE', ifcorrectable errors are not being monitored. It is illegal to load Mcr while references are inprogress (Changing DisHold is known to cause problems).System InitializationSystem initialization must get the map initialized as desired and the cache in agreementwith the map. Initialization firmware should allow for cache rows containing several entriesfor the same address, which might occur after power up or after running diagnostics.There are many ways to carry out this initialization. One is as follows:1. Set NoWake and DisHold true in Mcr, so the fault task won't disturb initialization, and so thatBeingLoaded conditions won't cause trouble.2.Clear TestSyndrome.3. Load the map as desired. Clear the cache as discussed in the Map section. After this thecache will be empty and Ref and Dirty in map entries will be smashed.4. Reload the map as desired.5. Read FaultInfo to kill any pending wakeup for the fault task.6. Setup Mcr for normal activity (0 or ReportSE').ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНxОbqН~pНxО_─qН~pНxО\АqН~pП;НxОZCqН~pНЇОW;ТСП<ТТ qpНЇОUpТ█П,Т▌П2НЇОSіТП7НЇОO`qНЇОKНpТІТЇП:НЇОJ#Т┴Т┼ПUНЇОHYТжПGТвНЇОDГТПIНxОBtТёП'ТєП<НxО@~ТП+НxО=╪НґНxО:ШТ╛П#ТґП;НxО9wТststНxО6╣НxО3ТПAНxО12П3Ї1<П6Э·Dorado Hardware ManualMemory Section14 September 198162TestingThis section outlines the order in which parts of the memory system can be tested, so thatonly a few new components are involved at each step.VA, Adder, BR'sThe first step is to set NoWake, FDMiss, and DisBR to true. Now processor references willdeposit Mar in VA of pipe entry 0 (in the emulator), or of every other pipe entry (in othertasks), so that this part of the pipe can be tested (LongFetch_ allows all the VA bits to betested). Next, setting DisBR false, loading BR's, and making more processor referenceswill allow BR's and adder to be tested.Cache Address StorageThen set NoWake, FDMiss and UseMcrV to true and use McrV and McrNV to select onecolumn of the cache at-a-time. Each processor reference will store its VA into that column,and into the pipe, and will read out the old VA into the next ring buffer pipe entry (as thevictim because FDMiss is true). This allows the VA bits in the address memory to beinitialized and tested. The column number in Pipe2 should read back the value in McrV inthis case.Above, address memory values are read using FDMiss, then VA is checked in the pipeentry created for the victim. A simpler method of reading any address section VA is asfollows: Turn on DisBR, UseMcrV, and dVA_Vic. On processor references, the cacheentry addressed by Mar[6:11] (the row) and McrV (the column) will then have its VA[7:21]written into VA[7:21] of the pipe entry for the reference.The flag bits in the address section can be directly tested using B_Pipe5 and CFlags_A.These functions operate on the cache entry addressed by the row of the last reference andcolumn of the hit or victim on a miss. Since the IFU or another task could have issued thelast reference, these functions are realistically limited to initialization and checkout, wherethe last reference is known. Normally these will be used with UseMcrV and FDMiss true inMcr, so McrV will select the column.B_Pipe5 also reads V and NV from the selected row. CFlags_A won't work if DisCF istrue, and B_Pipe5 will read zeroes for all four flags in this case.CFlags_A requires that Mar data continue without glitching during the precedinginstruction as well. This means that data originating in RM or T must not have been loadedduring either of the two previous instructions (else a glitch might occur when themultiplexor switched from the bypass to direct path) and that no higher priority tasks mayintervene between the two instructions. Issuing CFlags_A in both instructions is theeasiest way to drive Mar continuously for two cycles.Cache Data StorageNext, initialize the cache address section VA's and flags so that the cache data section canbe tested. To do this turn off FDMiss while leaving on NoWake, dVA_Vic, UseMcrV.ЪНЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbrНЇО^╛pТ▐ПZНЇО\АТП4НЇОX°qНЇОU*pТ▀П(Т▄П2НЇОS_Т║П:Т╒П!НЇОQ■Т▓Т⌠П<НЇОOйТІП,ТЇП+НЇОMЪТП'НЇОI╨qНЇОFHpТґПLТўНЇОD}Т┌П.Т┐П.НЇОB╡Т≤Т≥ПPНЇО@ХТйПNТкНЇО?Т░П?Т▒НЇО=SТ НЇО9АТбТцП=НЇО8Т╡ПTТЁНЇО6KТдП2ТеНЇО4│Т═ПAТ║НЇО2ІТП:НЇО/DТїП4Т╗П#НЇО-zТ┘П7Т├П"НЇО+╞Т┬Т┴П;НЇО)ДТ═П_НЇО(Т█П&Т▌П3НЇО&OТП$НЇО"щТЄТ╣П?НЇО!ТПCНЇО║Т9П(Т:П'НЇОжТ│П<Т┌НЇОТПCТ НЇОAТ²ПGТ·НЇОvТХТИП@НЇО╛ТП5НЇОfqНЇОТpТ└ПXТ┘НЇО*ТшП7ТэЪ ХЇ Ц=П](Dorado Hardware ManualMemory Section14 September 198163Initialize the address section to a convenient range of virtual addresses by Store_'s to eachmunch with appropriate McrV values. In the instruction after each reference, write theflags to WP = false, BeingLoaded = false, Vacant = false with CFlags_A.At the end of this setup, the address section will be loaded and have write access to thedesired virtual addresses. Hence, Fetch_'es and Store_'s to these VA's will not miss, andwill access the 4k of cache data memory, which can thus be systematically tested.MapNext, turn off UseMcrV, leaving only NoWake turned on and use Map_ to test the map. Atthe end of this test initialize the map, say, to map virtual addresses into corresponding realaddresses.Main StorageThen finally the storage can be accessed and tested with fetches and Store_. FDMiss canbe used to force storage references.Fault ReportingNoWake can be turned off and methods similar to the above can be used to test faultreporting.IOFetch_, IOStore_, Fast IO BussesSpecial hardware is needed to test these (the IOTest board).Error CorrectionIn normal operation TestSyndrome contains 0 and Syndrome, written by the error corrector,should be 0 if no error was corrected or detected. For test purposes, TestSyndrome canbe loaded with any non-zero value and one bit disables error correction altogether. If thereare no storage failures, TestSyndrome should wind up in Syndrome after a storage read.The error-corrector, MemError, ECfault, ReportSE', and fault reporting can be tested usingTestSyndrome.The LoadTestSyndrome function causes TestSyndrome to be loaded from DBuf. Thisshould normally be done after a Store_, as follows:TaskingOff;Store_RMAddr, DBuf_T;*DBuf_data for TestSyndromeLoadTestSyndrome;TaskingOn;TaskingOff is required because an intervening higher priority task might change thecontents of DBuf.НЇОf╙pТ─Н$┘q Н5pТНFОfЯНЇОbТ│П#Т┌П:НЇО`SТеПMТф НЇО^┴ТПGНЇО[Т║ПRТ╒НЇОYLТ▓ПPТ⌠ НЇОW│ТШП2ТЭНЇОSmqНЇО:ШpТбП,ТцП'НЇО90 НЇО4КqТП"НЇО1ypП<НЇО-3qНЇО)аpТ┐ПNТ└ НЇО'ВТ÷ Т═ПLНЇО&,Т┌П)Т┐П4НЇО$aТ╛П,ТґП*НЇО ОТ√ПZНЇО%НЇОЁТАПOНЇОХТП3Н:Оv Н:О╛Т─Н" ТН:ОАН:О НЇО ╔ТЧПSНЇОзТЪ`Ї⌠<П\xЮ HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICALOGO TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICAHIPPO HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICAHIPPO HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA HELVETICA TIMESROMAN HELVETICA HELVETICA HELVETICA HELVETICAHIPPO HELVETICA HELVETICA HELVETICA HELVETICA HELVETICAHIPPO dґ `fxа} э(Щ0 o: kCNN УW OaкhМo╒wв~ъ├ц█⌠╩≥÷∙ідґ╡%╧бауфЫм≤у з БГЙЮР ўЭўХ╛фл$* ╪3Ё:іBГJЛQйY ╔b Єkзr-yЁ│ж┴ є▓ ф°╙÷ ╟╗ ²╠а╦▄юГхОп╨ь ²А фЙзЯbУАЩ "┐B":ДJСЖZЬВ!?фBЩZ iKiJ,CЮf▀9GЕFiCB"9@Е]▄F[Doн;J;\;∙;╗@┬;┌[E=<t$~y?ґQ,─3gfЬчччччЖҐччччpЧЪАXАXУРч√Ь0─d )(▓ АDУРQ,pЙ│ pыfОemory2.press┌fЬf╘emory2.press to local file d1memory2.press═ [New file]█]█o96Z" :·#:Z┼"FD8m╪;J;\;∙;╗j/ GЪЪ≈псЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪЪDoradoManual-A.pressFiala17-Sep-81 9:32:22 PDT