{Begin SubSec Xerox 1100 Booting and Maintenance Panel Codes} {Title Xerox 1100 Booting and Maintenance Panel Codes} {Text {Tag X1100MPcodes} The Xerox 1100 processor cabinet contains the electronic circuitry, memory, and magnetic disk of the 1100. The only controls on the front panel of the processor cabinet are two pushbuttons used to turn the power on and off, and a numeric display (the "Maintenance Panel" or "MP") used for maintenance purposes. This section intends to provide enough information to diagnose malfunctions to the extent that MP codes permit. These codes are mainly of use when booting the machine; usually they mean nothing during an Interlisp-D session. Ordinarily, error conditions in Interlisp-D cause at worst a recoverable (by typing {lisp CTRL-D}) call to Raid. If something is seriously wrong, however, Interlisp-D halts, the screen loses synch and the code in the MP may be significant; in these cases Interlisp-D is not continuable. {Begin SubSec 1100 Booting Sequence} {Title 1100 Booting Sequence} {Text When an 1100 is booted, the following steps occur before you see the Alto Executive: (1) Hardware loads the Boot microcode from an EPROM; (2) Boot tests the processor and loads Initial; (3) Initial tests the map and storage and loads an emulator; (4) Emulator microcode initializes itself; (5) Alto Executive or NetExec is booted. A bootstrap is generally initiated by pushing and releasing the start (power-on) button, by pushing the keyboard boot button, or by executing the Boot function in a microinstruction. In addition to these normal methods of booting, if your machine is malfunctioning, it might spontaneously boot itself when a fault happens while the fault task is running. If the problem is really bad, then it might boot itself over and over. Some deficiencies in error handling by the Initial microprogram cause certain classes of recoverable hardware failures (disk and ethernet problems) to reboot the machine rather than recovering from the error. If your machine won't boot or boots very slowly, it is important to go through the following check list: (1) If you have just powered up, make sure your display is turned on--sometimes the emulator won't run if the display is powered off, and you obviously won't be able to see anything. (2) Sometimes the keyboard microcomputer will power on in a bad state; if this happens, you can have all kinds of trash happen on the backchannel--erroneous keystrokes, mouse button clicks, and mouse movement. To fix this, push the keyboard boot button. Sometimes, it is necessary to turn the power to the display off and then on again. (3) Otherwise, watch the MP while the problem is happening. The detailed sequence of numbers may indicate what is going wrong. You will have to get your head down low to observe the numbers on the MP reliably; people have frequently reported numbers with 1's translated into 7's, and some other observation errors are occasionally made. }{End SubSec 1100 Booting Sequence} {Begin SubSec Hardware Boot} {Title Hardware Boot} {Text While you depress the start button, the hardware shows 8888 in the MP as a light test. When you release the power-on button, the machine will then run through the boot sequence discussed above. During the hardware boot, you might see either 8808 or 8880 in the MP if RM or IMX parity errors are detected by the hardware during loading. }{End SubSec Hardware Boot} {Begin SubSec EPROM Boot Microcode} {Title EPROM Boot Microcode} {Text Boot is a tiny diagnostic and bootstrap loader. Its function is to test the processor thoroughly but quickly, reporting any failures on the MP, and then boot the Initial microcode from any of the I/O devices which might reasonably contain it; these are the SA4000 disk and the 3 Megabit Experimental Ethernet at present; 10 Megabit Ethernet booting will be made available at a future date. Boot runs a few processor tests to find out whether or not the processor is healthy enough to continue loading. Many machines malfunction when first powered up, then work correctly. If the processor tests fail, Boot will show an error MP code (0000 to 0039) for a second or so and then reboot. Otherwise, registers are initialized. Boot tries to read the first program on the SA4000 boot record (normally Initial) directly into the microstore. When Boot starts this, it will show 0040 in the MP. If you see this (or 0041 or 0046 which frequently follow immediately), your processor is at least somewhat alive (Initial lives in a special reserved portion of the disk, so you won't see it among your disk files.) If the disk won't work (0041 to 0045, 0047 to 0048) or isn't ready yet (0046), Boot will try to obtain Initial.Eb from the (3MB) Ethernet; when this decision is made, a one second pause allows you to read the MP; if the reason for the boot is NOT a button push, the delay is extended to one minute to prevent a sick machine from hogging the boot servers. When Ether booting starts, 0060 (trying to Etherboot) will appear in the MP; other 006x MP codes indicate Etherboot problems. While waiting for a boot server to respond and while transmitting Initial microcode from the boot server, 004x slowly alternates with 006x, so that you can see both the reason why the disk boot failed and the current Etherboot indication. Unfortunately, many machines experience a short period of unreliability after being powered on, but then work correctly. These machines encounter the one minute wait intended to prevent a sick machine from hogging the boot servers, and this can be frustrating, if you are waiting for the machine to become ready. If your machine does this, keep pushing and releasing the start button until you see a healthy 0060. The standard trick for forcing an Etherboot is to turn power off and then back on. It takes the disk about two minutes to become ready again. If you push the start button before the disk is ready, you should get to the Alto NetExec. Note that Initial is loaded directly into the microstore without using either the map or storage. }{End SubSec EPROM Boot Microcode} {Begin SubSec Initial Microcode} {Title Initial Microcode} {Text Initial is primarily responsible for testing and initializing the map and storage, reporting any failures in the MP, then loading and starting an emulator. When Initial receives control, it puts 0700 ('starting map test') into the MP; if you see 0040 then 0700 (without an intervening 0060), your disk is at least somewhat healthy since Initial was loaded from the disk. Initial first tests the map; it will hang with the 'bad map' MP code (0702) displayed, if the map is imperfect. Then Initial tests storage and uses only 'good' pages. 0700 is visible barely long enough to see, and 0400 is seen in the MP during storage testing, which lasts less than 4 seconds with eight perfect 96k-word storage boards. If Initial detects any storage imperfection, it will do additional testing, and 0400 will be shown for 4 to 9 seconds (timing approximately proportional to the amount of storage). If the number of pages with correctable failures exceeds 1/8 of all pages, and if the amount of good storage is less than 128k words, then the entire test will be repeated allowing the pages with only soft failures to be used; otherwise, only perfect pages are used. After all testing is complete, three numbers will be shown in the MP for about 1.3 seconds each: (1) 0400+2^n, where board n+1 is imperfect, (2) the count of "hard" bad pages (uncorrectable failures), and (3) the count of "soft" bad pages (correctable failures only). These numbers are NOT shown when all storage boards are perfect. Even when some storage is bad, unless the amount of 'good' storage is reduced by failures to less than 64k words, initialization will continue normally following the bad-page MP codes. On a disk boot, Initial then puts 0720 into the MP and continues reading the SA4000 boot record. But this time the emulator from the boot record is placed into storage rather than directly into the microstore. On an ether boot, it instead shows 0758 in the MP and reads Alto.Eb from the Ethernet into storage. At the end of file, the microcode image in storage is loaded into the microstore and started. [Note: Older versions of Initial had different code sequences.] }{End SubSec Initial Microcode} {Begin SubSec Alto Emulator Microcode} {Title Alto Emulator Microcode} {Text Early in initialization the emulator shows the MP code 'Start device init' (0104) barely long enough to see. Seeing 0104 means that Initial tested and zeroed storage, loaded the microcode image into storage from the disk or Ethernet and then from storage into the microstore, transferred control to it, and executed at least the first few microinstructions successfully. The emulator has two main entry points. After 0104, on a disk boot, 0118 (GotBreathOfLife) is shown for 0.3 seconds after the first page from the disk boot record has been read successfully. On an ethernet boot, 0114 (start booting the NetExec) is shown, then 0118 when the "breath-of-life" program has been received from the Ethernet boot server. Finally, on both disk and ethernet bootstraps, the emulator shows the number of good pages found by Initial in the MP. For the 96k-word storage boards using 16k RAMs, the MP will show {note 96K/256 pages/board which is--- what?} 384 times the number of storage boards (i.e., 0768, 1152, 1536, 1920, 2304, or 2688). Some MPs will show this value plus 1 occasionally for unknown reasons. Then the emulator loads the final microcode overlay and starts the emulator at the breath-of-life program's disk boot or Ether boot address. When started at its normal entry point, the emulator will boot the OS from SA4000 partition 1 if you have no keys down. If you depressed the "0", or backspace key and then pushed the keyboard boot button when your emulator was correctly running the display task, then the OS will be booted from SA4000 partition 2 ("0" typed) or the NetExec from the Ethernet (BS typed). If none of these keys was depressed, or if some other keys were also depressed, then a partition 1 disk boot occurs. Pushing the keyboard boot button does nothing if an emulator isn't running. }{End SubSec Alto Emulator Microcode} {Begin SubSec Normal Maintenance Panel Code Sequences} {Title Normal Maintenance Panel Code Sequences} {Text If your hardware is working properly, the viewable MP sequence if your disk is not yet up to speed is: 8888, [46, 60], 700, 400, 758, 104, 114, 118, {arg NPages}, where "46, 60" may repeat several times before continuing with the rest of the sequence, if your boot server is busy. Other MP codes are not up long enough to see unless something goes wrong; 700 is up barely long enough to see. If you have just powered up your machine, it might get a 22 or 24. After a second or so, your 1100 will automatically reboot, but this time 46 will stay in the MP for a minute or so. If you get tired of waiting, poke the button again. The viewable MP sequence for a "normal" disk boot is: 8888, 40, 700, 400, 720, 104, 118, {arg NPages}. }{End SubSec Normal Maintenance Panel Code Sequences} {Begin SubSec Maintenance Panel Codes} {Title Maintenance Panel Codes} {Text In the following pages, a # indicates a final MP code. The machine will hang with this number in the MP until you boot again. All other MP codes are either errors that will be retried automatically or simple indications of progress. {Begin SubSec MP Codes from the hardware} {Title MP Codes from the hardware} {Text {Begin LabeledList MP Codes from the hardware} {Name 8888} {Item Lamp test (shown while the start button is down).} {Name 8808#} {Item RM parity error. The MP freezes with this code, indicating (??) a failure during the hardware boot.} {Name 8880#} {Item IMX or control store parity error. The MP freezes with this code indicating (??) a failure of the hardware boot.} {End LabeledList MP Codes from the hardware} }{End SubSec MP Codes from the hardware} {Begin SubSec MP codes from Boot microcode} {Title MP codes from Boot microcode} {Text {Begin LabeledList MP codes from Boot microcode} {Name 0000} {Item One of the first instructions in the EPROM clears the MP; seeing this number means that the ALU all-zeroes, all-ones, or GoTo tests failed.} {Name 0001} {Item The ALU all-zeroes, all-ones, and GoTo tests passed. You should never see this since some other error should happen or the MP should change to 0040 when disk booting starts.} {Name 0002} {Item Midas boot.} {Name 0004 to 0015} {Item One of the preliminary ALU tests failed.} {Name 0016} {Item Mismatch after write then read of an RM register via the stack.} {Name 0017} {Item The contents of an RM register have changed.} {Name 0018} {Item Register read and compare error using the Stack.} {Name 0020} {Item A Breakpoint microinstruction was executed.} {Name 0021} {Item Memory error. Since the EPROM code doesn't touch the map or storage, this is probably an H4 Parity error.} {Name 0022} {Item RM register parity error.} {Name 0024} {Item Control Store parity error.} {Name 0028} {Item Stack error.} {UnLabeled 0020 to 0035 indicates that a fault happened. The MP contains 20 plus the contents of the Parity register. A fault in the fault handler will reboot the machine, so you may not get to see these codes. The value shown is 20d + (1 if memory error) + (2 if RM parity error) + (4 if control store parity error) + (8 if stack overflow or underflow). } {UnLabeled 0022 and 0024 are to be expected if you have just powered up your machine. (The bias on the RAM chips hasn't been pumped up yet.) This will invoke a one minute delay to avoid hogging the boot servers when something is broken. Poke the button again if you want faster service. } {Name 0040} {Item Starting to load microcode from disk.} {Name 0041} {Item Can't find disk. (Will now try to EtherBoot)} {Name 0042} {Item SA4000 disk read error.} {Name 0043} {Item SA4000 seek timed out.} {Name 0044} {Item SA4000 disk checksum failure.} {Name 0045} {Item SA4000 bad Control Store address--attempt to load into EPROM area.} {Name 0046} {Item SA4000 disk not ready. (Will now try to EtherBoot)} {Name 0047} {Item The label word which should contain a link to the next page of microcode to be loaded has an invalid disk address.} {Name 0048} {Item Didn't load microcode from disk within 1 second. (Will now try to EtherBoot)} {Unlabeled Most disk errors (0042, 0043, 0044, 0045, 0047) can be caused by simple transient read problems. The disk task simply retries all of them while the emulator task is counting down a timer. If the timer runs out, you will see 0048. } {Name 0060} {Item Trying to load microcode via Ethernet.} {Name 0061#} {Item Can't find Ethernet board.} {Name 0062} {Item Bad Ethernet checksum while reading microcode.} {Name 0063} {Item Bad Control Store address--attempt to load into EPROM area.} {Name 0064} {Item Hardware error (bad status) at end of packet.} {Name 0065#} {Item Timeout after 15 tries at about 10 seconds each.} {Unlabeled If EtherBooting doesn't work, the MP will slowly alternate between 006x and 004x so that you can see both what was wrong with the disk and what is wrong with the Ethernet. If the Etherboot eventually times out, you will see 0065 alternating with the bad disk code. } {Name 0070 to 0085} {Item An unnexpected wakeup happened. The MP contains 70 plus the number of the offending task (see list of tasks later in this section). If you get one of these, one of the IO controllers is probably broken. For example, its reset logic is not working, or the wakeup logic on the disk or Ethernet board is generating the wrong task number. } {End LabeledList MP codes from Boot microcode} }{End SubSec MP codes from Boot microcode} {Begin SubSec MP codes from Initial} {Title MP codes from Initial} {Text {Begin LabeledList MP codes from Initial} {Name 0700} {Item Start map test.} {Name 0702#} {Item Bad map.} {Name 0400} {Item Start storage test.} {Unlabeled If any bad storage boards detected, the following three numbers are displayed for about 1.3 seconds each: } {Name 0401 to 0655} {Item Failures detected on one or more boards. MP shows 0400 + (1 if the memory board in slot 5 is imperfect) + (2 if slot 6) + (4 if slot 7) + (8 if slot 8) + (16 if slot 9) + (32 if slot 10) + (64 if slot 11) +(128 if slot 12). This interpretation is for the 96k-word storage boards with 16k RAMs. Each such board occupies 128k words of real address space but implements only the first 96k words of the space. Hence, the bits actually represent imperfections in the existing storage of each 128k-word bank of storage. {note is this still true??} } {Name {arg NHardBad}} {Item Count of hard bad pages (ones with uncorrectable failures), where pages are 256 16-bit words; shown after 0401-0655. } {Name {arg NSoftBad}} {Item Count of soft bad pages (only correctable failures); shown after NHardBad. } {Name 0701#} {Item Not enough memory. (Initial requires 64K words.)} {Name 0720} {Item Starting to load emulator microcode from disk.} {Name 0721#} {Item No disk.} {Name 0722#} {Item Disk Read error.} {Name 0741#} {Item Can't find UTVFC (display).} {Unlabeled When Initial starts Etherbooting, it puts 740+2*{arg bfi} into the MP. ({arg bfi} is related to the boot file number). If EtherLoad can't load that file within a reasonable length of time it will give up and bump the MP by one. } {Name 0758} {Item Trying to load AltoD0.Eb from the Ethernet.} {Name 0759#} {Item Timeout trying to load AltoD0.Eb} {Name 0812} {Item Label check from SA4000. (Try to EtherBoot.)} {End LabeledList MP codes from Initial} }{End SubSec MP codes from Initial} {Begin SubSec MP codes from Emulator Microcode} {Title MP codes from Emulator Microcode} {Text These codes are shown when the (Alto) emulator microcode starts up. {Begin LabeledList MP codes from Emulator Microcode} {Name 0101#} {Item Not enough memory. (You need 512 pages or 128K words--more than the 64k words required by Initial.)} {Name 0104} {Item Start device initialization.} {Name 0105} {Item Started UTVFC initialization (invisible).} {Name 0106} {Item Finished loaded keyboard overlay (invisible--only happens on systems).} {Name 0107} {Item Finished display initialization (invisible).} {Name 0110} {Item Started disk boot (invisible).} {Name 0111#} {Item Timeout waiting for disk status.} {Name 0112#} {Item Hardware error reading disk (after 10 retries).} {Name 0114} {Item Start booting the NetExec.} {Name 0118} {Item Breath-of-life read successfully from disk or Ethernet.} {Name {arg NPages}} {Item The number of 'good' 256-word pages determined by Initial is put in the MP just before the final microcode overlay overwrites initialization; it normally remains in the MP until you boot, crash, or start Interlisp-D. The emulator starts immediately after loading the final overlay. } {End LabeledList MP codes from Emulator Microcode} }{End SubSec MP codes from Emulator Microcode} {Begin SubSec Fault MP Codes} {Title Fault MP Codes} {Text [The old Initial has different fault MP codes.] {Begin LabeledList Fault MP Codes} {Name 0115} {Item Unexpected Ethernet output task wakeup.} {Name 0117} {Item Two MC2 errors; both MC2A and MC2B pipes indicate errors; since LogSE is illegal, this might mean two consecutive references experienced uncorrectable storage failures. Alternatively, this is caused by microcode bug.} {Name 0119} {Item Stack overflow or underflow. This is a microcode bug.} {Name 0120 to 0135} {Item RM or CS parity error, possibly in combination with other errors. MP code is 120 + (1 if MC1 or MC2 error) + (2 if RM parity error) + (4 if CS parity error) + (8 if stack overflow or underflow). } {End LabeledList Fault MP Codes} Many codes show a multiple of 20d + a task number, where the task assignments are given below. The 'pipe task' is the one issuing the reference causing an error--this can determined from the error pipe; the 'current task' is the one running when the fault aborted execution. Task assignments: {Begin Table Task assignments} {COLUMN 10percent} {COLUMN} {First 0} {Next emulator} {First 5} {Next color display} {First 6} {Next 3 Megabit Experimental Ethernet output} {First 7} {Next 10 Megabit Ethernet, or 3 Megabit Experimental Ethernet input} {First 8} {Next disk (SA4000 controller)} {First 10} {Next Display} {First 14} {Next timer task} {First 15} {Next fault task} {End Table Task assignments} The following codes imply no RM or CS parity error. {Begin LabeledList More Fault MP Codes} {Name 0136} {Item MC12 error occurred but none of the reasons for it is indicated; i.e., neither H4PE, MOB, MC1A, MC1B, MC2A, nor MC2B errors are true. Conceivably, this can be caused by an MC1 fault on the reference following a PFetch4, if the PFetch4 experiences error correction. Due to a hardware bug, the fault isn't started soon enough in this case, so an extra non-fault-task instruction is executed. If the extra instruction is a reference it wipes out the MC1ErA and MC1ErB indicators.} {Name 140 to 155} {Item Map out of bounds (MOB) indicating virtual address greater than 22d bits. Code shown is 140 + current task. An MOB crash can't happen for Interlisp-D because MOB errors fault to software.} {Name 160 to 175} {Item H4 parity error indicating bad parity on the processor bus used by Input, IOStore4, and IOStore16 references. Code shown is 160 + current task. This can never happen at present since these errors are ignored for all tasks.} {Name 180 to 195} {Item Some fault when the preceding instruction contains a LoadPage and the fault handler decides to continue execution. This indicates a microcode bug and should be reported.} {Name 200 to 215} {Item MC2 crash, indicating correctable storage failure of PFetch1, 2, or 4 with LogSE true in the map entry or an uncorrectable storage failure on any reference. The code shown is 200 plus the pipe task. Since LogSE is presently illegal, this code should indicate an uncorrectable storage failure; some microcode bugs may cause it.} {Name 220 to 235} {Item MC1 crash, indicating a page or write protect violation. The code shown is 220 plus the pipe task. This can't happen for Interlisp-D because MC1 faults are passed to software; MC1 faults always crash for Alto.} {Name 240 to 255} {Item SetFault or Breakpoint crash. The code shown is 240 plus the current task. This is used by the microcode in a few places when impossible conditions are detected; for unused tasks, it represents an unexpected wakeup.} {End LabeledList More Fault MP Codes} }{End SubSec Fault MP Codes} }{End SubSec Maintenance Panel Codes} }{End SubSec Xerox 1100 Booting and Maintenance Panel Codes}