:TITLE[MemInit];*Map and storage diagnostic and initialization

%Ed Fiala 10 January 1983: PNIP display of Blk.0 and Blk.1 plus the
relevant 8-bit syndrome as a single number when failures are
detected during the storage test; moved many mi from InitialPage1
to opPage2 to make room. Parameterize refresh timer period.
Optionally loop the storage test until an error is detected, and
then loop the error MP codes; on CSL keyboards, this is a
keyboard boot option.
Ed Fiala 27 October 1982: Fix bug at imNoPage+2 causing PageCount
to become 0 when 2↑22 words of real storage are used.
Ed Fiala 24 June 1982: Fix bug in imMarkSector causing wrong page
report on MP.
Ed Fiala 23 April 1982: Add sophistication to storage
enumeration algorithm to better distinguish hard error pages from
non-existent pages. Add wait after write sweeps to allow
slowed refresh to provoke storage failures. Set refresh timer
to standard emulator value after storage testing.
Create by Ed Fiala 22 January 1982: Rewrite program to improve
testing and speed. Earlier version and changes were by Johnsson,
Frandeen, and Henning.

Problems:
1) Cannot examine VM with Midas because FFault<0.
2) Need tests for PFetch1/2 and PStore1/2 with/without DF2
addressing, task=0 and task#0.
%

Loca[MC2ErrRet0,InitialPage1,17];
Loca[MC2ErrRet1,InitialPage1,16];
Loca[imBST,InitialPage1,20];
Loca[imFST,InitialPage1,40];

SetTask[0];

*All RM registers except BootType appear to be available here,
*except that RTemp and RTemp1 (RM 52-53) are smashed by PNIP.
RV[BootType,20];
*Must agree with identical def in Initial.mc
RV[StorageFaults,21];
*Number of quadwords with correctable
*failures detected by the diagnostic in pages
*which had no uncorrectable failures.
RV2[XMAdLo,XMAdHi,22];
*Base register
RV4[XMBuf0,XMBuf1,XMBuf2,XMBuf3,24];
*Buffer for XMap
RV4[RBuf0,RBuf1,RBuf2,RBuf3,24];
*Buffer for PFetch4
RV4[WBuf0,WBuf1,WBuf2,WBuf3,30];
*Buffer for PStore4
RV[RLink0,34];
*Subroutine return link
RV[MapAddr,35];
*Current map location
RV[RealPage,36];
*Current real storage page
RV[PageCount,37];
*Count of ’good’ pages; ***Location known to
*old versions of Alto emulator.
RV[Transient,40];
*Holds soft error count during storage test.
RV[ZPage,41];
*Virtual page being tested during sweep
RV[ZWord,42];
*Word within page being tested during sweep
RV[RLink1,43];
*MC2 fault return link
RV[RLink2,44];
*Another subroutine return link
RV[SoftQThreshold,45];
RV[SoftBadPages,46];
RV[HardBadPages,47];

%If there are more than EnoughGoodPages after the first pass of the
storage test, then a second pass is not made, even if there are a
large number of soft bad pages. In this case, no questionable pages
are used.
%
MC[EnoughGoodPages,1000];
*= 128k words

%During the first pass of the storage test, no pages with soft
failures are put in service. If there are ’too many’ such pages
and some other criteria are met, a second less conservative pass of
the storage test is made in which pages with up to BadQThreshold
correctable errors on any one test are put into service.
BadQThreshold .eq. 1 allows pages with a single correctable failure
to be used but discards pages with more failures; .ge. 100b allows
the page to be used regardless.
%
MC[BadQThreshold,100];

OnPage[InitialPage];

IMap:
LoadPage[InitialPage1];
MapAddr ← 140000C, GoToP[.+1];

%First, test the map; any failure results in the BadMap MP code and
termination. The test verifies that each bit can assume both ’1’
and ’0’ values and that there are no stuck address drivers nor
on-chip addressing problems. Timing ~ 0.12 seconds.

First, zero the map by writing consecutive addresses from 0 to
37777b. The XMap in the imWriteMap subroutine writes the map
from XMBuf0 and reads the old contents into XMBuf1-3.
%
OnPage[InitialPage1];
imMapTest:
XMBuf0 ← 0C, Call[imWriteMap];
*Loop here.
MapAddr ← (MapAddr) + 1;
T ← (MapAddr) and (37400C), GoTo[imWriteMap1,ALU<0];

%Now write -1 and verify old value .eq. 0 using forward sweep of
all map addresses. This will find not only data bits that can’t
assume the 0 state but also stuck address drivers or internal map
RAM addressing failures.
%
T ← MapAddr ← 140000C;
imOMapLoop:
XMBuf0 ← (XMBuf0) or not (0C), Call[imWriteMap];
LU ← T;
MapAddr ← (MapAddr) + 1, Skip[ALU=0];
LoadPage[InitialPage], GoTo[imMapFail];
GoTo[imOMapLoop,ALU<0];

*Now write 0 and verify old value .eq. -1 using forward sweep.
T ← MapAddr ← 140000C;
imZMapCheckLoop:
XMBuf0 ← 0C, Call[imWriteMap];
LU ← XMBuf1;
MapAddr ← (MapAddr) + 1, Skip[ALU=0];
LoadPage[InitialPage], GoTo[imMapFail];;
DblGoTo[imZMapCheckLoop,imStorageTest,ALU<0];

imWriteMap1a:
XMAdHi ← T, LoadPage[opPage2], GoTo[imWriteMap2];

imWriteMapT:
MapAddr ← T;
*imWriteMap writes the data in XMBuf0 into map location MapAddr
*and returns the old value in T, complement of old value in XMBuf1.
imWriteMap:
T ← (MapAddr) and (37400C);
imWriteMap1:
XMAdHi ← T, LoadPage[opPage2];
imWriteMap2:
T ← LSh[MapAddr,10], GoToP[.+1];
OnPage[opPage2];
XMAdLo ← T;
XMap[XMAdLo,XMBuf0,0];
T ← LSh[XMBuf3,10];
XMBuf1 ← (RHMask[XMBuf1]) or T;
T ← (XMBuf1) xnor (0C), Return;
OnPage[InitialPage1];

imMapFail:
T ← BadMap, GoToP[InitFail]; *Some map entry was bad

%The storage test has the following goals:

1) To find and use all the "good" storage pages.

2) To execute as fast as possible. Suppose that a user boots his
machine 3 times/day, 240 times/year and that the test takes 10 sec.
Then he waits 7200 sec/year = 2 hr/year--already too long. With
4 perfect storage boards (384k words), this diagnostic completes in
less than 2 seconds, which is satisfactory.

Our experience with a large sample of 96k storage boards (120d
boards) for a period in excess of one year was that less than
0.05 16k RAMs per storage board per year were replaced, after the
board was checked out and found to be working. Also, we have not
observed either pattern sensitive or intermittent failures. Our
experience with 64k RAMs on another computer has also been good.
Also, the chance of an uncorrectable failure happening if this
diagnostic misses a failure seems to be remote, so the diagnostic
and failure handling objectives discussed below may be much less
important than fast execution of this test. Bearing this fact in
mind...

3) To report information to AMesa or Pilot or directly to the user
via the maintenance panel so that the user and/or system maintainer
will find out that the machine does or doesn’t need service, even
if the failure can be bypassed or ignored at present. Proposed
service strategy is to replace bad boards at the customer’s site and
then diagnose and repair the failure back at the shop; so it would
also be convenient to report the board number, so the service man
wouldn’t have to run diagnostics before replacing the board. If
enough information to identify the failing RAM on the board could
be reported, that would be better yet.

4) To detect as many storage board failures as possible. The
following are some that we believe are more likely than others:
a) Single-bit failure on a storage RAM; i.e., one bit of the 16k
in a RAM stuck at 0, stuck at 1, or flakey.
b) Row addressing failure that will affect all or most of the
SqRt(16k) consecutive bits (= 2 consecutive pages) in a RAM by
mapping writes or reads addressed to one row into another row.
c) Column addressing failure that will affect all or most of the
SqRt(16k) bits in a RAM at addresses spaced SqRt(16k) bits apart
(= 1 bit in every other one of the 128 pages spanned by the RAM);
this would map writes or reads addressed to one column into another.
d) Bad sockets, bent pins, broken wires, etc. that cause all the
bits in one or more RAMs to malfunction; the storage boards being
built now have the RAMs soldered into the board, so there are no
sockets. Also, bent pins are not a significant problem because
the holes in the board are bigger than the socket holes, so pin
bending is much less likely; and, even if a bent pin happens, the
solder is likely to make good contact.
e) Address driver on a storage board stuck at 0 or 1 or flakey;
this will affect many RAMs and cause uncorrectable failures.
f) Pattern sensitive failures within a RAM.
g) Failures due to excessive leakage (cell not holding charge).

Although (a) and (d) failures could be found by isolated testing
of individual quadwords, (b), (c), (e), (f), and (g) are best tested
by sweeping through storage writing at all addresses, then sweeping
through again reading at all addresses as follows: One sweep
initializes storage to a known value; a second sweep first checks
the value at each address, then writes the 1’s complement back
into that quadword. On-chip addressing failures or stuck address
drivers manifest as a comparison failure later in the sweep. Data
patterns should ensure that each bit in each RAM (including check
bits) is checked at both ’1’ and ’0’ values during sweeps in which
the complement value is being written in each quadword after
reading; the following pattern sequence does this:
0, 0, 0, 0;
chkbits 0
100000b, 100000b, 100000b, 1;
chkbits 177b
0, 0, 0, 0;
chkbits 0, data 0
177777b, 177777b, 177777b, 177777b;
data -1
0, 0, 0, 0;
data 0

In the event of an on-chip addressing failure or an address driver
failure, it is necessary to sweep in both forward and backward
directions to detect failures at both the high and low affected
addresses. In other words, two different addresses (call them X
and Y with Y .gr. X) will appear to be written when either one of
them is addressed; during a forward sweep, X will be read correctly,
but because the complement value is written into X after reading
Y will be read as the complement value, and the failure will be
discovered. However, the failure at X is not discovered until a
sweep in the reverse direction occurs.

To detect failures due to excessive leakage, the timer task in
Initial.Mc carries out refresh 50 percent slower than the emulators.
Here, pause longer than the refresh period after a write sweep to
allow leaky cells to lose charge.

Pages in which uncorrectable errors are detected are, of course,
discarded, but if the only failures detected in a page are
correctable, then MemInit has to decide whether to use the storage
or discard it. The following are some considerations that affect
this decision:

1) If such a page is discarded, less storage will be available to
the program; a column failure, for example, would affect half the
pages in a RAM or 64 pages, so the loss of storage might have
significant performance implications. Also, if discarding results
in storage declining below MinGoodPages (.eq. 64k words), then
Initial will crash.

2) If a page is used despite correctable failures, references
that invoke error correction will be slowed significantly; one
benchmark running on a system in which one RAM was removed from
the top storage board without discarding the affected storage was
7 percent slower than with perfect storage (However, except with
astronomical amounts of storage, Pilot would be slowed more than 7
percent by discarding the 128 pages affected by the RAM removal.).

3) If another failure coincides with one ignored by MemInit, then an
uncorrectable failure will cause a crash. If this diagnostic were
a perfect test (i.e., no bad RAMs go undiscovered) and no fluke
failures occur, then an uncorrectable error caused by coincidence of
a bad RAM ignored by MemInit with a RAM which fails under normal
operation is unlikely. To show this, suppose that only (a), (b),
and (c) solid failures occur and that 0.8 are single, 0.1 row, and
0.1 column; and suppose 0.05 failures/storage board/year on the
average. Then, if a single failure is ignored, only 1/16k of future
single failures and 1/128 of future row or column failures on the
same storage board will coincide, which happens once every 12000
years/storage board; if a row failure is ignored, only 1/128 of
future row failures will coincide, but every column failure will
coincide, so a coincident failure occurs about every 180
years/storage board.

This means that any major risk of uncorrectable errors must be the
result of imperfect testing, much higher than expected failure rate,
or fluke failures. With 256k RAMs, fluke failures in good RAMs due
to alpha particle collisions are theoretically likely, but these
don’t happen in 16k RAMs; not sure about 64k RAMs. Intermittent or
pattern sensitive failures might make perfect testing difficult, but
we don’t seem to have had significant problems in this area either.
This means that, except possibly with 256k RAMs, we are unlikely to
suffer uncorrectable failures as a result of missing a bad RAM
during testing or going ahead and using a page despite a correctable
error in the page.

Based on all these objectives the algorithm used here is as follows:
a) To run as fast as possible, the forward sweeps are done only
when one or more bad bits are detected by the reverse sweeps.
b) Initially, pages with any bad bits are eliminated, and a count
of these is kept.
c) After testing, if ’too many’ pages were eliminated, the entire
storage test is repeated eliminating only pages affected by
uncorrectable failures.
d) If any failures were detected, the board number is shown on the
MP and the number of pages affected by both uncorrectable and
correctable failures.
%

%Initially, make any page with one or more bad bits a bad page;
if there are too many such pages, repeat less conservatively.
%
imStorageTest:
RLink0 ← HiA[TestingStorage];
SoftBadPages ← T ← 0C;
HardBadPages ← T, LoadPage[0];
T ← (RLink0) or (LoA[TestingStorage]), CallP[PNIP];
SoftQThreshold ← 0C;
imRepeatStorageTest:
StorageFaults ← 0C;

%Now write the 1st quadword in each 256-word real page with the
page number and some constants using map entry 0; sweep from page
7777b down to page 0. Storage boards using 16k RAMs span 128k
words but implement only 96k words of storage--does attempting to
write into the 32k-word ’hole’ clobber some lower word? If so, the
reverse scan is needed to write the lower address correctly after
it is clobbered by the write into the ’hole’.
%
WBuf0 ← 10000C;*max real page +1 = 4096d
MapAddr ← 140000C;*carries beyond max VM cause ALU Carry
*I think (?) that a missing storage board or a ’hole’ delivers all
*1’s data.
WBuf1 ← 326C;*Arbitrary constant
WBuf2 ← 134000C;
WBuf3 ← 0C, Call[.+2];
*imWriteMap returns here.
PStore4[XMAdLo,WBuf0,0];
WBuf0 ← T ← (WBuf0) - 1;
XMBuf0 ← T, Skip[ALU<0];
T ← (MapAddr) and (37400C), GoTo[imWriteMap1];

%During this phase, sweep upward through real storage and the map.
If the real page number and constants are in the 1st quadword of
a page, then the page exists--i.e., it is neither beyond the largest
storage address nor in the ’hole’ of a 96k storage board. If the
quadword does not have the expected value, then the page either
doesn’t exist or has uncorrectable data failures. Attempt to
distinguish non-existent from hard bad pages by counting 1 bits in
the XOR of the correct and actual values; if 4 or more bits are
wrong then the page is called non-existent, else it is
bad but left in for now--testing will presumably mark it bad later.

We go to this trouble because, if an existing page is skipped here
because of a hard error, then the HardBadPages count will be wrong.

Set bit 0 in task 17’s FFault register so that the fault handler
will exit to the location pointed at by Stack on MC2 errors rather
than crashing.
%
RLink0 ← IP[FFault]C, Call[imSetStkP];
Stack ← (Stack) or (100000C);
RLink1 ← LoA[MC2ErrRet0], Call[imSetFaultRet];
*Zero quadword for PStore4.
T ← WBuf3 ← 0C, Call[imBlockSet1];
T ← RealPage ← 170000C;
imTloop:
XMBuf0 ← T;
*Point base reg at MapAddr and the map entry at RealPage.
*Set all map flags off except LogSE.
XMBuf0 ← (XMBuf0) and not (70000C), Call[imWriteMap];
PFetch4[XMAdLo,RBuf0,0];
*Return here on an MC2 error indicating that error correction
*occurred on the PFetch4 below. Ignore the MC2 error and compare
*the data irrespective of the fault.
RTemp1 ← 0C, At[MC2ErrRet0];
T ← LdF[RealPage,4,14];
T ← (RBuf0) xor T, Call[imCountErrs];
T ← (RBuf1) xor (326C), Call[imCountErrs];
T ← (RBuf2) xor (134000C), Call[imCountErrs];
T ← RBuf3, Call[imCountErrs];
*Page is non-existent if .ge. 4 bits wrong
LU ← (RTemp1) - (4C);
ZWord ← 374C, Skip[ALU<0];
GoTo[imNoPage];
*Page exists; zero it in preparation for first data test sweep.
MapAddr ← (MapAddr) + 1, Call[.+1];
T ← ZWord, GoTo[imNoPage,R<0];
PStore4[XMAdLo,WBuf0];
ZWord ← (ZWord) - (4C), Return;
imNoPage:
T ← RealPage ← (RealPage) + 1;
XMBuf0 ← 60000C, GoTo[imTloop,Carry’];
*Have completed mapping all the storage that exists; now mark the
*rest of the map entries vacant.
T ← LdF[MapAddr,3,15];
PageCount ← T, Call[imWriteMap];*Page vacant
*Loop here
MapAddr ← (MapAddr) + 1;
T ← (MapAddr) and (37400C), GoTo[imWriteMap1a,Carry’];

%Now test storage by writing and reading various patterns in both
forward and backward sweeps, as discussed earlier. Any unequal data
compare indicates an uncorrectable storage failure and the page is
marked bad.

The 128k-word bank of real memory in which an error is detected
is recorded in StorageFaults; bits 8 to 15d are a bit table
corresponding to banks 7, 6, ..., 0, where 1’s represent detected
errors in that bank. After testing, the MP will indicate what
storage boards need replacement.

Since LogSE is true in the map, a fault handler return occurs for
any detected failure. The register Transient is initialized to
100000b when testing begins for a page; any MC2 fault return causes
the sign bit to be cleared and the rest of the word incremented,
so a count of 1 to 63d indicates the number of quadwords causing MC2
fault returns on a particular sweep. If Transient is greater than
the value in SoftQThreshold, the page will be discarded; otherwise,
it is put into service.

NOTE: following the PFetch4/PStore4 sequence used in the tests
below, any MC2 fault on the PFetch4 happens AFTER completing the
first mi that touches RBuf--this is a memory controller bug, but
the data in RBuf will have been corrected, so it is not especially
harmful; also, the TPC clock is not disabled by abort, so any Call
in the mi which is aborted by the fault will clobber TPC before the
fault starts, even though that mi won’t be complete then.

If a page is discarded, the map entry at PageCount-1 is copied
into the map entry for the page being discarded; then the map entry
at PageCount-1 is made vacant; finally, PageCount is reduced by 1
and either the SoftBadPages or HardBadPages register is incremented.
%
RLink1 ← LoA[MC2ErrRet1], Call[imSetFaultRet], At[imBST,1];

%imBackSweep will return to caller+1 for every quadword during the
sweep. At the end of the sweep, it returns to caller+2.
%

*Check for all ’0’ data and check bits; write all ’1’ check bits.
imBS1:
T ← WBuf2 ← 100000C, Call[imBlockSet2], At[imBST,2];
WBuf3 ← 1C, Call[imBackSweep], At[imBST,3];
T ← RBuf0, GoTo[imAllZeroTest], At[imBST,4];

*Check for check bits all ’1’, write all ’0’ data and check bits.
imBS2:
T ← WBuf3 ← 0C, Call[imBlockSet1], At[imBST,5];
Call[imBackSweep], At[imBST,6];
LU ← (RBuf0) xor (100000C), GoTo[imChkB1Test], At[imBST,7];

*Check for check bits and data all ’0’, write all ’1’ data.
imBS3:
T ← WBuf3 ← (WBuf3) or not (0C), Call[imBlockSet1], At[imBST,10];
Call[imBackSweep], At[imBST,11];
T ← RBuf0, GoTo[imAllZeroTest], At[imBST,12];

*Check for data all ’1’; write 0 check bits and data.
imBS4:
T ← WBuf3 ← 0C, Call[imBlockSet1], At[imBST,13];
Call[imBackSweep], At[imBST,14];
LU ← (RBuf0) xnor (0C), GoTo[imAllOneTest], At[imBST,15];

%Here, we make the judgment that it is not worth doing the forward
sweeps unless we found some problem during the reverse sweeps.
It takes about 0.5 seconds/storage board to do all the forward
sweeps = about 24 minutes/year for machines with 384k words booted
3 times/day. Forward sweeps are intended to find the higher
addresses affected by on-chip addressing problems, but in this case
we expect that the lower addresses involved in such a failure would
have been detected during the reverse sweeps.
%
LU ← StorageFaults, At[imBST,16];
Skip[ALU#0];
GoTo[imClearLogSE];

*Check for check bits and data all ’0’; write all ’1’ check bits.
imFS1:
T ← WBuf2 ← 100000C, Call[imBlockSet2], At[imFST,1];
WBuf3 ← 1C, Call[imForwardSweep], At[imFST,2];
T ← RBuf0, GoTo[imAllZeroTest], At[imFST,3];

*Check for all ’1’ check bits; write all ’0’ data and check bits.
imFS2:
T ← WBuf3 ← 0C, Call[imBlockSet1], At[imFST,4];
Call[imForwardSweep], At[imFST,5];
LU ← (RBuf0) xor (100000C), GoTo[imChkB1Test], At[imFST,6];

*Check for all 0 data and check bits; write all ’1’ data.
imFS3:
T ← WBuf3 ← (WBuf3) or not (0C), Call[imBlockSet1], At[imFST,7];
Call[imForwardSweep], At[imFST,10];
T ← RBuf0, GoTo[imAllZeroTest], At[imFST,11];

*Check for all ’1’ data; zero storage.
*NOTE: Storage must be zeroed.
imFS4:
T ← WBuf3 ← 0C, Call[imBlockSet1], At[imFST,12];
Call[imForwardSweep], At[imFST,13];
LU ← (RBuf0) xnor (0C), GoTo[imAllOneTest], At[imFST,14];

imClearLogSE:
*Clear the ’Return on MC2 errors’ bit in FFault.
RLink0 ← IP[FFault]C, Call[imSetStkP], At[imFST,15];
Stack ← (Stack) and not (100000C);
T ← (PageCount) - 1;
imClearLogSELoop:
MapAddr ← T, GoTo[imDone,Carry’];
*Exchange garbage with the contents of the map entry at MapAddr.
Call[imWriteMap];
*Rewrite map entry with LogSE turned off.
XMBuf0 ← T;
XMBuf0 ← LdF[XMBuf0,1,17], Call[imWriteMap];
T ← (MapAddr) - 1, GoTo[imClearLogSELoop];

imChkB1Test:
LU ← (RBuf1) xor (100000C), Skip[ALU=0];
GoTo[imQWBad];
LU ← (RBuf2) xor (100000C), Skip[ALU=0];
GoTo[imQWBad];
LU ← (RBuf3) xor (1C), Skip[ALU=0];
imQWBad1:
GoTo[imQWBad];
imTestData:
Skip[ALU#0];
Return;
imQWBad:
Transient ← 40000C;
ZWord ← (ZWord) or not (0C), Return;

imAllOneTest:
LU ← (RBuf1) xnor (0C), Skip[ALU=0];
GoTo[imQWBad];
LU ← (RBuf2) xnor (0C), Skip[ALU=0];
GoTo[imQWBad];
LU ← (RBuf3) xnor (0C), DblGoTo[imTestData,imQWBad1,ALU=0];

imAllZeroTest:
T ← (RBuf1) or T;
T ← (RBuf2) or T;
LU ← (RBuf3) or T, GoTo[imTestData];

%Add the number of 1 bits in T to RTemp1 using RTemp as temporary
storage. Uses the fact that X and (-X) is equal to the right-most
1 in X. Called by the algorithm which enumerates storage.
%
imCountErrs:
RTemp ← T, LoadPage[opPage2];
T ← (Zero) - T, GoToP[.+1];
OnPage[opPage2];
T ← (RTemp) and T, GoTo[.+3,ALU=0];
RTemp1 ← (RTemp1) + 1, LoadPage[InitialPage1];
T ← RTemp ← (RTemp) and not T, LoadPage[opPage2], GoToP[.-3];
Return;
OnPage[InitialPage1];

%If any errors were detected, whether or not they were ignored, show
an MP code (400 + 1, 2, 4, 8, 16, 32, 64, and/or 128) to indicate
which banks have problems and dally a little while to let the user
view the MP. Then show the number of pages with uncorrectable
failures; then show the number of pages with correctable failures;
finally, show the Blk and Syndrome values of the last error
correction fault.

As a keyboard boot option, loop the map and storage tests until some
failure is detected; then loop the MP display of the four error
codes. Otherwise, if ’too many’ pages were discarded because of
correctable failures, repeat testing with SoftQThreshold .eq.
BadQThreshold. ’Too many’ is defined as (PageCount .ls.
MinGoodPages) & (SoftBadPages .ne. 0) OR (PageCount .ls. 3000b) &
((SoftBadPages > (PageCount rsh 3)) & (SoftQThreshold .eq. 0).
%

Set[RefConstant,Add[50017,LShift[RShift[RefreshPeriod,6],4]]];

imDone:
LU ← StorageFaults;
StorageFaults ← (StorageFaults) + (HiA[TestingStorage]), GoTo[imDone0,ALU#0];
*No errors were detected; determine whether to loop or finish.
LU ← (BootType) xor (400C);
RBuf0 ← HiA[RefConstant], GoTo[imDone2,ALU#0];
MapAddr ← 140000C, GoTo[imMapTest];

imDone0:
LoadPage[opPage2];*Failures were detected.
LU ← (BootType) xor (400C), GoToP[.+1];
OnPage[opPage2];
*Check for less conservative repeat.
LU ← SoftQThreshold, Skip[ALU#0];
GoTo[imDone1];*No repeats if looping until failure
LU ← SoftBadPages, Skip[ALU=0];
GoTo[imDone1];*No repeat if we already repeated
ZWord ← HiA[MinGoodPages], Skip[ALU#0];
GoTo[imDone1];*No repeat if no soft bad pages
T ← (ZWord) or (LoA[MinGoodPages]);
LU ← (PageCount) - T;
LU ← (PageCount) - (EnoughGoodPages), Skip[ALU>=0];
*Repeat maximally aggressively if PageCount < MinGoodPages
SoftQThreshold ← 100C, GoTo[imRepeat];
T ← RSh[PageCount,3], Skip[ALU<0];
GoTo[imDone1];*No repeat if enough good pages
*No repeat if bad pages .ls. 1/16 to 1/8 of good pages
LU ← (SoftBadPages) - T - 1;
SoftQThreshold ← BadQThreshold, GoTo[imRepeat,ALU>=0];

*Have failure(s) and not repeating.
imDone1:
*Show the 400 + sum(2↑board) MP value.
T ← (StorageFaults) + (LoA[TestingStorage]), Call[imMPDelay];
*Show the count of pages with uncorrectable errors.
T ← HardBadPages, Call[imMPDelay];
*Show the count of pages with correctable errors.
T ← SoftBadPages, Call[imMPDelay];
*Show Blk.0 and 1 (high true) and the relevant 8-bit syndrome.
RLink0 ← IP[xStorageFaults]C;
StkP ← RLink0;
T ← Stack, Call[imShowOctal];
LU ← (BootType) xor (400C);
Skip[ALU#0];
GoTo[imDone1];*Loop the MP failure display
LoadPage[InitialPage1];
RBuf0 ← HiA[RefConstant], GoToP[imDone2];

imRepeat:
LoadPage[InitialPage1];
GoToP[imRepeatStorageTest];

*Convert octal number in T into the decimal number which will
*have the correct octal characters when shown on the MP.
*Then show the number on the MP. Smashes RBuf0-3 and RLink0.
imShowOctal:
RBuf2 ← T, UseCTask;*Save octal value.
T ← APCTask&APC;
RLink0 ← T;*Save procedure return
T ← LdF[RBuf2,15,3];
RBuf0 ← T;*Accumulate result in RBuf0.
RBuf1 ← HiA[1750];*1000d = 1750b
RBuf1 ← (RBuf1) or (LoA[1750]);
T ← LdF[RBuf2,4,3], Call[imMulAdd];
RBuf1 ← 144C;*100d = 144b
T ← LdF[RBuf2,7,3], Call[imMulAdd];
RBuf1 ← 12C;*10d = 12b
T ← LdF[RBuf2,12,3], Call[imMulAdd];
LoadPage[0], GoToP[imMPDelay1];

imMulAdd:
RBuf3 ← T;
imMulAddLp:
T ← RBuf1, Skip[ALU#0];
Return;
RBuf0 ← (RBuf0) + T;
RBuf3 ← (RBuf3) - 1, GoTo[imMulAddLp];

%Loop to delay for a period of time so that user can notice warning
on maintenance panel. imMPDelay delays for
(OuterLoopCount+2)*(2*InnerLoopCount+13)+5 cycles
and then returns.
Values below make the total delay about 32,514 x 311 + 5 = 1.01 sec
with 40 mHz crystal.
%
MC[OuterLoopCount,77400];
*= 32,512d
MC[InnerLoopCount,225];
*This makes 2*InnerLoopCount+13 = 311

imMPDelay:
RBuf0 ← T, UseCTask;
T ← APCTask&APC;
RLink0 ← T, LoadPage[0];
imMPDelay1:
T ← RBuf0, CallP[PNIP];
RBuf1 ← OuterLoopCount;
imSweepDelay:
RBuf0 ← InnerLoopCount, Call[imMPD1];
RBuf1 ← (RBuf1) - 1, Skip[R<0];
GoTo[.-2];
APCTask&APC ← RLink0, GoTo[imMPDRet];

*This subroutine returns after 2*RBuf0 + 7 cycles.
imMPD1:
RBuf0 ← (RBuf0) - 1, GoTo[.,R>=0];
imMPDRet:
Return;


OnPage[InitialPage1];

imRL0Ret:
APCTask&APC ← RLink0;
imRet:
Return;

imSaveRL0:
RLink0 ← T, Return;

%Change timer period from its current slowed value to the standard
value used by the emulators--see comment in Initial.Mc. Then, if
PageCount .ls. MinGoodStorage (.eq. 256d pages or 64k words), crash
with the NotEnoughMemory MP code.
%

imDone2:
RLink0 ← IP[RTimer]C, Call[imSetStkP];
T ← (RBuf0) or (LoA[RefConstant]), Call[imPush];
ZWord ← HiA[MinGoodPages];
RLink0 ← Sub[IP[xPageCount],1]C, Call[imSetStkP];
*Save test results in high RM registers where emulator can avoid
*smashing them more easily.
T ← PageCount, Call[imPush];
T ← StorageFaults ← (StorageFaults) - (HiA[TestingStorage]), Call[imPush];
T ← HardBadPages, Call[imPush];
T ← SoftBadPages, Call[imPush];
T ← (ZWord) or (LoA[MinGoodPages]);
LU ← (PageCount) - T, LoadPage[InitialPage];
T ← NotEnoughMemory, DblGoTo[InitFail,MemInitDone,ALU<0];

*Make the fault handler send control to MC2ErrRet0 on a fault.
*(This is below the label "imTloop".)
imSetFaultRet:
RLink1 ← (RLink1) or (HiA[MC2ErrRet0]);
RLink0 ← IP[RLink1]C;
imSetStkP:
StkP ← RLink0, Return;

imPush:
Stack&+1 ← T, Return;

imInitForwardPage:
ZWord ← 0C, Skip;
imInitBackPage:
ZWord ← 374C;
XMAdLo ← T;
Transient ← 100000C, Return;

imBSNewPage:
Transient ← LdF[Transient,1,17], Skip[R<0];
UseCTask, Call[imMarkSector];*One or more failures
T ← SoftQThreshold;
LU ← (Transient) - T - 1;
Skip[ALU<0];
*Too many failures or uncorrectable failure
UseCTask, Call[imPageBad];
ZPage ← (ZPage) - 1, GoTo[imBSBegin];
imBackSweep:
T ← (PageCount) - 1;
ZPage ← T, UseCTask;
*RLink0 ← where to go to compare data; RLink0+1 is where to go
*at end of sweep.
T ← APCTask&APC, Call[imSaveRL0];
imBSBegin:
T ← LHMask[ZPage], Skip[R>=0];
RLink0 ← (RLink0) + 1, GoTo[imRL0Ret];
XMAdHi ← T;
T ← LSh[ZPage,10], Call[imInitBackPage];
*Return here after each successful data compare
T ← ZWord, GoTo[imBSNewPage,R<0];
PFetch4[XMAdLo,RBuf0];
ZWord ← (ZWord) - (4C);
imFSDisp:
APCTask&APC ← RLink0;
PStore4[XMAdLo,WBuf0], Return;

*MC2 fault return
APCTask&APC ← RLink0, At[MC2ErrRet1];
Transient ← (LdF[Transient,1,17]) + 1, Return;

imForwardSweep:
UseCTask;
*RLink0 ← where to go to compare data; RLink0+1 is where to go
*at end of sweep.
T ← APCTask&APC, Call[imSaveRL0];
ZPage ← 0C, GoTo[imFSBegin];
imFSNewPage:
Transient ← LdF[Transient,1,17], Skip[R<0];
UseCTask, Call[imMarkSector];*One or more failures
T ← SoftQThreshold;
LU ← (Transient) - T - 1;
ZPage ← (ZPage) + 1, Skip[ALU<0];
*Too many failures or uncorrectable one; since the page at the
*last good map entry will be substituted for the one removed,
*continue testing at the same page.
ZPage ← (ZPage) - 1, UseCTask, Call[imPageBad];
imFSBegin:
T ← (PageCount) - 1;
LU ← (ZPage) - T;
T ← LHMask[ZPage], Skip[ALU<0];
RLink0 ← (RLink0) + 1, GoTo[imRL0Ret];
XMAdHi ← T;
T ← LSh[ZPage,10], Call[imInitForwardPage];
*Return here after data compare for each word of page.
LU ← (ZWord) + (177400C);
T ← ZWord, GoTo[imFSNewPage,Carry];
PFetch4[XMAdLo,RBuf0];
ZWord ← (ZWord) + (4C), GoTo[imFSDisp];

%Data on page cannot be read correctly or more than BadQThreshold
correctable errors on the page during one of the sweeps.
ZPage is the map entry affected; UseCTask in calling mi.
%
imPageBad:
T ← APCTask&APC, Call[imSaveRL2];
*Exchange Vacant with the map entry at PageCount-1.
XMBuf0 ← 60000C;
T ← PageCount ← (PageCount) - 1, Call[imWriteMapT];
*Exchange the value obtained from the map entry at PageCount-1
*with the contents of the map entry at ZPage.
XMBuf0 ← T;
T ← ZPage, Call[imWriteMapT];
imRL2Ret:
APCTask&APC ← RLink2, GoTo[imRet];

imSaveRL2:
RLink2 ← T, Return;

%Mark the 128k word sector in StorageFaults.
NOTE: imMarkSector must be called before imPageBad.
ZPage is the map entry affected; UseCTask in calling mi.
%
imMarkSector:
T ← APCTask&APC, Call[imSaveRL2];
*Exchange garbage with the contents of the map entry at ZPage.
T ← ZPage, Call[imWriteMapT];
*Rewrite the map entry.
XMBuf0 ← T, Call[imWriteMap];
*’OR’ 1 LShift[128k-word region number] into StorageFaults.
XMBuf1 ← 100000C;
XMBuf0 ← LdF[XMBuf0,4,3];
XMBuf0 ← (XMBuf0) - 1;
T ← XMBuf1 ← LCy[XMBuf1,1], GoTo[.-1,ALU>=0];
%Only count bad pages during the first pass of the storage test
when any failure is treated as a bad page. During the second pass,
pages already counted in SoftBadPages will be retested and may be
counted again, screwing up the statistics.
%
LU ← SoftQThreshold;
LU ← (Transient) - (100C), Skip[ALU=0];
StorageFaults ← (StorageFaults) or T, GoTo[imRL2Ret];
StorageFaults ← (StorageFaults) or T, Skip[ALU<0];
SoftBadPages ← (SoftBadPages) + 1, GoTo[imRL2Ret];
HardBadPages ← (HardBadPages) + 1, GoTo[imRL2Ret];

*imBlockSet1 puts T into WBuf0-2.
imBlockSet1:
WBuf2 ← T;
imBlockSet2:
WBuf1 ← T;
WBuf0 ← T, UseCTask;
T ← APCTask&APC, Call[imSaveRL0];
%Now delay about twice the refresh period to allow leaky RAMs to
lose charge. During this delay, the only storage references are
those by the Timer task’s Refresh; at other times, ordinary storage
references refresh all words in the referenced row except the one
being read. Since the refresh period is 3840*32 = 122880 cycles,
the required constant here is (245760-1003)/311 = 787 = 1423b.
%
RBuf1 ← HiA[1423];
LoadPage[opPage2];
RBuf1 ← (RBuf1) or (LoA[1423]), GoToP[imSweepDelay];

:END[MemInit];