:TITLE[MesaOP0];*Mesa opcodes 0b - 77b
%
Ed Fiala 19 March 1984: Fix bug in failing to reload IBuf on return from
NopInt with interrupts disabled for 2 byte or longer opcodes; move
refill trap to MesaOP2.mc; add absolute placement to BackPC for Fault.mc;
change register used by BackPC from WW (conflict with prData) to xBuf2.
Ed Fiala 8 November 1983: Speed up @BNDCK by 3 cycles and add BndCk1
label for BNDCKL; move P4NoTrap; change at BackTrap; change TrapParm to
xfTrapParm0.
Ed Fiala 27 April 1983: change WW to prIntSaveStkP in IdleInt for MesaP;
define P4Push for MesaP when CacheLocals=0.
Ed Fiala 21 December 1982: Removed UNLESS[WithCedar] conditional from trap
for opcodes 76-77; Cedar now uses opcodes 372-375.
25 August 1982 by Ed Fiala.
%

%When a device needs interrupt service, its service routine sets some bit
in NWW and sets IntPending true in the RS232 interface. IntPending causes
NextInst to send control to @NOP below rather than to the next opcode’s
starting address. Since interrupt testing is handled automatically by the
hardware, NOP is the only short-running opcode that needs to test for
interrupts. Long or indefinitely executing opcodes jump to NopInt when
IntPending is true. BLTstop (MesaOP3), @CKSUM (MesaESC), BitBlt, and
TextBlt do this--BitBlt and TextBlt only call NopInt when xfWDC .eq. 0.

NopInt backs up the PC, turns off IntPending in RSImage and in the RS232
hardware, and then restarts the opcode if xfWDC .ne. 0 or services the
interrupt if xfWDC .eq. 0.

IdleInt is called by the idle loop in MesaP.
%
@NOP:
GoTo[NopInt,IntPending], Opcode[0];
P4Tail:
LU ← NextInst[IBuf];
P4Tailx:
NIRet;

NopInt:
T ← (PCXreg) - 1, Call[BackPC];
T ← (SStkP&NStkP) xor (377C);
IdleInt:
prIntSaveStkP ← IP[RSImage]C, Task;
StkP ← prIntSaveStkP, prIntSaveStkP ← T, NoRegILockOK;
LU ← xfWDC;
*Turn off IntPending in RSImage and in the hardware RS232.
T ← Stack ← (Stack) and not (IntPendingBit), Skip[ALU#0];
Stack&+1, RS232 ← T, GoTo[Interrupt];*Point at NWW
*IBuf must be refilled in case BackPC has backed into the previous
*quadword. Since 1 .le. PCF .le. 10b, this can only happen on opcodes
*which are 2 or more bytes long and get here by manually checking for
*interrupts.
PFetch4[PCB,IBuf,0], Call[P4Ret];
StkP ← prIntSaveStkP, RS232 ← T, GoTo[P4Tail];

P4IncT:
T ← (Zero) + T + 1, Return;


%Call from prBackPCFault (MesaP); CallExternal from Fault.mc; 2 calls here.
Always do PCF←PCX-1; only PCB is in doubt. PCB is backed up to the previous
quadword when (PCX .ne. 0 & PCF .ne. PCX-1) or (PCF < PCX). The check for
PCF .ne. PCX-1 is needed because nested faults and traps may call BackPC
several times.
%
BackPC:
LU ← (PCFreg) - T, Skip[ALU>=0], At[BackupPCLoc];
*PCX was 0; opcode started in prev. quadword
T ← 7C, GoTo[.-1];
BackPCa:
xBuf2 ← T, Skip[ALU>=0];
PCB ← (PCB) - (4C);*PCX large, PCF small = > prev. quadword
PCF ← xBuf2, Return;

*DblGoTo[P4Trap,P4NoTrap,...] from NILCKL and NILCK in MesaESC.
P4NoTrap:
LU ← NextInst[IBuf], CallX[P4Tailx];*Odd; paired with P4Trap

%UDIV, LUDIV, NILCK, NILCKL and BNDCK jump to P4Trap with SD offset in
RTemp. Cedar and MesaFP Esc opcodes have (EscTrapOffset - SDOffset)/2 +
alpha in RTemp; Cedar AssignRef has (EscTrapOffset - SDOffset)/2 +
ARFakeAlpha (=0) in RTemp.

@MW in MesaP jumps to BackSPPCandTrap.

Unimplemented regular opcodes jump to SDTrap with the opcode number in
xfTrapParm0 and sOpcodeTrap in RTemp.

Unimplemented Esc/EscL opcodes jump to BackTrap with alpha in xfTrapParm0
and EscTrapOffset + 2*alpha in RTemp+T.

Timing from BackTrap to SavePCXfer = 16 to 24 cycles.
%
P4Trap:
T ← SStkP;*Even; paired with P4NoTrap
BackSPPCandTrap:
WW ← T, Task;
StkP ← WW;
SDTrap:
T ← (RTemp) + (SDOffset);
BackTrap:
T ← (RTemp) + T;*T ← MDS displacement of ControlLink
PFetch2[MDS,xfTemp];*Fetch control link
T ← (PCXreg) - 1, Call[BackPC];
*Since xfTypeCall=1 and xfTypeTrap=5, the OR of these values in SavePCXfer
*will correctly equal 5.
MemStat ← Trap;
LoadPage[xfPage1];
MemStat ← (MemStat) or (Or[EarlyXfer!,xfTypeTrap!]C), GoToP[SavePCXfer];

:IF[CacheLocals]; *************************************
P4PushT:
LU ← NextInst[IBuf];
Stack&+1 ← T, NIRet;

*Timing: 8.25 cycles.
@LL0:
T ← LocalCache0, GoTo[P4PushT], Opcode[1];*Load Local 0-3
@LL1:
T ← LocalCache1, GoTo[P4PushT], Opcode[2];
@LL2:
T ← LocalCache2, GoTo[P4PushT], Opcode[3];
@LL3:
T ← LocalCache3, GoTo[P4PushT], Opcode[4];

P4PushTR:
Stack&+1 ← T, Return;

*Timing: 12.25 cycles.
@LLD0:
T ← LocalCache0, Call[P4PushTR], Opcode[16];*Load Local Double 0-3
T ← LocalCache1, GoTo[P4PushT];
@LLD1:
T ← LocalCache1, Call[P4PushTR], Opcode[17];
T ← LocalCache2, GoTo[P4PushT];
@LLD2:
T ← LocalCache2, Call[P4PushTR], Opcode[20];
T ← LocalCache3, GoTo[P4PushT];
@LLD3:
T ← LocalCache3, Call[P4PushTR], Opcode[21];
PFetch1[LOCAL,Stack,4], GoTo[P4Tail];

*Timing: 16.25 cycles.
SL0:
T ← Stack&-1, GoTo[SL0x], Opcode[31];*Store Local 0-3
SL1:
T ← Stack&-1, GoTo[SL1x], Opcode[32];
SL2:
T ← Stack&-1, GoTo[SL2x], Opcode[33];
SL3:
T ← Stack&-1, GoTo[SL3x], Opcode[34];

StoreLocalCache:
LU ← NextInst[IBuf];
PStore4[LOCAL,LocalCache0,0], NIRet;

P4Push:
Stack&+1, FreezeResult, Return;

*Store Local Double 0-3. Timing: 20.25 cycles SLD0-2, 27.25 cycles SLD3
*OK to jump to the opcode starting addresses because page faults are
*impossible.
@SLD0:
T ← Stack&-1, Opcode[45];
LocalCache1 ← T, GoTo[SL0];

@SLD1:
T ← Stack&-1, Opcode[46];
LocalCache2 ← T, GoTo[SL1];

@SLD2:
T ← Stack&-1, Opcode[47];
LocalCache3 ← T, GoTo[SL2];

@SLD3:
PStore1[LOCAL,Stack,4], GoTo[SL3], Opcode[50];

*Put Local 0-3, Byte. Timing 18.25 cycles.
*Have to do Stack&-1 to interlock possible PFetch2 to the stack.
@PL0:
T ← Stack&-1, Call[P4Push], Opcode[55];
SL0x:
LocalCache0 ← T, GoTo[StoreLocalCache];
@PL1:
T ← Stack&-1, Call[P4Push], Opcode[56];
SL1x:
LocalCache1 ← T, GoTo[StoreLocalCache];
@PL2:
T ← Stack&-1, Call[P4Push], Opcode[57];
SL2x:
LocalCache2 ← T, GoTo[StoreLocalCache];
@PL3:
T ← Stack&-1, Call[P4Push], Opcode[60];
SL3x:
LocalCache3 ← T, GoTo[StoreLocalCache];

*Timing: 20.25 cycles.
@PLD0:
T ← Stack&-1, Opcode[62];
LocalCache1 ← T;
T ← Stack&+1, GoTo[Sl0x];

:ELSE; ************************************************

*Load Local 0-3. Timing: 12.25 cycles if next opcode(s) doesn’t reference
*TOS, else 16.25 cycles
@LL0:
PFetch1[LOCAL,Stack,0], GoTo[P4Tail], Opcode[1];
@LL1:
PFetch1[LOCAL,Stack,1], GoTo[P4Tail], Opcode[2];
@LL2:
PFetch1[LOCAL,Stack,2], GoTo[P4Tail], Opcode[3];
@LL3:
PFetch1[LOCAL,Stack,3], GoTo[P4Tail], Opcode[4];

*Load Local Double 0-3
*Timing: 11.25 cycles if next opcode doesn’t reference TOS,
*else 16.25 cycles.
@LLD0:
PFetch2[LOCAL,Stack,0], GoTo[P4Tail], Opcode[16];
@LLD1:
PFetch2[LOCAL,Stack,1], GoTo[P4Tail], Opcode[17];
@LLD2:
PFetch2[LOCAL,Stack,2], GoTo[P4Tail], Opcode[20];
*Timing: 16.25 cycles if next opcode(s) doesn’t reference TOS, else
*22.25 cycles.
@LLD3:
PFetch1[LOCAL,Stack,3], Opcode[21];
PFetch1[LOCAL,Stack,4], GoTo[P4Tail];

*Store Local 0-3. Timing: 11.25 to 19.25 cycles.
@SL0:
PStore1[LOCAL,Stack,0], GoTo[P4Tail], Opcode[31];
@SL1:
PStore1[LOCAL,Stack,1], GoTo[P4Tail], Opcode[32];
@SL2:
PStore1[LOCAL,Stack,2], GoTo[P4Tail], Opcode[33];
@SL3:
PStore1[LOCAL,Stack,3], GoTo[P4Tail], Opcode[34];

*Store Local Double 0-3.
*Timing: 12.25 to 19.25 cycles.
@SLD0:
PStore2[LOCAL,Stack,0], GoTo[P4Tail], Opcode[45];
@SLD1:
PStore2[LOCAL,Stack,1], GoTo[P4Tail], Opcode[46];
@SLD2:
PStore2[LOCAL,Stack,2], GoTo[P4Tail], Opcode[47];
*Timing: 28.25 to 36.25 cycles.
@SLD3:
PStore1[LOCAL,Stack,4], Call[P4Ret], Opcode[50];
PStore1[LOCAL,Stack,3], GoTo[P4Tail];

*Put Local 0-3
@PL0:
PStore1[LOCAL,Stack,0], GoTo[PutTail], Opcode[55];
@PL1:
PStore1[LOCAL,Stack,1], GoTo[PutTail], Opcode[56];
@PL2:
PStore1[LOCAL,Stack,2], GoTo[PutTail], Opcode[57];
@PL3:
PStore1[LOCAL,Stack,3], GoTo[PutTail], Opcode[60];

*Put Local Double 0
@PLD0:
PStore2[LOCAL,Stack,0], GoTo[Put2Tail], Opcode[62];

P4Push:
Stack&+1, FreezeResult, Return;*Used by MesaP.Mc
:ENDIF; ***********************************************

*Load Local 4-14, Byte
@LL4:
PFetch1[LOCAL,Stack,4], GoTo[P4Tail], Opcode[5];
@LL5:
PFetch1[LOCAL,Stack,5], GoTo[P4Tail], Opcode[6];
@LL6:
PFetch1[LOCAL,Stack,6], GoTo[P4Tail], Opcode[7];
@LL7:
PFetch1[LOCAL,Stack,7], GoTo[P4Tail], Opcode[10];
@LL8:
PFetch1[LOCAL,Stack,10], GoTo[P4Tail], Opcode[11];
@LL9:
PFetch1[LOCAL,Stack,11], GoTo[P4Tail], Opcode[12];
@LL10:
PFetch1[LOCAL,Stack,12], GoTo[P4Tail], Opcode[13];
@LL11:
PFetch1[LOCAL,Stack,13], GoTo[P4Tail], Opcode[14];

@LLB:
T ← NextData[IBuf], CallX[LocalF], Opcode[15];

*Load Local Double 4-8, 10, Byte
@LLD4:
PFetch2[LOCAL,Stack,4], GoTo[P4Tail], Opcode[22];
@LLD5:
PFetch2[LOCAL,Stack,5], GoTo[P4Tail], Opcode[23];
@LLD6:
PFetch2[LOCAL,Stack,6], GoTo[P4Tail], Opcode[24];
@LLD7:
PFetch1[LOCAL,Stack,7], Opcode[25];
PFetch1[LOCAL,Stack,10], GoTo[P4Tail];
@LLD8:
PFetch2[LOCAL,Stack,10], GoTo[P4Tail], Opcode[26];
@LLD10:
PFetch2[LOCAL,Stack,12], GoTo[P4Tail], Opcode[27];

*The P4 double reads (LLDB and LGDB) are restartable regardless of which
*reference faults because the address of the fetch does not depend on
*the first stack word overwritten, so we can do these more straight-forwardly
*than the ones on P5.
P4FQT:
Skip[QuadOvf];
LU ← NextInst[IBuf], Call[P4Tailx];
Stack&-2, Return;*Point StkP at 1st word

*In LLDB and LGDB, where are there are no complications with fault handling,
*the double fetch to the stack averages 2.75 cycles faster than 2 single
*fetches, assuming that .25 of the doublewords cross quadword boundaries.

@LLDB:
T ← NextData[IBuf], Opcode[30];
PFetch2[LOCAL,Stack], Call[P4FQT];
*If P4FQT returns, QuadOvf occurred and StkP has been restored to its value
*prior to the PFetch2.
PFetch1[LOCAL,Stack], Call[P4IncT];*Fetch & push 1st word
LocalF:
PFetch1[LOCAL,Stack], GoTo[P4Tail];*T+1 and push 2nd word

*Store Local 4-10, Byte
@SL4:
PStore1[LOCAL,Stack,4], GoTo[P4Tail], Opcode[35];
@SL5:
PStore1[LOCAL,Stack,5], GoTo[P4Tail], Opcode[36];
@SL6:
PStore1[LOCAL,Stack,6], GoTo[P4Tail], Opcode[37];
@SL7:
PStore1[LOCAL,Stack,7], GoTo[P4Tail], Opcode[40];
@SL8:
PStore1[LOCAL,Stack,10], GoTo[P4Tail], Opcode[41];
@SL9:
PStore1[LOCAL,Stack,11], GoTo[P4Tail], Opcode[42];
@SL10:
PStore1[LOCAL,Stack,12], GoTo[P4Tail], Opcode[43];
**Require alpha > 4 here if local cache is used.
@SLB:
T ← NextData[IBuf], Opcode[44];
PStore1[LOCAL,Stack], GoTo[P4Tail];

*Store Local Double 4-6, 8, Byte
@SLD4:
PStore2[LOCAL,Stack,4], GoTo[P4Tail], Opcode[51];
@SLD5:
PStore2[LOCAL,Stack,5], GoTo[P4Tail], Opcode[52];
@SLD6:
PStore2[LOCAL,Stack,6], GoTo[P4Tail], Opcode[53];
@SLD8:
PStore2[LOCAL,Stack,10], GoTo[P4Tail], Opcode[54];

*Stack&+n, GoTo[P4Tail];
*Invokes interlock, so ..
*increment StkP by hand avoiding interlock.
:IF[CacheLocals]; *************************************
*Jump here only from Put2Tail, where T contains LOCAL+0 to 4 if locals 0 to 3
*were touched.
Put2Tail:
T ← (LOCAL) - T;
LU ← (Form4[AllOnes]) + T;
T ← (SStkP&NStkP) - 1, FreezeResult;
RTemp ← (Zero) - T, Skip[ALU<0];
PFetch4[LOCAL,LocalCache0,0];
PutTy:
LU ← NextInst[IBuf];
StkP ← RTemp, NIRet;

PutTail:
T ← SStkP&NStkP;
RTemp ← (Zero) - T, GoTo[PutTy];
:ELSE; ************************************************
Put2Tail:
T ← (SStkP&NStkP) - 1, GoTo[PutTx];
PutTail:
T ← SStkP&NStkP;
PutTx:
RTemp ← (Zero) - T;
LU ← NextInst[IBuf];
StkP ← RTemp, NIRet;
:ENDIF; ***********************************************

**Require alpha > 4 here if local cache is used.
@PLB:
T ← NextData[IBuf], Opcode[61];*Put Local Byte
PStore1[LOCAL,Stack], GoTo[PutTail];

@PLDB:
T ← NextData[IBuf], Opcode[63];
PStore2[LOCAL,Stack];
T ← (RZero) + T + 1, GoTo[Put2Tail,QuadOvf’];*Bypass kludge
Stack&+2;
PStore1[MDS,Stack], Task;
T ← (AllOnes) + T;
PStore1[MDS,Stack], GoTo[Put2Tail];

*Global opcodes
@LG0:
PFetch1[GLOBAL,Stack,0], GoTo[P4Tail], Opcode[64];
@LG1:
PFetch1[GLOBAL,Stack,1], GoTo[P4Tail], Opcode[65];
@LG2:
PFetch1[GLOBAL,Stack,2], GoTo[P4Tail], Opcode[66];

@LGB:
T ← NextData[IBuf], CallX[GlobalF], Opcode[67];

@LGD0:
PFetch2[GLOBAL,Stack,0], GoTo[P4Tail], Opcode[70];
@LGD2:
PFetch2[GLOBAL,Stack,2], GoTo[P4Tail], Opcode[71];

@LGDB:
T ← NextData[IBuf], Opcode[72];
PFetch2[GLOBAL,Stack], Call[P4FQT];
*If P4FQT returns, QuadOvf occurred. See LLDB.
PFetch1[GLOBAL,Stack], Call[P4IncT];
GlobalF:
PFetch1[GLOBAL,Stack], GoTo[P4Tail];

@SGB:
T ← NextData[IBuf], Opcode[73];
PStore1[GLOBAL,Stack], GoTo[P4Tail];

*Bounds Check--Trap if 2OS >= TOS (unsigned)
@BNDCK:
T ← Stack&-1, Opcode[74];
LU ← (Stack) - T;
BndCk1:
RTemp ← sBoundsTrap, Skip[Carry’];
T ← SStkP, GoTo[BackSPPCandTrap];
LU ← NextInst[IBuf], CallX[P4Tailx];

*xfBrkByte contains 40400b + the break byte = (2001b+4*opcode) RCy 2.
@BRK:
LU ← (xfBrkByte) - (40400C), Opcode[75];*Break
T ← 40400C, Skip[ALU#0];
RTemp ← sBreakTrap, GoTo[SDTrap];
*Manually dispatch to the entry mi for the opcode.
xfBrkByte ← LCy[xfBrkByte,2];
*P4IncT is used only for its "Return."
APCTask&APC ← xfBrkByte, xfBrkByte ← T, NoRegILockOK, GoTo[P4IncT];

P4Undef:
RTemp ← sOpcodeTrap, GoTo[SDTrap];

@OP76:
xfTrapParm0 ← 76C, GoTo[P4Undef], Opcode[76];
@OP77:
xfTrapParm0 ← 77C, GoTo[P4Undef], Opcode[77];

:END[MesaOP0];