:TITLE[MesaOP1];*Opcodes 100b-177b
%
Ed Fiala 7 February 1984: Changes to FetchLink subr and link opcodes
for Klamath; change TrapParm to xfTrapParm0; deimplement LPChecking;
move TOSDoubleRead; move refill trap to MesaOP2.mc; add P5TtoRTemp1 label;
bum 5b mi with nop changes for WF, W0F, PSF, P5Ret, RF.
Ed Fiala 27 May 1982: StackLP0x for @RL0, @RL0F, and @WL0F;
WriteTail extension for CacheLocals=1.
Ed Fiala 21 May 1982: Fix WriteLongTail and at WFLx for CacheLocals=1;
change at CLCR0.
Ed Fiala 2 April 1982: bummed 6 mi in RF/WF opcodes; bummed 4 cycles out of
RLF, 2 cycles each out of RL0F, RF, R0F, W0F, WF, PSF, WL0F; eliminated
CLCR0 tag and RTemp1 ← T at P5SQT when CacheLocals=0; eliminated RFLy tag.
%
%This file contains opcodes originating on opPage1. Almost all opcodes
using long pointers or doing doubleword operations are here.

Long pointers are truncated mod 2↑24, which produces a map fault only if
the 2↑23 bit (bit 9) is 1--otherwise, no map fault will occur and the
reference will take place to the address mod 2↑22 of the long pointer.

PFetch2[LP,Stack] pushes the word at LP onto the stack first and then
the word at LP+1, and PStore2[LP,Stack] writes TOS into LP+1 and 2OS into
LP. Long pointers then wind up with the high-order word at TOS and the
low-order word at 2OS on the stack. Doing two PFetch1’s/PStore1’s when
a doubleword crosses a quadword boundary has to be done in a scrambled
order because of the way this works.

Timing comments given here include 2.25 cycles/byte for buffer refill and
all memory wait assuming no io tasks run when tasking occurs. Variable
times are given for opcodes that end with a PFetch/PStore to the stack
because total opcode time depends upon memory wait of subsequent opcodes,
and this depends upon when Stack&-1, Stack&+1, or a reference occurs.
Also, some opcodes offset 2.25 cycles for buffer refill by using the BPCChk
branch condition to carry out buffer refill inline--adjustments are made
for this. Sometimes memory wait of the next opcode and buffer refill wait
cancel out--adjustments are made for this.

NOTES:
1) To move mi off of opPage1, consider a LoadPage at StackLPx, ?
2) Timing of @PS0F, @W0F can be improved.
3) Save an average of 1.5 cycles/opcode that calls LocalLP or GlobalLP
by checking for BPCChk in the PFetch1[MDS,LP] opcode.
4) Variant StackLP subroutine save 2 cycles in @PSLB, 4 in @PSDLB.
5) replicating DoubleWrite for LOCAL and GLOBAL saves 2 cycles in @SLDB
and in @SGDB.
6) Bypass kludge at decLPTR should be eliminated if start using LogSE.
7) Save 1 mi but cost 2 cycles in @PSLF by doing Stack&+3 earlier.
8) Need to fix CacheLocals=1 bug at LPDoubleWrite, where bypass kludge
is ambiguous, so local cache refill check is bad.
%

OnPage[opPage1];

%StkP ← StkP+2, T ← T+1
This subroutine is used between the two PStore1’s which happen when a
PStore2 from the Stack crosses a quadword boundary; the Stack&+2 will abort
until the PStore1 finishes, so could save two cycles with:
P5IncS2T1:
RTemp1 ← T;
T ← (SStkP&NStkP) - 1;
RTemp ← (Zero) - T;
StkP ← RTemp;
T ← (RTemp1) + 1, Return;
P5IncS2T1 called by DoubleWrite and LPDoubleWrite.
%
P5IncS2T1:
T ← (Zero) + T + 1;*Reversing these 2 mi saves 1 mi but
Stack&+2, Return;*costs 2 cycles.

*P5IncT called by DoubleRead and LPDoubleRead.
P5IncT:
T ← (Zero) + T + 1, Return;

*FetchLPToRTemp called by @WLF, @PSLF, and @WLFS,
FetchLPToRTemp:
PFetch1[LP,RTemp];*allow time to write T
*P5Ret called by DoubleReadx and RLI0x opcodes.
P5Ret:
Return;*allow time for T to be written

*P5FQT called by DoubleRead, @RDL0, and LPDoubleRead.
*On quadword overflow adjust StkP modified by the failed fetch to point one
*below the 2nd word to be fetched.
P5FQT:
GoTo[P5Tail,QuadOvf’];
*Called by @PSF, @PSLF.
P5Pop:
Stack&-1, Return;

*RWSTRLx called by @RLS and @WLS.
*TtoR1StackLP called by @RLFS, @PSLF, and @WLFS.
*StackLP called by @PSLB, @PSDLB, @RL0F, @RLF, @WL0F, and @WLF.
*StackLPx called by @RL0, @RLB, @RDL0, @RDLB, @WLB, and @WDLB, and by BLTL
*in MesaOP3.
*StackLPz called by @CKSUM in MesaESC.
RWSTRLx:
T ← (Stack&-1) + T;
TtoR1StackLP:
RTemp1 ← T, GoTo[StackLP];*RTemp1 ← String index
StackLP0x:
LPhi ← T, GoTo[StackLPz];

*Timing through Return is 10 cycles.
StackLP:
T ← RHMask[Stack&-1];
StackLPx:
LPhi ← T;
StackLPy:
LPhi ← (LSh[LPhi,10]) + T + 1;
StackLPz:
T ← Stack&-1;
LP ← T, Return;

*GlobalLP called by @RGILP.
*Format a long pointer from a global selector,,offset pair.
*Timing = 23 cycles.
GlobalLP:
T ← (LdF[Cycle&PCXF,0,4]) + 1;*Note: LPhi fetched first
PFetch1[GLOBAL,LPhi], GoTo[decLPTR];
*LocalLP called by @RLILP, @RLDILP, @WLILP, @WLDILP, and @RLILPF.
*Format a long pointer from a local selector,,offset pair.
*Timing = 23 cycles.
LocalLP:
T ← (LdF[Cycle&PCXF,0,4]) + 1; *High half of pointer fetched first
PFetch1[LOCAL,LPhi];
decLPTR:
*Bypass kludge (ok because MDShi=GLOBALhi=LOCALhi)
T ← (AllOnes) + T;
*2 cycle abort occurs here--this reference can’t fault unless the previous
*reference also faults.
PFetch1[MDS,LP];
***5 (?) cycle abort occurs here.
T ← RHMask[LPhi];
LPhi ← (LSh[LPhi,10]) + T + 1;
LPtrx:
T ← LdF[Cycle&PCXF,4,4], Return;

*GPtr called by @RGIP.
GPtr:
T ← LdF[Cycle&PCXF,0,4];
PFetch1[GLOBAL,RTemp], GoTo[LPtrx];

*LPtr called by @RLIP, @RLDIP, @WLIP, and @RLIPF.
LPtr:
T ← LdF[Cycle&PCXF,0,4];
PFetch1[LOCAL,RTemp], GoTo[LPtrx];

*RWSTRx called by @RS and @WS.
RWSTRx:
T ← (Stack&-1) + T;*string index
RTemp1 ← T;
T ← RSh[RTemp1,1];
*AddTtoStack called by @RB, @RDB, @WB, and @WDB.
AddTtoStack:
T ← (Stack&-1) + T, Return;

*FetchMDSToRTemp called by @W0F, @WF, @PSF, @PS0F, and @WS0F.
FetchMDSToRTemp:
PFetch1[MDS,RTemp];
P5TtoRTemp1:
RTemp1 ← T, Return;*Bypass kludge = no bypass kludge here

*Read n, n = 0, 1, Byte
*Timing: 4.25 cycles + ReadTail.
@R0:
T ← (Stack&-1) + (0C), GoTo[ReadTail], Opcode[100];
@R1:
T ← (Stack&-1) + (1C), GoTo[ReadTail], Opcode[101];

*Timing: 8.5 cycles + ReadTail.
@RB:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[102];
*Timing: 9 (best) to 11.75 cycles (next opcode refs stack in 1st mi).
ReadTail:
PFetch1[MDS,Stack], GoTo[P5Tail];

*Read Long Zero. Push [TOS,,2OS](0).
*Timing: 17.25 (best) to 22 cycles (next opcode refs stack in 1st mi).
@RL0:
T ← LSh[Stack&-1,10], Call[StackLP0x], Opcode[103];
PFetch1[LP,Stack,0], GoTo[P5Tail];

*Read Long Byte
*Timing: 25.5 (best) to 30.25 cycles (Stack&-1 in 1st mi of next opcode).
@RLB:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[104];
T ← NextData[IBuf], CallX[ReadLongTail];

*Read Double 0
*Timing 4.25 cycles + DoubleRead.
@RD0:
T ← Stack&-1, GoTo[DoubleRead], Opcode[105];

*Read Double Byte (old except Call[P5Ret] replaces two Nops)
*Timing 8.5 cycles + DoubleRead.
@RDB:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[106];
*DoubleRead is jumped to from @RD0, @RLDI00, @RLDIP, and @RKDIB.
*Timing = 10 cycles if no quadword boundary crossed and next opcode doesn’t
*touch TOS, or 14 cycles if next opcode touches TOS in 1st mi; 26 cycles if
*quadword boundary crossed.
DoubleRead:
PFetch2[MDS,Stack], Call[P5FQT];
%If P5FQT Returns, QuadOvf occurred and StkP is 1 larger than it was before
the PFetch2. Since the fetch address is a function of the 1st word
overwritten, the two single fetches have to be done without overwriting
the 1st stack word on a fault.
%
PFetch1[MDS,RTemp], Call[P5IncT];*Fetch 1st word
PFetch1[MDS,Stack];*Fetch 2nd word directly to stack
*DoubleReadx is jumped to from LPDoubleRead.
DoubleReadx:
Call[P5Ret];
T ← RTemp;*Move 1st word to stack
*Must allow the PFetch1[MDS,Stack] above time to fault before smashing the
*1st stack word; a fault on that PFetch1 will abort this next mi.
***Stack&-1 ← T below will abort until PFetch1 finishes costing 2 cycles.
Stack&-1 ← T;*& point StkP at 2nd word
LU ← NextInst[IBuf];
Stack&+1, NIRet;

*Read Double Long Zero. Push doubleword at [TOS,,2OS](0).
*Timing: 12 cycles + LPDoubleRead.
@RDL0:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[107];
RKDIBa:
PFetch2[LP,Stack,0], Call[P5FQT];
*See comments for @RDLB.
PFetch1[LP,RTemp,0];
T ← 0C, GoTo[RDL0x];

*Read Double Long Byte
*Timing: 16.25 cycles + LPDoubleRead.
@RDLB:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[110];
T ← NextData[IBuf], CallX[LPDoubleRead];

*Write n, n = 0, byte
*Timing: 4.25 cycles + WriteTail.
@W0:
T ← Stack&-1, GoTo[WriteTail], Opcode[111];

*Timing: 8.5 cycles + WriteTail.
@WB:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[112];
:IF[CacheLocals]; *************************************
*PStore1’s can’t cause QuadOvf, so this checks the local cache
WriteTail:
PStore1[MDS,Stack];
CLCR3:
T ← (LOCAL) xor T;
LU ← (Form-4[AllOnes]) and T, GoTo[CLCR2];

%P5SQT is called by DoubleWrite and LPDoubleWrite and jumped to by WriteTail
and WFLx (where QuadOvf is impossible, so it won’t return).
Bypass kludge (or don’t care about bypass kludge after MDS reference)
%
P5SQT:
RTemp1 ← T, Skip[QuadOvf’];
P5Push:
Stack&+1, Return;*Point StkP to the 1st word to be stored
CLCR0:
T ← RSh[RTemp1,2];*Check for local cache refill
CLCR1:
LU ← (RSh[LOCAL,2]) xor T;
*Entry here from MesaESC.
CLCR2:
Skip[ALU#0];
PFetch4[LOCAL,LocalCache0,0];
LU ← NextInst[IBuf], Call[P5Tailx];

WSDBx:
Stack&-1, GoTo[CLCR0]; *back up StkP over pointer

WS0x:
PStore1[MDS,RTemp], GoTo[P5SQT];
:ELSE; ************************************************
*Timing = 17 cycles if next opcode does a reference or a Stack&+1, 9 cycles
*in the best case. Timing is discussed in the beginning of MesaOP2.
WriteTail:
PStore1[MDS,Stack], GoTo[P5Tail];

P5SQT:
Skip[QuadOvf’];
P5Push:
Stack&+1, Return;*Point StkP to the 1st word to be stored
LU ← NextInst[IBuf], Call[P5Tailx];

WSDBx:
LU ← NextInst[IBuf];*Back StkP over pointer
Stack&-1, NIRet;

WS0x:
PStore1[MDS,RTemp], GoTo[P5Tail];
:ENDIF; ***********************************************

*Store TOS at MDS[2OS+alpha], Recover.
*Timing: 10.5 cycles + WriteTail.
@PSB:
T ← CNextData[IBuf], Call[P5Pop], Opcode[113];*Put Swapped Byte
T ← (Stack&+1) + T, GoTo[WriteTail];

*Write Long Byte.
*Timing: 18.5 cycles + WriteLongTail.
@WLB:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[114];
PSLBa:
T ← NextData[IBuf], CallX[WriteLongTail];

*Store TOS at [2OS,,3OS][alpha], Recover2.
*Timing: 22.5 cycles + WriteLongTail.
@PSLB:
Stack&-1, Call[StackLP], Opcode[115];*Put Swapped Long Byte
Stack&+3, GoTo[PSLBa];

*Write Double Byte.
*Timing: 8.5 cycles + DoubleWrite.
@WDB:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[116];
*Timing: 10 cycles if no quadword boundary crossed and next opcode does
*neither a reference nor Stack&+1 for 4 mi; 17 cycles if a reference or
*Stack&+1 is in next opcode’s 1st mi; 30 cycles if quadword boundary
*crossed but no reference or Stack&+1; or 41 cycles if quadword boundary
*crossed and either reference or Stack&+1 in next opcode’s 1st mi.
DoubleWrite:
PStore2[MDS,Stack], Call[P5SQT];
*If P5SQT Returns, StkP is one smaller than it was before the PStore2.
*Do two single stores
PStore1[MDS,Stack], Call[P5IncS2T1];
PStore1[MDS,Stack], GoTo[WSDBx];

*Put Swapped Double Zero.
*TOS,,2OS stored at MDS[3OS], Recover.
*Timing: 6.5 cycles + DoubleWrite.
@PSD0:
Stack&-2, T ← Zero, GoTo[PSD0a], Opcode[117];

*Put Swapped Double Byte.
*TOS,,2OS stored at MDS[3OS+alpha], Recover.
*Timing: 10.5 cycles + DoubleWrite.
@PSDB:
T ← NextData[IBuf], Opcode[120];
Stack&-2;
PSD0a:
T ← (Stack&+2) + T, GoTo[DoubleWrite];

*Write Double Long Byte.
*Timing: 18.5 cycles + LPDoubleWrite.
@WDLB:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[121];
PSDLB0:
T ← NextData[IBuf], CallX[LPDoubleWrite];

*Put Swapped Double Long Byte
*TOS,,2OS stored at [3OS,,4OS][alpha], Recover2.
*Timing: 24.5 cycles + LPDoubleWrite.
***Save 4 cycles with non-standard StackLP.
@PSDLB:
Stack&-2, Call[StackLP], Opcode[122];
Stack&+2;
Stack&+2, GoTo[PSDLB0];

*Read Indirect Local n = 0 to 3.
:IF[CacheLocals]; *************************************
*Timing: 4.25 cycles + ReadTail.
@RLI00:
T ← LocalCache0, GoTo[ReadTail], Opcode[123];
@RLI01:
T ← (LocalCache0) + 1, GoTo[ReadTail], Opcode[124];
@RLI02:
T ← (LocalCache0) + (2C), GoTo[ReadTail], Opcode[125];
@RLI03:
T ← (LocalCache0) + (3C), GoTo[ReadTail], Opcode[126];
:ELSE; ************************************************
*Timing: 18.25 cycles + ReadTail.
@RLI00:
PFetch1[LOCAL,RTemp,0], Call[P5Ret], Opcode[123];
T ← RTemp, GoTo[ReadTail];
@RLI01:
PFetch1[LOCAL,RTemp,0], Call[P5Ret], Opcode[124];
T ← (RTemp) + 1, GoTo[ReadTail];
@RLI02:
PFetch1[LOCAL,RTemp,0], Call[P5Ret], Opcode[125];
T ← (RTemp) + (2C), GoTo[ReadTail];
@RLI03:
PFetch1[LOCAL,RTemp,0], Call[P5Ret], Opcode[126];
T ← (RTemp) + (3C), GoTo[ReadTail];
:ENDIF; ***********************************************

*Read Local Indirect Pair.
*Timing: 24.5 cycles + ReadTail.
@RLIP:
LU ← CycleControl ← CNextData[IBuf], Call[LPtr], Opcode[127];
RILPx:
T ← (RTemp) + T, GoTo[ReadTail];

*Read Local Indirect Long Pair
*Timing: 31.5 cycles + ReadLongTail.
@RLILP:
LU ← CycleControl ← CNextData[IBuf], Call[LocalLP], Opcode[130];
*Timing: 9 (best) to 11.75 cycles (next opcode does Stack&-1 in 1st mi).
ReadLongTail:
PFetch1[LP,Stack], GoTo[P5Tail];

*Read Local Double Indirect Zero Zero.
*Push the double word at LOCAL[0].
:IF[CacheLocals]; *************************************
*Timing: 2 cycles + DoubleRead.
@RLDI00:
T ← LocalCache0, GoTo[DoubleRead], Opcode[131];
:ELSE; ************************************************
*Timing: 16 cycles + DoubleRead.
@RLDI00:
PFetch1[LOCAL,Stack,0], Opcode[131];
T ← Stack&-1, GoTo[DoubleRead];
:ENDIF; ***********************************************

*Read Local Double Indirect Pair.
*Push the double word at MDS[LOCAL[alpha[0:3]]+alpha[4:7]].
*Timing: 24.5 cycles + DoubleRead.
@RLDIP:
LU ← CycleControl ← CNextData[IBuf], Call[LPtr], Opcode[132];
T ← (RTemp) + T, GoTo[DoubleRead];

*Read Local Double Indirect Long Pair.
*LOCAL[alpha[0:3]] is a long pointer P; push the double word at P[alpha[4:7]].
*Timing: 33.5 cycles + LPDoubleRead.
@RLDILP:
LU ← CycleControl ← CNextData[IBuf], Call[LocalLP], Opcode[133];
*LPDoubleRead is called by @RDLB.
*Timing = 10 cycles if no quadword boundary is crossed and the next opcode
*doesn’t do Stack&-1 in its 1st two mi, 14 cycles if Stack&-1 is done in the
*1st mi, 36 cycles if quadword boundary crossed.
LPDoubleRead:
PFetch2[LP,Stack], Call[P5FQT];
*Unlike other double reads to the stack, the fetch address for RDBL depends
*on BOTH stack words overwritten, so the code must tolerate a fault on either
*PFetch1 below. The problem case is where the 2nd PFetch1 starts before the
*1st faults but the 2nd doesn’t fault because it is on a different page.
*To avoid this, ensure 3 mi between the two fetches.
PFetch1[LP,RTemp];
Nop;
RDL0x:
Call[P5IncT];
PFetch1[LP,Stack], GoTo[DoubleReadx];

*Read Global Indirect Pair
*Timing: 24.5 cycles + ReadTail.
@RGIP:
LU ← CycleControl ← CNextData[IBuf], Call[GPtr], Opcode[134];
T ← (RTemp) + T, GoTo[ReadTail];

*Read Global Indirect Long Pair
*Timing: 31.5 cycles + ReadLongTail.
@RGILP:
LU ← CycleControl ← CNextData[IBuf], Call[GlobalLP], Opcode[135];
PFetch1[LP,Stack], GoTo[P5Tail];

*Write Local Indirect Pair
*Timing: 24.5 cycles + WriteTail.
@WLIP:
LU ← CycleControl ← CNextData[IBuf], Call[LPtr], Opcode[136];
WILPx:
T ← (RTemp) + T, GoTo[WriteTail];

*Write Local Indirect Long Pair
*Timing: 31.5 cycles + WriteTail.
@WLILP:
LU ← CycleControl ← CNextData[IBuf], Call[LocalLP], Opcode[137];
:IF[CacheLocals]; *************************************
WriteLongTail:
PStore1[LP,Stack], Call[P5Ret];
T ← (LP) + T, GoTo[CLCR3];
:ELSE; ************************************************
WriteLongTail:
PStore1[LP,Stack], GoTo[P5Tail];
:ENDIF; ***********************************************

*Write Local Double Indirect Long Pair
*LOCAL[alpha[0:3]] is a L.P. P; write dbl word at TOS,,2OS into P[alpha[4:7]].
*Timing: 31.5 cycles + DoubleWrite.
@WLDILP:
LU ← CycleControl ← CNextData[IBuf], Call[LocalLP], Opcode[140];
*Timing: see DoubleWrite comment.
LPDoubleWrite:
PStore2[LP,Stack], Call[P5SQT];
*Do two single stores
PStore1[LP,Stack], Call[P5IncS2T1];
PStore1[LP,Stack], GoTo[WSDBx];

*Convert RTemp1 byte pointer to field pointer in CycleControl; put
*original T in RTemp1.
R1BtoF:
RTemp1 ← LSh[RTemp1,7];
RTemp1 ← (RTemp1) or (7C);
CycleControl ← RTemp1, RTemp1 ← T, NoRegILockOK, Return;

*Read String
*Timing: 30.5 cycles.
@RS:
T ← CNextData[IBuf], Call[RWSTRx], Opcode[141];
PFetch1[MDS,Stack], Call[R1BtoF];
*Jump to RFLx from @RF, @RL0F, @RLFS, and @RLIPF.
RFLx:
LU ← NextInst[IBuf];
*Call from ShiftRight, @RLS, @ROTATE in MesaESC.Mc
RFLxx:
Stack ← RF[Stack], NIRet;

*Read Long String
*Timing: 40.5 cycles.
@RLS:
T ← CNextData[IBuf], Call[RWSTRLx], Opcode[142];
*The Nops below prevent occasional stack errors on some machines.
*No one knows exactly why.
*It’s related to the RDC Idle loop running during the tasking above
*and to fetching to the stack below.
*
Nop;
T ← RSh[RTemp1,1];
*
Nop;
PFetch1[LP,Stack], Call[R1BtoF];
LU ← NextInst[IBuf], Call[RFLxx];

*Write String
*Timing: 32.5 cycles + WS0x.
@WS:
T ← CNextData[IBuf], Call[RWSTRx], Opcode[143];
PFetch1[MDS,RTemp], Call[R1BtoF];
*Jump here from @WF, @W0F, @PSF.
WFz:
T ← WFA[Stack&-1];
*Jump here from @WS0F.
WFy:
RTemp ← (WFB[RTemp]) or T;*Do insert
T ← RTemp1, GoTo[WS0x];

*Write Long String
*Timing: 42.5 cycles + WS0x.
@WLS:
T ← CNextData[IBuf], Call[RWSTRLx], Opcode[144];
T ← RSh[RTemp1,1];
PFetch1[LP,RTemp], Call[R1BtoF];
T ← WFA[Stack&-1], GoTo[WFLz];

*Read Zero Field.
*Timing: 22.25 cycles.
@R0F:
T ← Stack&-1, GoTo[R0F1], Opcode[145];

*Read Field.
*Timing: 26.5 cycles.
*T ← Word displacement + pointer to record
@RF:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[146];
R0F1:
PFetch1[MDS,Stack];
*Setup field descriptor
RLILPFa:
LU ← CycleControl ← NextData[IBuf], CallX[RFLx];

*Read Long Zero Field.
*Timing: 32.25 cycles.
@RL0F:
T ← LSh[Stack&-1,10], Call[StackLP0x], Opcode[147];
PFetch1[LP,Stack,0], GoTo[RLILPFa];

*Read Long Field.
*Timing: 36.5 cycles.
@RLF:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[150];
T ← NextData[IBuf];
PFetch1[LP,Stack], GoTo[RLILPFa];

*Read Long Field Stack.
*Timing: 36.25 cycles.
@RLFS:
T ← CycleControl ← Stack&-1, Call[TtoR1StackLP], Opcode[151];
T ← RSh[RTemp1,10];
PFetch1[LP,Stack], GoTo[RFLx];

*Read Local Indirect Pair Field.
*Timing: 42.75 cycles.
@RLIPF:
LU ← CycleControl ← CNextData[IBuf], Call[LPtr], Opcode[152];
T ← (RTemp) + T, GoTo[R0F1];

*Read Local Indirect Long Pair Field.
*Timing: 47.5 cycles.
@RLILPF:
LU ← CycleControl ← CNextData[IBuf], Call[LocalLP], Opcode[153];
PFetch1[LP,Stack], GoTo[RLILPFa];

*Write Zero Field.
*Timing: 24.5 cycles + WriteTail.
@W0F:
T ← Stack&-1, GoTo[WF1], Opcode[154];

*Write Field.
*Timing: 28.75 cycles + WriteTail.
@WF:
T ← CNextData[IBuf], Call[AddTtoStack], Opcode[155];
WF1:
PFetch1[MDS,RTemp], Call[P5TtoRTemp1];
LU ← CycleControl ← NextData[IBuf], CallX[WFz];*Get field descriptor

*Put Swapped Field.
*WSF, Recover.
*Timing: 22.75 cycles + WriteTail.
@PSF:
T ← CNextData[IBuf], Call[P5Pop], Opcode[156];
T ← (Stack&+1) + T, GoTo[WF1];

WS0Fa:
MNBR ← WFA[Stack&-1];
T ← Stack&-1, GoTo[FetchMDSToRTemp];

*Put Swapped Zero Field.
*WS0F, Recover.
*Timing: 28.5 cycles + WriteTail.
@PS0F:
LU ← CycleControl ← CNextData[IBuf], Call[WS0Fa], Opcode[157];
Stack&+1, GoTo[PS0Fa];

*Write Swapped Zero Field.
*Alpha is field descriptor, TOS is data to write, MDS[2OS] is word
*containing field.
*Timing: 28.5 cycles + WriteTail.
@WS0F:
LU ← CycleControl ← CNextData[IBuf], Call[WS0Fa], Opcode[160];
PS0Fa:
T ← MNBR, GoTo[WFy];

*Write Long Zero Field.
*Timing: 34.5 cycles + WriteTail.
@WL0F:
T ← LSh[Stack&-1,10], Call[StackLP0x], Opcode[161];
PFetch1[LP,RTemp,0];
T ← 0C, GoTo[PSLFa];

*Write Long Field
*Timing: 38.75 cycles + WriteTail.
@WLF:
T ← RHMask[Stack&-1], Call[StackLPx], Opcode[162];
T ← CNextData[IBuf], Call[FetchLPToRTemp];
*Jump here from @WL0F, @PSLF.
PSLFa:
RTemp1 ← T;
LU ← CycleControl ← NextData[IBuf], CallX[WFLy];

*Put Swapped Long Field
*Timing: 40.75 cycles + WriteTail.
@PSLF:
Stack&-1, Call[StackLP], Opcode[163];
T ← CNextData[IBuf], Call[FetchLPToRTemp];
Stack&+3, GoTo[PSLFa];

*Write Long Field Stack
*Timing: 38.25 cycles + WriteTail.
@WLFS:
T ← CycleControl ← Stack&-1, Call[TtoR1StackLP], Opcode[164];
RTemp1 ← T ← RSh[RTemp1,10], Call[FetchLPToRTemp];
*Jump here from @WLF, @WL0F, @PSLF.
WFLy:
T ← WFA[Stack&-1];
*Jump here from @WLS.
WFLz:
RTemp ← (WFB[RTemp]) or T;
WFLx:
T ← RTemp1;
:IF[CacheLocals]; **********************************
PStore1[LP,RTemp], GoTo[CLCR3];
:ELSE; ************************************************
PStore1[LP,RTemp], GoTo[P5Tail];
:ENDIF; ***********************************************

*Store Local Double Byte.
*Timing: 8.5 cycles + DoubleWrite.
**Require alpha .ge. 4 here if local cache is used.
@SLDB:
T ← LOCAL, Opcode[165];*Store Local Double Byte
T ← (NextData[IBuf]) + T, CallX[DoubleWrite];
*Store Global Double Byte.
@SGDB:
T ← GLOBAL, GoTo[.-1], Opcode[166];

PushLink:
T ← (MNBR) + T + 1;
*T ← (2xalpha+1)’+1-1 = -(2xalpha)-2.
T ← (Zero) or not T, xfGFIWord, Skip[R Even];
PFetch2[CODE,Stack], Return;*Code link
*T ← -4-(2xalpha)-2 = -6-(2xalpha)
T ← (Form-4[AllOnes]) + T;
PFetch2[GLOBAL,Stack], Return;*Frame link

*Load Link Byte
@LLKB:
T ← MNBR ← CNextData[IBuf], Call[PushLink], Opcode[167];
P5Tail:
LU ← NextInst[IBuf];
P5Tailx:
NIRet;

FetchLink:
T ← (MNBR) + T + 1;
*T ← (2xalpha+1)’+1-1 = -(2xalpha)-2.
T ← (Zero) or not T, xfGFIWord, Skip[R Even];
PFetch2[CODE,LP], Return;*Code link
*T ← -4-(2xalpha)-2 = -6-(2xalpha)
T ← (Form-4[AllOnes]) + T;
PFetch2[GLOBAL,LP], Return;*Frame link

*Read Link Indirect Byte
@RKIB:
T ← MNBR ← CNextData[IBuf], Call[FetchLink], Opcode[170];
LPhi ← LSh[LPhi,10], Call[P5Ret];
PFetch1[LP,Stack,0], GoTo[P5Tail];

*Read Link Double Indirect Byte
@RKDIB:
T ← MNBR ← CNextData[IBuf], Call[FetchLink], Opcode[171];
LPhi ← LSh[LPhi,10];
GoTo[RKDIBa];

*Link Byte. LOCAL[0] ← -1OS - alpha; destination link is at -1OS.
:IF[CacheLocals]; **********************************
@LKB:
T ← MNBR ← CNextData[IBuf], Call[P5Push], Opcode[172];
Stack ← T ← (Stack) - T, LoadPage[opPage0];
Stack&-1, GoToP[SL0x];
:ELSE; ************************************************
@LKB:
T ← MNBR ← CNextData[IBuf], Call[P5Push], Opcode[172];
Stack ← (Stack) - T;
PStore1[LOCAL,Stack,0], GoTo[P5Tail];
:ENDIF; ***********************************************

ShiftRight:
RTemp ← (RTemp) + (17C);
CycleControl ← RTemp, Skip[Carry];
Stack ← Zero, GoTo[P5Tail];*shift count > 17 , use zero
LU ← NextInst[IBuf], CallX[RFLxx];

ShiftLeft:
LU ← (RTemp) and not (17C);
ShiftLeft1:
RTemp ← (RTemp) xor (17C), Skip[ALU=0];
Stack ← Zero, GoTo[P5Tail];*shift count > 17 , use zero
*T has positive count. form 0,,-count, then use WFA
CycleControl ← RTemp;
LU ← NextInst[IBuf];
Stack ← WFA[Stack], NIRet;

@SHIFT:
T ← Stack&-1, Opcode[173];
RTemp ← T, DblGoTo[ShiftRight,ShiftLeft,ALU<0];

@SHIFTSB:
T ← NextData[IBuf], Opcode[174];*Shift Signed Byte
RTemp ← T, Skip[H2Bit8’];
RTemp ← (RTemp) or (177400C), GoTo[ShiftRight];
LU ← (RTemp) and not (17C), GoTo[ShiftLeft1];

P5Undef:
LoadPage[opPage0];
RTemp ← sOpcodeTrap, GoToP[SDTrap];

@OP175:
xfTrapParm0 ← 175C, GoTo[P5Undef], Opcode[175];
@OP176:
xfTrapParm0 ← 176C, GoTo[P5Undef], Opcode[176];
@OP177:
xfTrapParm0 ← 177C, GoTo[P5Undef], Opcode[177];

:END[MesaOP1];