DragonRC.mesa
Copyright © 1984, 1986 by Xerox Corporation. All rights reserved.
Created by Russ Atkinson in 1984.
Russ Atkinson (RRA) May 14, 1986 9:27:44 pm PDT
This module is an initial pseudo-implementation of the reference counting primitives for Cedar as implemented for Dragon. There are a number of unimplemented features right now. This code is intended as a template for hand-coding, since performance in these routines is important. This code will not run, since we assume that the order of halfwords in arithmetic corresponds to Dragon conventions, not to PrincOps conventions.
We need to maintain a large number of invariants. The trickiest one is forced by the inability of Dragon to atomically modify anything more than one word. This implies that reference counts may be too high from time to time, which means that during that time the trace and sweep collector must be prevented from running, because it would incorrectly reconstruct certain reference counts. The invariant can be stated informally as "whenever a reference count is incorrect, trace and sweep must be disabled". Note that it is disasterous to have the reference count too low, since a legitimate decrement by another process could force the RC under 0, which has no real meaning, and cannot be handled by our structures.
This implementation is also oriented towards mixing Cedar and Lisp objects. In Cedar, various quantities are 32 bits wide, and the compiler performs static type analysis to determine how to determine the bits. In Lisp, it is more common to use a general purpose reference, which we will call a TaggedRef. The top 3 bits of a TaggedRef are used to interpret the quantity, and are decoded according to the following table:
0: +INT 4: CONS
1: +REAL 5: - REAL
2: +REAL 6: - REAL
3: REF 7: - INT
Unlike the older versions of Cedar, there is some provision for REFs that are not counted. If these REFs constitute a significant percentage of RC objects, then the average cost of RC operations will decrease. These REFs are allocated from the "permanent" counted zone, and lie in the address range in [LowestUncountedRef..LowestCountedRef). In the type system, however, they are treated as any other REFs. Candidates for such REFs include ROPE and ATOM literals. There are also possibilities for taking checkpoints, treating all objects in the checkpoint as permanent.
DIRECTORY
DragOpsCross;
DragonRC: PROGRAM = BEGIN
Types from the outside.
Word: TYPE = DragOpsCross.Word;
Types and constants used in reference-counting
wordsPerPage: NAT = 1024;
unitsPerWord: NAT = 1;
addressing units per word
unitsPerPage: NAT = wordsPerPage/unitsPerWord;
addressing units per page
bytesPerPage: NAT = bytesPerWord*wordsPerPage;
bytesPerWord: NAT = 4;
TaggedRef: TYPE = Word;
TaggedRefPtr: TYPE = LONG POINTER TO TaggedRef;
Tag: TYPE = MACHINE DEPENDENT {pInt, pReal1, pReal2, ref, cons, nReal5, nReal6, nInt};
TaggedRefParts: TYPE = MACHINE DEPENDENT RECORD[
tag: Tag ← pInt,
highPad: HighPad ← 0,
lowPad: LowPad ← 0,
indexInPage: IndexInPage ← 0
];
HighPad: TYPE = [0..LAST[CARDINAL]/8];
LowPad: TYPE = [0..LAST[CARDINAL]/wordsPerPage];
IndexInPage: TYPE = [0..wordsPerPage);
CedarNil: TaggedRef = LOOPHOLE[LONG[0]];
LispNil: TaggedRef = LOOPHOLE[TaggedRefParts[
cons, LAST[HighPad], LAST[LowPad], LAST[IndexInPage] ]];
LowestCountedRef: TaggedRef = LOOPHOLE[TaggedRefParts[ref, 256, 0, 0]];
LowestUncountedRef: TaggedRef = LOOPHOLE[TaggedRefParts[ref, 0, 0, 0]];
ConsCellPtr: TYPE = LONG POINTER TO ConsCellRep;
ConsCellRep: TYPE = MACHINE DEPENDENT RECORD [car, cdr: TaggedRef];
RefCountPtr: TYPE = LONG POINTER TO RefCountWord;
RefCountWord: TYPE = MACHINE DEPENDENT RECORD[
type: [0..177777B] ← 0, -- (16 bits) type index
finalizable: BOOLFALSE, -- (1 bit) TRUE iff finalization enabled for this object
spare: BOOLFALSE, -- (1 bit) available for the taking
sizeIndex: SizeIndex ← 0, -- (6 bits) gives size of object
rcb: RefCountByte ← [] -- (8 bits) gives RC of object
];
A RefCountWord precedes every object from Cedar heap space. Very large objects have an additional prefix word before the RefCountWord, but we are not really concerned with them right now.
SizeIndex: TYPE = [0..MaxSizeIndex];
MaxSizeIndex: CARDINAL = 77B;
There is a table that translates between object size and size index, and another table that translates between size index and object size. These tables are only used at allocation / collection time.
RefCountByte: TYPE = MACHINE DEPENDENT RECORD[
count: RefCount ← 0, -- (5 bits) the ref count
onStack: BOOLFALSE, -- (1 bit) TRUE iff ref may be on stack
over: BOOLFALSE, -- (1 bit) TRUE iff ref is present in RC overflow table
inZCT: BOOLFALSE-- (1 bit) TRUE iff ref is on ZCT
];
It simplifies matters to have the RefCountByte be in the same format for both LispCell and RefCountWord. However, this is not strictly necessary.
RefCount: TYPE = [0..MaxRC];
MaxRC: CARDINAL = 37B;
Types for Lisp CONS cell support
RefCountByteArray: TYPE = PACKED ARRAY [0..bytesPerWord) OF RefCountByte;
RCBAptr: TYPE = LONG POINTER TO RefCountByteArray;
LispPagePtr: TYPE = LONG POINTER TO LispPage;
LispPage: TYPE = RECORD [
nextPage: LispPagePtr,
pointer to the next page with a free cell
nextFree: LispCellPtr,
pointer to the next free cell on the page
pageOwner: Word,
process owning the page when it is locked (e.g. during alloc), zero if not locked
padding: PACKED ARRAY [3*bytesPerWord..firstCellIndex) OF RefCountByte,
These bytes are currently not used
rcBytes: PACKED ARRAY [firstCellIndex..cellsPerPage) OF RefCountByte,
These bytes are the ref counts for the Lisp cells
cells: ARRAY [firstCellIndex..cellsPerPage] OF LispCell
];
The LispPage type is primarily for documentation. Each LispCell really occupies 9 bytes on every page, with the RefCountByte not being contiguous with the two reference fields. The above declarations waste 16-3 = 13 words per 1024 word page (less than 1.3%), but that waste could be reduced by more clever encoding. Right now it does not seem worth while to be that clever.
LispCellPtr: TYPE = LONG POINTER TO LispCell;
LispCell: TYPE = RECORD [car, cdr: TaggedRef];
wordsPerCell: NAT = 2;
cellsPerPage: NAT = wordsPerPage/wordsPerCell;
bytesPerCell: NAT = bytesPerWord*wordsPerCell;
firstCellIndex: NAT = cellsPerPage / bytesPerCell;
Global variables (mostly in registers)
collecting: BOOLFALSE;
TRUE when the incremental collector is running, FALSE when it is not running.
noTandS: INT ← 0;
noTandS > 0 when the trace and sweep collector is disabled. noTandS = 0 when T and S is permitted.
highestUncountedRef: TaggedRef ← LowestCountedRef;
The highest address for an uncounted reference is here. All counted references are numerically higher than this reference. This could very well be the reference to Lisp 0.0, which needs to be easily accessible.
highestCons: TaggedRef ← LispNil;
The highest address for a cons cell is here. All counted cons cells are numerically below this address. LispNil need not be counted, of course.
RC primitives seen by Cedar
AssignRef: PROC [lhsP: TaggedRefPtr, rhs: TaggedRef] = {
... is the most common RC primitive. It performs lhsP^ ← rhs, decrementing the RC of the object referenced by lhsP, and incrementing the RC of rhs. This operation must appear to be atomic to all fetches and to all other RC assignments. The RC for rhs or lhsP^ may be temporarily too high.
Preliminary estimates for timing (minimal # of cycles) indicate that the cost of AssignRef is 20 cycles plus the cost of the IncRC and the DecRC. This leads to a cost of about 120 cycles for two cons cells, 85 cycles for two refs, 55 cycles for one ref destination (no rhs RC), and 50 cycles for one ref source (no lhs RC). We suspect that the normal time for an AssignRef will be 10-20 usecs with a 100ns cycle time, and that Lisp will spend slightly more here than Cedar.
DO
lhs: TaggedRef ← lhsP^;
IF lhs = rhs THEN RETURN;
no change anywhere for x ← x (not only is this faster, but it keeps us out of trouble)
DisableTandS[];
IncRC[rhs];
The RC for rhs is now too high, but we will get it right soon.
IF CStoreRef[lhsP, rhs, lhs] THEN {
The assignment was successful, which means that the RC for rhs is now correct, but the RC for lhs is too high. We fix that, then exit in triumph.
DecRC[lhs, collecting];
EnableTandS[];
Indicate that T&S is reenabled
RETURN;
};
A race occurred, so undo the (now incorrect) increment of the rhs and enable T&S.
DecRC[rhs, collecting];
EnableTandS[];
ENDLOOP;
};
InitRef: PROC [lhsP: TaggedRefPtr, rhs: TaggedRef] = {
... performs an RC assignment as if the destination held NIL, which is checked. This is the kind of RC assignment used when a REF is assigned to a component of a newly allocated object. We do not need to use CStore, since no other process can have its mitts on the new object (the compiler had better be sure about this). We use InitRef instead of AssignRef for this special case because it is faster to do so.
lhs: TaggedRef ← lhsP^;
IF lhs # CedarNil THEN CRASH[];
IF rhs # CedarNil THEN {
DisableTandS[];
IncRC[rhs];
RC for rhs is now temporarily too high
lhsP^ ← rhs;
init the designated word
EnableTandS[];
};
};
SwapRefs: PROC [lhsP: TaggedRefPtr, rhsP: TaggedRefPtr] = {
... swaps the contents of lhsP^ and rhsP^ without normally altering the reference counts. The following intermediate states may be observed: {lhsP^ = CedarNil, lhsP^ = rhsP^, complete}. The reference count for lhsP^ or rhsP^ may be temporarily too high.
DisableTandS[];
DO
lhs: TaggedRef = lhsP^;
IF CStoreRef[lhsP, CedarNil, lhs] THEN DO
lhsP^ ← NIL has been done, try to to rhsP^ ← lhs {lhs RC is too high by 1}
rhs: TaggedRef = rhsP^;
IF CStoreRef[rhsP, lhs, rhs] THEN {
rhsP^ ← lhs has been done, try lhsP^ ← rhs {rhs RC is too high by 1}
IF NOT CStoreRef[lhsP, rhs, CedarNil] THEN DecRC[rhs, collecting];
If can't update, correct the rhs RC
EnableTandS[];
RETURN;
};
ENDLOOP;
ENDLOOP;
};
RC primitives special for Lisp
GeneralCons: PROC [car, cdr: TaggedRef] RETURNS [new: TaggedRef] = {
DisableTandS[];
new ← ConsAllocSamePage[cdr];
IncRC[car];
IncRC[cdr];
LOOPHOLE[new, ConsCellPtr]^ ← [car, cdr];
EnableTandS[];
};
ConsWithNil: PROC [car: TaggedRef] RETURNS [new: TaggedRef] = {
DisableTandS[];
new ← ConsAlloc[];
IncRC[car];
LOOPHOLE[new, ConsCellPtr]^ ← [car, LispNil];
EnableTandS[];
};
TailCons: PROC [old: TaggedRef] RETURNS [new: TaggedRef] = {
new ← ConsAllocSamePage[old];
IF IsACons[old] THEN AssignRef[LOOPHOLE[old, TaggedRefPtr]+SIZE[TaggedRef], new];
};
Increment and Decrement RC
IncRC: PROC [tref: TaggedRef] = {
SELECT TRUE FROM
IsARef[tref] =>
DO
rp: RefCountPtr = LOOPHOLE[tref, RefCountPtr] - SIZE[RefCountWord];
rc: RefCountWord = rp^;
nrc: RefCountWord ← rc;
c: RefCount = rc.rcb.count;
IF c = MaxRC
THEN {IncOver[tref]; nrc.rcb.count ← 0; nrc.rcb.over ← TRUE}
ELSE nrc.rcb.count ← c + 1;
nrc.rcb.onStack ← FALSE;
IF CStoreRC[rp, nrc, rc] THEN RETURN;
IF c = MaxRC THEN [] ← DecOver[tref];
There is a vulnerable window in here such that the false IncOver will have been observed by at least 31 net decrements, and that rcb.over will be falsely set to TRUE. If DecOver is not able to find the reference, it should turn off rcb.over to compensate for this nonsense.
ENDLOOP;
IsACons[tref] => {
parts: TaggedRefParts ← LOOPHOLE[tref, TaggedRefParts];
bx: [0..bytesPerWord) ← (parts.indexInPage / wordsPerCell) MOD bytesPerWord;
parts.indexInPage ← parts.indexInPage / (cellsPerPage*bytesPerWord);
DO
rcba: RefCountByteArray = LOOPHOLE[parts, RCBAptr]^;
nrcba: RefCountByteArray ← rcba;
rcb: RefCountByte ← rcba[bx];
c: RefCount = rcb.count;
IF c = MaxRC
THEN {IncOver[tref]; rcb.count ← 0; rcb.over ← TRUE}
ELSE rcb.count ← c + 1;
rcb.onStack ← FALSE;
nrcba[bx] ← rcb;
IF CStoreRCBA[LOOPHOLE[parts, RCBAptr], nrcba, rcba] THEN RETURN;
IF c = MaxRC THEN [] ← DecOver[tref];
ENDLOOP;
};
ENDCASE;
};
DecRC: PROC [tref: TaggedRef, duringCollection: BOOL] = {
If we are decrementing during collection, so we have to assume that the REF could be on the stack, and therefore we put it on the ZCT anyway, so we will clear the onStack bit when the collection is over.
SELECT TRUE FROM
IsARef[tref] =>
DO
needZCT: BOOLFALSE;
rp: RefCountPtr = LOOPHOLE[tref, RefCountPtr] - SIZE[RefCountWord];
rc: RefCountWord = rp^;
nrc: RefCountWord ← rc;
c: RefCount = rc.rcb.count;
SELECT c FROM
0 => {
IF NOT rc.rcb.over THEN CRASH[];
nrc.rcb.over ← DecOver[tref];
nrc.rcb.count ← LAST[RefCount];
};
1 => {
IF NOT (nrc.rcb.over OR nrc.rcb.inZCT) THEN nrc.rcb.inZCT ← needZCT ← TRUE;
nrc.rcb.count ← 0;
};
ENDCASE => nrc.rcb.count ← c - 1;
IF duringCollection THEN {
nrc.rcb.onStack ← TRUE;
IF NOT nrc.rcb.inZCT THEN nrc.rcb.inZCT ← needZCT ← TRUE;
};
IF CStoreRC[rp, nrc, rc] THEN
The count has been altered, so check for needing ZCT entry.
IF needZCT THEN GO TO doZct ELSE RETURN;
IF c = 0 THEN IncOver[tref];
We cannot alter the refCount, so we have to undo the bogus overflow decrement.
ENDLOOP;
IsACons[tref] => {
parts: TaggedRefParts ← LOOPHOLE[tref, TaggedRefParts];
bx: [0..bytesPerWord) = (parts.indexInPage / wordsPerCell) MOD bytesPerWord;
parts.indexInPage ← parts.indexInPage / (cellsPerPage*bytesPerWord);
DO
needZCT: BOOLFALSE;
rcba: RefCountByteArray = LOOPHOLE[parts, RCBAptr]^;
nrcba: RefCountByteArray ← rcba;
rcb: RefCountByte ← rcba[bx];
c: RefCount = rcb.count;
SELECT c FROM
0 => {
IF NOT rcb.over THEN CRASH[];
rcb.over ← DecOver[tref];
rcb.count ← MaxRC;
};
1 => {
IF NOT (rcb.over OR rcb.inZCT) THEN rcb.inZCT ← needZCT ← TRUE;
rcb.count ← 0;
};
ENDCASE => rcb.count ← c - 1;
IF duringCollection THEN {
rcb.onStack ← TRUE;
IF NOT rcb.inZCT THEN rcb.inZCT ← needZCT ← TRUE;
};
nrcba[bx] ← rcb;
IF CStoreRCBA[LOOPHOLE[parts, RCBAptr], nrcba, rcba] THEN
The count has been altered, so check for needing ZCT entry.
IF needZCT THEN GO TO doZct ELSE RETURN;
IF c = 0 THEN IncOver[tref];
We cannot alter the refCount, so we have to undo the bogus overflow decrement.
ENDLOOP;
};
ENDCASE;
EXITS
doZct => PutOnZct[tref];
};
Increment and Decrement RC overflow routines
Global variables for the overflow table
headOvTab: OvTab ← NIL;
tailOvTab: OvTab ← NIL;
freeOvTab: OvTab ← NIL;
lockOvTab: INT ← 0; -- a dummy for the lock
OvTab: TYPE = LONG POINTER TO OvTabRep;
OvTabRep: TYPE = RECORD [
next: OvTab ← NIL,
used: INT ← 0,
data: ARRAY [0..OvTabEntries] OF OvTabEntry
];
OvTabEntries: NAT = 250;
OvTabEntry: TYPE = RECORD [tref: TaggedRef, count: INT];
IncOver: PROC [tref: TaggedRef] = {
Increments the metacount for the given tref.
NYI[];
};
DecOver: PROC [tref: TaggedRef] RETURNS [BOOLFALSE] = {
Decrements the metacount for the given tref. Returns TRUE iff the metacount removed was NOT the last for this reference.
Funny case: If the given reference is NOT in the table, this is because there was a false IncOver. If this occurs, then the rcb.over bit in the object should be turned off, and returns FALSE. We do not expect this to happen very frequently.
NYI[];
};
OverflowMonitorEntry: PROC = {
NYI[];
};
OverflowMonitorExit: PROC = {
NYI[];
};
ZCT management primitives
Global variables for the ZCT
headZ: ZctBlock ← NIL;
headPtr: TaggedRefPtr ← NIL;
headLim: TaggedRefPtr ← NIL;
tailZ: ZctBlock;
tailPtr: TaggedRefPtr ← NIL;
tailLim: TaggedRefPtr ← NIL;
freeZ: ZctBlock ← NIL;
lockZ: INT ← 0; -- a dummy for the lock
ZctBlock: TYPE = LONG POINTER TO ZctBlockRep;
ZctBlockRep: TYPE = RECORD [
next: ZctBlock ← NIL,
data: ARRAY [0..ZCTBlockEntries] OF TaggedRef
];
ZCTBlockEntries: NAT = 250;
PutOnZct: PROC [tref: TaggedRef] = {
For now we just approximate the cost and behavior.
If we get too many collisions on the monitor lock that we can put addresses into the ZCT based on some hash value, and therefore have many ZCT head blocks, and many ZCT locks. By adding enough fanout we can make the liklihood of collision on the ZCT quite low. Right now we don't know if this is necessary.
ZctMonitorEntry[];
IF tailPtr = tailLim THEN {
We need a new block from the free list
IF freeZ =NIL THEN {
We need a new block from the allocator.
freeZ ← AllocZctBlock[];
freeZ.next ← NIL;
};
tailZ.next ← freeZ;
tailZ ← freeZ;
freeZ ← freeZ.next;
tailPtr ← @tailZ.data[0];
};
tailPtr^ ← tref;
tailPtr ← tailPtr + SIZE[TaggedRef];
ZctMonitorExit[];
};
InitZctBlocks: PROC = {
headZ ← tailZ ← AllocZctBlock[];
headPtr ← tailPtr ← @headZ.data[0];
headLim ← tailLim ← @headZ.data[ZCTBlockEntries];
};
AllocZctBlock: PROC RETURNS [new: ZctBlock ← NIL] = {
IF freeZ = NIL THEN RETURN[PermanentZone[].NEW[ZctBlockRep]];
new ← freeZ;
freeZ ← freeZ.next;
new.next ← NIL;
new.data ← ALL[CedarNil];
};
FreeZctBlock: PROC [zctb: ZctBlock] = {
zctb.next ← freeZ;
freeZ ← zctb;
};
ZctMonitorEntry: PROC = {
NYI[];
};
ZctMonitorExit: PROC = {
NYI[];
};
Lisp primitive place holders
ConsAlloc: PROC RETURNS [TaggedRef ← LispNil] = {
Allocates a new Lisp cons cell, trying first from the most recently used page in cons space. We guarantee that both CAR and CDR = LispNil.
NYI[];
};
ConsAllocSamePage: PROC [tref: TaggedRef] RETURNS [TaggedRef ← LispNil] = {
Allocates a new Lisp cons cell, trying first from the most same page as tref. We guarantee that both CAR and CDR = LispNil.
NYI[];
};
Really primitive place holders
CRASH: PROC = {ERROR};
We use CRASH to indicate some kind of disaster in maintaining our invariants.
NYI: PROC = {ERROR};
We use NYI to indicate some feature that is not yet implemented.
DisableTandS: PROC = {noTandS ← noTandS + 1};
EnableTandS: PROC = {noTandS ← noTandS - 1};
noTandS is a per process auxiliary register that indicates whether that process has temporarily made a reference count too high, but will decrement it shortly. In such a case, the trace and sweep collector must not run, since it will incorrectly reconstruct the reference count for objects that are temporarily too high. Measuring whether or not all processes have noTandS = 0 is going to be cute. What must happen is that some processor must obtain the scheduler lock, then cause all other processors to panic stop, which will save the state of all processes. Then we can examine noTandS for all processes safely.
CStoreRef: PROC [lhsP: TaggedRefPtr, new, old: TaggedRef] RETURNS [BOOL ← TRUE] = {
NYI[];
};
CStoreRC: PROC [lhsP: RefCountPtr, new, old: RefCountWord] RETURNS [BOOL ← TRUE] = {
NYI[];
};
CStoreRCBA: PROC [lhsP: RCBAptr, new, old: RefCountByteArray] RETURNS [BOOL ← TRUE] = {
NYI[];
};
PermanentZone: PROC [] RETURNS [UNCOUNTED ZONE ← NIL] = {
NYI[];
};
IsACons: PROC [tref: TaggedRef] RETURNS [BOOL ← TRUE] = {
compares the given word against the highest address for a cons cell, returning TRUE iff the address is in the cons range.
NYI[];
};
IsARef: PROC [tref: TaggedRef] RETURNS [BOOL ← TRUE] = {
compares the given word against the lowest address for a heap object, returning TRUE iff the address is in the ref range.
NYI[];
};
END.