Heading:
IDL DESIGN: Reference counts
Page Numbers: Yes X: 527 Y: 10.5"
XEROX Palo Alto Research Center July 12, 1977
Inter-Office Memorandum
ToIDL GroupIDL Design Note Number: 12
FromRon KaplanFile:<IDLDOC>Design.RefCounts
Subject Reference Counts
This memo is intended to make explicit the current conventions governing the use and interpretation of reference counts. Reference counts are maintained for rows that are referenced, directly or indirectly, by array frames. Their purpose is to permit secret sharing of structures that logically should be copies, while permitting the copying semantics to apply when necessary. Specifically, the copy nature of the structures must be manifested when one of the "copies" is altered, for example, by ASSIGN or by some other system operation (about which more later). We expect such modification requests to be relatively rare, so that many physical copies can be avoided.
The first observation to make about reference counts is that they need not be maintained for all rows. They are only needed for rows that (1) can be secretly shared but logically copied; and (2) can be modified. In the current implementation, the only rows that have both these properties are:
Shapes
Element blocks
Dimension label blocks
Level label blocks.
All these rows may be passed back to the user (via SHAPE, AT, LABEL) in such a way that ASSIGN can be invoked to change them. Moreover, the rows of label information can be shared (via LAB.COPYALL and LAB.COPYDIM) and modified (via SETDIMLAB etc.) internal to system code.
There are a number of other rows in the system that lack one or both of these properties:
Translation table ROWPTRs -- never modified, not shared
Offset ROWINTs -- never modified, should (and will) be shared
TTABLINLOC ROWINTs -- never shared
VIRTSUBSCRIPT ROWINTs -- never shared
It is foolish to suffer the code complexity and run-time costs of maintaining reference counts for these objects, and therefore, no such maintenance is done.
The reference counts for all rows are set to zero at create time. The counts for reference counted rows are incremented whenever a new pointer to them is established. They are decremented whenever a change operation is to be performed and their reference count is more than one. In that situation, the structure containing the pointer to the row through which the modification will be visible is altered to point to a copy of the row. The copy has reference count 1, since it is pointed to by the structure in question. But a pointer to the original has been dropped, as is its reference count.
This simple design is, of course, only an approximation to what we would really like to have. Ideally, the reference count on an object would reflect only references through which that object can still be accessed. Thus, if an arrayframe is created pointing to an element block, and that element block is shared with another arrayframe, and then all pointers to the first frame are destroyed so that it is (or may be) garbage collected, the reference count to the element block should be decremented. The fact that the garbage collector is not smart enough to decrement the count means that the row may be copied unnecessarily. If this happens infrequently, it will cause only slight inefficiencies, since the reference count of the copy will be 1 and subsequent changes will be made in place.
We should probably embellish this scheme to eliminate known, systematic instances of inflated reference counts. In particular, there are occasions in system code where two arrayframes are created pointing to the same row, but we know that only one of them will migrate to the user. As an example, consider the expression
RESHAPE(1 (1 2))
RESHAPE converts its first argument to an array using CONV.ARRAY. This returns an arrayframe pointing to an element block with reference count 1. RESHAPE then creates a new frame with the new shape pointing to the element block, giving it a count of 2. The second frame is returned to the user; the first frame falls out of existence. An attempt to ASSIGN into the value of RESHAPE will result in an unnecessary copy.
An embellished definition of RESHAPE would notice that with the given arguments, CONV.ARRAY was producing a new arrayframe and element block, to which there were no other pointers. It could thus either smash the new shape into the old frame, or build a new frame in the ordinary way, but not increment the reference count.
There might be other common, special situations where we might want to take special action. However, we are probably unable to capture such cases as the following:
(RESHAPE (COUNT X) ’(1 2))
Here, COUNT is producing an array that will never emerge into the light of day, yet the element block of RESHAPE’s value will have a reference count of 2. Oh well.