CircularGarbageDoc.tioga
Bob Hagmann August 13, 1985 2:33:29 pm PDT
Circular Garbage
CEDAR 6.0 — FOR INTERNAL XEROX USE ONLY
CircularGarbageDoc
Bob Hagmann
© Copyright 1985 Xerox Corporation. All rights reserved.
Abstract: This program is a modification of the trace and sweep program to find circular data structures that are unreferenced except from within themselves.
Created by: Bob Hagmann
Maintained by: Hagmann.pa
Keywords: garbage collection, performance
XEROX  Xerox Corporation
   Palo Alto Research Center
   3333 Coyote Hill Road
   Palo Alto, California 94304

For Internal Xerox Use Only
1. Algorithm
This program is a rework of the original implementation done by Paul Rovner.
Recall that the garbage collectors in Cedar rely on the ``conservative scan'': REFs in stack frames are not referenced counted, and which fields in stack frames contain REFs is never computed.
FindCircularGarbage and FindCircularGarbageTypes described below work as follows. First a series of incremental collections occur to try to get everything that the incremental collector can find to be reclaimed, and all finalizable objects finalized. When further incremental collections seem useless, a variant of a trace and sweep is launched. When an object if found that is not free and is not referenced, directly or recursively, from permanent data structures (global frames, stack frames), then it is an object that the normal trace and sweep would collect — unless, of course, the object is finalizable. Instead of collecting it, CircularGarbage remembers it in a table and set the reference count to four. The table is of size specified by the caller — overflows are not recorded. Also, objects of type RTTypesPrivate.TypedVariable are not recorded — they are created by the interpreter during evaluation, and don't print well. When the collection finishes, the table is processed, and a list is constructed of the REFs or the types of the objects found respectively.
WhoPointsTo and WhoPointsToRef follow up the referrer chain to find who refers to an object given by header (Allocator.NHeaderP) or by REF. This uses the trace and sweep to find these. Real trace and sweeps are done.
FindConnectedComponent takes an object and recursively finds all objects that it can reach. When given an object in a non-trivial strongly connected component ("circular part"), FindConnectedComponent will return all objects in the entire connected component plus all objects reachable from the component (up to the limits provided by nObjects). When given an object in a weakly connected component (the "leaf part"), FindConnectedComponent will return all objects that are reachable from this object (up to the limits provided). By using WhoPointsTo or WhoPointsToRef, you can find an object in a strongly connected component from objects in weakly connected components. For example, you may find that some poor unsuspecting ROPE is listed as circular. Since the ROPE cannot (should not) be circular all by itself, you would use WhoPointsTo or WhoPointsToRef to find who points to the ROPE. Note that an object can be circular if it points to itself.
To review, you can walk forward from objects using the interpreter. You can do transitive closures of walking forward via FindConnectedComponent. You can find objects that point to a given object (walking backwards) via WhoPointsTo and WhoPointsToRef.
See the example below for interesting uses of PrintTV.PrintType andSafeStorage.GetReferentType used with this package.
Warning: trace and sweep can take a long time if your collectible storage is much bigger than your physical storage.
Warning: the conservative scan and finalization may make some objects appear to be garbage that in reality are not. Also, objects created or (recursively) dereferenced between the last incremental collection and the trace and sweep may appear to be garbage when in fact they are not. We launch a series of incremental collections before doing the sweep, but this is probalistic.
Warning: the reference counts on objects suspected of being garbage is set artificially high. To make the world consistent, a real trace and sweep should be run (right click GC in the watch tool).
Warning: certain circular structures seem to freak out the interpreter while printing. I've tried to prune these out.
Warning: objects of type "LIST OF AMModelContextImpl.GFContextRec" and of type "RTTypesPrivate.TypedVariableRec" have their reference counts set artifically high and are not reported. This avoids some unsaftey in the interpreter (I think).
Warning: never try to do two of the garbage finders at once.
Warning: objects with finalization enabled are never reported.
Warning: FindConnectedComponent relies on my incomplete understanding of the Abstract Machine. It may not handle objects of all types.
2. User Interface (hah!)
Run CircularGarbageTraceAndSweepImpl. From the Interpreter (not the CommandTool for some obscure reason!), start up the garbage finder. CircularGarbageTraceAndSweep interface has five interesting procedures:
WhoPointsTo: PROC [nhp: Allocator.NHeaderP, cycleLimit: NAT ← 20] RETURNS [ referers: REF APNodes, gfh: PrincOps.GlobalFrameHandle, foundInLocalFrame, foundInCircularStructure: BOOL, cycles: NAT ← 0] ;
Look up the referer chain. Stop if a gfh found, or if foundInLocalFrame, or if a loop is found, or if more than 1 referer is found, or if no referers are found (changed conserv scan). This does a real trace and sweep to get its information.
WhoPointsToRef: PROC [ref: REF, cycleLimit: NAT ← 20] RETURNS [ referers: REF APNodes, gfh: PrincOps.GlobalFrameHandle, foundInLocalFrame, foundInCircularStructure: BOOL, cycles: NAT ← 0] ;
Same as WhoPointsTo but for REF's
FindCircularGarbageTypes: PROC [nObjects: CARDINAL, reportRopes: BOOLTRUE] RETURNS [ nGarbage, nSeen: INT, garbage: LIST OF SafeStorage.Type ← NIL] ;
This procedure first does a standard incremental collection. Then it goes thru the motions of a TandS, but does not reclaim any garbage. Instead, it records the Types in a list. Do not do two of these at once!!
FindCircularGarbage: PROC [nObjects: CARDINAL, reportRopes: BOOLTRUE] RETURNS [ nGarbage, nSeen: INT, garbage: LIST OF REFNIL] ;
This procedure first does a standard incremental collection. Then it goes thru the motions of a TandS, but does not reclaim any garbage. Instead, it records the REFs in a list. Do not do two of these at once!!
FindConnectedComponent: PROC [rootObject: REF ANY, nObjects: CARDINAL, reportRopes: BOOLTRUE] RETURNS [ allOnList: BOOL, structure: LIST OF REF ];
This procedure takes the rootObject, and find all object that it recursively points to. nObjects is the maximum number of objects to return. ROPEs will not be returned if reportRopes is FALSE. allOnList indicates that the list is complete: it was not stopped because nObjects was reached.
Call with nObjects as the number of objects you want to worry about. Ropes will not be returned in the list if reportRopes is FALSE (but they are counted). ``nGarbage'' is the number of objects seen that are garbage. ``nSeen'' is how many total objects are in the system. Examples ( what you type in bold) :


&1 ← CircularGarbageTraceAndSweepImpl.FindCircularGarbage[10]
=> [nGarbage: 333, nSeen: 35087, garbage: LIST[" 9 Aug 85 HARTMA... Report Requests", "HARTMANN.PA $ 24#117@ 9 Aug 85 10:12:43 PDT", ^[msg: "donahue.pa $ 3#235@12 Aug 85 12:51:50 PDT", tocName: "\t12 Aug 85 donahue.pa A Walnut bulletin b...", startOfSubject: 29, hasBeenRead: TRUE], ^[streamProcs: 2361562B^, streamData: NIL, propList: NIL, backingStream: 7047046B^], "summonerserveron", ^[getChar: IOCommonImpl.DefaultGetChar, endOf: IOCommonImpl.DefaultEndOf, charsAvail: IOCommonImpl.DefaultCharsAvail, getBlock: IOCommonImpl.GetBlockViaGetChar, unsafeGetBlock: IOCommonImpl.UnsafeGetBlockViaGetChar, backup: IOCommonImpl.DefaultBackup, putChar: IOCommonImpl.DefaultPutChar, putBlock: IOCommonImpl.PutBlockViaPutChar, unsafePutBlock: IOCommonImpl.UnsafePutBlockViaPutChar, flush: IOCommonImpl.DefaultFlush, reset: IOCommonImpl.DefaultReset, close: IOCommonImpl.ClosedClose, getIndex: IOCommonImpl.DefaultGetIndex, setIndex: IOCommonImpl.DefaultSetIndex, propList: NIL, variety: input, class: $EditedViewer], ^[getChar: IOCommonImpl.DefaultGetChar, endOf: IOCommonImpl.DefaultEndOf, charsAvail: IOCommonImpl.DefaultCharsAvail, getBlock: IOCommonImpl.GetBlockViaGetChar, unsafeGetBlock: IOCommonImpl.UnsafeGetBlockViaGetChar, backup: IOCommonImpl.DefaultBackup, putChar: IOCommonImpl.DefaultPutChar, putBlock: IOCommonImpl.PutBlockViaPutChar, unsafePutBlock: IOCommonImpl.UnsafePutBlockViaPutChar, flush: IOCommonImpl.DefaultFlush, reset: IOCommonImpl.DefaultReset, close: IOCommonImpl.ClosedClose, getIndex: IOCommonImpl.DefaultGetIndex, setIndex: IOCommonImpl.DefaultSetIndex, propList: NIL, variety: output, class: $ViewersOutput], ^[proc: WalnutMsgSetDisplayerImpl.MsgSetSelectionProc, clientData: 2155302B^, q: 4760662B^], ^[proc: WalnutMsgSetDisplayerImpl.MsgSetSelectionProc, clientData: 2155002B^, q: 4760662B^], ^[proc: NIL, clientData: NIL, q: 4760662B^]]]
&2 ← &1.garbage.rest.rest.first
=> ^[msg: "donahue.pa $ 3#235@12 Aug 85 12:51:50 PDT", tocName: "\t12 Aug 85 donahue.pa A Walnut bulletin board service", startOfSubject: 29, hasBeenRead: TRUE]
&3 ← CircularGarbageTraceAndSweepImpl.WhoPointsToRef[&1.garbage.rest.rest.first, 2]
=> [referers: LIST[^[next: 12434322B^, prev: 2147702B^, iButton: 12160370B^, hbrButton: NIL, tocButton: 12160406B^, container: 10475232B^, msgInfo: 1702212B^], ^1702212B^, ^[ref: 1702212B^, cycleLimit: 2], ^1702212B^, ^[ref: 1702212B^, cycleLimit: 2], LIST[^[msg: "donahue.pa $ 3#235@12 Aug 85 12:51:50 PDT", tocName: "\t12 Aug 85 donahue.pa A Walnut bulletin b...", startOfSubject: 29, hasBeenRead: TRUE], ^[streamProcs: 2361562B^, streamData: NIL, propList: NIL, backingStream: 7047046B^], "summonerserveron", ^[getChar: IOCommonImpl.DefaultGetChar, endOf: IOCommonImpl.DefaultEndOf, charsAvail: IOCommonImpl.DefaultCharsAvail, getBlock: IOCommonImpl.GetBlockViaGetChar, unsafeGetBlock: IOCommonImpl.UnsafeGetBlockViaGetChar, backup: IOCommonImpl.DefaultBackup, putChar: IOCommonImpl.DefaultPutChar, putBlock: IOCommonImpl.PutBlockViaPutChar, unsafePutBlock: IOCommonImpl.UnsafePutBlockViaPutChar, flush: IOCommonImpl.DefaultFlush, reset: IOCommonImpl.DefaultReset, close: IOCommonImpl.ClosedClose, getIndex: IOCommonImpl.DefaultGetIndex, setIndex: IOCommonImpl.DefaultSetIndex, propList: NIL, variety: input, class: $EditedViewer], ^[getChar: IOCommonImpl.DefaultGetChar, endOf: IOCommonImpl.DefaultEndOf, charsAvail: IOCommonImpl.DefaultCharsAvail, getBlock: IOCommonImpl.GetBlockViaGetChar, unsafeGetBlock: IOCommonImpl.UnsafeGetBlockViaGetChar, backup: IOCommonImpl.DefaultBackup, putChar: IOCommonImpl.DefaultPutChar, putBlock: IOCommonImpl.PutBlockViaPutChar, unsafePutBlock: IOCommonImpl.UnsafePutBlockViaPutChar, flush: IOCommonImpl.DefaultFlush, reset: IOCommonImpl.DefaultReset, close: IOCommonImpl.ClosedClose, getIndex: IOCommonImpl.DefaultGetIndex, setIndex: IOCommonImpl.DefaultSetIndex, propList: NIL, variety: output, class: $ViewersOutput], ^[proc: WalnutMsgSetDisplayerImpl.MsgSetSelectionProc, clientData: 2155302B^, q: 4760662B^], ^[proc: WalnutMsgSetDisplayerImpl.MsgSetSelectionProc, clientData: 2155002B^, q: 4760662B^], ^[proc: NIL, clientData: NIL, q: 4760662B^]], ^[ref: 1702212B^, cycleLimit: 2]], gfh: NIL, foundInLocalFrame: TRUE, foundInCircularStructure: FALSE, cycles: 1]
&4 ← CircularGarbageTraceAndSweepImpl.FindCircularGarbageTypes[30, FALSE]
=> [nGarbage: 2116, nSeen: 41644, garbage: LIST[[709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [709], [1507], [1507], [1507], [1507], [550], [1507], [789], [1507], [1507], [1507], [1507], [1507], [1507], [1507]]]
&5 ← PrintTV.PrintType[SafeStorage.GetReferentType[&1.garbage.rest.rest.first], &H.tsOutStream]
WalnutMsgSetDisplayerPrivate.MsgInfoRec
&6 ← PrintTV.PrintType[[709], &H.tsOutStream]
MBQueuePrivate.MyClickInfoObj
&7 ← CircularGarbageTraceAndSweepImpl.FindCircularGarbage[6, FALSE]
=> [nGarbage: 3663, nSeen: 43116, garbage: LIST[^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 12161202B^, q: 4760662B^], ^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 7120362B^, q: 4760662B^], ^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 3142222B^, q: 4760662B^], ^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 12161002B^, q: 4760662B^], ^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 12161022B^, q: 4760662B^], ^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 12161122B^, q: 4760662B^]]]
&8 ← CircularGarbageTraceAndSweepImpl.FindConnectedComponent[&7.garbage.first, 10]
=> [allOnList: FALSE, countReturned: 10, circularObjects: LIST[^[proc: WalnutMsgSetButtonsImpl.SelectMsgSetProc, clientData: 12161202B^, q: 4760662B^], ^[msgSet: [...], spaceButton: 12157576B^, button: 12157614B^, msViewer: NIL, next: 12161222B^, selected: NIL], ^[LOCK: [...], firstEvent: NIL, pushModel: FALSE, newEvent: [...], notifierRunning: FALSE], ^[startLoc: [...], endLoc: [...], proc: NIL, clientData: NIL, fork: TRUE], ^[startLoc: [...], endLoc: [...], proc: MBQueueImpl.UserClick, clientData: 2146052B^, fork: TRUE], NIL, ^[msgSet: [...], spaceButton: 12157632B^, button: 12157650B^, msViewer: NIL, next: 12161242B^, selected: NIL], ^[startLoc: [...], endLoc: [...], proc: NIL, clientData: NIL, fork: TRUE], ^[startLoc: [...], endLoc: [...], proc: MBQueueImpl.UserClick, clientData: 12134442B^, fork: TRUE], ^[msgSet: [...], spaceButton: 12157704B^, button: 12157722B^, msViewer: NIL, next: 12161322B^, selected: NIL]]]
&9 ← &8.circularObjects.first.clientData
=> ^[msgSet: [name: "VM", version: -1], spaceButton: 12157576B^, button: 12157614B^, msViewer: NIL, next: 12161222B^, selected: NIL]
&10 ←
You can now paw around in the circular structure and try to figure out why is isn't collected.
CircularGarbageTraceAndSweep is an interface that allows for this all to be called from a program to automate some sifting.
You might consider first doing a trace and sweep, running the program that you what to measure, then finding the garbage. This avoids the garbage left around by other parts of the system.