{Begin SubSec Storage Allocation and Garbage Collection} {Title Storage Allocation and Garbage Collection} {Text {index Fragmentation of data space} {index Data fragmentation} {index Working set} {index Garbage collection} {index Storage allocation} As an Interlisp-D applications program runs, it creates data structures (allocated out of free storage space), manipulates them, and then discards them. If there were no way of reclaiming this space, over time the Interlisp-D memory (both the physical memory in the machine and the virtual memory stored on the disk) would fill up, and the computation would come to a halt. Actually, long before this could happen the system would probably become intolerably slow, due to "data fragmentation," which occurs when the data currently in use are spread over many virtual memory pages, so that most of the computer time must be spent swapping disk pages into physical memory. The problem of fragmentation will occur in any situation where the virtual memory is significantly larger than the real physical memory. To reduce swapping, it is desirable to keep the "working set" (the set of pages containing actively referenced data) as small as possible. {index Mark-and-sweep garbage collection} It is possible to write programs that don't generate much "garbage" data, or which recycle data, but such programs tend to be overly complicated and difficult to debug. Spending effort writing such programs defeats the whole point of using a system with automatic storage allocation. An important part of any Lisp implementation is the "garbage collector" which identifies discarded data and reclaims its space. There are several well-known approaches to garbage collection. One method is the traditional mark-and-sweep garbage collection algorithm, which identifies "garbage" data by marking all accessible data structures, and then sweeping through the data spaces to find all unmarked objects (i.e., not referenced by any other object). Although this method is guaranteed to reclaim all garbage, it takes time proportional to the number of allocated objects, which may be very large. (Some allocated objects will have been marked during the "mark" phase, and the remainder will be collected during the "sweep" phase; so all will have to be touched in some way.) Also, the time that a mark-and-sweep garbage collection takes is independent of the amount of garbage collected; it is possible to sweep through the whole virtual memory, and only recover a small amount of garbage. {index Reference-counting garbage collection} For interactive applications, it is not acceptable to have long interruptions in a computation for the purpose of garbage collection. Interlisp-D solves this problem by using a reference-counting garbage collector. With this scheme, there is a table containing counts of how many times each object is referenced. This table is incrementally updated as pointers are created and discarded, incurring a small overhead distributed over the computation as a whole. (Note: References from the stack are not counted, but are handled separately at "sweep" time; thus the vast majority of data manipulations do not cause updates to this table.) At opportune moments, the garbage collector scans this table, and reclaims all objects that are no longer accessible (have a reference count of zero). The pause while objects are reclaimed is only the time for scanning the reference count tables (small) plus time proportional to the amount of garbage that has to be collected (typically less than a second). "Opportune" times occur when a certain number of cells have been allocated or when the system has been waiting for the user to type something for long enough. The frequency of garbage collection is controlled by the functions and variables described below. For the best system performance, it is desirable to adjust these parameters for frequent, short garbage collections, which will not interrupt interactive applications for very long, and which will have the added benefit of reducing data fragmentation, keeping the working set small. One problem with the Interlisp-D garbage collector is that not all garbage is guaranteed to be collected. Circular data structures, which point to themselves directly or indirectly, are never reclaimed, since their reference counts are always at least one. With time, this unreclaimable garbage may increase the working set to unacceptable levels. Some users have worked with the same Interlisp-D virtual memory for a very long time, but it is a good idea to occasionally save all of your functions in files, reinitialize Interlisp-D, and rebuild your system. Many users end their working day by issuing a command to rebuild their system and then leaving the machine to perform this task in their absence. If the system seems to be spending too much time swapping (an indication of fragmented working set), this procedure is definitely recommended. Garbage collection in Interlisp-D is controlled by the following functions and variables: {FnDef {Name RECLAIM} {Args} {Text Initiates a garbage collection. Returns 0. }} {FnDef {Name RECLAIMMIN} {Args N} {Text Sets the frequency of garbage collection. Interlisp keeps track of the number of cells of any type that have been allocated; when it reaches a given number, a garbage collection occurs. If {arg N} is non-{lisp NIL}, this number is set to {arg N}. Returns the current setting of the number. }} {VarDef {Name RECLAIMWAIT} {Text Interlisp-D will invoke a {fn RECLAIM} if the system is idle and waiting for user input for {var RECLAIMWAIT} seconds (currently set for 4 seconds). }} {FnDef {Name GCGAG} {Args MESSAGE} {Text Sets the behavior that occurs while a garbage collection is taking place. If {arg MESSAGE} is non-{lisp NIL}, the cursor is complemented during a {fn RECLAIM}; if {arg MESSAGE}={lisp NIL}, nothing happens. The value of {fn GCGAG} is its previous setting. }} {FnDef {Name GCTRP} {Args} {Text Returns the number of cells until the next garbage collection, according to the {fn RECLAIMMIN} number. }} The amount of storage allocated to different data types, how much of that storage is in use, and the amount of data fragmentation can be determined using the following function: {FnDef {FnName STORAGE} {FnArgs TYPES PAGETHRESHOLD} {Text {fn STORAGE} prints out a summary, for each data type, of the amount of space allocated to the data type, and how much of that space is currently in use. If {arg TYPES} is non-{lisp NIL}, {fn STORAGE} only lists statistics for the specified types. {arg TYPES} can be a litatom or a list of types. If {arg PAGETHRESHOLD} is non-{lisp NIL}, then {fn STORAGE} only lists statistics for types that have at least {arg PAGETHRESHOLD} pages allocated to them. {fn STORAGE} prints out a table with the column headings {lisp Type}, {lisp Assigned}, {lisp Free Items}, {lisp In use}, and {lisp Total alloc}. {lisp Type} is the name of the data type. {lisp Assigned} is how much of your virtual memory is set aside for items of this type. Currently, memory is allocated in quanta of two pages (1024 bytes). The numbers under {lisp Assigned} show the number of pages and the total number of items that fit on those pages. {lisp Free Items} shows how many items are available to be allocated (using the {lisp create} construct, {PageRef (Record Operator) CREATE}); these constitute the "free list" for that data type. {lisp In use} shows how many items of this type are currently in use, i.e., have pointers to them and hence have not been garbage collected. If this number is higher than your program seems to warrant, you may want to look for storage leaks. The sum of {lisp Free Items} and {lisp In use} is always the same as the total {lisp Assigned} items. {lisp Total alloc} is the total number of items of this type that have ever been allocated (see {fn BOXCOUNT}, {PageRef Fn BOXCOUNT}). Note: The information about the number of items of type {lisp LISTP} is only approximate, because list cells are allocated in a special way that precludes easy computation of the number of items per page. Note: When a data type is redeclared, the data type name is reassigned. Pages which were assigned to instances of the old data type are labeled {index **DEALLOC** (Data Type Name)}{lisp **DEALLOC**}. At the end of the table printout, {fn STORAGE} prints a "Data Spaces Summary" listing the number of pages allocated to the major data areas in the virtual address space: the space for fixed-length items (including datatypes), the space for variable-length items, and the space for litatoms. Variable-length data types such as arrays have fixed-length "headers," which is why they also appear in the printout of fixed-length data types. Thus, the line printed for the {lisp BITMAP} data type says how many bitmaps have been allocated, but the "assigned pages" column counts only the headers, not the space used by the variable-length part of the bitmap. This summary also lists "Remaining Pages" in relation to the largest possible virtual memory, not the size of the virtual memory backing file in use. This file may fill up, causing a {lisp STORAGE FULL} error, long before the "Remaining Pages" numbers reach zero. {fn STORAGE} also prints out information about the sizes of the entries on the variable-length data free list. The block sizes are broken down by the value of the variable {var STORAGE.ARRAYSIZES},{index STORAGE.ARRAYSIZES Var} initially {lisp (4 16 64 256 1024 4096 16384 NIL)}, which yields a printout of the form: {lispcode variable-datum free list: le 4 26 items; 104 cells. le 16 72 items; 783 cells. le 64 36 items; 964 cells. le 256 28 items; 3155 cells. le 1024 3 items; 1175 cells. le 4096 5 items; 8303 cells. le 16384 3 items; 17067 cells. others 1 items; 17559 cells.} This information can be useful in determining if the variable-length data space is fragmented. If most of the free space is composed of small items, then the allocator may not be able to find room for large items, and will extend the variable datum space. If this is extended too much, this could cause an {lisp ARRAYS FULL} error,{index ARRAYS FULL Error} even if there is a lot of space left in little chunks. }} {note is there any reason to document this??? (ATOMHASH#PROBES STRING) Returns the number of probes it takes to find an atom in Interlisp's atom hash table with print name = STRING, or NIL if no such atom is found.} {FnDef {Name STORAGE.LEFT} {Args } {Text Provides a programmatic way of determining how much storage is left in the major data areas in the virtual address space. Returns a list of the form {lisp ({arg MDSFREE} {arg MDSFRAC} {arg 8MBFRAC} {arg ATOMFREE} {arg ATOMFRAC})}, where the elements are interpreted as follows: {Begin Labeledlist STORAGE.LEFT values} {Label {arg MDSFREE}} {Text The number of free pages left in the main data space (which includes both fixed-length and variable-length data types).} {Label {arg MDSFRAC}} {Text The fraction of the total possible main data space that is free.} {Label {arg 8MBFRAC}} {Text The fraction of the total main data space that is free, relative to eight megabytes. This number is useful when using Interlisp-D on some early computers where the hardware limits the address space to eight megabytes. The function {index 32MBADDRESSABLE Fn}{fn 32MBADDRESSABLE} returns non-{lisp NIL} if the currently running Interlisp-D system can use the full 32 megabyte address space.} {Label {arg ATOMFREE}} {Text The number of free pages left in the litatom space.} {Label {arg ATOMFRAC}} {Text The fraction of the total litatom space that is free.} {End Labeledlist STORAGE.LEFT values} }} Note: Another important space resource is the amount of the virtual memory backing file in use (see {fn VMEMSIZE}, {PageRef Fn VMEMSIZE}). The system will crash if the virtual memory file is full, even if the address space is not exhausted. }{End SubSec Storage Allocation and Garbage Collection} ?1(DEFAULTFONT 1 (GACHA 10) (GACHA 8) (TERMINAL 8)) ?1(DEFAULTFONT 1 (GACHA 10) (GACHA 8) (TERMINAL 8)) /X/Xzē