{Begin SubSec Hash Arrays} {Title Hash Arrays} {Text {Note Interlisp-D doesn't GC any hash keys. } {Note Interlisp-Vax doesn't throw away any hash keys} {Note Will the garbage collector ever delete entries from a hash table? Personal opinion: I think that the GC should NOT delete entries, and that you should always be able to retrieve everything you put in a hash array using MAPHASH ---mjs} {Note Exactly how are hash arrays different from arrays? Do we want to document this here? Personal opinion: we want to treat hash arrays as a totally different data type ---mjs} {Tag HashArrays} {index *PRIMARY* hash arrays} {index *PRIMARY* hash keys} {index *PRIMARY* hash values} Hash arrays provide a mechanism for associating arbitrary lisp objects ("hash keys") with other objects ("hash values"), such that the hash value associated with a particular hash key can be quickly obtained. A set of associations could be represented as a list or array of pairs, but these schemes are very inefficient when the number of associations is large. There are functions for creating hash arrays, putting a hash key/value pair in a hash array, and quickly retrieving the hash value associated with a given hash key. Hash keys can be any lisp object, but is should be noted that the hash array functions use {fn EQ} for comparing hash keys. Therefore, if non-atoms are used as hash keys, the exact same object (not a copy) must be used to retrieve the hash value. {note paragraph about GC of non-atomic hash keys (done in I-10, not in I-D)?} In the description of the functions below, the argument {arg HARRAY} has one of three forms: {lisp NIL}, in which case a hash array provided by the system, {index SYSHASHARRAY Var}{var SYSHASHARRAY}, is used; a hash-array created by the function {fn HARRAY}; or a list, {fn CAR} of which is a hash array. The latter form is used for specifying what is to be done on overflow, as described below. {Note What does 'size' MEAN?? Is size='max number hash links', or 2 times that? vaguely, size=# of "cells" of storage ---lmm do we want to doc this??} {FnDef {FnName HARRAY} {FnArgs LEN} {Text Creates a hash array containing at least {arg LEN} hash keys. }} {FnDef {FnName HARRAYSIZE} {FnArgs HARRAY} {Text Returns the size of {arg HARRAY}; the number of hash keys it can hold before becoming "full". }} {FnDef {FnName CLRHASH} {FnArgs HARRAY} {Text Clears all hash keys/values from {arg HARRAY}. Returns {arg HARRAY}. }} {FnDef {FnName PUTHASH} {FnArgs KEY VAL HARRAY} {Text Associates the hash value {arg VAL} with the hash key {arg KEY} in {arg HARRAY}. Replaces the previous hash value, if any. If {arg VAL} is {lisp NIL}, any old association is removed (hence a hash value{index hash value} of {lisp NIL} is not allowed). If {arg HARRAY} is full when {fn PUTHASH} is called with a key not already in the hash array, the function {fn HASHOVERFLOW} is called, and the {fn PUTHASH} is done to the value returned (see below). Returns {arg VAL}. }} {FnDef {FnName GETHASH} {FnArgs KEY HARRAY} {Text Returns the hash value associated with the hash key {arg KEY} in {arg HARRAY}. Returns {lisp NIL}, if {arg KEY} is not found. }} {FnDef {FnName REHASH} {FnArgs OLDHARRAY NEWHARRAY} {Text Hashes all hash keys and values in {arg OLDHARRAY} into {arg NEWHARRAY}. The two hash arrays do not have to be (and usually aren't) the same size. Returns {arg NEWHARRAY}. {note in Interlisp-Vax, if OLDHARRAY=NEWHARRAY, rehashes OLDHARRAY "in place".} }} {FnDef {FnName MAPHASH} {FnArgs HARRAY MAPHFN} {Text {arg MAPHFN} is a function of two arguments. For each hash key in {arg HARRAY}, {arg MAPHFN} will be applied to (1) the hash value, and (2) the hash key. For example, {lispcode [MAPHASH A (FUNCTION (LAMBDA (VAL KEY) (if (LISTP KEY) then (PRINT VAL)]} will print the hash value for all hash keys that are lists. {fn MAPHASH} returns {arg HARRAY}. }} {FnDef {FnName DMPHASH} {FnArgs HARRAY{SUB 1} HARRAY{SUB 2} {ellipsis} HARRAY{SUB N}} {Type NOSPREAD NLAMBDA} {Text Prints on the primary output file {fn LOAD}able forms which will restore the hash-arrays contained as the values of the atoms {arg HARRAY{SUB 1}}, {arg HARRAY{SUB 2}}, {ellipsis} {arg HARRAY{SUB N}}. Example: {lisp (DMPHASH SYSHASHARRAY)} will dump the system hash-array. Note: all {fn EQ} identities except atoms and small integers are lost by dumping and loading because {fn READ} will create new structure for each item. Thus if two lists contain an {fn EQ} substructure, when they are dumped and loaded back in, the corresponding substructures while {fn EQUAL} are no longer {fn EQ}. The {lisp HORRIBLEVARS} file package command ({PageRef FileCom HORRIBLEVARS}) provides a way of dumping hash tables such that these identities are preserved. }} {note use CIRCULARVARS or GENERALVARS instead of HORRIBLEVARS ---lmm} {Begin SubSec Hash Overflow} {Title Hash Overflow} {Text {index hash overflow} {note who created this kludgy way of specifying overflow? Why not simply have the overflow action be determined by a property of an HARRAY? Then, you wouldn't have to pass around these dumb lists.} When a hash array becomes full, attempting to add another hash key will cause the function {fn HASHOVERFLOW}{index HASHOVERFLOW Fn} to be called. This will either automatically enlarge the hash array, or cause the error {lisp HASH TABLE FULL}.{index HASH TABLE FULL Error} How hash overflow is handled is determined by the form that was passed to {fn PUTHASH}: {Begin LabeledList the form that is passed to the hash function} {Name {arg HARRAY}} {Text If a plain hash array is passed to a hash function, and it overflows, the error {lisp HASH ARRAY FULL}{index HASH ARRAY FULL Error} is generated. } {Name {lisp NIL}} {Text If a hash function is passed {lisp NIL} as its {arg HARRAY} argument, the system hash array {var SYSHASHARRAY}{index SYSHASHARRAY Var} is used. This array is not used by the system, but is provided for the user. If {var SYSHASHARRAY} overflows, it is automatically enlarged by 1.5. } {Name {lisp ({arg HARRAY} . {arg N})}} {Text {arg N} is a positive integer. This form specifies that upon hash overflow, a new hash-array is created with {arg N} more cells than the current hash-array. } {Name {lisp ({arg HARRAY} . {arg F})}} {Text {arg F} is a floating point number. This form specifies that upon hash overflow, the new hash array will be {arg F} times the size of the current hash-array. } {Name {lisp ({arg HARRAY} . {arg FN})}} {Text {arg FN} is a function name or a lambda expression. This form specifies that upon hash overflow, {arg FN} is called with {lisp ({arg HARRAY} . {arg FN})} as its argument. If {arg FN} returns a number, the number will be the size of the new hash array. Otherwise, the new size defaults to 1.5 times the size of the old hash array. {arg FN} could be used to print a message, or perform some monitor function. } {Name {lisp ({arg HARRAY})}} {Text Equivalent to {lisp ({arg HARRAY} . 1.5)}. } {End LabeledList the form that is passed to the hash function} If a list form is used, upon hash overflow the new hash-array is {fn RPLACA}ed into the dotted pair, and {fn HASHOVERFLOW} returns it. }{End SubSec Hash Overflow} }{End SubSec Hash Arrays}