{Begin SubSec Hash Arrays}
{Title Hash Arrays}
{Text
{Note Interlisp-D doesn't GC any hash keys. }
{Note Interlisp-Vax doesn't throw away any hash keys}
{Note Will the garbage collector ever delete entries from a hash table?
Personal opinion: I think that the GC should NOT delete entries, and that you should always be able to retrieve everything you put in a hash array using MAPHASH ---mjs}
{Note Exactly how are hash arrays different from arrays? Do we want to document this here? Personal opinion: we want to treat hash arrays as a totally different data type ---mjs}
{Tag HashArrays}
{index *PRIMARY* hash arrays}
{index *PRIMARY* hash keys}
{index *PRIMARY* hash values}
Hash arrays provide a mechanism for associating arbitrary lisp objects ("hash keys") with other objects ("hash values"), such that the hash value associated with a particular hash key can be quickly obtained. A set of associations could be represented as a list or array of pairs, but these schemes are very inefficient when the number of associations is large. There are functions for creating hash arrays, putting a hash key/value pair in a hash array, and quickly retrieving the hash value associated with a given hash key.
Hash keys can be any lisp object, but is should be noted that the hash array functions use {fn EQ} for comparing hash keys. Therefore, if non-atoms are used as hash keys, the exact same object (not a copy) must be used to retrieve the hash value.
{note paragraph about GC of non-atomic hash keys (done in I-10, not in I-D)?}
In the description of the functions below, the argument {arg HARRAY} has one of three forms:
{lisp NIL}, in which case a hash array provided by the system, {index SYSHASHARRAY Var}{var SYSHASHARRAY}, is used;
a hash-array created by the function {fn HARRAY};
or a list, {fn CAR} of which is a hash array.
The latter form is used for specifying what is to be done on overflow, as described below.
{Note What does 'size' MEAN?? Is size='max number hash links', or 2 times that?
vaguely, size=# of "cells" of storage ---lmm
do we want to doc this??}
{FnDef {FnName HARRAY} {FnArgs LEN}
{Text
Creates a hash array containing at least {arg LEN} hash keys.
}}
{FnDef {FnName HARRAYSIZE} {FnArgs HARRAY}
{Text
Returns the size of {arg HARRAY}; the number of hash keys it can hold before becoming "full".
}}
{FnDef {FnName CLRHASH} {FnArgs HARRAY}
{Text
Clears all hash keys/values from {arg HARRAY}.
Returns {arg HARRAY}.
}}
{FnDef {FnName PUTHASH} {FnArgs KEY VAL HARRAY}
{Text
Associates the hash value {arg VAL} with the hash key {arg KEY} in {arg HARRAY}. Replaces the previous hash value, if any. If {arg VAL} is {lisp NIL}, any old association is removed (hence a hash value{index hash value} of {lisp NIL} is not allowed). If {arg HARRAY} is full when {fn PUTHASH} is called with a key not already in the hash array, the function {fn HASHOVERFLOW} is called, and the {fn PUTHASH} is done to the value returned (see below). Returns {arg VAL}.
}}
{FnDef {FnName GETHASH} {FnArgs KEY HARRAY}
{Text
Returns the hash value associated with the hash key {arg KEY} in {arg HARRAY}. Returns {lisp NIL}, if {arg KEY} is not found.
}}
{FnDef {FnName REHASH} {FnArgs OLDHARRAY NEWHARRAY}
{Text
Hashes all hash keys and values in {arg OLDHARRAY} into {arg NEWHARRAY}. The two hash arrays do not have to be (and usually aren't) the same size. Returns {arg NEWHARRAY}.
{note in Interlisp-Vax, if OLDHARRAY=NEWHARRAY, rehashes OLDHARRAY "in place".}
}}
{FnDef {FnName MAPHASH} {FnArgs HARRAY MAPHFN}
{Text
{arg MAPHFN} is a function of two arguments.
For each hash key in {arg HARRAY}, {arg MAPHFN} will be
applied to (1) the hash value, and (2) the hash key. For example,
{lispcode
[MAPHASH A
(FUNCTION (LAMBDA (VAL KEY)
(if (LISTP KEY) then (PRINT VAL)]}
will print the hash value for all hash keys that are lists. {fn MAPHASH} returns {arg HARRAY}.
}}
{FnDef {FnName DMPHASH} {FnArgs HARRAY{SUB 1} HARRAY{SUB 2} {ellipsis} HARRAY{SUB N}}
{Type NOSPREAD NLAMBDA}
{Text
Prints on the primary output file {fn LOAD}able forms which will restore the hash-arrays contained as the values of the atoms
{arg HARRAY{SUB 1}}, {arg HARRAY{SUB 2}},
{ellipsis} {arg HARRAY{SUB N}}.
Example: {lisp (DMPHASH SYSHASHARRAY)} will dump the system hash-array.
Note: all {fn EQ} identities except atoms and small integers are lost by dumping and loading because {fn READ} will create new structure for each item.
Thus if two lists contain an {fn EQ} substructure, when they are dumped and loaded back in, the corresponding substructures while {fn EQUAL} are no longer {fn EQ}.
The {lisp HORRIBLEVARS} file package command ({PageRef FileCom HORRIBLEVARS}) provides a way of dumping hash tables such that these identities are preserved.
}}
{note use CIRCULARVARS or GENERALVARS instead of HORRIBLEVARS ---lmm}
{Begin SubSec Hash Overflow}
{Title Hash Overflow}
{Text
{index hash overflow}
{note who created this kludgy way of specifying overflow? Why not simply have the overflow action be determined by a property of an HARRAY? Then, you wouldn't have to pass around these dumb lists.}
When a hash array becomes full, attempting to add another hash key will cause the function {fn HASHOVERFLOW}{index HASHOVERFLOW Fn} to be called. This will either automatically enlarge the hash array, or cause the error {lisp HASH TABLE FULL}.{index HASH TABLE FULL Error} How hash overflow is handled is determined by the form that was passed to {fn PUTHASH}:
{Begin LabeledList the form that is passed to the hash function}
{Name {arg HARRAY}}
{Text
If a plain hash array is passed to a hash function, and it overflows, the error {lisp HASH ARRAY FULL}{index HASH ARRAY FULL Error} is generated.
}
{Name {lisp NIL}}
{Text
If a hash function is passed {lisp NIL} as its {arg HARRAY} argument, the system hash array {var SYSHASHARRAY}{index SYSHASHARRAY Var} is used. This array is not used by the system, but is provided for the user. If {var SYSHASHARRAY} overflows, it is automatically enlarged by 1.5.
}
{Name {lisp ({arg HARRAY} . {arg N})}}
{Text
{arg N} is a positive integer. This form specifies that upon hash overflow, a new hash-array is created with {arg N} more cells than the current hash-array.
}
{Name {lisp ({arg HARRAY} . {arg F})}}
{Text
{arg F} is a floating point number. This form specifies that upon hash overflow, the new hash array will be {arg F} times the size of the current hash-array.
}
{Name {lisp ({arg HARRAY} . {arg FN})}}
{Text
{arg FN} is a function name or a lambda expression. This form specifies that upon hash overflow, {arg FN} is called with {lisp ({arg HARRAY} . {arg FN})} as its argument. If {arg FN} returns a number, the number will be the size of the new hash array. Otherwise, the new size defaults to 1.5 times the size of the old hash array. {arg FN} could be used to print a message, or perform some monitor function.
}
{Name {lisp ({arg HARRAY})}}
{Text
Equivalent to {lisp ({arg HARRAY} . 1.5)}.
}
{End LabeledList the form that is passed to the hash function}
If a list form is used, upon hash overflow the new hash-array is {fn RPLACA}ed into the dotted pair, and {fn HASHOVERFLOW} returns it.
}{End SubSec Hash Overflow}
}{End SubSec Hash Arrays}