{Begin SubSec Data Types}
{Title Data Types}
{Text

{index BEGIN data-types}


{note do we want to change this list of data types?}


The data-types{index *PRIMARY* data-types} of Interlisp are
lists,{index lists}
atoms,{index *PRIMARY* atoms}
pnames,{index pnames}
value cells,
arrays,{index arrays}
large and small integers,{index large integers}{index small integers}
floating point numbers,{index floating point numbers}
stack pointers,
string characters{index string characters}
and
string pointers.{index string pointers}
There is also a way to define new data-types.
Compiled code, read tables, terminal tables, and hash arrays{index hash arrays} are currently included with arrays.{true?}



In the descriptions of the various data-types given below, for each data type the input syntax and output format are described:  what input sequence will cause the Interlisp read program to construct an element of that type, and how the Interlisp print program will print such an element.  Also, the functions that construct elements of that data-type are given.  Note that some data-types (e.g. arrays) cannot be input, they can only be constructed.



{Begin SubSec Literal Atoms}
{Title Literal Atoms}
{Text

{index *PRIMARY* literal atoms}

A literal atom is a datum consisting of the following components:  a property list, a value, a function definition, and a pname, (not directly accessible).

A literal atom is input as any string of non-delimiting characters that cannot be interpreted as a number.
The syntatic characters that delimit atoms called separator or break characters ({SectionRef L!SeparatorCharacters}) and normally are
space,{index space}
end-of-line,{foot
An end-of-line{index end-of-line} character is transmitted by {index TENEX  NIL}TENEX when it sees a carriage-return.{index carriage-return}
}{comment endfootnote}{index end-of-line}
line-feed,{index line-feed}
{lisp %}{index % (escape character)},
{lisp (}{index (} (left paren),
{lisp )}{index )} (right paren),
{lisp "}{index "} (double quote),
{lisp [}{index [ (use in input)} (left bracket),
and {lisp ]} (right bracket).{index ]}
However, these characters may be included in atoms by preceding them with the escape character{index escape character} {lisp %}.

Literal atoms are printed by {fn PRINT} and {fn PRIN2} as a sequence of characters with {lisp %}s inserted before all delimiting characters (so that the atom will read back in properly).  Literal atoms are printed by {fn PRIN1} as a sequence of characters without these extra {lisp %}s.
For example, the atom consisting of the five characters {lisp A}, {lisp B}, {lisp C}, {lisp (}, and {lisp D} will be printed as {lisp ABC%(D} by {fn PRINT} and {lisp ABC(D} by {fn PRINL}.
The extra {lisp %}s are an artifact of the print program; they are not stored in the atom's pname.{index pnames}

Literal atoms can be constructed by {fn PACK}, {fn MKATOM}, and {fn GENSYM} (which uses {fn MKATOM}).

Literal atoms{index literal atoms} are unique.  In other words, if two literal atoms have the same pname, i.e. print the same, they will {it always} be the same identical atom, that is, they will always have the same address in memory, or equivalently, they will always be {lisp EQ}.{foot
Note that this is {it not} true for strings, large integers, floating point numbers, and lists, i.e. they all can print the same without being {lisp EQ}.
}{comment endfootnote}
Thus if {fn PACK} or {fn MKATOM} is given a list of characters corresponding to a literal atom that already exists, they return a pointer to that atom, and do {it not} make a new atom.  Similarly, if the read program is given as input of a sequence of characters for which an atom already exists, it returns a pointer to that atom.

}{End SubSec Literal Atoms}



{Begin SubSec Pnames}
{Title Pnames}
{Text

{index *PRIMARY* pnames}

{note should we include this as a data type here?  If so, we should explain it more clearly.}

The pnames of atoms{foot
All Interlisp pointers have pnames, since we define a pname simply to be how that pointer is printed.  However, only literal atoms and strings have their pnames explicitly stored.  Thus, the use of the term pname in a discussion of data-types or storage allocation means pnames of atoms or strings, and refers to a sequence of characters stored in a certain part of Interlisp's memory.
}{comment endfootnote}
comprise another data-type with storage assigned as it is needed.
This data-type only occurs as a component of an atom or a string.
It does not appear, for example, as an element of a list.

{note max length of pnames?}

}{End SubSec Pnames}


{Begin SubSec Value Cells}
{Title Value Cells}
{Text

{index value cells}

The value cells of atoms comprise another data-type.  Value cells occur only as part of an atom.  If an atom is ever referenced as a variable, either by {lisp SETQ}, or by use as the argument of a function, or bound as a {lisp PROG} variable, a value cell is created and assigned to the atom.  The value cell will then always contain the current value of the atom.  Note that an atom having no value cell is indistinguishable from an atom whose value cell contains the atom {atom NOBIND}.  The latter case could occur if the user types {lisp SET(FOO {arg EXP})}, then undoes ({SectionRef FnDef UNDO}) the set operation, thereby restoring the atom to its original "value".


Value cells have no input syntax or output format as they cannot be directly referenced by user programs.

}{End SubSec Value Cells}


{Begin SubSec Numeric Atoms}
{Title Numeric Atoms}
{Text

Numerical atoms, or simply numbers, do not have property lists, value cells, functions definition cells, or explicit pnames.
There are currently two types of numbers in Interlisp: integers, and floating point numbers.


{Begin SubSec Integers}
{Title Integers}
{Text

{index *PRIMARY* integers}

The input syntax for an integer is an optional sign ({lisp +} or {lisp -}) followed by a sequence of digits, followed by an optional {lisp Q}, and terminated by a delimiting character.{foot
Note that some data-types are self-delimiting, e.g. lists.
}{comment endfootnote}
{index Q (following a number)}If the {lisp Q} is present, the digits are interpreted in octal,{index octal} otherwise in decimal, e.g. {lisp 77Q} and {lisp 63} both correspond to the same integers, and in fact are indistinguishable internally since no record is kept of how integers were created.

The setting of {fn RADIX} ({SectionRef FnDef RADIX}), determines how integers are printed: signed or unsigned, octal or decimal.

Integers are created by {fn PACK} and {fn MKATOM} when given a sequence of characters observing the above syntax, e.g.
{lisp (PACK (LIST 1 2 (QUOTE Q)))} = {lisp 10}.
Integers are also created as a result of arithmetic operations, as described in {SectionRef L!ArithmeticOperations}.

{note is the following true?}

An integer is stored in one 36 bit word; thus its magnitude must be less than 2{super 35}.{foot
If the sequence of digits used to create the integer is too large, the high order portion is discarded.  The handling of overflow as a result of arithmetic operations is discussed in {SectionRef L!OverFlow}.
}{comment endfootnote}
To avoid having to store (and hence garbage collect) the values of small integers,{index *PRIMARY* small integers} a few pages of address space, overlapping the Interlisp-10 machine language code, are reserved for their representation.
The small number pointer {it itself}, minus a constant, is the value of the number.
Currently the range of "small" integers is -1536 thru +1535.{note true}
The predicate {fn SMALLP} is used to test whether an integer is "small".

While small integers have a unique representation, large integers{index *PRIMARY* large integers} do not.  In other words, two large integers may have the same value, but not the same address in memory, and therefore not be {fn EQ}.  For this reason the function {fn IEQP} (or {fn EQUAL}) should be used to test equality of large integers.

}{End SubSec Integers}


{Begin SubSec Floating Point Numbers}
{Title Floating Point Numbers}
{Text

{index *PRIMARY* floating point numbers}

A floating point number is input as a signed integer, followed by a decimal point, followed by another sequence of digits called the fraction, followed by an exponent
(represented by {lisp E}{index *PRIMARY* E (in a floating point number)} followed by a signed integer) and terminated by a delimiter.

Both signs are optional, and either the fraction following the decimal point, or the integer preceding the decimal point may be omitted.{index . (in a floating point number)}  One or the other of the decimal point or exponent may also be omitted, but at least one of them must be present to distinguish a floating point number from an integer.  For example, the following will be recognized as floating point numbers:

{lispcode
5.      5.00    5.01    .3      5E2     5.1E2
        5E-3    -5.2E+6}


Floating point numbers are printed using the format control specified by the function {fn FLTFMT} ({SectionRef FnDef FLTFMT}).{foot Interlisp-10 calls the TENEX{index TENEX} floating point number to string conversion routines.  Additional information concerning these conversions may be obtained from the
TENEX JSYS Manual.
}{comment endfootnote}
{fn FLTFMT} is initialized to {lisp T}, or free format.
For example, the above floating point numbers would be printed free format as:

{lispcode
5.0     5.0     5.01    .3      500.0   510.0
        .005    -5.2E6}


The {fn PRINTNUM} package ({SectionRef FnDef PRINTNUM}) permits greater controls on the printed appearance of floating point numbers, allowing such things as left-justification, suppression of trailing decimals, etc.


Floating point numbers are also created by {fn PACK} and {fn MKATOM}, and as a result of arithmetic operations as described in {SectionRef L!ArithmeticOperations}.

A floating point number is stored in one 36 bit word in standard PDP-10 format.
The range is {plusminus}2.94E-39 thru {plusminus}1.69E38 (or 2{super -128} thru 2{super 127}).{true?}

}{End SubSec Floating Point Numbers}


}{End SubSec Numeric Atoms}



{Begin SubSec Lists}
{Title Lists}
{Text

{index *PRIMARY* lists}

The input syntax for a list is a sequence (at least one) of Interlisp data elements, e.g. literal atoms numbers, other lists, etc. enclosed in parentheses or brackets.
(Note that {lisp ()}{index ()} is read as the atom {atom NIL}.)
A bracket can be used to terminate several lists, e.g. {lisp (A (B (C]}, as described in {SectionRef L!ClosingBracket}.

If there are two or more elements in a list, the final element can be preceded by a {lisp .}{index .} (delimited on both sides), indicating that {fn CDR} of the final node in the list is to be the element immediately following the {lisp .}, e.g. {lisp (A . B)} or {lisp (A B C . D)}, otherwise {fn CDR} of the last node in a list will be {lisp NIL}.  Note that a list does {it not} have to end in {atom NIL}.  It is simply a structure composed of one or more {lisp CONS} cells.
The input sequence {lisp (A B C . NIL)} is equivalent to {lisp (A B C)}, and {lisp (A B . (C D))} is equivalent to {lisp (A B C D)}.
Note however that {lisp (A B . C D)} will create a list containing the five literal atoms {lisp A}, {lisp B}, {lisp %.}, {lisp C}, and {lisp D}.

Lists are constructed by the primitive functions {fn CONS} and {fn LIST}.

Lists are printed by printing a left parenthesis, and then printing the first element of the list, then printing a space, then printing the second element, etc. until the final node is reached.  The individual elements of a list are printed using {fn PRIN2} if the list is being printed by {fn PRINT} or {fn PRIN2}, and by {fn PRIN1} if the list is being printed by {fn PRIN1}.
Lists are considered to terminate when {fn CDR} of some node is not a list.
If {fn CDR} of this terminal node is {atom NIL} (the usual case),
{fn CAR} of the terminal node is printed followed by a right parenthesis.
If {fn CDR} of the terminal node is {it not} {atom NIL}, {fn CAR} of the terminal node is printed, followed by a space, a period, another space, {fn CDR} of the terminal node, and then the right parenthesis.
Note that a list input as {lisp (A B C . NIL)} will print as {lisp (A B C)}, and a list input as {lisp (A B . (C D))} will print as {lisp (A B C D)}.
Note also that {fn PRINTLEVEL} affects the printing of lists, and that carriage returns may be inserted where dictated by {fn LINELENGTH}, as described in {SectionRef FnDef LINELENGTH}.

}{End SubSec Lists}




{Begin SubSec Arrays}
{Title Arrays}
{Text

{index *PRIMARY* arrays}

An array in Interlisp is a one dimensional block of contiguous storage of arbitrary length.  Arrays do not have input syntax; they can only be created by the function {fn ARRAY}.
Arrays are printed by both {fn PRINT}, {fn PRIN2}, and {fn PRIN1}, as {lisp #}{index # (followed by a number)} followed by the address of the array pointer{index array pointer} in octal.{index octal}  Array elements can be referenced by the functions {fn ELT} and {fn ELTD}, and set by the functions
{fn SETA} and {fn SETD}, as described in {SectionRef FnDef ELT}.{foot
InInterlisp-10, arrays are partitioned into four sections: a header,{index *PRIMARY* array header} a section containing unboxed numbers,{index *PRIMARY* unboxed numbers (in arrays)} a section containing Interlisp pointers, and a section containing relocation information.{index relocation information (in arrays)}  The last three sections can each be of arbitrary length (including 0); the header is two words long and contains the length of the other sections.  The unboxed number region of an array is used to store 36 bit quantities that are not Interlisp pointers, and therefore not to be chased from during garbage collections, e.g. machine instructions. The relocation informaion is used when the array contains the definition of a compiled function, and specifies which locations in the {it unboxed} region of the array must be changed if the array is moved during a garbage collection.
}{comment endfootnote}


}{End SubSec Arrays}



{Begin SubSec Strings}
{Title Strings}
{Text

{index *PRIMARY* strings}

The input syntax for a string is a double quote ({lisp "}),{index *PRIMARY* "} followed by a sequence of any characters except double quote and {lisp %}, terminated by a double quote.  {lisp %} and double quote characters may be included in a string by preceding them with the escape character {lisp %}.

Strings are printed by {fn PRINT} and {fn PRIN2} with initial and final double quotes, and {lisp %}s{index % (escape character)} inserted where necessary for it to read back in properly.
Strings are printed by {fn PRIN1} without the delimiting double quotes and extra {lisp %}s.

Strings are created by {fn MKSTRING}, {fn SUBSTRING}, and {fn CONCAT}.

{note the following is still true?}
{note is there a better term than "string characters"}

Internally a string is stored in two parts; a string pointer and the sequence of
characters.  The Interlisp pointer to a string is the address of the string pointer.  The string pointer, in turn, contains the character position at which the string
characters{index string characters} begin, and the number of characters.  String pointers{index *PRIMARY* string pointers} and string characters{index *PRIMARY* string characters} are two separate data-types,{foot
String characters are not directly accessible by user programs.
}{comment endfootnote}
and several string pointers may reference the same characters.
This method of storing strings permits the creation of a substring by creating a
new string pointer, thus avoiding copying of the characters.
For more details, see {SectionRef L!Strings}.


{note max length of string?}


}{End SubSec Strings}



{index END data-types}


}{End SubSec Data Types}