Page Numbers: Yes X: 306 Y: 1.0" First Page: 25
Margins: Top: 1.0" Bottom: 1.3"
Heading:
STRUCTURE & NOTATION 3-LISP REFERENCE MANUAL March 18, 1984
——————————————————−
3. 3-LISP Structures and Notation
——————————————————−
3-LISP is based on a serial model of computation, consisting of a topology or graph of structures collectively called a structural field, examined and manipulated by a single active processor. This section describes the elements of the structural field, and the notation used to present them.
3.a. Structural Field
Objects in the structural field are called internal structures or, when it is not confusing, just structures; mathematical entities like numbers and sequences are called external structures or abstractions (but never just structures). It is probably misleading to think of internal structures either as syntactic expressions or as mathematical abstractions; they fall somewhere between those two in terms of concreteness. The world consists of objects of all sorts, including both internal and external structures, and undoubtedly many other things as well.
There are at least eleven types (kinds, categories) of internal structures that populate the structural field. This immutable property of each structure may be interrogated with the standard procedure TYPE. The standard procedure = can be used to test whether two structures are (is?) one and the same.
TypeDesignationNormalConstructorStandard Notation
NumeralsNumbersYesa sequence of digits
Booleans
Truth-ValuesYes$TRUE or $FALSE
Charats
CharactersYes#character
Stringer
StringsYes"a sequence of characters"
Streamers
StreamsYes{streamer}
Handles
Internal StructuresYesEXP
Env-DesignatorsEnvironmentsYesECONS{environment ... }
ClosuresFunctionsYesCCONS{closure ... }
Atoms
(Designation of Binding)NoACONS a sequence of alphanumerics
Pairs
(Value of Application)NoPCONS(EXP . EXP)
Rails
SequencesSomeRCONS[EXP EXP ... EXP]
A structure is said to be in normal form if it has three properties: it is stable (meaning that it normalises to itself in all contexts), it is side-effect free (meaning that normalising it will produce no processor, field, or external side effects), and context-independent (meaning that its semantics, procedural and declarative, are independent of context). A normal-form structure S1 is canonical if it is the unique normal-form designator of its designation. The claim on the 3-LISP processor is that, given a structure S1, it will yield a normal-form co-designating structure S2. Because of the stability of normal-form structures, it follows that the processor is idempotent.
Of the eleven structural types, eight and a half are normal-form structures. Also, all six of the non-constructible (i.e., permanent) structure types are canonical.
Each of these eleven structure types can be briefly described:
Numerals: There are an infinite number of 3-LISP integer numerals, set in one-to-one correspondence to the abstract external numbers (ultimately we intend to support full rational or repeating fraction arithmetic, but at the moment only integers are defined). All numerals are canonical normal-form designators of numbers.
Booleans: There are just two boolean structures, notated as ‘$TRUE’ and ‘$FALSE’, that are constants (rigid designators) of Truth and Falsity, respectively. These normal-form structures may be viewed as the canonical true and false statements.
Charats: We do not claim to know what characters are, but charats are their normal-form designators. More precisely, a charat is an atomic structure associated one-to-one with character types (in the linguists’ sense) — i.e., there is only one charat for the character ‘+’, although there of course may be an arbitrary number of tokens (occurences) of that character.
Stringers: Stringers, often used but rarely mentioned as such, are canonical, normal-form designators of strings, associated with the string type (in the linguists’ sense) — e.g., there is only one stringer designating the string tokens of which would be printed ‘This is a 29 character string’.
Streamers: Streams are intended to serve as the interface with the outside world (i.e., to function essentially as communication channels); as a consequence we say virtually nothing about them, other than that they are legimitate arguments for reading and printing (i.e., by such procedures as CHAR-IN, READ, PRESENT, etc.). Streamers are their normal-form designators. Note that no field relationships are defined over streamers.
Handles: Handles are unique normal-form designators of other internal structures — they are the 3-LISP field’s canonical form of structure quotation (but not of lexical quotation — see section 3.b). Thus for the atom X there is a single handle, written ’X. All 3-LISP structures have handles (including handles themselves; thus the handle of the handle of the atom X is written ’’X).
Environment-Designators: Environment-designators, often used but rarely mentioned as such, designate environments — mappings from atoms to normal-form bindings.
Closures: Closures are normal-form function designators; because we have no adequate theory of procedural intension, they retain all the relevant contextual information from the point of function definition (expression and enclosing environment). Although closures, being first-class structures, can be inspected and compared, closure identity is far more fine grained than function identity.
Atoms: As in standard LISPs, atoms are atomic structures used as variables (schematic names). Atoms are associated with identifiers (lexical spellings) by the INTERNALISE and EXTERNALISE procedures — see ATOM-NOTATION and ATOM-NOTATED.
Pairs: Pairs (as in LISP 1.5) are ordered pairs, consisting of a first and second element (which may in turn be any structure in the field). Unlike standard LISPs, however, 3-LISP pairs are used for only one purpose: to encode function applications (a pair is therefore sometimes called a redex, for ‘reducible expression’).
Rails: Rails, derivative from standard LISP’s lists, are used to designate abstract sequences. Like the lists of LISP 1.5, isomorphic rails may be distinct. Those rails all of whose elements are in normal-form are by definition themselves in normal-form; thus the rail [1 2] is in normal-form, whereas the rail [1 (+ 1 1)] is not.
The nine first-order locality relationships defined over internal structures are summarized in the following table:
NameTypeTotal AccessibleStandard Procedure
PPROCPairs StructuresYesYesNoPPROC
PARGSPairs StructuresYesYesNoPARGS
FIRSTRails StructuresNoYesNoFIRST
RESTRails RailsNoYesNoREST
PROC-TYPEClosures AtomsYesYesNoPROCEDURE-TYPE
ENVClosures RailsYesYesNoCLOSURE-ENVIRONMENT
PATTERNClosures StructuresYesYesNoPATTERN
BODYClosures StructuresYesYesNoBODY
DENOTATIONHandles StructuresYesYesYesDOWN ()
All of these relations are total functions, with the exception of FIRST and REST, which are partial, being undefined together on empty rails. DENOTATION is one-to-one and onto; DENOTATION-1 is a therefore total function on structures, called HANDLE, a subset of the function designated by the standard procedure UP ().
Some structures — all numerals, charats, booleans, streamers, and their handles — are permanent members of any structural field configuration. Others — pairs, rails, atoms, and closures — can be brought into existence and connected to existing structures through the use of one of the primitive constructors. For example, the standard procedures called PCONS creates a new pair and establishes a PPROC and PARGS relationship between this pair and the two structures given to PCONS as arguments.
A structure X is accessible from structure Y if X can be reached from Y through a series of FIRST, REST, PPROC, etc. connections. In addition, the handles of all structures are accessible from their designations. When a so-called ‘new’ structure is generated (by CONS, RCONS, ACONS, CCONS, or PCONS) it is guaranteed to be otherwise inaccessible, meaning that it cannot be accessed from any other accessible structure. A rail is considered to be completely inaccessible if it and all of its tails (i.e., rails reachable via one or more applications of REST) are inaccessible. Thus RCONS returns an otherwise completely inaccessible rail, whereas CONS returns an inaccessible, but not completly inaccessible rail.
Once created, a structure will remain a part of the structural field permanently.
3.b. Standard Notation
The 3-LISP internalisation function (the notational interpretation function Q that maps notations into internal structures) is not, strictly speaking, a primitive part of the language definition, since it is not used in internal processing (i.e., discarding it will not topple the tower). There is, however, a standard notation that is used in all documentation (including this reference guide) and provided with 3-LISP upon initialisation. (A user may, however, completely replace it with his/her own version, if desired). This section explains that notation.
The lexical notation is designed to satisfy three goals:
1.In so far as possible, to resemble standard LISP notational practice;
2.To maintain category alignment with the field (one lexical type per structural type);
3.To be convenient.
The goal of category alignment is met by having the standard notation for each type be identifiable in the first character (except for "notational escapes", described below), as indicated in the following chart:
TypeLeading CharacterExamples
Numeraldigit0, 1, -24, +100, 007
Atom
letterA, REDUCE, CAR, ATOM
Boolean
$$T, $F
Pair
((A . B), (PLUS 2 3)
Rail
[[1 2 3], []
Handle
’A, ’(+ 2 3), ’’[]
Charat
##A, #/
Stringer
""This is a string"
Other{{closure},{streamer}
Examples: ‘(A.1)’ notates a pair whose PPROC is the atom notated ‘A and whose PARGS is the numeral notated ‘1’; ‘[]’ notates an empty rail; ‘[123456]’ notates a rail whose FIRST and REST would be notated ‘1’ and ‘[23456]’, respectively; ‘’100’ notates the handle for the numeral that would be notated ‘100’.
Some subtleties complicate this clean correspondence. Specifically:
1.Numerals can have a leading ‘+’ or ‘-’ (i.e., ‘-24’).
2.An atom label may begin with a digit (or sign) providing it contains at least one non-digit (i.e., ‘6N237E’, ‘-X’ and ‘1+’ are valid atom labels). Any atom label that also satisfies the rules for numeral tokens will be taken to be the latter. For example, ‘1-’ and ‘+’ notate atoms, whereas ‘+1’ notates a numeral.
3.Left brace (‘{’) is used as a general notational escape, not only for closures and streamers, but also for un-labelled atoms, errors, and other notational commentary. This notation is employed only on output.
4.Case is ignored in atom labels (converted to upper case on input) For example, ‘Koyannisqatsi’, ‘KOYYANNISQATSI, ‘koyannisqatsi’, and ‘KoYaNnIsQaTsI’ all notate the same atom.
5.A single quote mark (‘’) is used to notate general structural quotations, which include not only the handles but also more complex cases; see section 3.b.??.
6.Some lexical abbreviations (so-called notational sugar) are supported:
(exp1 exp2 ... expk)abbreviates(exp1 . [exp2 ... expk])
‘"char1 char2 ... chark"’abbreviates’[#char1 #char2 ... #chark]
expabbreviates’(UP exp)
exp’’abbreviates’(DOWN exp)
Any notational fragment beginning with a comma (‘;) up to the next carriage-return is treated as a lexical comment, ignored by the internalisation routines.
The following grammar, for those who like such things, presents the essence of the standard 3-LISP notation:
Extended BNF Grammar for 3-LISP Standard Notation
1.Expression::=Regular | Abbreviation | Escape
2.Regular::=Numeral | Boolean | Charat | Stringer | Atom | Pair | Rail
Numeral
::=[Sign-character] Digit-character+
Boolean
::=$TRUE’ | ‘$FALSE’ | ‘$t’ | ‘$f
Charat
::=#’ Any-character
Stringer
::="’ String-character*"
Atom
::=Digit-character* Atom-character+
Pair
::=(’ Expression ‘.’ Expression ‘)
Rail
::=[’ Expression*]
3.Abbreviation::=Up | Down | Extended-pair | String | Quotation | Comma
Up
::=’ Expression
Down
::=’ Expression
Extended-pair
::=(’ Expression+)
Quotation
::=’ Expression
Comma
::=,’ Expression
4.Escape::={Unspecified information}
5.Sign-character::=+’ | ‘-
Digit-character
::=0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9
Any-character
::=Any character in the character set: i.e., carriage-return, space, or
any of:ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
1234567890!@#$%~&*-=+\|<>()[]{}←↑;:’",./?
Atom-character::=Any character except Space, carriage-return,$’, ‘#’, ‘"’, ‘;’, or Special
String-character
::=Any character except"and%’ (see note)
Normal-character
::=Any character except % and carriage-return (see note)
Special
::=(’ | ‘)’ | ‘[’ | ‘]’ | ‘’ | ‘’ | ‘.’ | ‘,’ | ‘{’ | ‘}’ | ‘’ | ‘%
6.Token-sequence::=(Separator* Token Separator*)*
Separator
::=Space-character | Carriage-Return | Comment
Comment
::=;’ Normal-character* Carriage-Return
Note: A percent sign (‘%’) may be used as a character-level quote to insert double quotes and percent signs into strings; thus the notation for the stringer designating the string ‘Is "The 7% Solution" playing?’ would be ‘"Is %"The 7%% Solution%" playing?"’. Similarly, ‘%%’ may be used to insert percent signs into atom names or comments.
In the standard notation, structures are notated with sequences of lexical tokens, each of which is composed of a sequence of one or more characters chosen from a collection of characters called the character set, listed above.
Sequences of characters are broken down into tokens in the conventional way, with the rule that there must always be at least one token separator between adjacent non-special tokens. For example, the character stream ‘(foo[1$T↑#x’’100]) consists of the ten tokens: ‘(’, ‘foo’, ‘[’, ‘1’, ‘$T’, ‘’, ‘#x’, ‘’’100’ ‘]’, and ‘)’.
Special tokens do not notate structures by themselves; rather, they are used to punctuate the notation for composite structures.
For convenience, the following table lists all "special" (i.e., non alpha-numeric) characters that are used for some special purpose. Note that the standard notation uses one character (down-arrow: ‘’) that is not part of the standard ASCII character set, but we reserve ASCII backslash (‘\’) so that it can be used in its stead. We assume, in other words, that the 3-LISP standard notation is indeed based on the standard ASCII sequence, but simply choose to print backslash as a down-arrow.
Code, Character, and UseCode, Character, and Use
(starts pairsabbreviation for UP
)ends pairsabbreviation for DOWN (same as ‘\’)
.separates CAR and CDR%character quotation operator
[starts rails,normalised expression within a quote
]ends rails;starts comments (to CRLF)
handles-starts negative numerals
#charats+starts some positive numerals
$booleans ($T and $F){starts notational escapes
"starts and ends strings}ends notational escapes
A general comment: The internalization function described above is not "onto"; there are structures that are not the result of internalising any lexical expression, for two reasons. First, upon internalisation, all pairs and rails notated are created from previously-inaccessible structures. Hence, any structure with a shared sub-structure will have no lexical counterpart. Second, closures and un-named atoms (those created by ACONS) have no standard lexical counterparts. The standard version of EXTERNALISE, approximately the inverse of INTERNALISE, makes no attempt to deal in a sophisticated way with either of these problems. In particular, no attempt is made to show shared substructure, and un-notatable structures — closures, nameless atoms, and circular structures — are marked with a standard lexical escape: a note enclosed in braces (e.g., ‘{closure}’, ‘{streamer}’, etc.).
On quotation: Simple quoted expressions (specifically, those not containing commas) notate handles: ‘expression, if expression contains no commas, notates the handle for the structure notated by expression. Comma’ed expressions may be used within the scope of the quote, however, allowing one to conveniently notate expressions for constructing structures that resemble the structure that would be notated by the given lexical expression, except that certain parts are substituted in. I.e., the comma notation does is to give you a fill-in-the-blanks-like capability within quotations. These so-called complex quotations are particularly useful when defining macros. (The idea is based on the back-quote feature, borrowed from MACLISP [Moon 74], and is reminiscent of Quine’s "corner quotes" [Quine ??].)
In order to understand this feature, note that ‘’(A.B)’ is roughly equivalent to ‘(PCONS’A’B)’; the latter notates a structure that normalises to the structure that would be notated by the first. Trading on this correspondence, the notation ‘’(A.,X)’ is lexical shorthand for ‘(PCONS’AX)’, which notates a structure that would normalize to a structure that would be notated ‘(A.HELLO)’ in an environment where X was bound to ’HELLO. Assume in the following examples that B is bound to ’2 and C to ’’3:
Notationabbreviatesand normalises to
’1’1’1
’(A . ,B)
(PCONS ’A B)’(A . 2)
’[A ,B C]
(RCONS ’A B ’C)’[A 2 C]
’[[A B][,C D]]
(RCONS ’[A B] (RCONS C ’D))’[[A B][’3 D]]
’’[A ,B ,,C]
(PCONS (’RCONS’[’A B ,C]))’(RCONS [’A B ’3]) (i.e.,’[A ,B 3])
To express the workings of this mechanism precisely requires a little care, since both notation and designation must be spoken of explicitly. It can be summarised as follows:
Lexical Quotation Principle: A lexical expression E1 preceded by a quote will notate a structure S1 that designates a structure S2 that would be notated by E1, with the exception that those fragments of S2 that would be notated by portions of E1 that are preceded by a comma will in fact be designated by the structures that those portions notate, rather than notated by them directly.
An intensional note: the quotation expander will not use a token tail of a rail if any part of that rail has a comma’ed expression within it. Specifically, we have:
1> (DEFINE TEST
(LAMBDA SIMPLE [A]
’[,A 2 3]))
=>
TEST
1> (LET [[X (TEST ’1)]
[Y (TEST ’2)]]
(= (REST X) (REST Y)))
=>
$FALSE
since ’[,A23] abbreviates (RCONSA’2’3), not (CONSA’[2 3]).
See the Appendix A definitions of DEFINE, LET, LETSEQ, and other macros for futher examples.