[phylum]<3-LISP>course>manual>primer.bravo!4

Page Numbers: Yes X: 306 Y: 0.9" First Page: 1 Not-on-first-page
Margins: Top: 1.1" Bottom: 1.2"
Heading:

3-LISP PRIMER LISP: LANGUAGE AND LITERATURE April 13, 1984
————————————————————————————————————————————

3-LISP PRIMER

Filed as:[phylum]<3-lisp>course>manual>primer.bravo
User.cm:[phylum]<BrianSmith>system>user.classic
Last edited:April 13, 1984 4:23 PM

————————————————————————————————————————————

This primer is a guided introduction for first time 3-LISP users. For detailed questions about 3-LISP, especially once you get started, you will want to use the 3-LISP manual. For questions about using 3-LISP on the Dandelions — how to "boot" the machine, how to use the editor and load files, etc. — you should refer to the 3-LISP System Guide.

A — Overview

A.1. A Declarative, Extensional View

Being at least in part a language, all 3-LISP expressions are taken as designating something (think of designating as a kind of naming). For example, the expressions 1, 100, and -6 designate the abstract numbers one, ten and negative six, respectively; $TRUE designates truth and $FALSE designates falsity; [] designates the empty sequence, and [1 2 3] the sequence whose elements are the numbers one, two, and three in that order. More generally, expressions of the form [x1 x2 ... xn] designate the abstract sequence of length n consisting of the objects designated by the xi in the specified order. And expressions of the form (f . a) designate the result of applying the function designated by f to the argument object designated by a. (thus function applications are expressed in prefix form). The common case of applying a function to a sequence of n (>0) arguments (f . [x1 x2 ... xn]) is abbreviated (f x1 x2 ... xn).

So-called variables, such as Knights, f, and Calculus (case is unimportant) designate different objects in different contexts. We use the term ‘variable’ here rather generally, and include within its scope what you might think of as constants, as well, such as + and lambda. This works because there is a most general context, called the global context, in which many variables are taken to be names of all the standard functions. For example, in the global context, + designates the addition function, and therefore the expression (+ 1 3), in the global context, designates the number that is the square of two.

The global context includes names for a wide assortment of standard functions defined over numbers, truth-values, and sequences. For numbers, +, -, *, and / designate the conventional arithmetic functions; ** designates exponentiation, and 1+ and 1- are (slightly confusing) names for increment and decrement functions, respectively. Thus (- 100 1) designates ninety-nine, (1+ 3) designates four, and (1- 3) designates two. The standard sequence operations are named NULL, LENGTH, FIRST, REST, and CONS. Thus (FIRST [1 2 3]) designates the number one; (REST [10 20 30]) designates the two-element sequence of twenty and thirty in that order; (LENGTH [10 20 30 40]) designates four; and (CONS 1 [2 3]) and [1 2 3] designate the same sequence. In general, (FIRST (CONS x s)) and x will be co-designating expressions, as are (REST (CONS x s)) and s; (NULL s) will designate truth if and only if s designates an empty sequence. Finally, = designates the equality function — a function that has the value truth just in case its two arguments are in fact the same object. Thus (= 1 1) and (= 2 (1+ 1)) both designate truth.

There are several special operators (including LAMBDA, IF, and COND) that do not fit the standard mold in that they cannot be characterized solely in terms of what their arguments designate. So-called lambda-expressions, of the form (LAMBDA pattern body), are commonly used to designate functions for which one does not already have a simple name. In general, the pattern part will be written as a sequence of variables (called the formal parameters), and the body part will be a general expression, usually one involving some of the formal parameters. For example, (LAMBDA [X] X) designates the identity function (of one argument); (LAMBDA [X] (FIRST (REST X))) designates the function that yields the second element of its (single) argument, which is assumed to be a sequence. LAMBDA is not extensional because the function is not built up out of the designations of the pattern and body expressions. Those expressions are really acting as templates; in order to determine what function is designated, one must look at the expressions themselves, not at what they designate. (The pattern [X] in the foregoing definition of the identity function, for example, doesn’t designate anything in the global context, because X is not the name of anything in that context.)

Uses of free variables are always satisfied in the lexically enclosing environment. (Unless otherwise indicated, all uses of standard names will assume their global meaning.)

IF and COND are the two standard conditional operators in 3-LISP. IF designates a function (or at least something like a function — but lets not worry about technicalities yet) that selects its second or third argument, depending on whether its first argument is true or false, respectively. For example, the expression (IF (= 1 2) 100 200) designates two hundred, whereas (IF $TRUE [] [1]) designates the empty sequence. COND is a little more complex; its general form is (COND [p1 x1] [p2 x2] ... [pn xn]), and it designates the designation of the first xi for which the corresponding pi designates truth. Thus for example (COND [(= 1 2) 3] [$TRUE 4]) designates four; the following expression designates sixteen:

(COND [(= 1 2) (+ 3 4)]
[(= 1 3) (* 3 4)]
[(= 1 4) (** 3 4]
[$TRUE (/ 3 4)])

Note how much easier it is to read this expression when it is printed it out with each of the clauses of the conditional lined up vertically. This general style of printing expressions in a structured indented fashion is called pretty-printing.

We said that (LAMBDA pattern body) designated a function, and that (f x1 x2 ... xn) designated the value of applying the function designated by f to the arguments designated by x1 through xn. These two expression forms can be combined. For example, the following four expressions all designate the number sixteen:

1. ((LAMBDA [X] (* X X)) 4)

2. ((LAMBDA [X] (FIRST (REST X)))
[(+ 8 2) (* 8 2) (/ 8 2)])

3. ((LAMBDA [F N] (F N N))
+
8)

4. ((LAMBDA [X Y Z] (X Y Z))
(LAMBDA [N M] (* N (1+ M)))
(FIRST [2 7])
(FIRST (REST [2 7])))

The third and fourth of these illustrate the fact that 3-LISP is what is called higher-order: functions, as well as more normal arguments, can be used as arguments to other functions.

There is a standard abbreviation supported in 3-LISP, that makes expressions like the ones just listed much easier to write. In general, the expression:

(LET [[v1 x1]
...
[vn xn]]
body)

is an abbreviation for:

((LAMBDA [v1 ... vn] body)
x1 ... xn)

Thus the four expressions given just above could equally well have been written as follows:

1. (LET [[X 4]] (* X X))

2. (LET [[X [(+ 8 2) (* 8 2) (/ 8 2)]]
(FIRST (REST X)))

3. (LET [[F +] [N 8]]
(F N N))

4. (LET [[X (LAMBDA [N M] (* N (1+ M)))]
[Y (FIRST [2 7])]
[Z (FIRST (REST [2 7]))]]
(X Y Z))

Two more types of facility will complete this simple introduction. First, 3-LISP contains primitive functions and notations for dealing with strings. In particular, the expression "string-of-characters" designates the string of characters within the double quote marks. For example, the expression "Ethelred" designates the eight-character name of a famous old English king; the expression "" designates a string consisting of no characters at all. Similarly, the expression #c designates the character c. For example, #e designates the last letter in this sentence. Some functions defined on strings include: STRING-LENGTH, STRING-CONS, and STRING-NULL. For example, (STRING-CONS #B "ale") designates the string ‘Bale’.

There is also an interesting function in 3-LISP called TYPE (i.e., the variable TYPE, in the global context, designates this function), which maps all kinds of objects onto variable names that indicate what type they are. For example, (TYPE 3) designates the variable NUMBER; (TYPE [1 2 3]) designates the variable SEQUENCE, and (TYPE +) designates the variable FUNCTION.

A.2. A Procedural, but still Extensional, View

We could go on in the spirit of the last section for quite a while, gradually introducing almost all the standard 3-LISP functions. But we still haven’t talked about anything computational — about anything to do with procedures or activity, or about what happens when you type something in to a 3-LISP process. In this section we will shift our attention from what expressions designate to what 3-LISP does with expressions — i.e., from a declarative to a procedural viewpoint. But we will still describe those procedures primarily extensionally, in terms of designated objects. Only in the next section will talk about how 3-LISP works.

When you are typing things to 3-LISP, we will say you are talking to the 3-LISP processor — a particular process inside the computer that you can have conversations with about particular kinds of behaviour you would like to create. We will start with exceedingly simple interactions, but build up the complexity fairly quickly.

If you type (+ 2 3) to the 3-LISP processor (just processor, for short), it will respond with the expression 5. If you simply type 5 back to the processor, it will respond with the same expression 5. In general, the processor always responds with an expression that designates the same thing that your input expression designates (in this case, the number five), but that is in some simple or canonical form. We will call the expression that the processor types back the response. Just what is viewed as simple or canonical is a complex story in general, but the answer is straightforward for simple cases. In particular, for numbers the response will be the corresponding numeral expression (5 for the number five, in the previous example); for truth-values it will be one of the two boolean expressions $TRUE or $FALSE; for sequences it will be an expression of the form [x1 x2 ... xn], where the xi are, correspondingly, appropriate responses for each of the elements of the sequence; for strings, the response will be the string enclosed in double quote marks: "string-of-characters".

To put this more formally, the 3-LISP processor simplifies input expressions, returning responses that are in normal-form; it is therefore called a normalizing processor. The types of response just described are the normal-form designators for numbers, truth-values, sequences, and strings. We will talk about normal-form designators of functions in the next section. Normal-form expressions have a number of properties: the most important one is that they are independent of context; thus no response from the processor will ever be a variable, or contain a free occurence of a variable.

The general thing to remember is this: if you type an expression x1 to the 3-LISP processor, it will respond with an expression x2 that is a normal-form co-designator of x1.

The way that interaction with the processor happens is this: it types out a prompt, which is usually of the form ‘1> ’, and awaits your input. When you have typed one full expression, it will type ‘1= ’ followed by the normal-form response, and then repeat. Thus, the processor performs what is called a read-normalize-print loop. The following gives an example of some typical interactions. For pedagogical convenience of this primer, we have put the responses in italics, but this convention will probably not be supported by the implementation. One detail: anything on a line to the right of a semi-colon is taken as a comment, and is entirely ignored by the processor. This makes it convenient for you to annotate what you are doing, make notes to yourself, etc.

;;; This is a comment. The 3-LISP reader ignores everything between a ";"
;;; and a carriage-return. Here we use three semi-colons for aesthetics.
;;; The semi-colon needn’t be the first character in the line, but it
;;; is convenient for general comments like this one.

1> (+ 2 5) ; "1> " is the standard 3-LISP prompt. The user typed "(+ 2 5)".
; The processor reads (+ 2 2) and responds with a normal-form
1= 7 ; expression that designates the same number — seven.
1> [1 2 3] ; Then it prompts the user for another input.
1= [1 2 3]

We can try out some more complex function applications:

1> (* 100 (IF (= [] []) 3 4))
1= 300

1> [(* 1 1) (+ 1 1) (= 1 1)]
1= [1 2 $TRUE]

1> (TYPE 1)
1= ’number ; Ignore the single quote mark now; it will be explained later.
1> (TYPE +)
1= ’function
1> (TYPE TYPE)
1= ’function

As a notational convenience in this primer, we will use the symbol ‘g’ to mean "normalizes to"; thus we will say (+ 1 2) g 3; [(= 3 4)] g $FALSE, etc.

Given that we have started talking about behaviour and processors, we can start defining names to use in other expressions. In the purely declarative account we gave in section A.1, we talked about variables being bound in the scope of lambda expressions, and we used some convenient globally defined names like + and IF. But we didn’t have any way to define new names and have them stick. But we can do this. There are, in particular, two ways of defining new names in the global environment: SET and DEFINE. As a rule of thumb, you should use SET for simple setting of variables to designate objects like numbers and sequences and strings, and DEFINE for defining functions.

The form (SET var x) will cause the name var to designate the designation of x from now on, in the global environment. The response from a SET form is just the response from the x part, which, although sometimes useful, is by and large irrelevant. The expression (SET var x), in other words, is primarily used for its (procedural) effect, not for its (declarative) designation. It is more like a command than like a name (SET is sort of like a verb, whereas + and LENGTH are more like common nouns). Thus we have:

1> (SET X 3)
1= 3

1> X
1= 3

1> (+ X X)
1= 6

1> (SET L (LENGTH [10 20 30 40 50]))
1= 5

Variables that are bound in the global context, but that are also bound by patterns in lambda expressions, take their designation from the most local enclosing context (3-LISP is said to be lexically scoped, because it is always the lexical context that determines bindings). Some examples:

1> (SET X 3)
1= 3 ; X now designates three in the global context.

1> X
1= 3 ; Confirmation of this fact.

1> (LET [[X 4]] (+ X 1)) ; The local binding of X takes precedence.
1= 5

1> ((LAMBDA [Y] (+ X Y)) ; but only locally; the original binding is still
100) ; in force globally.
1= 103

With DEFINE we can define functions using lambda expressions. For example, the following defines a function called APPEND that appends two sequences. (Note: when talking declaratively, we would have said that APPEND was a function from two sequences to a sequence that is their concatenation, or something static like that. Now we say it "takes" two arguments, or "appends" two sequences — a more active way of speaking. This reflects the fact that we are now viewing (APPEND s s) in a procedural light.)

1> (DEFINE APPEND
(LAMBDA [SEQ1 SEQ2]
(IF (NULL SEQ1)
SEQ2
(CONS (FIRST SEQ1)
(APPEND (REST SEQ1) SEQ2)))))
1= ’APPEND ; DEFINE responds with the name of the function defined;
; Ignore the single quote mark for now.

1> (APPEND [1 2 3] [4 5 6])
1= [1 2 3 4 5 6]

1> (APPEND [(= 1 2) (= 2 2)] [])
1= [$FALSE $TRUE]

The definition of APPEND is recursive, because the name APPEND was used within the definition. Recursion is an extremely powerful mechanism for defining functions.

The definitions just given illustrate a very important point about names. When defining a function, like APPEND, the names of variables are chosen to reflect what happens one level of designation away. Thus we called the formal parameters of the APPEND function by the names SEQ1 and SEQ2, because they will designate sequences when the function is applied. We did not call them SEQUENCE-DESIGNATING-EXPRESSION-1 and SEQUENCE-DESIGNATING-EXPRESSION-2, even though in some sense they are sequence designating expressions. We, as external observers and theorists, look at these two variables, and use the phrases "sequence-designating-expression", in our theoretic language (English, as it happens), to describe those expressions in 3-LISP. Those 3-LISP expressions, however, are looking at sequences, and so are named in ways that are appropriate for the things they look at. This practice — of having expressions named in ways that correspond to the things they are looking at — works out very well indeed. (Imagine how complicated it would be if my name were ‘Brian’s-name’, rather than ‘Brian’.) It is also a pretty automatic practice for us native language users, in simple cases (there is very little tendency for you to call me ‘Brian’s name’). When we get to complex intensional situations, however, when we are using language to look at language, we will need to have this practice — which we will maintain — very clearly in mind, in order to avoid confusion. Also, there is an ambiguity in computer science about whether we are theoreticians looking at computer languages, or whether we are theoreticians using computer languages as extensions of our own language to look at the world. This ambiguity can also lead to some confusion. Although in some cases the latter is a powerful strategy, it must be our first priority, in this class, to do the former.

Another issue that comes up, once we adopt the procedural stance, has to do with how long things take. In the problem set, for example, you will encounter two different definitions of the fibonnaci function. In terms of designated function, these two definitions are identical; in terms of procedural consequence, however, they are quite different. The 3-LISP processor, for example, will respond to the input (FIBONNACI 100) with the number ??? in just a few seconds, given one definition; given the other, it would take it thousands of years to complete its calculation. Computational complexity — a large area of active research in computer science — is something we will not consider in too much detail in this course. See the discussion at the end of the first chapter of Abelson and Sussman’s book for more information.

More complex functions, and more complex ways of combining them, are explored in the first problem set, which you will be working through in sections and labs.

A.3. The Intensional Account

Our next task, given a) some understanding of what 3-LISP expressions designate, and b) some understanding of what behaviour the 3-LISP processor exhibits, is to ask c) about how the 3-LISP processor works. This requires that we no longer speak extensionally, in terms of designated numbers, sequences, and functions, because the 3-LISP processor — a finite, physically realized, formal device — has not access to such abstract, "in-the-world" objects. Rather, it must deal with the expressions themselves, and with things it can make out of those expressions.

We have already talked some about expressions as such, and about a normalization relationship between them, and about the relationship between normalization and designation. In order to discuss the internal workings of 3-LISP, however, we must make yet another distinction: between 3-LISP expressions as notations — i.e., as strings of characters that can be printed on a page or on a terminal screen or edited in a buffer — and various internal structures inside the machine that the 3-LISP processor essentially translates those notations into. This distinction, between notations and internal structures, is like a distinction between a public language that a person speaks — say English — and an internal "mentalese" or brain-writing, in terms of which a person thinks.

Right away, we must emphasize that the ontological status of these internal structures — whether, for example, they are syntactic objects like notations, or whether they are more like set-theoretic or mathematical objects — is a matter of considerable dispute. One of the goals of the course is to convey intuitions about the status of these internal structures. But for the time being we will treat them rather simply. There are, in particular, internal structures that we call numerals, which correspond with the numeral expressions we have already seen. These internal numerals correspond one-to-one with the numbers they designate, but there can be more than one numeral expression corresponding to each numeral. For example, there are lots of occurences (tokens) of the numeral 16; one in this sentence, several at the beginning of this primer, etc. Also, there are other ways of writing that numeral: XVI, for example.

You will have noticed that we have already started speaking as if internal structures designate, whereas earlier we said that expressions designate. In fact it was our earlier verbiage that was sloppy; henceforth (and throughout the manual, etc.), we will not say that expressions designate things directly; rather, expressions are internalized into internal structures (often just called structures), which in turn designate objects of many kinds. Similarly, internal structures can be externalized into expressions. So whereas we said that the 3-LISP processor was a "read-normalize-print" loop, it should more accurately be described as a "read-internalize-normalize-externalize-print" loop.

The issues relating expressions and internal structures form a complex subject in their own right; one we will examine in some detail in section 3 of the course. In general, however, it is convenient to ignore expressions and talk directly in terms of internal structures. Thus, we will from this point onwards, unless explicitly noted, we will talk as if 3 was an internal structure called a numeral, and as if $TRUE was an internal boolean constant. When we want to speak of notation directly, we will say such things as that ‘3’ is a numeral-expression that notates a numeral that in turn designates a number. Similarly, ‘=’ is a variable expression that notates a variable that designates the equality predicate (in the global context). Note the convention, which we have just now started, of using single quotes (as well as a sans-serif font) in our theoretic language to refer to 3-LISP expressions; the sans-serif font on its own will be used to refer to 3-LISP internal structures.

A variety of types of (internal) structure have already been implicitly involved in our discussions. The numerals have already been mentioned. Atoms are atomic (i.e., non-divisible) internal structures that are notated by variable expressions of the sort we have seen: ‘Testing’, ‘+’, and ‘LAMBDA’. Rails are internal structures that designate sequences, that are notated by a series of expressions enclosed in square brackets, such as: ‘[10 20 30]’, ‘[]’, and ‘[(= 2 3)]’. There are two internal structures called booleans — indivisible and context-independent constants designating truth and falsity, notated as ‘$TRUE’ and ‘$FALSE’. Pairs are the internal structures that represent function application; they are notated with parentheses, as in ‘(+ 1 2)’, ‘(IF (= 1 2) 3 4)’, and ‘(LAMBDA [X] Y)’. Strings and characters are designated by internal structures called stringers and charats that are notated with double quote marks and a hash sign, respectively; we have already seen such examples as ‘"ale"’ and ‘#B’. Fortunately, stringers and charats are rather recherché objects — we won’t have much need to refer to them explicitly.

All of these are relatively straightforward. The complexities have to do with internal structures that designate functions. So far, we have used two kinds of expressions to "designate" functions: variables (atoms) and lambda-expressions. Neither of these, as it happens, can correspond very closely to the standard internal structures that designate functions (for reasons that will become clear as you get more familiar with the workings of the 3-LISP processor). Instead, there is a type of internal structure called a closure, for which there is no good standard notation. When closures are printed out, braces are used saying that this is a closure, but (unlike other cases) the notation carries far less information than is born by the closure itself. In particular, whereas from the notations for the other structures you can determine what structure is notated, and therefore what object is designated, from the notation used for a closure you cannot determine what closure it is, and therefore what function is designated. For this reason, closure notation is not accepted on input (i.e., there is no way to type a closure in to 3-LISP). But you can easily enough have closure notations printed out:

1> +
1= {+ closure}

1> (LAMBDA [X] (+ X Y))
1= {simple closure}

1> (let [[x 1]] ; Remember that case makes no difference
(lambda [n] (+ x n)))
1= {simple closure}

When we explore in detail how the 3-LISP processor works, we will examine the internals of closures; they basically contain the pattern and body expression that occur in the lambda expression, plus more information representing the context in which the defining lambda expression occured. For the time being, we needn’t bother with details.

A.4. Quotation and Meta-Structural Access

xxx

1> xxx
1= yyy

xxx

A.5. The 3-LISP Processor

xxx

1> xxx
1= yyy

xxx

code

xxx