Page Numbers: Yes X: 306 Y: 1.0" First Page: 1
Margins: Top: 1.0" Bottom: 1.3"
Heading:
LECTURE NOTES #1 LISP: LANGUAGE AND LITERATURE April 5, 1984
————————————————————————————————————————————
Lecture Notes #1 — Introduction
Filed as:[phylum]<3-lisp>course>notes>Lecture-01.notes
User.cm:[phylum]<BrianSmith>system>user.classic
Last edited:April 5, 1984 11:09 AM
————————————————————————————————————————————
Introduction
Do four things today:
A.Philosophical orientation;
—Important to set theoretical assumptions on the table at the outset because, in the early stages of dealing with computation, things will look disarmingly simple. Can go for a long way on a variety of different misunderstandings; then land in big touble later.
B.Survey of the course itself
—Structure: Classes, sections, and labs.
—Content: what we will cover, and when; what we will learn.
C.A first encounter with a computational expression;
D.Matters of orchestration: sections, machines, problem-sets, etc.
————————————————————————————————————————————
A. Philosophical Orientation
1.Three overall goals:
a.Convey computational intuitions, so that people emerge with their own original and authentic sense of computation, rather than a derivative one.
b.Provide an explicit theoretic language in terms of which to understand the intuitions conveyed in (a). I.e., by the end people should understand computation both tacitly and theoretically, in a coordinated way.
—Without this, the course would do no more than develop a new set of hackers.
c.Have the theoretic language of (b) be continuous with, and consonant with, the theoretic language they would be disposed to use in describing the use of language by any finite agent.
—I.e., ask of all participants that they not allow me to introduce any theoretic term that they are not prepared to admit they would be forced to use in their own work if they were studying situated language use. So no "fexprs", "left-hand side values", etc.
—This means that it will have to be a collaborative project, because I am not an expert in the attendent fields.
—Also means that I will do violence to virtually everybody’s assumptions. Hopefully equally; make no one happy. Will make terminology tricky.
—Can’t have pandemonium: so will:
—Try to identify explicitly the most salient places where we are parting company with assumptions made in the major surrounding traditions. I.e., provide running commentary from a variety of field’s perspectives.
—Admit when a new term is being introduced.
—Will be no more familiar to computer scientists than to anyone else. The claim, or desire, is to do justice to computational practice, not necessarily to standard reconstruction.
—Terminological variances across fields run pretty deep.
—For example, by the very term language computer scientists don’t mean what linguistics or logicians mean. There is ambiguity about whether we mean a set of expressions, or those plus an interpretaion, or those plus processes that use them, or a more abstract thing. In particular, what formal language theory might call a language some computer scientists will (perhaps disparagingly, but I think that judgment is deeply mistaken) a notation or a syntax.
—To borrow a technique from the beginning of Situations and Attitudes, note some words you might think you knew what meant:
—Language, evaluation, extension, value, interpretation, function, ...
These are words from the theory of language, not from language itself; no a priori reason to suppose they will suffice for this reconstruction.
2.What is Computation About?
a.My claim: semantic processes
—Specifically not about machines, or about computers, or about static languages.
—Not about any kind of processes; just those that bear information about some world, some reality, something, above and beyond them.
—E.g.: abacuses, hand-held calculators, fuel-injection systems, flight systems that land planes at the SF Airport.
—In fact this whole approach is based on a rather particular philosophy of computation, which we won’t examine here; maybe at the end of the course. (Rather, the philosophy of computation has emerged from this reconstruction.)
—One line summary: computational processes are always about something.
—Computation, therefore, is basically language use in a microcosm. As I claimed to the board, computers are the "frictionless pucks and inclined planes" of the theory of language.
b.Five fundamental dialectics or balances (we will see these over and over):
i.Between processes and languages:
—Computer science primarily studies processes, and secondarily the languages used to describe and interact with them. Linguistics (am I right?) primarily studies languages, and only secondarily the processes that underlie them. In this course we will focus on a pretty balanced combination of the two.
—There will be lots of different languages (of communication, of computational process, of theory). Part of the point will be to keep them clear and distinct, and to describe the relationships between them.
—As opposed to astronomy, where phenomena (stars), models (brass), and theories (written in lab books) were easy to distinguish; in our case virtually all will be linguistic.
—This is one reason we will have to be so frustratingly precise about avoiding use/mention confusions.
—For that matter, there will be lots of different processes, too; we will try to be clear about them as well. The real subject will be at the juncture between them — where language and process interact.
—In this sense studying computation is much more like studying the mind (and its use of language) than studying language on its own.
—LISP is neither a language nor a process, but an amalgam.
ii.Between behaviour and "aboutness":
—Perhaps the most importantly different thing about computer science, especially when compared with other fields represented at CSLI, is that it studies active, temporal processes.
—This temporality is so important, so prevalent, that much of the theoretic analysis in computer science is devoted to properties of the behaviour.
—On the other hand, or so I will argue, it is the information that those processes bear that leads us (programmers) to organize them, structure them, understand them, classify them, in the ways that we do.
iii.Between (so-called) "intensional" and "extensional" properties of languages and processes:
—Will deal with very fine-grained issues of notation: spelling, type/token distinctions, grammars and parsing.
—Will also with functions designated, objects in the world, and truth-values — all of the motherhood objects that are the normal truck of extensional semantical theories.
—Will talk about "middle realm" objects, certainly not "meanings" (I personally doubt that there are any such things), but nonetheless objects more finely individuated than extensions, and yet more coarsely individuated than sentences or expressions.
iv.Between "external" languages of communication and "internal" languages of computational itself.
—The external languages (computer analogs of English or French) serve as a medium of communication between or among agents;
—The internal "languages" (the machine analogs of mentalese) are the structures with (or over) which the actual computational processes are defined.
—Actually I will argue, in the end, that these internal structures probably shouldn’t be viewed as languages at all
In passing: we will mix these last two up in wonderful combination: we will have expressions that refer (extensionally) to the intensions of other expressions; internal (mentalese) structures that are roughly the translation of external expressions (i.e., co-extensional and co-intensional with them); internal (mentalese) structures that refer (extensionally) to other internal structures, etc. etc.
—Kind of an erector set of possibilities (an algebra?)
—Leading to a whole bunch of different kinds of quotation.
v.Between what is explicit and what is implicit:
—aspects of computational processes, information embedded in them, etc.
—Far more of interest about any computational process (and presumably any linguistic expression) arises out of implicit relationships that it bears with its context, its use, etc., than is written down explicitly in its surface form.
—Whether information is implicitly or explicitly represented, in a computational process, can be of tremendous importance. Many programming techniques can be viewed as attempts to deal directly with this complex issue.
(–Charles Taylor has argued that people can’t be computers because everything about computers is explicit. I deny that there is any notion of "explicitness" (about which, admittedly, we don’t have a theory) such that this claim is true.)
—Relates, in the end, to whether computation is even syntactic: cf. Rosenschein and Pereira’s current project, which seems to me exactly right in this regard.
c.Why all of these different levels? Wouldn’t it be simpler to simplify, deal purely with extensional languages, or with finite state or Turing machines?
—No, we can’t, because computation is fundamentally intensional; that’s in fact what it is (at least on my view).
—Of course the word ‘intension’ is theory-relative. Nonetheless, the intensional aspects of symbolic or linguistic structures are crucial, both computational and in terms of language and mind. This, of course, is one of the reasons (the reason?) they are potentially related.
3.Some Additional Working Assumptions:
a.On the role of mathematics (a personal view):
—Terribly useful as a way of classifying things; i.e., as an aid in theorizing.
—We will use the notion of a mathematical "model", in much the way that Barwise and Perry do in more recent versions of situation semantics, and as Goguen did in his talk on computatinal semantics just yesterday.
—But terrible as a basis for semantic intuitions. Problems are two:
—There are often isomorphisms between the syntactic structures and the semantic intepretations, tending to lead us, either explicitly or implicitly, to identify (i.e., fail to distinguish) between the two. This is a disaster for developing clean semantical discriminations, which are crucial in dealing with real language, real reasoning, etc.
—There are almost always isomorphisms between the real mathematical objects (numbers, sets, whatever) that are being studied, and the mathematical objects used to model them. Thus, there is even less motivation to distinguish model from reality, something that is again crucial, if we are to understand the relationships between computational process, theories of them, and the world they are about.
—It is crucial that, for purposes of the course, we adopt a platonist view of mathematics — i.e., that there really are numbers and sets and all those good things, out there in some "beyond" that may be the world and may be Plato’s heaven but certainly isn’t just the page of mathematical language that describes them. It is not that I really care about your views of mathematics (indeed, I am not sure what my own are); the point, rather, is that we will be using mathematics to classify things; analogy will go through only if you assume that mathematical abstractions trancend their embodiment in language or machine.
—This will all become clear as we proceed.
b.On the "language and literature" part of the course name:
—The stuff that we will primarily deal with, in the course, is above level of individual syntactic constructs. We will be concerned, rather, with the way in which these structures and expressions are put together to form large scale patterns.
—Thus a "literature" orientation, rather than focusing just on the details of the linguistic constructs themselves.
—A&S don’t even teach the LISP language itself, in any explicit way: it’s so simple that they just assume the students (M.I.T. freshpeople) will learn it automatically (and, presumably, tacitly).
—That is why the specific subject matter will be things like abstraction, modularity, and complexity.
—The overall goal is (to quote A&S) is to learn how "to control the intellectual complexity of large software systems".
—Programs, furthermore, are meant to be read by people; i.e., they serve a communication role as well as generating processes inside the machine.
—I.e., the difference between a "good" program and a "bad" one has in part to do with human understandability.
c.On formal theorizing, and semantical terminology:
—Footnote on "semantic" and "semantical".
—Standard semantical notions are simply not complex enough (interpretation, designation, etc.).
—Will not resist mathematical theorizing about what we do, but will put it of until we have satisfied both of the first two goals; want people to have the intutions before we try to formalize them.
—In due course (in about 6 weeks, perhaps) we will present and examine a formal analysis of the (3-LISP) language: its syntax, semantics, etc. What I hope will be striking about this reconstruction will be the extent to which it fails to get at what this course will be teaching. Not proud of this; rather, take it as a lesson in how far we have yet to go.
————————————————————————————————————————————
B. The Course
1.Structure. What will we do? Answer is to some extent mandated by the three overall goals.
—Practical programming experience:
—One cannot develop these intuitions without some marination in the practice itself. Therefore have prepared an implementation of 3-LISP on the Dandelions for use by students.
—3-LISP is a simple dialect of LISP I developed in 1981, in line with this basic orientation. You may know of it because it is "reflective", but we won’t deal with any of that here (except maybe at the end of the course).
—Owe Jim des Rivìeres an enormous debt here: he has done a staggering amount of work over the last few months preparing this. Also, more recently, David Wallace and David Christman.
—Problem sets: of a sort common in physics and mathematics courses. A graded series of problems, with instructions, commentary, etc.; handed out every week or two; due a week or two following. They will endeavour to lead you through all of the material we talk about in class.
—Three kinds of sessions:
—These classes: aim will be to develop the theoretic terminology with which to understand what is going on.
—Sections (or "recitations" — I may use the latter term), where specific programming details will be presented, problem-sets reviewed, etc. I.e., these will be much more pragmatically oriented. And more technical. In class I won’t bother talking at all about how to plug in your machine, nor will I describe in detail the particular syntax of the 3-LISP conditional statement.
—But: analysis in class will assume material that is covered in the recitations. No way around it.
—Lab time, at the machines. Will make assistants available for hand-holding, getting people out of trouble, etc.
—Times for the latter two of these to be arranged. Will have several sections per week, people need go to only one (I will be at all of them). Lab times basically open as much as possible, with designated times at which lab instructors will be available.
—Blunt: things will get more interesting as course goes on. First have to lay some groundwork (get used to notation, procedures, etc.). Requires a little patience.
2.An annotated overview of topics and themes (i.e., what are we going to learn?):
—Topic outline (from syllabus):
1.Introduction
2.Abstraction and Modularity I
a.Procedural abstraction
b.Simple data representation
c.Models of processing
Primary intent is to convey a procedural, intensional viewpoint.
—get mathematicians to see numerals, not numbers;
—get linguists, when they are presented with S ← NP VP, to ask how long it takes.
—get philosophers to see how easy it is to swallow a "two-factor" theory of semantics.
—get computer scientists and AI’ers to realize that adverting to the external world in one’s accounts of computation doesn’t challenge the integrity of the computational device.
Also, make clear the role of mathematical models in classifying computational devices.
3.Abstraction and Modularity II
a.Objects, modularity, state, and encapsulation
b.Input/output, notation, and communication protocols
c.Meta-linguistic abstraction, and problems of intensional grain
In this section the state of the process and the state of the world will seriously part company. Also, will see the difference between internal codes and external notation; develop a caution about assuming that things that hold of external syntactic structures necessarily hold of internal ones.
4.Explicit theory
a.Notation, structure, and semantics
b.Specification, verification, and correctness
Show in detail the difference between entities in the theory of a computational process and entities in that process itself. Show, for example, why it is a category error to say that one can "implement a theory". Consider, also, the implicit/explicit distinction as it relates theory to process.
5.Architecture and implementation
a.Abstract machines: implementation and realisation
b.Declarative formalisms: sentences, claims, and constraints
Show, among other things, why semantics doesn’t cross implementation boundaries: one of the most important things we will get to. Define notions of interpreter, compiler, realisation, etc.: computation in the large.
6.Introspection, self-reference, meta-circular processors, and reflection
Fun and games: once we have everything straight, mix it all up again ("pull the cork on a conundrum, and watch the paradoxes fizz").
—Permeating themes (also from syllabus):
—Procedural and declarative notions of semantics
—Interpretation, compilation, and other models of processing
—Implicit vs. explicit representation of information
—Contextual relativity, scoping mechanisms, and locality
—Varieties of language: internal, external, theoretic
—Programs vs. processes: "description of" vs. "communication with"
—Syntax and abstract structure: functionalism and representationalism
—Use/mention confusions!
————————————————————————————————————————————
C. First Encounter
1.How do we start?
—Subject matter
—Suppose we walk up to a computer running 3-LISP. What’s there to talk about? The problem is that, as it stands, the computer "knows" about only three sorts of object (where "knows" means something like "has representations that designate or are taken by us as designating"):
—Its internal structures
—Mathematical abstractions;
—Notation (strings) and external language;
To "tell it" about anything else (i.e., to represent any other kind of information) requires that we deal explicitly with questions of representation. To do that would involve us in too much complexity (in part because the subject is too important). So we are sort of stuck with these three. Given them, the problem with the first is we don’t yet know anything about it; with the third, that it involves us with both using and mentioning language, with the concomitant tendency to slip into use/mention confusions (we can and will keep such distinctions straight, but it makes life complex). So: we will start (with considerable apology, and violating a basic assumption we just made) with simple mathematical examples. We will use a certain number of "notational" examples in due course. And as soon as we get to more adequate representational structures, we will leave mathematical examples entirely.
—The reason we can’t take up the representational questions right away is because before we can do that (or really anything much else) we need to develop some fluency with procedures.
—Language
—To do this, we will begin with an overly simplistic model. We will assume that we communicate with the machine in a simple language about mathematical objects: numbers, functions, sequences, and truth-values. From a purely formal (syntactical) point of view, a language generally consists of three things:
—Primitive expressions
—Means of combination
—Means of abstraction (or naming)
—Semantically, we have to say what expressions "mean". We will use the term significance for the most general semantic notion (whatever it turns out to be). Thus: we must ask what the significance is for each of instances of each of these syntactic types.
—Note: We just introduced two technical terms.
2.A Simple Declarative Language
—We will have these primitive expressions:
—Numerals (signifying numbers): 3, -100, etc.
—Note: use a different font (GACHA) for object-level expressions, in these notes. In class (blackboard, etc.,) use a over-line for formal linguistic expressions not in the object language. I.e., quote by default (pace Barwise).
—Boolean constant expressions (signifying truth-values): $TRUE and $FALSE
—Variables: A, X, LongVariableName, etc.
—Primitive constants (+, *, max, =, etc.), signifying a number of specific (i.e., pre-interpreted) functions.
these means of combination:
—Square-bracketed expressions, called "sequence expressions" (signifying sequences): ‘[1 2 3]’, ‘[]’, etc.
—Cambridge Polish notation (signifying function application): parenthesized expressions containing both function expression and argument expression: ‘(+ 2 3)’, ‘(F (G X))’, etc.
—A conditional operator: if, such that (if P E1 E2) will signify what E1 signifies if P signifies truth, and what E2 signifies if P signifies falsity.
and these means of abstraction:
—lambda expressions (signifying functions in the usual fashion).
A footnote: in computer science it is traditional to take a whole lot of words as primitives or constants, in this sense; it is rarer in mathematics. INTERLISP, which runs on the Dandelion, has something like 11,000 "lexical items" bound in the initial environment: approaches a real natural language, not at all like a typical formalism studied in logic.
—This, I take it, we would all consider an incredibly simple language.
—So for example, we would expect each of the following expressions to signify the (platonic) number 3, in some fashion:
—3
—(+ 1 2)
—(* 1 (- 4 1))
—(if (= [] [])
3
4)
—((lambda [x] (+ x 1)) 2 )
—Could now set out and do a complete semantics for this.
—I.e., given an environment type E mapping variables onto values, and a semantic domain D of numbers, functions (from D to D), truth-values, and sequences, and given a particular E0 which maps the constant function symbols onto the functions they signify, define the designation of an expression exp to be [| exp |]E, as defined by:
—[| exp |]E = E(exp) if exp is a variable;
—[| $TRUE |]E = Truth and [| $FALSE |]E = Falsity
—[| exp |]E = the corresponding number, if exp is a numeral;
—[| [e1 e2 ... ek |]E = < [| e1 |]E, [| e2 |]E, ..., [| ek |]E >, if "< ... >" are sequence brackets in the theoretic language
—What is often called the "meta-language", but we will avoid that term: it is too confusing, because it is not unique: we will end up with more than one.
—[| (e1 e2 ... ek) |]E = the function[| e1 |]E applied to the sequence of arguments < [| e2 |]E, ..., [| ek |]E >.
—[| (if e1 e2 e3) |]E = [| e2 |]E if [| e1 |]E is Truth, [| e3 |]E if [| e1 |]E is Falsity.
—[| (lambda [v1 ... vk] body) |]E = l d1, ... dk . [| exp |]E’, where E’ is just like E except that v1 ... vk are bound to d1, d2, ... dk.
—Hope it is clear that this is an absolute standard, "extensional" semantic account.
4.Computation
—The crucial fact about all of this is that the situation described above is entirely static. Suppose we type the expression (+ 2 3) to the computer; we will get back the expression 5. Normally, we say that the computer has "evaluated" the expression (+ 2 3). I argue that this either doesn’t mean anything or else is false. But is is crucial to realize that something has happened (it took up time, for example — we could even measure how much), whereas nothing we have said so far about what this language meant said anything whatsoever about behaviour.
—There will be lots of behaviour
—In the first problem set, for example, we will type the expression (FORWARD 100) and a little turtle icon will appear on the screen of the Dandelion and draw a line about two inches long.
Could make the behaviour be the significance, and we will do this. But we don’t want to make the expression 5 be the significance of (+ 2 3), because (+ 2 3) signifies the number five: we all know that!
—We could say that the computer replies with some expression that also designates the number five. But what if it simply echoed our input: i.e., it returned the expression (+ 2 3). We would think it pretty dumb. No, there has to be some other story going on, that tells a more complete story about at least three things: the expression (+ 2 3), the number five, and the expression 5 that got returned.
—Another example: what is the longest river in Egypt? Answer: the Nile
—Also: what desk are you using? Answer: the one that used to be in Joan’s office.
—Also: just type in +. Do you expect + back?
Morale: it is one thing to say what an expression is about, another to say what reply (more generally: what behaviour) it engenders.
—When we say that a computer evaluates an expreesion, we use a massive pun: by simplifying the expression (or something analogous to it), the computer can be said to return something that can be classified by the significance of the input expression.
—Actually, this pun works only in very simple cases.
—Even more significantly, what the computer does (i.e., its overall behaviour) can be classified as following the procedure that we would take the program as intensionally representing.
—Note that it doesn’t actually follow the procedure, because it can’t: to do that would require trafficking in numbers, which computers don’t have access to (remember our platonic assumption).
—Note also that it is only because of this pun that the language we used (and programming languages in general) are restricted to constructive notions. I.e., this is why we don’t have formal analogs of "the x such that ... ": no procedure is intensionally associated with it, so there is onthing the computer can do such that the pun will hold.
—Once we clear up the pun, we can allow non-constructive language back into the game.
—Morale: here is our first instance, on the most utterly simple example, of "behaviour" and "aboutness" getting mixed up.
—Note that when we get to the point of "saying" to the computer: "John walked to the store by way of the cathedral", what we say will have an intensional reading, but it doesn’t correspond to the intension of anything that the computer will be expected to do by way of response or comprehension. Something special about the program we have used so far (hint: has to do with the fact that it is a program).
—Programs: theoretic descriptions of processes, or languages that processes use to communicate. Ambiguity: we will characterise both.
—Actually, programs are conversations with one process, about another one. More than one process showing up already.
—Introduce full diagram? Show why mathematical examples are SO misleading.
—Resolve: keep program, behaviour (and process), and designation, distinct. Also, all of these from the mathematical structures used to classify them.
5. Next time:
—Today, we started thinking declaratively, but saw right away that this won’t suffice. Next time will start thinking procedurally, but still by and large extensionally — i.e.,in terms of the mathematical objects we use to classify the machine state. In due course, will start thinking intensionally as well.
————————————————————————————————————————————
D. Matters of Orchestration
—A&S Book
—Next Thursday: 3-LISP manual, problem-set, primers − all to be distributed;
—Will also arrange sections, set up lab times, etc.