Page Numbers: Yes X: 306 Y: 1.0" First Page: 7
Margins: Top: 1.0" Bottom: 1.3"
Heading:
CONCEPTUAL FRAMEWORKINTERIM 3-LISP REFERENCE MANUAL June 10, 1983
2.b. Computational Semantics
In the model of computation just presented, we said that the attribution of semantic significance to the ingredients of a process was a distinguishing mark of computer science. Informally, no one could possibly understand 3-LISP without knowing that the boolean $T stands for truth and $F for falsity. From the fact that computer science is thought to involve formal symbol manipuation we admit not only that the subject matter includes symbols, but also that the computations over them occur in explicit ignorance of their semantical weight (you cannot treat a non-semantical object, such as an eggplant or a waterfall, formally; simply by using the term formal you admit that you attribute significance to it on the side). Even at the very highest levels, when saying that a process — human or computational — is reasoning about a given subject, or reasoning about its own thought processes, we implicate semantics, for the term "semantics" can be viewed as a fancy word for aboutness. It is necessary, therefore, to set straight our semantical assumptions and techniques.
2.b.i. Pre-Theoretic Assumptions
In engaging in semantical analysis, our goal is not simply to provide a mathematically adequate specification of the behaviour of one or more procedural calculi — one that would enable us, for example, to prove programs correct, given some specification of what they were designed to do. In particular, by "semantics" we do not simply mean a mathematical formulation of the properties of a system, formulated from a meta-theoretic vantage point. Rather, we take semantics to have fundamentally to do with meaning and reference and so forth — whatever they come to — emerging from the paradigmatic human use of language. We are interested in semantics for two reasons: first, because, as we said at the end of the last section, all computational systems are marked by external semantical attribution, and second, because semantics is the study that will reveal what a computational system is reasoning about, and a theory of what a computational process is reasoning about is a pre-requisite to a proper characterisation of reflection.
Given this agenda, we will require that our semantical analyses of computational systems answer to the following two requirements, which emerge from the facts about processes and structural fields laid out at the end of :2.a: 1) they should manifest the fact that we understand computational structures in virtue of attributing to them semantical import; 2) they should make evident that, in spite of such attribution, computational processes are formal, in that they must be defined over structures independent of their semantical weight.
Strikingly, from just these two principles we will be able to defend our requirement of a double semantics, since the attributed semantics mentioned in the first premise includes not only a pre-theoretic understanding of what happens to computational symbols, but also a pre-computational intuition as to what those symbols stand for. We will therefore have to make clear the declarative semantics of the elements of (in our case) the 3-LISP structural field, as well as establishing their procedural import.
We will explore these results in more detail below, but in its barest outlines, the form of the argument is quite simple. Most of the results are consequences of the following basic tenet (we have relativised the discussion to 3-LISP, but the same would hold for any other calculus):
What 3-LISP structures mean is not a function of how they are treated by the 3-LISP processor; rather, how they are treated is a function of what they mean.
For example, the expression "(+23)" in 3-LISP normalises to 5; the reason is that "(+23)" is understood as a complex name of the number that is the successor of 4. We arrange things — we defined 3-LISP in the way that we did — so that the numeral 5 is the result because we know what (+23) stands for. To borrow a phrase from Barwise and Perry, our reconstruction is an attempt to regain our semantic innocence — an innocence that still permeates traditional formal systems (logic, the l-calculus, and so forth), but that has been lost in the attempt to characterise the so-called "semantics" of computer programming languages.
2.b.ii. Semantics in a Computational Setting
In a general sense of the term, semantics can be taken as the study of the relationship between entities or phenomena in a syntactic domain S and corresponding entities in a semantic domain D, as pictured in the following diagram.
(5)



<==<3-figure-2-05.press<
We call the function mapping elements from the first domain into elements of the second an interpretation function (to be sharply distinguished from what in computer science is called an interpreter). Note that the question of whether an element is syntactic or semantic is a function of the point of view; the syntactic domain for one interpretation function can readily be the semantic domain of another (and a semantic domain may of course include its own syntactic domain).
Not all relationships, of course, count as semantical; the "grandmother" relationship fits into the picture just sketched, but stakes no claim on being semantical. Though it has often been discussed what constraints on such a relationship characterise genuinely semantical ones (compositionality or recursive specifiability, and a certain kind of formal character to the syntactic domain, are among those typically mentioned), we will not pursue such questions here. Rather, we will complicate our diagram as follows, so as to enable us to characterise a rather large class of computational and linguistic formalisms:
(6)










<==<3-figure-2-06.press<
N1 and N2 are intended to be notational or communicational expressions, in some externally observable and consensually established medium of interaction, such as strings of characters, streams of words, or sequences of display images on a computer terminal. The relationship Q is an interpretation function mapping notations into internal elements of some process over which the primary semantical and processing regimens are defined. In first-order logic, S1 and S2 would be something like abstract derivation tree types of first-order formulae.
In contrast, F is the interpretation function that makes explicit the standard denotational significance of linguistic terms, relating, we may presume, expressions in S to the world of discourse. The relationship between one’s mental token for T. S. Eliot, for example, and the poet himself, would be formulated as part of F. Again, we speak very broadly; F is intended to manifest what, paradigmatically, expressions are about, however that might best be formulated (F includes for example the interpretation functions of standard model theories). Y, in contrast, relates some internal structures or states to others — one can imagine it specifically as the formally computed derivability relationship in a logic, as the function computed by the primitive language processor in a computational machine, or more generally as the function that relates one configuration of a field of symbols to another, in terms of the modifications engendered by some internal processor computing over those states. For mnemonic convenience, we use the name "Y" by analogy with psychology, since a study of Y is a study of the internal relationships between symbols, all of which are within the machine. The label "F", on the other hand, chosen to suggest philosophy, signifies the relationship between a set of symbols and the world. (Philosophy takes you "out of your mind").
Some simple comments. First, N1, N2, S1, S2, D1, and D2 need not all necessarily be distinct: in a case where S1 is a self-referential designator, for example, D1 would be the same as S1; similarly, in a case where Y computed a function that was designation-preserving, then D1 and D2 would be identical. Secondly, we need not take a stand on which of Y and F has a prior claim to being the semantics of S1. In standard logic, Y (i.e., derivability: p) is a relationship, but is far from a function, and there is little tendency to think of it as semantical; a study of Y is called proof theory. In computational systems, on the other hand, Y is typically much more constrained, and is also, by and large, analysed mathematically in terms of functions and so forth, in a manner much more like standard model theories. For discussion, we will refer to the F-semantics of a symbol or expression as its declarative import, and refer to its Y-semantics as its procedural consequence.
It is possible to use this diagram to characterise a variety of standard formal systems. In the standard models of the l-calculus, for example, the designation function F takes l-expressions onto functions; the procedural regimen Y, usually consisting of a- and b-reductions, can be shown to be F-preserving. Similarly, if in a standard predicate logic we take F to be (the inverse of the) satisfaction relationship, with each element of S being a sentence or set of sentences, and elements of D being those possible worlds in which those sentences are true, and similarly take Y as the derivability relationship, then soundness and completeness can be expressed as the equation Y(S1,S2)W [ D1ID2 ]. As for all formal systems (these presumably subsume the computational ones), it is crucial that Y be specifiable independent of F. The l-calculus and predicate logic systems, furthermore, have no notion of a processor with state; thus the appropriate Y involves what we may call local procedural consequence, relating a simple symbol or set of symbols to another set. In a more complex computational circumstance, as we will see below, it is appropriate to characterise a more complex full procedural consequence involving not only simple expressions, but fuller encodings of the state of various aspects of the computational machine (for example, at least environments and continuations in the typical computational case).
An important consequence of the analysis illustrated in the last figure is that it enables one to ask a question not typically asked in computer science, about the (F-) semantic character of the function computed by Y.
One of the important properties of 3-LISP is called the normalisation property (the formulation used here is simplified for perspicuity, ignoring contextual relativisation; S is the set of structural field elements):
AS B S [[ F(S) = F(Y(S)) ] & NORMAL-FORM(Y(S)) ]
Diagrammatically, the circumstance it describes is pictured as follows:
(7)







<==<3-figure-2-07.press<
Such a Y, in other words, is always F-preserving. It relies, in addition, on a notion of normal form, which we will have to define.
In the l-calculus, Y(S) would definitionally be in normal-form, since the concept of that name is defined in terms of the non-applicability of any further b-reductions. In 3-LISP, we will in contrast define normal-formedness in terms of the following three independent properties:
1. A normal-form structure must be context-independent, in the sense of having the same declarative and procedural import independent of their context of use;
2. It must be side-effect free, implying that their procedural treatment will have no affect on the structural field or state of the processor;
3. It must be stable, by which we mean that they must normalise to themselves in all contexts.
It then requires a proof that all 3-LISP results (all expressions Y(S)) are in normal-form. In addition, from the third property, and this proof that the range of Y includes only normal-form expressions, one is able to show that Y is idempotent, as was suggested earlier (Y=YoY, or equivalently, ASY(S)=Y(Y(S))) — a property of 3-LISP that has substantial practical benefits.
In designing 3-LISP we insisted that the structural category of each normal form designator be determinable from the type of object designated, independent of the structural type of the original designator, and independent as well of any of the machinery involved in implementing Y. For example, we will be able to demonstrate that any term that designates a number will be taken by Y into a numeral, since numerals will be defined as the normal-form designators of numbers. In other words, from just the designation of a term X the structural category of Y(X) will be predictable, independent of the form of X itself (although the token identity of Y(X) cannot be predicted on such information alone, since normal-form designators are not necessarily unique or canonical). This category result, however, also has to be proved: we call it the semantical type property.
That normal form function designators cannot be canonical arises, of course, from computability considerations: one cannot decide in general whether two expressions designate the same function, and therefore if normal-form function designators were required to be unique it would follow that expressions that designated functions could not necessarily be normalised. Instead of pursuing approach, we will instead adopt a non-unique notion of normal-form function designator, still satisfying the three requirements specified above: such a designator will by definition be called a closure.
Some 3-LISP examples will illustrate all of these points. We include the numbers in our semantical domain, and have a syntactic class of numerals, which are taken to be normal form number designators. The numerals are canonical (one per number), and as usual they are side-effect free and context independent; thus they satisfy the requirements on normal-formedness. The semantical type theorem says that any term that designates a number will normalise to a numeral: thus if X designates five and Y designates six, and if + designates the addition function, then we know (can prove) that (+XY), since it designates eleven, will normalise to the numeral 11. Similarly, there are two boolean constants $T and $F that are normal-form designators of Truth and Falsity, and a canonical set of rigid structure designators called handles that are normal-form designators of all internal structures (including other handles). And so on: closures are normal-form function designators, as mentioned in the last paragraph; we will also have to specify normal-form designators for sequences and other types of mathematical objects included in the semantic domain.
2.b.iii. Recursive and Compositional Formulations
The previous sections have suggested briefly the work that we would like our semantics to do; they do not reveal how this is to be accomplished. Beginning very simply, standard approaches suffice. For example, we begin with declarative import (F), and initially posit the designation of each primitive object type (saying for instance that the numerals designate the numbers, and that the primitively recognised closures designate a certain set of functions, and so forth), and then specify recursive rules that show how the designation of each composite expression emerges from the designation of its ingredients. Similarly, in a rather parallel fashion we can specify the procedural consequence (Y) of each primitive type (saying in particular that the numerals and booleans are self-evaluating, that atoms evaluate to their bindings, and so forth), and then once again specify recursive rules showing how the value or result of a composite expression is formed from the results of processing its constituents.
If we were considering only purely extensional, side-effect free, functional languages, the story might end there. However there are a variety of complications that will demand resolution, of which two may be mentioned here. First, 3-LISP is not purely extensional: there are intensional constructs of various sorts (LAMBDA, for example, which we will view as a standard intensional procedure, rather than as a syntactic mark). As in any system, the ability to deal with so-called intensional constructs causes havoc in the semantics, requiring that extensional procedures be recast in appropriate ways. This is a minor complexity, but no particular difficulty emerges.
The second difficulty has to do with side-effects and contexts. All standard model-theoretic techniques of course allow for the general fact that the semantical import of a term may depend in part of on the context in which it is used (variables are the classic simple example). However the question of side-effects — which are part of the total procedural consequence of an expression, impinges on the appropriate context for declarative purposes as well as well as for procedural. For example, in a context in which X is bound to the numeral 3 and Y is bound to the numeral 4, it is straightforward to say that the term (+XY) designates the number seven, and returns the numeral 7. However consider the more semantics of the more complex:
(+ 3 (BLOCK (SET Y 14) Y))
It would be hopeless (to say nothing of false) to have the formulation of declarative import ignore procedural consequence, and claim that the above designates seven, even though it patently returns the numeral 17 (although note that we are under no pre-theoretic obligation to make the declarative and procedural stories cohere). On the other hand, to include the procedural effect of the SET within the specification of F would seem to violate the ground intuition which argued that the designation of this term, and the structure to which it evaluates, are different.
The approach we adopt is one in which we define what we call a general (or total) significance function S, which embodies both declarative import (designation), local procedural consequence (what an expression evaluates to, to use LISP jargon), and full procedural consequence (the complete contextual effects of an expression, including side-effects to the environment, modifications to the field, and so forth). Only the total significance of our dialects will be strictly compositional; the components of that total significance, such as the designation, will be recursively specified in terms of the designation of the consituents, relativised to the total context of use specified by the encompassing function. In this way we will be able to formulate precisely the intuition that the aforementioned expression designates seventeen, as well as returning the corresponding numeral 17.
2.b.iv. The Role of a Declarative Semantics
One brief final point about this double semantics must be brought out. It should be clear that it is impossible to specify a normalising processor without a pre-computational theory of semantics. If you do not have an account of what structures mean, independent of how they are treated by the processor, there is no way to say anything substantial about the semantical import of the function that the processor computes. On the standard approach, for example, it is impossible to say that the processor is correct, or semantically coherent, or semantically incoherent, or anything: it is merely what it is. Given some account of what it does, one can compare this to other accounts: thus it is possible for example to prove that a specification of it is correct, or that an implementation of it is correct, or that it has certain other independently definable properties (such as that it always terminates, or uses certain resources in certain ways). In addition, given such an account, one can prove properties of programs written in the language — thus, from a mathematical specification of the processor of ALGOL, plus the listing of an ALGOL program, it might be possible to prove that that program met some specifications (such as that it sorted its input, or whatever). However none of these questions are the question we are trying to answer; namely: what is the semantical character of the processor itself?
2.b.v. On Programming Language Semantics
Programming language semantics, for reasons that can at least be explored, if not wholly explained, have focused primarily on Y, although in ways that tend to confuse it with F. Except for PROLOG, which borrows its F straight from a subset of first-order logic, it is unusual to find a semantical account of a programming language that gave independent accounts of F and Y. There are complexities, furthermore, in knowing just what the proper treatment of general languages should be. It can be argued that the notion program is inherently defined as a set of expressions whose (F-) semantic domain includes data structures (and set-theoretic entities built up over them). In other words, in a computational process that deals with finance, say, the general data structures will likely designate individuals and money and relationships among them, but the terms in that part of the process called a program will not designate these people and their money, but will instead designate the data structures that designate people and money (plus of course relationships and functions over those data structures). Even on a declarative view like ours, in other words, the appropriate semantic domain for programs is built up over data structures — a situation strikingly like the standard semantical accounts that take abstract records or locations or whatever as elements of the otherwise mathematical domain for programming language semantics. It may be that this fact that all base terms in programs are meta-syntactic that has spawned the confusion between operations and reference in the computational setting.
Although the details of a general story remain to be worked out, the LISP case is instructive, by way of suggestion as to how a more complete computational theory of language semantics might go. In particular, because of the context relativity and non-local effects that can emerge from processing a LISP expression, F is not specifiable in a strict compositional way. Y — when taken to include the broadest possible notion that maps entire configurations of the field of symbols and of the processor itself onto other configurations and states — is of course recursively specifiable (the same fact, in essence, as saying that LISP is a deterministic formal calculus). A pure characterisation of Y without a concomitant account of F, however, is unmotivated — as empty as a specification of a derivability relationship would be for a calculus for which no semantics had been given. Of more interest is the ability to specify what we call a general significance function S, that recursively specifies Y and F together. In particular, given any expression S1, any configuration of the rest of the symbols, and any state of the processor, the function S will specify the configuration and state that would result (i.e., it will specify the use of S1), and also the relationship to the world that the whole signifies. For example, given a LISP expression of the form (+1(PROG(SETQA2)A)), S would specify that the whole expression designated the number three, that it would return the numeral "3", and that the machine would be left in a state in which the binding of the variable A was changed to the numeral "2". A modest result; what is important is merely a) that both declarative import and procedural significance must be reconstructed in order to tell a full story about LISP; and b) that they must be formulated together.
Several points emerge from analyses developed within this framework:
a.In most programming languages, Q can be specified compositionally and independently of F or Y. In the case of formal systems, Q is often context free and compositional, but not always (reader macros can render it opaque, or at least intensional, and some languages such as ALGOL are apparently context-sensitive). It is noteworthy, however, that there have been computational languages for which Q could not be specified indepently of Y — a fact that is often stated as the fact that the programming language "cannot be parsed except at runtime" (TECO and the first versions of SMALLTALK had this character).
b.Since LISP is computational, it follows that a full account of its Y can be specified independent of F; this is in essence the formality condition. It is important to bring out, however, that a local version of Y will typically not be compositional in a modern computational formalism, even though such locality holds in purely extensional context-free side-effect free languages such as the l-calculus.
c.It is widely agreed that Y does not uniquely determine F. However this fact is compatible with our foundational claim that computational systems are distinguished in virtue of having some version of F as part of their characterisation. A very similar point can be made for logic: although any given logic can (presumably) be given a mathematically-specified model theory, that theory doesn’t typically tie down what is often called the standard model or interpretation — the interpretation that we use. This fact does not release us, however, from positing as a candidate logic only a formalism that humans can interpret.
d.The declarative interpretation F cannot be wholly determined independent of Y, except in purely declarative languages (such as the l-calculus and logic and so forth). This is to say that without some account of the effect on the processor of one fragment of a whole linguistic structure, it may be impossible to say what that processor will take another fragment as designating. The use of SETQ in LISP is an example.
This last point needs a word of explanation. It is of course possible to specify F in mathematical terms without any explicit mention of a Y-like function; the approach we use in LISP defines both Y and F in terms of the overarching function S mentioned above, and we could of course simply define F without defining Y at all. Our point, rather, is that any successful definition of F will effectively have to do the work of Y, more or less explicitly, either by defining some identifiable relationship, or else by embedding that relationship within the meta-theoretic machinery. We are arguing, in other words, only that the subject we intend Y to cover must be treated in some fashion or other.
What is perhaps surprising about all of this machinery is that it must be brought to bear on a purely procedural language — all three relationships (Q, F, and Y) figure crucially in an account even of LISP.