CSLI RESEARCH SEMINARS FALL 1984 CONTENTS 1. Bush: A Generalized Framework for Speech Recognition 2. Cohen: Speech Acts as Summaries of Plans 3. Hobbs: Report from Commonsense Summer 4. Grosz: The Structures of Discourse Structure 5. McCarthy 6. Rosenschein: 7. Sells:intensionality and scope in relative clauses 8. Smith: abstract data types and the representational theory of mind 9. Uszkoreit: Morphology vs. Syntax: The Case of German Separable Prefix Verbs 10: Witkin: Modeling Primitives for Perception and Graphics 11: Withgott: Parsing Acoustic Events 12: Zalta: AXIOMATIC FOUNDATIONS FOR THE SEMANTICS OF NATURAL LANGUAGE A Generalized Framework for Speech Recognition Marcia Bush This talk will describe a framework for speaker-independent, large-vocabulary and/or continuous speech recognition being developed at Schlumberger (Fairchild). The framework consists of three components: 1) a finite-state pronunciation network which models relevant acoustic-phonetic events in the recognition vocabulary; 2) a set of generalized acoustic pattern matchers; and 3) an optimal search strategy based on a dynamic programming algorithm. The framework is designed to accommodate a variety of (typically disparate) approaches to the speech recognition problem, including spectral template matching, acoustic-phonetic feature extraction and lexical pruning based on broad-category segmentation. A working system developed within this framework and tailored to the digits vocabulary will also be described. The system achieves high recognition accuracy on a corpus spoken by approximately 250 talkers from 22 "dialect groups" within the continental United States. Speech Acts as Summaries of Plans Phil Cohen Cohen's main claim was that illocutionary act recognition is not necessary for engaging in communicative interaction. Rather, engaging in such interaction requires intent/plan recognition. In support of this thesis, he presented a formalism, being developed with Hector Levesque (Univ. of Toronto), that showed how illocutionary acts can be defined in terms of plans --- i.e., as beliefs about the conversants' shared knowledge of the speaker's and hearer's goals and the causal consequences of achieving those goals. In this formalism, illocutionary acts are no longer conceptually primitive, but rather amount to theorems that can be proven about a state-of-affairs. As an illustration, the definition of a direct request was derived from an independently-motivated theory of action, and thus did not need to be stipulated. Cohen arged that just as one need not determine if a proof corresponds to a prior lemma, a hearer need not actually characterize the consequences of each utterance in terms of the illocutionary act theorems, but can simply infer and respond to the speaker's goals. However, the hearer can retrospectively summarize a complex of utterances as satisfying an illocutionary act. It was also claimed that the framework can characterize a range of indirect speech acts as lemmas, which can be derived from and integrated with plan-based reasoning. The discussant, Ivan Sag, related the theory to Gricean maxims of conversation, and to the "standard" view of how pragmatics fits into a theory of linguistic communication. The Structures of Discourse Structure Barbara J. Grosz Grosz introduced a theory of discourse stucture which addressed two questions: What is discourse? What is discourse structure? It was argued that discourse structure must be described in terms of attention and intention. Intention and attention are both non-linguistic notions. The former play a role in defining discourse coherence and in providing a coherent notion of the term "discourse" itself. A main thesis of the theory is that the structure of any discourse is a composite of three interacting constituents: the structure of the actual sequence of utterances in the discourse, a structure of intentions, and an attentional state. Each of these affects and is affected by the individual utterance in the discourse. Grosz demonstrated how the separation of discourse structure into these three components allows the generalization and simplification in the interpretation of a number of previous results and, she argued, is essential in explaining certain discourse phenomena including interruptions and certain types of referring expressions. The theory is being developed with C. Sidner from BBN. Ray Perrault served as the discussant. Report from Commonsense Summer Jerry R. Hobbs ``Commonsense Summer'' was a summer-long workshop sponsored by CSLI and held at SRI International. It has long been agreed that intelligent behavior requires a great deal of knowledge about the commonsense world, but before this year no one had embarked on a large-scale effort to encode this knowledge. The aim of Commonsense Summer was to do the first three months of such an effort. Eight graduate students from several universities participated in the workshop full-time, and a number of other active researchers in the fields of knowledge representation, natural language and vision participated as well. An attempt was made to axiomatize in formal logic significant amounts of commonsense knowledge about the physical, psychological and social worlds, concentrating on eight domains: spatial relationships, shape, motion, properties of materials, belief states, certain speech acts, relations between textual entities and entities in the world, and responsibility. In this talk I will discuss the problem of encoding commonsense knowledge in general, outline the approach taken in the workshop, and describe some of the results of the summer. Johan DeKleer gave a critique of the general program in probing the question of how much one learns from this approach in comparison to traditional approaches in physics. Natural Language from the Standpoint of Artificial Intelligence John McCarthy An intelligent individual, human or computer program, must act on the basis of what it believes in advance modified by what it observes and what it learns from linguistic communication. McCarthy argued that thinking about how the achievement of goals is helped by communication leads to a somewhat different point of view from one derived mainly from study of the corpus of spoken and written language. Namely, 1. Communication should be regarded as a modifier of state of mind. 2. The most basic form of communication is the single word sentence uttered under conditions in whjich the speaker and hearer share enough knowledge so that the single word suffices. The complete sentence develops under conditions in which the speaker and the hearers share less context. 3. Many of the characteristics of language are determined by so far unrecognized requirements of the communication situation. They will apply to machines as well as people. 4. An effort to make a common Business Communication Language for commercial communication among machines belonging to different organizations exhibits interesting problems of the semantics of language. The seminar last week was given by Stan Rosenschein, from the Artificial Intelligence Center at SRI. Rosenschein discussed natural language processing and considered three distinct roles logic might play not only in the description of individual utterances but in a general description of exchange of information. Bob Moore was the discussant. I will present a part of my dissertation, ``Syntax and Semantics of Resumptive Pronouns,'' that deals with the semantics of relative clauses. Montague's treatment of intensionality runs into problems with the kinds of intensional examples I will discuss, which are of the form ``the unicorn that John seeks (should be 5 years old).'' This kind of example, I will argue, does indeed have an intensional reading, one that Montague's treatment cannot get. Another kind of example with a quantifier is something like ``the grade that every student gets (is determined solely by the final exam),'' where intuitively ``every student'' gets widest scope. This is problematic in that relative clauses are normally scope islands. I will propose an analysis of these data in the framework of Discourse Representation Structures as developed by Hans Kamp and use data from English and Hebrew to support the particular theoretical assumptions. --Peter Sells Peter Sells of CSLI discussed problems of intensionality and scope in relative clauses and proposed the beginnings of an account of them in terms of Discourse Representation Structures, which it was claimed yield a more satisfactory account than is available in Montague Grammar. Edit Doron and Lauri Karttunen were the discussants and pointed out several problem areas in the analysis and directions for future research. Abstract Data Types and the Representational Theory of Mind Though developed largely independently, there is a striking similarity between i) the development, in theoretical computer science, of the theory of abstract data types; and ii) the push, in AI and the philosophy of mind, towards a less explicitly representational model of computation (the latter most recently endorsed, in various ways, by Barwise and Perry and by Rosenschein and Pereira). On the other hand, there is also a striking DISsimilarity between the two approaches, having to do with what each of them calls "semantics". In this talk I will attempt to clarify the relationship between the two approaches by applying them both to the same simple example. In conclusion I will suggest that what we all headed towards is a representational, but non-syntactic, model of computation and/or mind. Brian Smith, of Xerox PARC, led last week's research seminar in a discussion of abstract data types and the representational theory of mind. Smith pointed out similarities between abstract data types and recent, less explicitly representational, models of computation, as well as certain differences in their "semantics". He suggested that the common direction is a representational, but non-syntactic, model of computation and/or mind. Jon Barwise was the discussant. Hans Uszkoreit will speak this Thursday, August 30, at 2 p.m. on "Morphology vs. Syntax: The Case of German Separable Prefix Verbs". The talk will be held in the trailor's conference room next to Ventura. ABSTRACT The phenomenon of the separable prefix verb in German poses a challenging problem in distinguishing the components of word-structure (morphology) and sentence-structure (syntax). One of the most solid generalizations about the separation of syntax and morphology, known as the Lexical Integrity Hypothesis, is that syntactic processes should be blind to the internal structures of words, which are created by the morphology. However, on one hand, the combination of verb and prefix appears to form a lexical unit. Very often the meaning of the complex verb cannot be compositionally derived from the meanings of prefix and stem. Yet on the other hand, prefix and verb often behave syntactically as separate constituents. There is no contradiction, if one assumes that the lexical processes that combine verbs and separable prefixes do not build structure but instead use a rich system of categorial information like the one available in GPSG to modify the verb entry. The prefix is then introduced separately in the syntax. The Lexical Integrity Hypothesis does not need to be relaxed. It will be shown why neither a purely syntactic nor a morphological solution to the problem can account for the syntactic and semantic facts. The "mixed analysis" smoothly integrates with a GPSG treatment of German word order. The strategy also carries over to similar phenomena in other languages, such as English verb-particle constructions. Modeling Primitives for Perception and Graphics In the first weekly research seminar for members of CSLI, Andy Witkin of Fairchild Research Center presented work on visual models intended to support both image analysis image generation. Witkin started by introducing his notion of generic modeling through examples from graphics. He described, for example, recursive particle process models (developed at LucasFilm) that generate diverse realistic natural forms. By adjusting its parameters, a single particle model can be used to synthesize fire, splashing water, and plants. The hope is that researchers can devise a small "pallette" of such generic models that can be combined to both create and perceive a wide range of complex structures. Witkin went on to describe work at Fairchild on modeling and perception of flow patterns including striated structures such as wood-grain. Such phenomena can be modeled uniformly by means of deformations applied to linear or isotropic patterns. Decomposing a flow pattern into a "straight" pattern and a deformation supports physically interesting transformations (such as undoing the deformation) and measures of similarity (e.g. pattern A is a deformed version of pattern B.) For example, a wood-grain pattern was analyzed (with Michael Kass) by making point-by-point estimates of flow direction, and obtaining a deformation grid by numerical integration. This analysis permits abstract transformations such as removing a knot and the deformation it produced, showing what the grain would have looked like if the knot hadn't been there. Stanley Peters commented on the task of establishing what kind of patterns we abstract out from the "retinal situation", i.e. the visual input the human being has at a given place and instant of time. ummary of last week's seminar "Parsing Acoustic Events" Meg Withgott, CSLI Can a language-independent representation be formulated of the acoustic events present in natural, continuous speech? This report argues that it is possible, and that it follows that this can be input to a general parser which will recover linguistic structure leading to the formulation of word hypotheses. This decomposition of the problem permits a representation tha