Page Numbers: No X: 306 Y: 0.5" First Page: 1
Margins: Top: 1.0" Bottom: 0.5"
Jim des Rivìeres Functional Programming: A Prospectus Final Draft
—————————————————————————————————————————————————
To be presented at the IEEE Spring COMPCON ’84
—————————————————————————————————————————————————

Functional Programming: A Prospectus

Jim des Rivi
̀eres

Xerox Palo Alto Research Center
3333 Coyote Hill Road, Palo Alto, CA 94304; and
Center for the Study of Language and Information
Stanford University, Stanford, CA 94305

—————————————————————————————————————————————————
Filed: [phylum]<desrivieres>fp-prospectus.bravo
Last Edited: December 7, 1983 7:08 AM
—————————————————————————————————————————————————
Abstract
A prospectus is presented on a venture that has been afoot for over twenty years. Functional programming languages are those whose primary control structure is the application of a function to arguments to yield a result. This paper explores some of the issues surrounding the design, implementation, and use of functional programming languages.
Introduction
Functional, or function-oriented programming languages are those whose primary control structure is the application of a function to arguments to yield a result. In contrast to conventional imperative (or procedural) programming languages, such as Pascal, C, and FORTRAN, pure functional programming languages have no analogue of assignment statements or side effect operations (i.e., constructs that change the state of the computation). The whole notion of state is as foreign to the pure functional programming world as it is to conventional mathematics.
There are two aspects of functional programming that are attractive aspects that are not shared with mainstream programming languages. First, side effects can make programs difficult to understand, debug, maintain, and prove properties about (either formally or informally). By shunning side effects, and thereby returning to the rules of conventional mathematics, functional programming frees itself from a prime contributor to program inscrutability.
Second, eliminating side effects is tantamount to giving up the notion of change, and giving up the notion of change sets one free from the concept of time. With programs that do not depend on time, one has considerable freedom in distributing the parts of the computation through time. It is even perfectly reasonable to evaluate several expressions at the same time. Contrast this to the imperative programming languages where each statement potentially alters the state of the computation, making it essential that the statements of a program be executed in a more-or-less fixed serial order. It is especially significant that the potential for parallelism is inherent to functional programs, as opposed to the kind of parallelism found in conventional programming languages that the programmer has to ask for explicitly. Functional programming languages are offered as one way to harness some of the power of new machines with significant parallel processing capabilities, made possible by VLSI technology1,2.
This paper will explore some of the issues surrounding the design, implementation, and use of functional programming languages. It begins by presenting FPL, a simple functional programming language that will serve to ground later discussions. Then it is shown how various issues arise because of the lack of side effects. The last section presents other aspects of functional programming languages that have received attention.
Functional Programming
In conventional mathematics one writes
ssq(x,y)=defx2+y2
to describe the abstract function that sums the squares of its two arguments. The function symbol ‘ssq’ is introduced to name this function; ‘x’ and ‘y’ are variables used to name the arguments; ‘+’ and 2 are standard names for the addition and squaring functions, respectively. One writes ‘ssq(2,3)’ to signify the application of the sum of squares function to the arguments 2 and 3. This function application yields 13 because ssq(2,3) =𔁈2 +𔁉2 =󈋉.
FPL is a programming language based on the notions of function and function application. Notational issues involving operator precedence and the like are sidestepped by using, in place of this traditional notation, the fully-parenthesized prefix notation employed by the LISP family of languages8,9,10,11. In FPL, the above definition would be written
(define ssq
(lambda [x y] (+ (sqr x) (sqr y))))
and the application of the function will be written (ssq23). (For those unfamiliar with Church’s ‘‘l" notation, read (lambda[x y] ...) as ‘‘the (nameless) function of two arguments, x and y, that yields ...".)
Some function symbols, like + and sqr, name primitive functions for manipulating numbers. Along with numbers, FPL has other types of data objects, including truth values (named true and false), and sequences. Expressions of the form [x1x2 ... xn] designate sequences whose elements are designated by the sub-expressions xi. For example, [12(+12)] designates the sequence whose elements are the numbers one, two, and three; [] designates the empty sequence. The elementary functions for manipulating sequences, named first, rest, cons, and empty?, are illustrated in the following examples:
(first [1 2 3]) = 1
(rest [1 2 3])
= [2 3]
(cons 0 [1 2 3])
= [0 1 2 3]
(empty? [])
= true
The conditional expression (ife1e2e3) is used to select the value of e2 or e3 depending on whether the truth value designated by e1 is true or false, respectively.
Recursive function definitions are allowed. That is, it is permissible to use the name of the function being defined within the body of its own l-expression, as in the following definition of the factorial function.
(define fact
(lambda [n]
(if (zero? n)
1
(* n (fact (- n 1))))))
Recursion makes it possible for functional programming languages like FPL to get by without special iteration constructs, such as for and while statements.
Some of the more common patterns of recursion/iteration, such as applying a function to each element of a sequence, can be captured with functions that take functions as arguments, or yield functions as results what are commonly called higher order functions.
(define apply-to-all
(lambda [f]
(lambda [s]
(if (empty? s) s
(cons (f (first s))
((apply-to-all f) (rest s)))))))

(define firsts (apply-to-all first))
(firsts [[1 2] [3 4] [5 6]])
= [1 3 5]

(define add1-to-all
(apply-to-all (lambda [x] (+ x 1))))
(add1-to-all [1 2 3])
= [2 3 4]
The argument to the higher order function named apply-to-all is termed a functional argument since it is expected to designate a function. The value returned is also a function, that depends on the argument to apply-to-all. While functional arguments are a feature commonly found in conventional programming languages (e.g. Pascal), functions that yield other functions as their result are rare.
Computing with Functions
As described so far, FPL is very much in the spirit of the language of mathematics it provides a means of describing abstract mathematical functions. However, if that were the entire story of the language, there would be no basis for a claim that FPL was a programming language. To be worthy of that title, it is essential that the descriptions of functions also prescribe computations. Consider the following definition of square root.
sqrt(x)=defthe y>0 such that y2=x
While this may be a complete and precise characterization of the square root function, this definition cannot be viewed in any computationally useful way as prescribing how one could compute a square root. Designers of functional programming languages must be careful to provide only those modes of expression that can be viewed as prescribing a way of computing the value of an application of a function.
For functional programming languages like FPL, the prime notion of computing is the process of simplifying expressions. Not only do we say that (+ (fact 2) 2) designates the number four, but it is also expected that some computing device can simplify this expression to the expression 4, which is the simplest expression that designates the number four. (Note that while there is a principled difference between the FPL expression 4 and the abstract number four, no real problems arise by blurring this distinction.) The natural unit of work is the reduction of a function application expression by substituting the arguments for occurrences of the variables within the body of the l-expression. For example, the expression
((lambda[ab] (faab))xy)
simplifies to the expression (fxxy) in a single reduction step. Reductions involving primitive functions usually produce expressions that cannot be simplified further; e.g., (+23) reduces to the expression 5. It will usually require many reduction steps to convert an expression to its simplest form, and there are usually many different ways to proceed. An all-important result, first proven by Church and Rosser for the l-calculus4 (the original functional programming language) but also applicable to languages like FPL, is that the order in which reductions are carried out does not make any difference in the resulting expression (although some orders may encounter errors or fail to terminate). In particular, since there are no side effects, one is even free to perform several reduction steps at the same time. This is the ultimate key to efficiency since a machine might be able to assign a different processing element to each expression that needs to be reduced.
This concludes the introduction to FPL. The question of precisely what will be the prescribed reduction order for FPL will remain unanswered. In the next section, it will be shown that there are at least two distinct solutions.
Basic Issues
There are a number of issues that arise due primarily to freedoms inherent in a language without side effects. Although several of the more important issues are discussed under separate headings in this section, it will become apparent that there are interdependencies among them. (For a more complete picture of the problems currently receiving attention, the reader may wish to consult the proceedings to some recent major functional programming conferences and workshops3,5,6,7.)
Strict vs. Non-Strict Functions
A function is said to be strict if it yields a meaningful value only when presented with well-defined argument values.
(define first-argument (lambda [x y] x))
(first-argument 100 (fact -1))
If the function first-argument is non-strict, then the second argument to this function is irrelevant, and the above application would yield 100 despite the fact that (fact-1) does not signify any meaningful value. On the other hand, if it is considered to be strict, then both arguments to it must be meaningful, and the above application should not yield a meaningful value.
The strict vs. non-strict decision is concerned with the functions designated by expressions in the language, and not about how those expressions are viewed computationally. However, the strictness issue is intimately related to the choice of reduction strategy. For an application expression involving a strict function, it is necessary to fully simplify all argument expressions. By doing so, the well-definedness of all arguments is verified; should one of the argument expressions fail to terminate, it is straightforward to ensure that no meaningful result is returned. On the other hand, the reduction strategy best suited to an application expression involving a non-strict function is one that immediately makes the reduction step, substituting in the yet-to-be-simplified argument expressions.
Infinite Objects
On a topic very closely related to strictness, consider the following definitions.
(define integers-from
(lambda [n]
(cons n (integers-from (+ n 1)))))
(define naturals (integers-from 0))
Should the definition of naturals be considered erroneous, or should it be taken as designating the infinite sequence of all natural numbers? If infinite sequences are legitimate then the following elegant program (attributed to Quarendon14) can be used to find the first n primes. (Assume that (filterps) yields the sequence s with all elements satisfying the predicate p removed, and that (first-nns) yields the first n elements of the sequence s.)
(define sieve
(lambda [s]
(cons (first s)
(sieve
(filter
(lambda [x] (divides? x (first s)))
(rest s))))))

(define primes (sieve (integers-from 2)))
(first-n 5 primes)
= [2 3 5 7 11]
The question of whether or not infinite sequences are allowed in the above functions is equivalent to asking whether or not cons is non-strict. If it is non-strict, infinite sequences are allowed; the task of simplifying the arguments to cons is shifted to first, rest, and the other sequence inspection primitives. Numerous discussions of this elegant feature can be found in the literature15,16,17,18.
Applicative vs. Lazy Reduction Order
On the issue of what computational prescription to assign to expressions in the language, there is a sharp polarization into two camps. One opts for the more-or-less standard, sequential reduction order; the other side takes a more liberal approach that is best described as ‘‘get the correct answer while doing as little work as possible". Each will be considered in turn.
The reduction strategy used by most LISP dialects, and the one used in most conventional programming languages (i.e., call-by-value), is called applicative order. Quite simply, the argument expressions are always simplified prior to reducing the function application. Most dialects go on to state that the argument expressions are processed sequentially in left-to-right order. This prescription has several implications.
8By completely spelling out the reduction order, the potential for parallelism has been precluded.
8Since there is a well-defined reduction order that the programmer can grasp, it is not problematic to allow functions to have side effects (see below).
8It is possible to design implementation processors that do not ‘‘build up stack" when making non-embedded function applications19. This allows true iterative behaviour, where going around a loop does not consume space.
8It is possible to employ standard techniques for predicting and controlling the space and time performance of a program.
8The task of debugging is not much different from the debugging of conventional programs.
The second scheme for doing reductions can be characterized as a fairly complex, non-sequential reduction order usually known as lazy processing14. Described by one researcher20 as a ‘‘surprising" reduction order, it is so non-obvious that it is unlikely that programmers could understand the processing of their programs in terms of any particular explicit reduction order. The crucial point is that the actual reduction order is unimportant; the processor guarantees that it performs only those reductions that must be made. To program effectively, the programmer needs only local knowledge of what causes expressions to be processed. Assuming that the application expression under consideration must be simplified, there are only two situations where a reduction step is forced: the function expression of a function application expression (in order to determine which function to apply), and the argument expressions of an application of a (strict) primitive function (in order to compute the result). In the case of applying a l-defined function, the fate of the argument expressions is entirely dependent on what happens to the corresponding variables in the course of processing the l-expression’s body.
Some of the merits and drawbacks of lazy processing regimes are as follows.
8The programmer can pay minimal attention to issues of control.
8Lazy reduction order entails the same number or fewer reduction steps than applicative order.
8Implementations are free to exploit the parallelism inherent in reducing multiple expressions at the same time, as long as the minimal amount of work is done (overall).
8Side effects and other state change operations cannot be easily supported because the programmer has no notion of processing order.
8By freeing the programmer from many of the mundane concerns of control, one makes it much more difficult to predict the space and time performance of his programs.
8Debugging can be difficult without an adequate model of how the program is being processed.
Side Effects
The decision to do away with side effects cannot be viewed as lightly as the one to banish GOTO statements from a language. Think of a program controlling an industrial robot. To that program, the ability to monitor and change the state of the world in which it is lives is fundamental. Obviously, this poses an serious problem for any programming language that is not build around the notions of time, state, and side effects.
For side effects to be useable, it must be possible for the programmer to write his expressions in such a way that they happen in the particular order he wants. Therefore, if a language is to allow side effects, there must be a well-defined and obvious protocol by which programs are processed. The more determinate the protocol, the more freedom the programmer has to use side effects willy-nilly. However, the determinism is not without penalty. Every additional constraint on the reduction order decreases the potential for parallelism. In the limiting case of a completely determinate reduction order such as left-to-right sequential applicative order, all potential for parallelism has been lost.
The LISP family of languages tend to include side effects, and therefore live without potential parallelism. This pragmatic approach has allowed LISP to be used as a powerful tool by a large community of users. On the other hand, languages such as HOPE12 and KRC13 are pure functional programming languages. While admittedly incomplete, they serve as vehicles for exploration into both application and implementation techniques. A third approach called applicative state transition systems1 has a purely functional subsystem as its major component. The global state is explicitly encoded and passed to a purely functional program, which computes both a result and a new global state. The global state is updated, and the process repeats. Since the state only changes between computations, there are still opportunities to exploit parallelism.
Other Aspects
This section looks at three other frontiers of exploration in the space of functional programming languages.
Combining Forms vs. Application-based
There are probably as many different functional programming styles that can be supported by a functional programming language as there are different styles for working within the confines of conventional programming language (e.g., GOTO-less style, use-lots-of-procedures style, FORTRAN style, etc.). The FP language1 illustrates a style of functional programming that is quite different from the more conventional l-based (or application-based) approach to functional programming illustrated by FPL. FP is based on a broad set of data manipulating primitive functions, and a (fixed) collection of combining forms (higher order functions) as exemplified by the following.
((compose f g) x) = (f (g x))
((
construct f1 ... fn) x) = [(f1 x) ... (fn x)]
((
conditional p f g) x) = (if (p x) (f x) (g x))
((
constant n) x) = n
((
apply-to-all f) [x1...xn]) = [(f x1) ... (f xn)]
((
insert f) [x1...xn]) = (f x1 (... (f xn-1 xn) ...))
There is no l-like construct in FP; consequently, there are no variables in the language; function names can only be introduced through defines; although recursive definitions are permitted, their use is discouraged. As an example, consider the following FP-style definition of the sum of squares function first presented in an earlier section (assume that the single argument to ssq will be a sequence of numbers).
(define ssq
(compose (insert +) (apply-to-all sqr)))
Advantages and disadvantages of this approach compared to l-based functional programming languages include the following.
8By offering powerful combining forms and data manipulating primitives, one is encouraged to write in an APL-like style, instead of the word-at-a-time style natural to other functional programming languages. This style tends to increase the possibility of performing many data manipulation operations in parallel.
8Programs expressed in this style tend to be easier to manipulate and reason about formally. General theorems involving the combining forms and the primitives provide a rich algebra of programs.
8Functions are not ‘‘first class citizens": one cannot designate a sequence of functions, nor can the combining forms be used to define new combining forms. There is a firm boundary between data and functions, which does not exist in most l-based languages.
Data Types and Type Checking
It is widely agreed that programming languages should support data type abstraction, and there is no reason to believe that functional proramming languages should be exempt. This problem has been addressed in at least two functional languages: HOPE12 and ML21. Each provides data type definition facilities and comprehensive static type checking. Of special relevance in the context of interactive languages is ML’s polymorphic type inference system that automatically deduces the type of most expressions without requiring explicit type definitions. A type abstraction facility has also been developed for FP22.
Implementation Technologies
There has been much interest in both software and hardware implementations of functional programming languages. Until recently it had been widely believed that implementations of lazy processing schemes were inherently less efficient on conventional machines than applicative order dialects. However, by compiling l-based functional programs into combinator trees, it is possible to provide lazy processing at competitive prices23,24.
Of the many proposals for novel machine architectures to support functional programming, we mention two. SCHEME-7925 is a single-chip VLSI implementation of the SCHEME dialect of LISP. In contrast to this sequential machine, Maǵo26 has proposed a cellular VLSI-based machine that exploits the parallelism inherent in pure FFP1 programs. See Treleaven2 for a discussion of various computer architectures suitable for functional programming.
Summary
The basic ideas underlying functional programming languages have been presented, and, hopefully, some of the more important issues facing their designers have been illuminated. Two textbooks on functional programming, one by Henderson18, the other by Burge15, are highly recommended as introductions to the subject.
Acknowledgements
The author would like to thank Brian Smith, Jonl White, Hector Levesque, Greg Nuyens, John Lamping, and Kathryn Finter for their helpful comments on earlier drafts. This research was conducted in the Cognitive and Instructional Sciences Group at Xerox PARC, as part of the Situated Language Program of Stanford’s Center for the Study of Language and Information.
References
[1]Backus, J. ‘‘Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs’’, CACM 21, 8, pp. 613-641 (August 1978).
[2]Treleaven. P. ‘‘Computer Architecture for Functional Programming", in Darlington et al.3, pp. 281-306.
[3]Darlington, J., Henderson, P. and Turner, D. (eds.) Functional Programming and its Applications. Cambridge: Cambridge University Press, U.K. (1982).
[4]Barendregt, H. The Lambda Calculus. Amsterdam: North-Holland (1981).
[5]Conference Record of the 1980 LISP Conference, Stanford, (August 1980).
[6]Proceedings of the 1981 ACM Conference on Functional Programming Languages and Computer Architecture, Portsmouth, New Hampshire, (1981).
[7]Conference Record of the 1982 LISP and Functional Programming Conference, Pittsburgh, (August 1982).
[8]McCarthy, J., et al., LISP 1.5 Programmer’s Manual. Cambridge, Mass.: The MIT Press (1965).
[9]Allen, J. Anatomy of LISP. New York: McGraw-Hill (1978).
[10]Steele, G., and Sussman, G., ‘‘The Revised Report on SCHEME, A Dialect of LISP’’, M.I.T Artificial Intelligence Laboratory Memo AIM-452 (1978).
[11]Smith. B. Reflection and Semantics in a Procedural Language, M.I.T. Laboratory for Computer Science Report MIT-TR-272 (1982).
[12]Burstall, R., et al. ‘‘HOPE: An Experimental Applicative Language’’, in 1980 LISP Conference5, pp. 136-143.
[13]Turner, D.. ‘‘Recursive Equations as a Programming Language", in Darlington et al.3, pp. 1-28.
[14]Henderson, P., and Morris, J. ‘‘A Lazy Evaluator", Third ACM Symposium on Principles of Programming Languages, Atlanta, pp. 95-103 (1976).
[15]Burge, W. Recursive Programming Techniques. Reading, Mass.: Addison-Wesley (1975).
[16]Friedman, D., and Wise, D. ‘‘CONS Should Not Evaluate Its Arguments’’, Automata, Languages, and Programming. Edinburgh University Press pp. 257-284 (1976).
[17]Friedman, D., and Wise, D. ‘‘Unbounded Computational Structures’’, SoftwarePractice and Experience 8, pp. 407-416 (1978).
[18]Henderson, P. Functional Programming, Applications and Implementation. Englewood Cliffs: Prentice-Hall (1980).
[19]Steele, G. ‘‘Debunking the ‘‘Expensive Procedure Call’’ Myth’’, M.I.T. Artificial Intelligence Laboratory Memo AIM-443 (1977).
[20]Morris, J. ‘‘Real Programming in Functional Languages’’, in Darlington et al.3, pp. 129-176.
[21]Cardelli, L. ‘‘ML under Unix", Bell Labs, Murray Hill, N.J. (draft 1983)
[22]Guttag, J. ‘‘Notes on Using Types and Type Abstraction in Functional Programming’’, in Darlington et al3, pp. 103-128.
[23]Turner, D. ‘‘A New Implementation Technique for Applicative Languages’’, SoftwarePractice and Experience 9, pp. 31-49 (1979).
[24]Peyton Jones, S. ‘‘An Investigation of the Relative Efficiencies of Combinators and Lambda Expressions’’, in 1982 LISP and Functional Programming Conference7, pp. 150-158.
[25]Sussman, G, Holloway, J., Steele, G., and Bell, A. ‘‘SCHEME-79 LISP on a Chip’’, IEEE Computer, pp. 10-21 (July1981).
[26]Maǵo, G. ‘‘A Cellular Computer Architecture for Functional Programming’’, Spring COMPCON’80, IEEE Computer Society, pp. 179-187 (February 1980).