[phylum]<3-LISP>course>PROBLEM-SETS>ps-1c.bravo!1

Page Numbers: Yes X: 306 Y: 0.9" First Page: 14
Margins: Top: 1.1" Bottom: 1.2"
Heading:

PROBLEM SET #1 LISP: LANGUAGE AND LITERATURE April 16, 1984
————————————————————————————————————————————

1−4. Sequences and Higher-Order Functions

Note: In working this problem, you may want to consult section 2.a.3 of the manual, for functions that are helpful in dealing with sequences. 1

Especially when modelling complex or composite objects, it is often important to be able to manipulate sequences in flexible ways. Some sequence operations are provided in the standard language: FIRST, REST, APPEND, etc. Some of these are primitive (such as FIRST and REST), but others could have been defined by the user. For example, the following is a plausible definition for APPEND:

(define APPEND
(lambda [seq1 seq2]
(if (null seq1)
seq2
(cons (first seq1)
(append (rest seq1) seq2)))))

If you type this in, however, the system will complain, because you would be trying to redefine the name APPEND, which it has already defined. (We could let you redefine it, but other routines, including system routines, probably embed expectations about APPEND, and your redefinition might therefore cause some other parts of the system to "break". This is a design choice: some system designers favour a "give the user lots of rope, and let her hang herself if she wants"; others try to protect the user from her own errors. By and large 3-LISP was designed under the "more rope" philosophy, but prohibiting a user from changing standard procedure definitions seemed a reasonable cautionary stance to take, especially at the beginning.)

1-4-a. Define your own version of APPEND just like the one above, but under a different name, and convince yourself that it works correctly. Also, define your own version of REVERSE, so that (REVERSE [1 2 3 4 5]) will return [5 4 3 2 1]. Check both of your definitions on null arguments (in both positions in the case of APPEND).

1-4-b. Define a procedure called FRINGE that takes a sequence consisting of numbers or sequences of numbers or ... — i.e., that takes what is essentially a tree, and returns a simple sequence of the numbers at the leaves of the tree. I.e., it should support the following sorts of behaviour:

1> (fringe [1 2 3])
1= [1 2 3]
1> (fringe [[1] [2] [3 4]])
1= [1 2 3 4]
1> (fringe [[[[1 2] 3 [4 5]]] [] [[[[[[[7]]]]]]]])
1= [1 2 3 4 5 7]

It turns out that whereas the definition of APPEND that you developed in problem 1-4-a accepts exactly two arguments, the standard version supplied in the system is defined over lots of arguments. In particular, we have:

1> (let [[x [1 2 3]]
[y []]
[z [4 5 6 7]]
[w [8]]]
(append x y z w x))
1= [1 2 3 4 5 6 7 8 1 2 3]

and

1> (append)
1= []

The way this works is that every 3-LISP procedure really takes just one argument, which is (almost always) a sequence of some length. So the addition function +, for example, which looks as if it takes two arguments, is in fact defined over a single sequence of two numbers. This convention is supported by the lexical notation; in general, the expression

(fun a1 a2 ... an)

is a lexical (notational) abbreviation for

(fun . [a1 a2 ... an])

(You can in fact try typing the more complex version on your terminal, to verify that it works in the same way as the simpler version. Try, for example, (+ . [1 2]) and ’(* . [A B]).) In fact you already have a clue that this is the way things work, because the pattern in a lambda expression is usually given as a sequence of variable names. This sequence is matched against the sequence of arguments, in such a way so that in the resulting environment the pattern will designate exactly what the arguments designated in the context of the original call.

This power of the language is sometimes quite useful. Suppose, for example, that the name X designated a sequence of three integers, and that you wanted to add the second and third. Up till now you would probably have written something like:

(+ (second x) (third x))

But it turns out that a simpler expression will do the same thing:

(+ . (rest x))

Note the general pattern here. Pairs (the structures notated ’(exp1 exp2 ... exp3)’ or ’(exp1 . exp2)’, have two parts, called the procedure part and the arguments part. In standard cases, a pair designates the value of applying the function designated by the procedure part to the objects designated by the argument part. Thus (+ 2 3), which is really an abbreviation for (+ . [2 3]), designates the value of applying the addition function to the sequence of numbers two and three, which is five. That is why (+ 2 3) designates the number five and returns the numeral 5. But (supposing X is bound to [10 20 30]), the form (+ . (rest x)) designates the value of applying the addition function to the sequence designated by the form (rest x), which is of course just the sequence of the numbers twenty and thirty. So everything works as expected.

Haskell Curry is known in part for a process of "currying" the arguments to a multi-argument function; thus instead of having addition be defined as:

(define +
(lambda [x y] (+ x y)))

a "curried" version of addition would be:

(define curried-addition
(lambda [x]
(lambda [y] (+ x y))))

You would have to use this as follows:

1> ((curried-addition 3) 4)
1= 7

What 3-LISP does is sort of the opposite: rather than taking multi-argument functions and defining a series of single-argument functions that collect the arguments one by one, we instead define multi-argument functions as single-argument functions that take all of the arguments wrapped up together into a single object. When we deal explicitly with the arguments as a whole (as in the case of (+ . (rest x)), we will say that we have "objectified" the sequence of arguments.

To get familiar with using objectified arguments, try out the following examples:

(+ . (rest (rest [1 2 3 4])))

(let [[x [10 20 30]]]
(+ . (rest x)))

(** 2 3)

(let [[x [2 3]]]
(** . x))

(let [[x [2 3]]]
(** . (reverse x)))

In order to define a procedure to take an arbitrary number of arguments, we can use a simple atom (variable name) in place of a sequence of variable names. Type in the following definition, and try it on some examples, such as (TEST 1 2 3) and (TEST):

(define TEST
(lambda ARGS args))

1-4-c. Define a simple procedure called TEST that returns just the number of arguments it was called with. I.e., you should be able to get such behaviour as:

1> (how-many)
1= 0
1> (how-many 10 20 30)
1= 3

Given all of this, you should now be able to define a multi-argument version of addition, called +!, so that you have:

1> (+! 1 2 3)
1= 6
1> (+! 10)
1= 10
1> (+! 100 200 300 400 500)
1= 1500

1-4-d. Define a multi-argument multiplication procedure (call it *!). What about subtraction and division; do multi-argument versions make sense?

Bugs in programs often occur when dealing with limiting cases. These errors are often called "fence-post" errors, by analogy with correctly staking out the main piece of intellectual territory you want to have your program deal with, but failing to have the fences — the markers delimiting the edges of the territory — set up properly.

1-4-e. What do your multi-argument procedures do if given no arguments at all? I.e., what do (+!) and (*!) designate? Modify your definitions, if necessary, to return the correct answer in these cases. Define a multi-argument version of MAXIMUM. Does (MAXIMUM!) make sense?

1-4-f. Why did we go on about objectified arguments before getting to problems 1-4-d and 1-4-e? Can you define a multi-argument addition procedure without using this feature? If not, say why; if so, do so and test it on the examples you tried above.

1-4-g. The MAP function provided standardly in 3-LISP applies a function repeatedly to a sequence of arguments, returning a sequence of values. For example, given the following definitions:

(define INCREMENT (lambda [n] (+ n 1)))
(define DOUBLE (lambda [n] (+ n n)))
(define SQUARE (lambda [n] (* n n)))

we would have

(map increment [10 20 30]) => [11 21 31]
(map double [10 20 30]) => [20 40 60]
(map square [10 20 30]) => [100 400 900]

(map rest [[1 2 3] [4 5 6] [7 8 9]])=>[[2 3] [5 6] [8 9]]

MAP can also be used with functions of more than one argument; thus we have

(map + [1 2 3] [10 20 30]) => [11 22 33]

(map nth [1 2 3] [[10 20 30]
[40 50 60]
[70 80 90]])=> [10 50 90]

1-4-h. Define a version of MAP that works correctly on the examples just given. Note that (as before) you should call your version something other than MAP (say, MAP*), since the system will not let you redefine MAP itself.

1-4-i. Try the standard version of MAP on your multi-argument addition:

(map +! [1 2 3])
(map +! [1] [2] [3])
(map +! [])
(map +! [1 2 3] [4 5 6] [7 8 9])

What about

(map +!)

Does this return anything? Should it?

Try these same examples using your definition of MAP*. Fix your definition if it doesn’t handle these cases correctly.

In defining all of these multi-argument functions, we are essentially repeating ourselves. In line with our constant goal of obtaining power and modularity, it would seem useful to define a (higher-order) function that would automatically generate +!, given +, and similarly for the other examples. I.e., we would like a procedure called SPREAD such that (SPREAD +) would designate what +! designates, etc. This would enable us to write such things as:

1> ((spread +) 1 2 3 4 5)
1= 15

1-4-j. Define such a SPREAD. What happens to your definition in the following cases:

((spread +))
((spread *))

Define a better version of SPREAD that takes two arguments, a function and a limiting case, that would enable us to define our multi-argument addition and multiplication procedures as follows:

(define +! (spread + 0))
(define *! (spread * 1))

1-4-k. Supposing that the standard APPEND were able to take only two arguments, define a multi-argument APPEND! using this new SPREAD.

A curious thing about these sorts of higher-order functions is that, once defined, they can often be used in turn to define procedures that would otherwise be defined using explicit recursion. For example, we could define LENGTH in the following manner:

(define LENGTH
(lambda [seq]
((spread + 0) . (map (lambda [e] 1) seq))))

or, again more modularly, as:

(define LENGTH
(lambda [seq]
((spread + 0) . (map (constant 1) seq))))

where CONSTANT is the higher-order definition:

(define CONSTANT
(lambda [constant-value]
(lambda args constant-value)))

Programming in this way is particularly developed by Henderson, Backus, and others, under the name "functional programming". If this style appeals to you, you should read Peter Henderson’s "Functional Programming"; you might even consider a career programming in APL.

1−5. Paragraph Filling

In this exercise we will step a little outside the domain of pure mathematical objects, and deal with simple strings of text. The goal is to develop some experience with a program that is larger than a single function, so as to see how what looks initially like a rather complex problem can be broken down coherently into manageable parts. In the overall scale of things, this will still be a very small program, but some important ideas will be illustrated.

The goal is to write a program to "fill" or "justify" a paragraph of text, rather the way that is done in text-editors like EMACS or the editor you use in preparing material for this course. In a real situation, the editor would deal with updating a live screen, but for simplicity we will ignore the "real-time" aspects of interaction.

We will assume that a "paragraph" is a string of characters, including normal characters and spaces and possibly carriage returns. The carriage-returns will be taken to separate "lines"; we needn’t take a position on the question of whether a carriage return is part of the line before it, or part of the line after it, or neither.

The exercise will make use of various standard 3-LISP functions that deal with strings; this is a good opportunity to look them up in section 2.c of the manual. Also, we will need to read and print strings from and to the terminal; the procedures STRING-IN and STRING-OUT can be used to do this. These two procedures take two arguments each; in both cases the first argument is an indicator of where the input or output should go; for now just use PS, which directs the reading and printing to go on in the main interaction window (PS stands for "primary stream" — something we will explain in a later section of the course). More specifically,

(string-out ps string)

will print the string designated by the form string on the terminal;

(string-in ps delimiter-character)

will read in a string of characters in from the terminal up to the character designated by delimiter-character. Finally, the atom CR is bound globally to a designator of the carriage-return character, which will be useful.

Designators of string are printed as strings contained within double quotes. Thus we have:

1> "This is a string"
1= "This is a string"
1> (string-in ps cr)Some words to be read
1= "Some words to be read"

The first thing we need to do is to define a routine that will read in a paragraph, which we will take as being a sequence of lines of text up to a blank line.

1-5-a. Define two procedures, called READ-LINE and READ-PARAGRAPH, that read a line and a paragraph from the terminal, respectively. Assume that a paragraph ends with a blank line (two successive carriage returns). Be careful about what happens to the delimiter character (the carriage return). You may need to use functions like STRING-CONS, STRING-LENGTH, SUB-STRING, and STRING-APPEND that you will have read about in section 2.c of the manual. In preparation for the next step, have READ-PARAGRAPH return a string of characters in which all carriage-returns are replaced with spaces.

READ-PARAGRAPH does half of what we want; it gets rid of the line breaks in the original paragraph. What remains is to "break" the paragraph appropriately, substituting some new carriage returns for spaces in the appropriate places. In order to know what "appropriate" means, we need to know how wide the newly formatted paragraph should be. We will parameterize our ultimate paragraph-filling procedure to take this "fill-column" as an argument. For example, if the fill column is set to 72, then we will be able to construct paragraphs no line of which is more than 72 characters.

1-5-b. Figure out and write down how you think we should proceed, proposing functionality for the various modules we will assemble into a coherent whole. Think first about what is wanted (i.e., what behaviour the routine should yield), and only subsequently about how to achieve it. For example, it is not enough to say that no line should be longer than the fill column; we want each line to be as long as possible without overflowing. So it seems plausible that one component in an appropriate solution will be a procedure that takes a string as an argument and identifies the first position in that string where there is a space that ought to be converted into a carriage return.

Note: Don’t think there is only one way to build such a paragraph filling program. In the problem-set answers we will present a number of good choices. The point, rather, is that your solution should be modular, clear, and straight-forward. 1

1-5-c. Once you have worked out what you think is a workable design, write and debug a procedure called FILL-PARAGRAPH that takes a paragraph (with carriage-returns removed) and a fill-column as arguments, and returns an appropriately filled paragraph as a result, with carriage returns inserted as appropriate, and a final carriage return at the end. Hint: It is likely that you will want a procedure called something like REVERSE-STRING-SEARCH — something, as you will note, that is not provided with the system. It is possible to define such a routine by defining a STRING-REVERSE procedure, and then defining REVERSE-STRING-SEARCH in terms of it:

(define REVERSE-STRING-SEARCH
(lambda [key string]
(string-search key (string-reverse string))))

You can test your version of FILL-PARAGRAPH by combining it with the READ-PARAGRAPH you designed above. Specifically, you should be able to generate something like the following console session:

1> (define TEST
(lambda [fill-column]
(string-out ps
(fill-paragraph (read-paragraph) fill-column))))
1= ’TEST
1> (test 35)
This is a first try.
We will
have a couple of relatively short
lines
including
one or two with only a single word, and then this long one that definitely goes too far.
That’s all this time around.

This is a first try. We will have
a couple of relatively short lines
including one or two with only a
single word, and then this long
one that definitely goes too far.
That’s all this time around.
1= ’ok

It is always good to test programs against limiting cases. One case that you may not have considered while writing your program was what to do if a single word was longer than the maximum number of characters you were allowing per line. This is easy to test:

1> (test 10)
This is a test to see whether if we were
to have a very long word like Koyaanisqatsi
that is longer than the maximum line length
we allow, whether things will still work
out ok. Whether we know what ok means we
won’t divulge at this point.

You may have a feeling, as you fix your program to deal with this and other limiting cases, that all of your hard-earned elegance and modularity is slipping away, and that your program is becoming a mass of checks for this and that "strange condition" that will rarely arise. This is a genuine and important worry — one we will address explicitly later in the course. For the time being, you can console yourself with the realisation that without a modular and understandable program, you would have had trouble figure out how to incorporate checks for all these special cases. But this isn’t much of a consolation. The real moral to be drawn is different: that the real world is a messy place. The more we develop programs to deal with real-world situations, the more we will have to recognize that its complexity will always transcend the complexity of our formal programs. This inescapable fact will have a substantial impact on the notions of "correctness" we examine later in the course.