*start*
00510 00024 US 
Date: 11 July 1981 5:38 pm PDT (Saturday)
From: Stewart.PA
Subject: Reorganization of Cedarlib>Real
To: "@[Ivy]<Cedarlib>Real>Users.dl"
cc: Stewart
Reply-To: Stewart

The files on [Ivy]<Cedarlib>Real> have been moved around a bit.  The following
subdirectories exist:

Doc>		Documentation
AltoI>		Special stuff for Alto I users
Alto>		The standard Mesa 6 floating point for Altos
		  and D* machines running in Alto mode.
Rubicon>	Rubicon Pilot floating point (used by Cedar)

	-Larry

*start*
00189 00024 US 
Date: 13 JUL 1981 0537-PDT
From: RWEAVER
Subject: Files archived from directory <STEWART>
To:   STEWART

MESA70REALS.DM;1 297 pages
           1 file      297 pages
*start*
01728 00024 US 
Date: 11 Aug. 1981 8:57 pm PDT (Tuesday)
From: Fiala.PA
Subject: Floating point opcodes
To: LStewart, Satterthwaite
cc: Wilhelm, Rovner, Taft, Fiala

I am proposing to implement floating point opcodes for square root (FSqRt = MISC
36b) and floating scale (FSc = MISC 37b) in both Alto and Cedar Dolphin
microcode in the near future (I have them both coded and am about to start
debugging).  The following comments are relevant to these.

1) The Cedar unimplemented opcode trap implementation must be changed so
that the procedure reached when an opcode (such as FSqRt) is not defined is
entered after one Xfer rather than two.  I was shocked to learn from Willie-Sue
today that this is apparently not the case.  When this becomes true, it will be
practical to produce the MISC 36b opcode for FSqRt (etc.) when that is
implemented on the Dolphin but not on the Dorado (or vice versa).

2) FSqRt takes one real on TOS,,2OS and returns a positive real on TOS,,2OS. 
The sign of its argument is ignored--LS, if this is wrong, please tell me.  My
thought is that the user can check for <0 before execution if he cares.  Timing
for FSqRt is about 57 microseconds.

3) FSc takes a real on TOS,,2OS and an integer N on 3OS where -202b <= N
<=200b.  It returns the real on TOS,,2OS with the exponent scaled by N.  My
thought on the use of FSc is as follows:  (a) Define "shifts" of reals to use the
FSc opcode--make this legal in Mesa source programs; (b) Have the compiler
detect real multiplication and division by constants that are powers of 2 and do
use FSc instead of FDiv or FMul in this case.   Note that the timing of FSc is
about 8 microseconds compared to 15 for a FAdd, 38 for FMul, or 42 for FDiv.


*start*
01043 00024 US 
Date: 12 Aug. 1981 5:57 pm PDT (Wednesday)
From: Fiala.PA
Subject: Re: Floating point opcodes
In-reply-to: Stewart's message of 12 Aug. 1981 10:55 am PDT (Wednesday)
To: Stewart, Wilhelm
cc: Fiala, LStewart, Satterthwaite, Rovner, Taft

The scaling opcode will correctly generate overflow and underflow so long as the
scaling integer is within the range specified in my message (-202b <= N <=
200b).  It will also work correctly for integers -377b < N < +377b provided that
overflow or underflow DOES NOT HAPPEN.  However, if N is outside the
required range, overflow and underflow indications will be REVERSED.

When underflow or overflow happens, the default event is a trap to software;
however, the substitute-0-on-underflow option will cause 0 to be the result of an
underflow.  For this reason, an argument outside the required range is verboten.

The microcode does not presently check for N outside the required range, and I
do not propose to make such a check.

I will make FSqRt trap on a negative argument.

*start*
00509 00024 US 
Date: 12 Aug. 1981 6:07 pm PDT (Wednesday)
From: Fiala.PA
Subject: Re: Floating point opcodes
In-reply-to: Taft's message of 12 Aug. 1981 11:02 am PDT (Wednesday)
To: Taft
cc: Stewart, Wilhelm, Fiala, LStewart, Satterthwaite, Rovner

FSc does behave the same as FDiv, FMul, etc.  In case you wondered, the reason
why it takes over 8 microseconds to execute is that it is going through the
standard unpacking and packing microcode where all the errors, traps, and sticky
stuff happen.

*start*
01839 00024 US 
Date: 14 Aug. 1981 8:45 am PDT (Friday)
From: Satterthwaite.PA
Subject: Floating point issues
In-reply-to: Fiala's message of 11 Aug. 1981 8:57 pm PDT (Tuesday)
To: Stewart, Warnock, Wilhelm
cc: Fiala, Satterthwaite, Rovner, Taft

Never in my wildest dreams did I think that any important Mesa programs would
spend much time crunching floating-point numbers.  Since I was wrong, let me
(re)raise the following issues now that we have more experience with
performance (and lack thereof):

- How closely do we want to try to adhere to the IEEE standard.  There are two
problems for the compiler:

 (1) Because of all the modes and exceptions that are possible, the current
compiler does very few floating-point operations at compile-time even when all
operands are constant -- I believe that it attempts only fixed-to-float, unary
negation and abs.

 (2) Because of exceptions, cheap "trick" implementations of some operations
aren't strictly legitimate, and apparently redundant computations cannot strictly
be eliminated.  For example, recent compilers have implemented ABS by masking
off the sign bit -- probably not really correct for NaNs, etc.; on the other hand,
the current compiler does not discard addition of (constant) zero, multiplication
by one, etc.

- What is the current state of support for LONG REAL?  Is it worth adding to
the language sooner rather than later?  Note that, even with the larger stack,
code for evaluating deeply nested LONG REAL expressions will not be wonderful
(only two operands can be pushed onto the stack at a time).

I suppose it's too heretical at this point to suggest 48-bit arithmetic for LONG
REAL (or for REAL, with 64 bit LONG REAL).  The numerical analysts at
Stanford who used the B5500 many years ago seemed to think that 48 bits was a
wonderful compromise.

Ed

*start*
01003 00024 US 
Date: 14 Aug. 1981 8:54 am PDT (Friday)
From: Satterthwaite.PA
Subject: Re: Floating point opcodes
In-reply-to: Fiala's message of 11 Aug. 1981 8:57 pm PDT (Tuesday)
To: Fiala
cc: Stewart, Satterthwaite, Wilhelm, Rovner, Taft

If we still plan to do strict IEEE standard arithmetic (see my previous message),
and if you want the compiler to generate FSc, please make sure that it behaves
identically to multiplication/division by powers of 2 within some specified
range.  ([-202b .. 200b] seems good enough; the compiler would only generate
this instruction for multiplication/division by constants.

Would it be hard to change the conventions for FSc so that N comes from TOS
and the real comes from 2OS,,3OS?  This would follow the convention for
fixed-point shift and is somewhat less awkward for the compiler.

I don't expect the compiler to get involved in FSqRt at all; unless
constant-folding of square roots is a big deal, a MACHINE CODE inline seems
entirely adequate.

*start*
00903 00024 US 
Date: 14 Aug. 1981 9:46 am PDT (Friday)
From: Wilhelm.PA
Subject: Re: Floating point issues
In-reply-to: Satterthwaite's message of 14 Aug. 1981 8:45 am PDT (Friday)
To: Satterthwaite
cc: Stewart, Warnock, Wilhelm, Fiala, Rovner, Taft

I couldn't care less about the IEEE "standard;" I think it's a crock.  It would
really be nice if the compiler could easily produce efficient code and do as much
as possible at compile time.

There is no question that we will (do) need LONG REAL, and I suspect if it's to
be added to Mesa, it might as well be now as later.  I wouldn't worry about
lengthy expressions, they don't happen very often anyway.  As for the size of
the numbers, they will have to be 64 bits in length.  Forty-eight bit numbers
would be somewhat short on precision in some cases, and dealing with
three-word quantities on future machines would be a real pain.

Neil

*start*
03095 00024 US 
Date: 14 Aug. 1981 11:28 am PDT (Friday)
From: Fiala.PA
Subject: Re: Floating point issues
In-reply-to: Satterthwaite's message of 14 Aug. 1981 8:45 am PDT (Friday)
To: Satterthwaite
cc: Stewart, Warnock, Wilhelm, Fiala, Rovner, Taft

I don't think optimizing constant expressions at compile time is important because
of infrequency and because the user can do this by hand if he really cares. 
However, using FSc instead of FMul or FDiv would be useful when it can be
used--FSc is in fact identical to FMul or FDiv by a power-of-two.  Although my
implementation accommodates -202b to +200b, it might be simpler to confine the
compiler's use of FSc to be -200b through +177b (i.e., 8 bit integers).

It is possible to reverse arguments for FSc.  This will slow it from about 8.9
microseconds to about 9.3 microseconds and increase code size by 2
microinstructions.  This is not a big deal and I am willing to change the
implementation if you insist.  Do you insist?

I think it is desirable to obey the IEEE floating point standard, though I have
been willing to add modifications such as Wilhelm's substitute-0-on-underflow as
options.  At the moment, however, it is impossible to allow users anything other
than system-wide options because parameters controlling rounding modes,
infinity handling, etc. are global rather than part of the process state.  For this
reason, Pilot should allocate larger blocks for process state dumps, so that I can
save and reload the extra registers as required.  The immediate requirement is for
two extra words in the process state; however, it might be useful to increase the
size by 4 words to allow for future growth.

-----

I don't think we presently have microcode space for 48 or 64-bit reals IN
ADDITION TO 32-bit reals.  Even if we had space, there might not be enough
registers to support 64-bit reals on the Dolphin, though there are enough
registers for 48-bit reals.  However, I agree with Wilhelm that, if we change to
longer reals, we should go to 64-bit reals.

If 32-bit reals were REPLACED by 64-bit reals, microcode required would grow
substantially from 541b microinstructions now to perhaps 720b microinstructions
with 64-bit reals.  Execution time might average 200% longer.

My feeling is that 64-bit reals would be best, but changing to these would
require so much new programming that I don't want to write microcode for this
now--a software implementation is up to Larry Stewart.  The graphics people
should indicate their feelings since this appears to be an EITHER-OR choice for
microcode implementation.  If both 32 and 64-bit reals are wanted, a possible
approach is to do something like the PDP-10 implementation which has, in
addition to the usual stuff, unnormalized floating add and the long-mode
opcodes (FMPL, FADL, FDVL, etc.) which produce a double-precision result
from single precision arguments.  Typically four or five of these opcodes are
executed to complete a double-precision operation.

If 64-bit reals are wanted, we might want to use the floating point board for
these on the Dolphin.

*start*
03515 00024 US 
Date: 14 Aug. 1981 2:15 pm PDT (Friday)
From: Taft.PA
Subject: Re: Floating point issues
In-reply-to: Satterthwaite's message of 14 Aug. 1981 8:45 am PDT (Friday)
To: Satterthwaite
cc: Stewart, Warnock, Wilhelm, Fiala, Rovner, Taft

I have to take exception to the assertion that the IEEE floating point standard is a
"crock".  In my opinion, it's the cleanest and easiest to understand floating point
convention I have seen.  There are minor changes that I might desire to simplify
software/microcode implementations; but overall I believe they "got it right" in
the sense that IEEE floating point is free of most of the pathological problems
common to other floating point implementations.

Leaving that aside, I believe we can't afford to diverge from the IEEE standard,
for the simple reason that it's the standard that will be cast in VLSI by the chip
manufacturers whether we like it or not.

Now, to address the specific issues that have been raised:

To make it possible to do constant folding at compile time, it will be necessary
for the various modes (particularly rounding) to be determined at compile time,
presumably by static declarations that obey block structure.  As for eliminating
redundant computation such as addition of zero, this seems fairly hopeless as
you say; but I think it's far less important than simple constant folding.

Dynamic floating point modes are a real problem.  Originally I believed we could
store the modes as part of the process state simply by enlarging the PrincOps
StateVector -- something that is allowed by the PrincOps and is now actually
possible as of the Rubicon release of Pilot.  Unfortunately (also as of Rubicon), a
process is allocated a StateVector only when it is preempted.  The state of a
process that gives up control voluntarily is contained entirely in the PSB, which
is completely full.  I suppose increasing the size of the PSB from 4 to 8 words is
practical, but it would not be a trivial change and would be incompatible with
current Pilot.

As for LONG REALs, I concur with most of Fiala's remarks.  But I would go
further and say that the Mesa stack-machine architecture is probably not the
right basis for implementing LONG REALs.  The overhead of pushing and
popping LONG REAL values may entirely dominate the cost of the floating point
operations themselves.  This is particularly true in the case of the Dolphin using
the floating point hardware, since communication with that hardware is via the
Dolphin's I/O system.  I think we should seriously consider an alternative
architecture, such as one involving "floating point accumulators" or something. 
In any event, there's no point in introducing LONG REALs into the language
until we have a better understanding of what the underlying architecture is
going to be and how we are going to implement it on the machines we care
about.

Which brings me to a final point.  On the basis of hallway discussions, I know
Larry Stewart has already given most of these issues a great deal of thought; and
I imagine other people have also.  If we're really serious about a comprehensive
Mesa floating point architecture, I think we should have a series of meetings to
do a serious design rather than continuing this hit-or-miss correspondence via
messages.  Participants in such a design should include Mesa language experts,
implementors of microcode and hardware for all machines of interest (Dolphin,
Dorado, Dragon), and of course a representative sample of users.
	Ed

*start*
01926 00024 US 
Date: 24 Aug. 1981 12:56 pm PDT (Monday)
From: Satterthwaite.PA
Subject: Mesa 6 Compiler vs. Altos
To: Brotz, Barth
cc: Stewart, Satterthwaite

I have determined that the problems with compiling floating-point expressions
disappear if programs are compiled with /l switch.

The /l switch is a signal to the compiler that the intended object machine is
emulating an Alto but can support long pointers, i.e., a Dolphin or Dorado.  The
D-machines also remove a number of tiresome restrictions imposed by the real
Alto hardware/microcode, and the compiler knows this.  Thus the /l switch
affects the generated code in a number of ways unrelated to long pointers.

In particular, the Alto microcode assumes that a number of stack-oriented
operations are "minimal stack", i.e., that there is nothing else on the stack when
the byte code is executed.  The D-machines do not assume this, even when
emulating Altos.  I believe that the observed fatal compiler errors arise in the
code that is intended to check the minimal stack condition.

According to Larry Stewart, the floating point operations provided by Alto
microcode do NOT actually assume the minimal stack convention.  If this is true,
there should be an easy fix to the 6.1 compiler that involves just changing a few
table entries.  Unfortunately, in converting to the latest Cedar release I just
overwrote my Mesa 6 partition that contained the files used to build the Mesa 6.1
compiler.  Even so, rebuilding should be a relatively small deal if I can find the
appropriate command files.  Is this a worthwhile exercise?

Whether or not this (hopefully) trivial fix is made, I think the question of
long-term maintenance of Alto/Mesa software (and other Alto software) remains
open.  I hear rumors that people are still building and buying Altos.  Are they
aware of the problem?  What do THEY think is going to happen?  Does anyone
else agree?

Ed

*start*
00856 00024 US 
Date: 24 Aug. 1981 5:23 pm PDT (Monday)
From: Brotz.PA
Subject: Re: Mesa 6 Compiler vs. Altos
In-reply-to: Satterthwaite's message of 24 Aug. 1981 12:56 pm PDT (Monday)
To: Satterthwaite
cc: Brotz, Barth, Stewart

Ed,

If the fix to the 6.1 compiler is indeed easy, then the exercise will be very worthwhile.  This is the only serious bug that has surfaced in ICL in the 6 months or so that they have been using that compiler, so it seems that shoring up the Alto Mesa world for our friends there would be the right thing to do.  As long as the fixes to the Alto Mesa compiler are very infrequent and simple, it would be nice of us to keep ICL happy with their Alto investment.

I have it from Chuck Thacker that the rumor of people still building Altos is blatantly false.  In fact, he has had a hand in making it false.

-- Doug

*start*
00939 00024 US 
Date: 25 Aug. 1981 9:07 am PDT (Tuesday)
From: Satterthwaite.PA
Subject: Re: Mesa 6 Compiler vs. Altos
In-reply-to: Brotz's message of 24 Aug. 1981 5:23 pm PDT (Monday)
To: Brotz, Barth
cc: Satterthwaite, Stewart
cc: Levin, Sweet

There is a new version of the Mesa 6.1 compiler on

     [Ivy]<Mesa>Temp>Compiler.image.

It seems to compile some test programs that previously produced fatal compiler
errors, and the generated code looks ok (assuming that the Alto floating-point
microcode really can handle operations with other stuff on the stack below the
operands).

The edits were pretty trivial and shouldn't introduce any new problems, but it's
possible that I blew the compiler-building process somehow (it's been a long
time).  Please give this version a try.  If it seems to be an improvement, I'll move
it to [Ivy]<Mesa>Compiler.image and announce it (suggestions about who should
know are welcome).

Ed

*start*
00672 00024 US 
Date: 28 Aug. 1981 1:41 pm PDT (Friday)
From: Wilhelm.PA
Subject: Square Roots
To: Stewart
cc: Fiala, Wilhelm

Here it is at last - the software square root stuff I promised.  It's on
[ivy]<wilhelm>testsqrt.mesa.  You can ignore the procedure "test" since its
purpose is only to exercise the SqRt function.  The "initSqRt" procedure is what
generated the table entries assigned to the array "guesses."  It, too, is probably of
historical value only.  You should give out a SIGNAL at the indicated place; if I
recall correctly, RealImpl currently doesn't signal any errors in functions, so you
will have to add a SIGNAL definition somewhere.

Neil

*start*
01237 00024 US 
Date: 9 Sept. 1981 11:20 am PDT (Wednesday)
From: Barth.PA
Subject: Re: Mesa 6 Compiler vs. Altos
In-reply-to: Satterthwaite's message of 25 Aug. 1981 9:07 am PDT (Tuesday)
To: Satterthwaite
cc: Brotz, Barth, Stewart, Levin, Sweet

Pardon the delay in sending this message but I was waiting until I really got the whole spectrophotometer package rolling and that, for various obscure and frustrating reasons is unlikely to happen soon.

I have used the new compiler on the software which the old one blew up on and the compiler accepted everything.   The runtime results also look good so I think we can presume that it works correctly.

As to who should know about this release, I believe that is up to you.  It would be fine with me if noone else were informed or you could advertise it to the world.  The latter course of action might result in the belief that you are maintaining the Alto/Mesa compiler which could result in frowns from management resulting in cutting off my support so I guess I do lean towards not telling anybody.  You could wait until someone comes around and complains and then let them in on the secret.

In any case I thank you very much for taking care of this little problem.

Rick

*start*
00224 00024 US 
Date: 30 Sept. 1981 4:08 pm PDT (Wednesday)
From: Stewart.PA
Subject: Cedar Release: PilotReals.df
To: Levin
cc: "@[Ivy]<Cedarlib>Real>Support.dl"

[Ivy]<Cedarlib>Real>Rubicon>PilotReals.df   exists

*start*
00388 00024 US 
Date: 2 Oct 1981 10:06 PDT
From: Stewart at PARC-MAXC
Subject: Re: IEEE Floating Point Std.
In-reply-to: Geschke's message of 1 Oct 1981 15:45 PDT
To: Sproull@CMU-10A
cc: Stewart

The best reference I know of is still the January 1980 issue of Computer
magazine..  There are also lots of documents from the standards committee, some
of which are useful.

	-Larry

*start*
00258 00024 US 
Date: 8 Oct 1981 09:50 PDT
From: Stewart at PARC-MAXC
Subject: Re: IEEE Floating Point Std
To: Sproull@CMU-10A
cc: Stewart

There is a newer reference as well. . .

January 1980 issue of Computer
March 1981 issue of Computer

-Larry

*start*
00483 00024 US 
Date: 20-Oct-81 17:20:43 PDT (Tuesday)
From: Glassman.PA
Subject: Floating point arithmetic on DLion
To: Stewart
Reply-To: Glassman

Larry,
	I'm trying to do some basic real arithmetic stuff.  Thru asking people I've found your Real package on [Indigo]<CedarLib>REal>Rubicon>, but I've been unable to get it to work.  I'm running in Tajo on a Dandelion, no floating point microcode that I know of.  If you can give me any ideas, I'd appreciate it.
					Steve
*start*
01691 00024 US 
Date: 21 Oct. 1981 9:45 am PDT (Wednesday)
From: Stewart.PA
Subject: Re: Floating point arithmetic on DLion
In-reply-to: Your message of 20-Oct-81 17:20:43 PDT (Tuesday)
To: Glassman
cc: Stewart

Steve,

The Mesa 6 Floating point (Rubicon is essentially identical) requires at least some
microcode.  The compiler outputs ESC instructions for REAL operations, which
are either executed by floating point microcode or trap to the RealImpl package. 
In addition, the Rubicon floating point uses the "Cedar" trapping conventions
rather than the Pilot convention, so it won't work with vanilla Tajo.

There are several possibilities:

Make a version of RealImpl that does not try to set up the trap handlers
(otherwise your MDS gets clobbered somewhere) and use the interface RealOps,
as in x ← RealOps.FAdd[a, b];.  If you can stand it...

Make a version of RealImpl that uses the Pilot trapping mechanism, via
sUnimplemented in SD.  That should work, but I am not sure whether Pilot
preserves enough information to let one implement "instructions" with software. 
This would require changes to the Pilot sUnimplemented handler, or some
cleverness about snarfing the FP related traps before passing control to the
standard handler.

Make a version of RealImpl that fills in the old FADD, etc. entries in SD, then
use RealOps for Fix to Integer (rather than the Real.bcd version, which is an ESC
INLINE.)  This would require you use the /-f compiler switch, if it still works.

Get the DLion microcoders to implement enough of the Floating Point opcodes to
trap to software with enough information to resume.  (This approach only if
sUnimplemented loses state.)

	-Larry

*start*
00983 00024 US 
Date: 13 Nov. 1981 12:31 pm PST (Friday)
From: Stevens.PA
Subject: Real Functions
To: LStewart
cc: Oki, Stevens

Larry,

We have many applications involving Real numbers and we are very much
interested in having a complete (correct) library of floating point routines to aid
in solving these applications in Mesa.

One recent experience in the use of Power showed an error when raising a
negative number to a power.  It appears that negative numbers as bases may not
be covered by the implimentation in RealFns.  Is a more general implimentation
planned for the future?

Are there any other functions which have nonobvious limitations?  And can
your routines be programmed to flag the odd cases rather than returning results
and leaving it to the program to find out if they are good or not?

Again, we look forward to extensive use of these capabilities and we will be
very cooperative in working out and learning the guidelines for their proper use.

	Jim

*start*
01082 00024 US 
Date: 13 Nov. 1981 2:24 pm PST (Friday)
From: Stewart.PA
Subject: Re: Real Functions
In-reply-to: Stevens' message of 13 Nov. 1981 12:31 pm PST (Friday)
To: Stevens
cc: Stewart, Oki, Wyatt, Fiala, Wilhelm, Warnock, Morris, Satterthwaite, Thacker,
 Lampson

I can well believe there are loose ends in RealFns.  I don't know where they are,
nor do I have any particular plans to look for or fix them.  (I like to be blunt up
front...)

There IS a better SqRt waiting for installation and I have better RealToString
facilities in a partial state of completion.

On a more global scale.  I believe we need to design and implement double
precision REAL and get it (and single precision) included in the PrincOps. 
There are a number of problems having to do with process state etc.  We also
need (you are quite right) a better math package, with all the bits right for
single precision and we need a double precision implementation.  There is no 
one actively working on these things, although I think about it occasionally.

Any suggestions out there?   -Larry

*start*
00792 00024 US 
Date: 13 Nov. 1981 3:26 pm PST (Friday)
From: Wilhelm.PA
Subject: Re: Real Functions
In-reply-to: Stewart's message of 13 Nov. 1981 2:24 pm PST (Friday)
To: Stewart
cc: Stevens, Oki, Wyatt, Fiala, Wilhelm, Warnock, Morris, Satterthwaite, Thacker,
 Lampson

Jim Stevens' immediate problem is simply that the current implementations of the
real functions do not generate SIGNALs for exceptional situations.  SqRt is
typical; if you give it a negative argument, it simply returns 0.  Our community
of number crunching types is growing, and I think it well worth the time to add
some basic error checks to RealFns.

Double precision is definitely needed, but I have a feeling that it will have to
wait a while for a number of uninteresting and frustrating reasons.

Neil

*start*
02354 00024 US 
Date: 7 Dec. 1981 5:42 pm PST (Monday)
From: Villano.ES
Subject: Mesa D0 Floating Point
To: LStewart.Pa
cc: Villano, Ruggles, Fiala.PA


I received the attached message from Ed Fiala regarding the need for Mesa level
support of the REAL operations.  Since your name was mentioned as the owner
of a Mesa support package, I hope you could answer a few questions for me.

I am writing the Interpress Interpreter for the 5700.  As you may know the 5700
has a D0 for the base processor and uses a modified version of Pilot as the
operating system.  The representation for numbers internal to the Interpress
Interpreter is best represented by floating point arithmetic (in fact the
specification, except for the exponent range, requires IEEE 32 bit floating point). 
I was originally under the (bad) impression that the microcode required no Mesa
level support but Ed's answer (below) set me straight.

I would like to understand the need for the Mesa support package and the
implications of using the Mesa package.  By this I mean:
	What function does the Mesa support package perform?
	How much ((object) code is the package?
	Where I go about getting the code?
	How this is bound in with Pilot?  (Is there anything special about this?)
	Are there any other floating point support packages available that I
		should know about?
	Is there any document showing the speed of the floating point operations
		within a D0/Pilot environment, either theoretically, expreimentally,
		or both?


Thank you

Tony

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Date: 7 Dec. 1981 11:06 am PST (Monday)
From: Fiala.PA
Subject: Re: Mesa D0 Floating Point
In-reply-to: Your message of 4 Dec. 1981 5:17 pm PST (Friday)
To: Villano.ES
cc: Fiala

Larry Stewart (LStewart.Pa) has a Mesa package that implements the
single-precision 32-bit floating point which we use.  You MUST load this
package to use REAL arithmetic.  It will work correctly WITH OR WITHOUT
floating point microcode on any Mesa machine.  Floating point microcode will
NOT work correctly without this package.

If you also have floating point microcode, performance of the real arithmetic will
be improved substantially, but this is not required for correct operation.



*start*
01422 00024 US 
Date: 8 Dec 1981 0915-PST (Tuesday)
Sender: CSL-Notebook
Subject:  Pupwatch
From: Birrell
To: CSL↑, Carothers


Title:  Pupwatch
Authors: Birrell
Date: 7 Dec 81 
Attributes: informal, technical, Alto, Communication, Distributed computing.

Abstract:  

Pupwatch is a program for observing PUP packets on your local Ethernet
network.  It will run on any computer that can run Alto/Mesa 6.0.  Pupwatch
operates by making the computer promiscuous, so that it will accept every
well-formed Ethernet packet on the attached network.  It then filters these
packets to select only the packets which are from or to the particular PUP host
currently being watched.

Pupwatch has a reasonable understanding of the common PUP packet types, the
PUP Byte Stream Protocol and the Leaf "Sequin" protocol.  On the display,
Pupwatch shows you a summary of packets interactively as they arrive.  By
explicit command, you can cause Pupwatch to write a disk file containing details
of the packets received.

Pupwatch is available as a net-bootable program that can be installed on IFS or
Gateway boot servers, and as an Alto ".run" file, and as an Alto/Mesa 6 bcd. 
Please read the documentation if you are interested in using it.  The
documentation is [Indigo]<Grapevine>Pupwatch>Pupwatch.press (and .bravo). 
Pupwatch.boot is at present net-bootable on net 3.


File: [Indigo]<CSL-Notebook>Entries>81CSLN-0065.press

*start*
04209 00024 US 
Date: 8 Dec. 1981 10:54 am PST (Tuesday)
From: Stewart.PA
Subject: Re: Mesa D0 Floating Point
In-reply-to: Villano.ES's message of 7 Dec. 1981 5:42 pm PST (Monday)
To: Villano.ES
cc: Stewart, Ruggles.ES, Fiala

Mesa floating point comes in two pieces:  microcode and a Mesa package
presently called RealImpl.  Floating point instructions generated by the Mesa 6
(and up) Compiler are things like [zESC, 26B].  Since these instructions are not
interpreted by the "standard" microcode, at least SOME additional microcode is
needed.  The minimum needed is microcode to trap gracefully to Mesa when a
floating point instruction comes by so that the instruction can be implemented in
software.  The Alto II, D0, and Dorado have additional microcode that nearly
always executes a floating point instruction without trapping to software. 
Needless to say, this is much faster.  (I should remark that it is possible, by
modifying the Mesa package and by using the Compiler /-f switch, to avoid the
need for any microcode.)

The Mesa package does three jobs:  It implements the floating point operations in
their full generality, it handles floating point exceptions, and it provides some
additional floating point functions.

The IEEE standard is rather complex, and it would take an enourmous amount of
microcode to handle all of it.  The approach taken has been to write the
microcode to handle all the "easy" cases and to trap on those operations or
operands which are difficult (and which rarely occur).

When a floating point exception occurs (divide by 0, overflow, etc.), the job of
raising a Mesa SIGNAL is too complex for microcode to reasonably handle.  In the
event of an exception, the microcode simply gives up and traps to software.

In order to make a reasonably complete set of floating point utilities available,
the RealImpl package also implements functions like Sine, Cosine, SquareRoot,
and utilities like StringToReal, and RealToString.

In order to make a clean division between microcode and software, to allow the
use of the same Mesa package on all machine configurations, and to permit the
microcoding of ever more of the IEEE standard, the approach taken is that when
the floating point microcode decides an operation is too complex to handle (or
when an exception occurs), the microcode passes the original operands and
instruction to software by trapping -- as though the microcode were not there. 
For this reason, the mesa package is prepared to handle all combinations of
operands and operations, even though the simpler ones are always handled by
the microcode.


In its present form, RealImpl.bcd is 11,000 bytes (no symbols).  I believe it could
be made much smaller by elimination of the functions (Sine, etc.), and by some
more coding effort.

All the code and documentation  is on [Indigo]<Cedarlib>Real>Alto> or
[Indigo]<Cedarlib>Real>Rubicon> (for Pilot).  The documentation is on
[Indigo]<Cedarlib>Real>Doc>.  Alto microcode is on [Indigo]<Cedarlib>Real>Alto>. 
Ed Fiala keeps the D0 microcode and Ed Taft keeps the Dorado microcode.

As mentioned earlier, the microcode needs to trap to software on occasion.  In the
Alto world, this trapping is done by a special mechanism through SD[137B].  In
the CSL/Cedar/Pilot world, trapping is done by a general mechanism called
TrapSupport, which is build into the Cedar.boot files we use.  This mechanism
uses 2 pages of MDS space in order to run fast.  A smaller but slower mechanism
could be build with some amount of work.  TrapSupport is managed by Ed Taft
and Roy Levin.

As for other floating point packages, there are predecessors of the Mesa 6
package for Mesa 5 and possibly even Mesa 4.  There was also a package written
by Ashok Singhania of PSD a while back.

There are no up to date measurements of the speed of either Mesa or microcode
floating point.  The general ballpark figures are that the (D0) microcode is in the
30-60 microsecond range somewhere and the Mesa is in the several millisecond
range.  The Dorado runs about ten times faster and the Alto about half as fast.

There are things about the present package that need work, but noone here at
least has much time for it.

	-Larry

*start*
01184 00024 US 
Date: 8 Dec. 1981 11:11 am PST (Tuesday)
From: Fiala.PA
Subject: Re: Mesa D0 Floating Point
In-reply-to: Villano.ES's message of 7 Dec. 1981 5:42 pm PST (Monday)
To: Villano.ES
cc: LStewart, Ruggles.ES, Fiala

The following is the (nominal) speed of the D0/Pilot floating point opcodes
assuming that you are using the floating point microcode:

	FAD		154 cycles
	FSUB		156 cycles
	FMUL	387 cycles
	FDIV		420 cycles
	FCOMP	 92 cycles
	FIX		 70 cycles
	FIXI		 74 cycles
	FIXC		 68 cycles
	FSTICKY	 24 cycles
	FREM	no microcode
	ROUND	 83 cycles
	ROUNDI	 87 cycles
	ROUNDC	 81 cycles
	FSQRT	480 cycles
	FSC		 88 cycles

You should not use FSC at the present time because the arrangement of its
arguments may change.

Generally speaking, the various opcodes will complete in microcode so long as no
denormalized arguments are received and no result overflows or underflows. 
One of the option bits in FSTICKY allows substitution of 0 on any underflow,
and if you use this option, opcodes which underflow will also complete in
microcode.  When an opcode is not completed in microcode it will trap to
software.

I leave the rest of your questions to Larry Stewart.

*start*
00337 00024 US 
Date: 3 March 1982 9:57 am EST (Wednesday)
From: Dattola.WBST
Subject: Pilot Reals
To: PilotUsers↑.PA
Reply-To: Dattola

I've been told that there is a software floating point package for Dandelions. Does
anyone know where to find this? Better yet, is there a microcoded version (or
will there be)?

Thanks,
Bob

*start*
01246 00024 US 
Date: 6 March 1982 5:19 pm PST (Saturday)
From: Stewart.PA
Subject: Re: Pilot Reals
In-reply-to: Dattola.WBST's message of 3 March 1982 9:57 am EST (Wednesday)
To: Dattola.WBST
cc: Stewart

I don't know if you have had any answers, so I thought I'd throw in my
information.

See press files on [Indigo]<Cedarlib>Real> and subdirectories.

There is FP support for Alto II's, Dolphins, and Dorados.  It requires at least some
microcode, either to handle the entire operation or to trap to software.

There is a less versatile package for Alto I's that uses no microcode.

All this stuff is Mesa 6.

The Pilot FP used in CSL on Dolphins and Dorados is (a) Rubicon, and (b) uses
CSL's peculiar microcode trapping machinery.

This stuff could be adapted to DLions, but the DLion microcode would have to
be taught to understand the ESC opcodes used for FP and the Mesa support
package would probably have to be modified for a new trapping convention.

By using the Mesa compiler /-f switch and modifying the Mesa part of the FP
package, a non-microcode version of FP for DLions could be put together in a
few hours, but everything that used FP would have to be compiled /-f.

As far as I know, noone is working on it.

	-Larry

*start*
00783 00024 US 
Date: 12-Mar-82 17:26:29 PST (Friday)
From: Evans.PA
Subject: Faster Software Floating Point
To: PerfProject↑.ES
Reply-To: Evans.PA
cc: Johnsson.PA, Newlin.PA, Weissman.PA, AlBrown.PA, Stewart.PA, PerformanceInterest↑.PA, Wick.PA, Ayers.PA, Lipkie.ES, Evans.PA, Sapsford.PA

Star Graphics stretching (a performance dog of the graphics world) is now about 5 times faster.  This appears to be solely due to the use of double word micro-code shifts (new for Trinity) in the software implementation of floating point now provided by Trinity Common Software.  The software is identical to the Rubicon version except for the use of these double word shifts.

Thank you Core Software.  (However, we still need a Dandelion micro-code floating point package!)


Steve