RealFnsImpl.mesa
Copyright © 1984, 1985 by Xerox Corporation. All rights reserved.
Stewart on August 27, 1982 1:07 pm
Paul Rovner on May 4, 1983 10:13 am
Levin, August 8, 1983 4:39 pm
Plass, October 24, 1984 4:26:48 pm PDT
Russ Atkinson (RRA), May 29, 1985 1:16:23 am PDT
DIRECTORY
Ieee USING [CVExtended, ExponentBias, SingleReal, Unpack],
Real USING [Extended, Fix, FixC, FixI, RealError, RealException],
RealFns;
RealFnsImpl: CEDAR PROGRAM IMPORTS Ieee, Real EXPORTS RealFns = BEGIN
NOTE: The compiler is bloody incompetent about floating point! In particular, it cannot deal with constant folding. Therefore, various constants are given as variables to avoid excess computation. Please make these revert to constants when the compiler gets smarter. (RRA)
Some useful constants
Ln2: REAL = 0.693147181;
LogBase2ofE: REAL ← 1/Ln2;
smallnumber: REAL = 0.00000000005;
PI: REAL = 3.14159265;
twoPI: REAL ← PI*2;
PI3ovr2: REAL ← 3*PI/2;
PIovr2: REAL = 1.570796327;
PI3ovr8: REAL = 1.1780972;
PIovr4: REAL = .785398163;
PIovr8: REAL = .392699;
radiansPerDegree: REAL ← PI/180;
degreesPerRadian: REAL ← 180/PI;
rec2pi: REAL ← 1.0/twoPI;
rec360: REAL ← 1.0/360.0;
fourOverPI: REAL ← 4.0/PI;
Exp:
PUBLIC
PROC [x:
REAL]
RETURNS [
REAL] = {
compute e^x using continued fractions
ACM Algorithm 229, Morelock, James C., "Elementary functions by continued fractions"
r, f: REAL;
k: INTEGER;
fl: Ieee.SingleReal ← LOOPHOLE[1.0, Ieee.SingleReal];
To keep the error small near 0.0, use this Taylor's series
SELECT
ABS[x]
FROM
< 3e-4 => RETURN [1+x];
< 3e-3 => RETURN [1+x+x*x*0.5];
ENDCASE;
x ← x*LogBase2ofE; --exponent of 2 (instead of e)
IF x < -Ieee.ExponentBias THEN RETURN [0.0];
IF x > Ieee.ExponentBias THEN ERROR Real.RealError[];
k ← Real.FixI[x]; --integer part
x ← x - k; --fractional part
x ← x*Ln2; --now result = 2^k * exp(x) and 0<=x<Ln2
r ← x*x;
{
Expand the continued fraction loop a few times
LoopCount: NAT = 3;
f ← 4*LoopCount + 2;
f ← (4*(LoopCount-0) - 2) + r/f;
f ← (4*(LoopCount-1) - 2) + r/f;
f ← (4*(LoopCount-2) - 2) + r/f;
Note, if LoopCount > 3, then use the following comments
f ← (4*(LoopCount-3) - 2) + r/f;
f ← (4*(LoopCount-4) - 2) + r/f;
};
unscaled result is (f+x)/(f-x)
x ← (f + x)/(f - x);
scale: multipy by 2^k by adding k to the exponent
fl.exp ← k + Ieee.ExponentBias;
RETURN[x*LOOPHOLE[fl, REAL]];
};
Log:
PUBLIC
PROC [base, arg:
REAL]
RETURNS [
REAL] = {
logxy = Ln(y)/Ln(x)
RETURN[Ln[arg]/Ln[base]];
};
Ln:
PUBLIC
PROC [x:
REAL]
RETURNS [
REAL] = {
fl: Ieee.SingleReal;
m: INTEGER;
xm1: REAL ← x-1;
OneMinusX, OneMinusXPowerRep: REAL;
IF x <= 0 THEN Bomb[x];
To keep the error small near 1.0, use this Taylor's series
SELECT
ABS[xm1]
FROM
> 3e-3 => {};
> 3e-4 => RETURN [xm1 - 0.5*xm1*xm1];
ENDCASE => RETURN [xm1];
if x = z * 2^m then Ln(x) = m * Ln2 + Ln(z). Here, 1/2 <= z < 1
fl ← LOOPHOLE[x, Ieee.SingleReal];
m ← fl.exp - (Ieee.ExponentBias - 1);
fl.exp ← Ieee.ExponentBias - 1; --now 1/2 <= z < 1
x ← LOOPHOLE[fl, REAL];
special case for limit condition (x a power of 2)
IF x = 0.5 THEN RETURN[Ln2*(m-1)];
OneMinusX ← 1 - x;
x ← m*Ln2;
OneMinusXPowerRep ← 1;
FOR i:
CARDINAL ← 1, i+1
DO
PreviousVal: REAL ← x;
OneMinusXPowerRep ← OneMinusXPowerRep*OneMinusX;
x ← x - OneMinusXPowerRep/i;
IF x = PreviousVal THEN EXIT;
ENDLOOP;
RETURN[x];
};
Bomb:
PROC [x:
REAL] = {
[] ← SIGNAL Real.RealException[flags: [invalidOperation: TRUE], vp: NEW[Real.Extended ← Ieee.CVExtended[Ieee.Unpack[x]]]];
ERROR Real.RealError[];
};
SquareReal:
TYPE =
MACHINE
DEPENDENT
RECORD [
m2: CARDINAL,
sign: BOOL, expD2: [0..128), index: [0..32), m1: [0..8)];
This declaration obviously depends on IEEE format 32-bit numbers (Ieee.SingleReal). The idea is that the index field has the low bit of the exponent and the high 4 bits of the mantissa, which is used to index a table of initial guesses at the square root, each guess is roughly halfway between the lowest and highest root with the given index, which allows convergence with only two iterations. I don't know where the table originated.
SqRt:
PUBLIC PROC [a:
REAL]
RETURNS [b:
REAL ← 0.0] = {
RRA: The idea is to lookup a square root guess from the table, adjusting for the exponent, then perform a few iterations to get within a reasonable error. Someday we will get ambitious and share the table with IeeeFloatB.
SELECT
TRUE
FROM
a <= 0.0 => {
The guess is only valid when we have a normalized number, so we force it to be normalized, then correct for the shift.
IF a < 0.0 THEN Bomb[a];
b ← 0.0;
};
LOOPHOLE[a, Ieee.SingleReal].exp = 0 =>
The guess is only valid when we have a normalized number, so we force it to be normalized, then correct for the shift.
b ← SqRt[a * (LONG[20000B] * LONG[20000B])] / 20000B;
ENDCASE => {
aFmt: SquareReal ← LOOPHOLE[a, SquareReal];
bFmt: Ieee.SingleReal ←
LOOPHOLE[sqGuesses[aFmt.index], Ieee.SingleReal];
Get the initial guess based on the top 4 bits of the mantissa and the bottom bit of the exponent.
bFmt.exp ← bFmt.exp + aFmt.expD2 - 63;
Adjust the guess exponent based on adding half of the initial exponent (adjusted for the offset inherent in Ieee format).
b ←
LOOPHOLE[bFmt,
REAL];
The result is initially the guess
b ←
LOOPHOLE[
LOOPHOLE[b + a/b,
LONG
CARDINAL] - 40000000B,
REAL];
This is really b ← (b + a/b)/2, but much faster
b ←
LOOPHOLE[
LOOPHOLE[b + a/b,
LONG
CARDINAL] - 40000000B,
REAL];
This is really b ← (b + a/b)/2, but much faster
};
};
sqGuesses:
ARRAY[0..32)
OF
REAL ← [
LOOPHOLE[7715741440B, REAL], -- 0.7178211
LOOPHOLE[7717240700B, REAL],
LOOPHOLE[7720514140B, REAL],
LOOPHOLE[7721745140B, REAL],
LOOPHOLE[7723155200B, REAL],
LOOPHOLE[7724346000B, REAL],
LOOPHOLE[7725517500B, REAL],
LOOPHOLE[7726654000B, REAL], -- 0.8568115
LOOPHOLE[7727773400B, REAL], -- 0.8748627
LOOPHOLE[7731077100B, REAL],
LOOPHOLE[7732167300B, REAL],
LOOPHOLE[7733245000B, REAL],
LOOPHOLE[7734310400B, REAL],
LOOPHOLE[7735342500B, REAL],
LOOPHOLE[7736363400B, REAL],
LOOPHOLE[7737373700B, REAL], -- 0.9920616
LOOPHOLE[7740370240B, REAL], -- 1.015156
LOOPHOLE[7741351440B, REAL],
LOOPHOLE[7742314540B, REAL],
LOOPHOLE[7743243000B, REAL],
LOOPHOLE[7744155200B, REAL],
LOOPHOLE[7745054400B, REAL],
LOOPHOLE[7745741300B, REAL],
LOOPHOLE[7746614600B, REAL], -- 1.211716
LOOPHOLE[7747457040B, REAL], -- 1.237247
LOOPHOLE[7750310640B, REAL],
LOOPHOLE[7751132500B, REAL],
LOOPHOLE[7751745000B, REAL],
LOOPHOLE[7752550100B, REAL],
LOOPHOLE[7753344400B, REAL],
LOOPHOLE[7754132600B, REAL],
LOOPHOLE[7754712500B, REAL] -- 1.402992
];
Root:
PUBLIC
PROC [index, arg:
REAL]
RETURNS [
REAL] = {
yth root of x = e^(Ln(x)/y)
RETURN[Exp[Ln[arg]/index]];
};
Power:
PUBLIC
PROC [base, exponent:
REAL]
RETURNS [
REAL] = {
x^y = e^(y*Ln(x))
IF base = 0 THEN IF exponent = 0 THEN RETURN[1] ELSE RETURN[0];
RETURN[Exp[exponent*Ln[base]]];
};
Sin:
PUBLIC PROC [radians:
REAL]
RETURNS [sin:
REAL] = {
rem: REAL ← ABS[radians];
IF rem > twoPI
THEN rem ← GetWithinTwoPI[rem];
We need to have rem IN [0..twoPI]
{
Now select the appropriate formula given the octant.
SELECT Real.FixC[rem * fourOverPI]
FROM
0 => {GO TO posSin};
1 => {rem ← PIovr2 - rem; GO TO posCos};
2 => {rem ← rem - PIovr2; GO TO posCos};
3 => {rem ← PI - rem; GO TO posSin};
4 => {rem ← rem - PI; GO TO negSin};
5 => {rem ← PI3ovr2 - rem; GO TO negCos};
6 => {rem ← rem - PI3ovr2; GO TO negCos};
7 => {rem ← twoPI - rem; GO TO negSin};
ENDCASE => RETURN [0.0];
EXITS
posSin => {sin ← SinFirstOctant[rem]; IF radians < 0.0 THEN GO TO neg};
negSin => {sin ← SinFirstOctant[rem]; IF radians > 0.0 THEN GO TO neg};
posCos => {sin ← CosFirstOctant[rem]; IF radians < 0.0 THEN GO TO neg};
negCos => {sin ← CosFirstOctant[rem]; IF radians > 0.0 THEN GO TO neg};
};
EXITS neg => sin ← - sin;
};
Cos:
PUBLIC PROC [radians:
REAL]
RETURNS [cos:
REAL] = {
rem: REAL ← ABS[radians];
IF rem > twoPI
THEN rem ← GetWithinTwoPI[rem];
We need to have rem IN [0..twoPI]
{
Now select the appropriate formula given the octant.
SELECT Real.FixC[rem * fourOverPI]
FROM
0 => {GO TO posCos};
1 => {rem ← PIovr2 - rem; GO TO posSin};
2 => {rem ← rem - PIovr2; GO TO negSin};
3 => {rem ← PI - rem; GO TO negCos};
4 => {rem ← rem - PI; GO TO negCos};
5 => {rem ← PI3ovr2 - rem; GO TO negSin};
6 => {rem ← rem - PI3ovr2; GO TO posSin};
7 => {rem ← twoPI - rem; GO TO posCos};
ENDCASE => cos ← 1.0;
EXITS
posSin => {cos ← SinFirstOctant[rem]};
negSin => {cos ← SinFirstOctant[rem]; cos ← -cos};
posCos => {cos ← CosFirstOctant[rem]};
negCos => {cos ← CosFirstOctant[rem]; cos ← -cos};
};
};
SinDeg:
PUBLIC
PROC [degrees:
REAL]
RETURNS [sin:
REAL] = {
IF
ABS[degrees] > 360
THEN degrees ← GetWithin360[degrees];
We get better results if we compute in the degrees domain for a while.
sin ← Sin[degrees*radiansPerDegree];
};
CosDeg:
PUBLIC
PROC [degrees:
REAL]
RETURNS [cos:
REAL] = {
degrees ← ABS[degrees];
IF degrees > 360
THEN degrees ← GetWithin360[degrees];
We get better results if we compute in the degrees domain for a while.
cos ← Cos[degrees*radiansPerDegree];
};
Tan:
PUBLIC
PROC [radians:
REAL]
RETURNS [tan:
REAL] = {
IF ABS[radians] > twoPI THEN radians ← GetWithinTwoPI[radians];
tan ← Sin[radians] / Cos[radians];
};
TanDeg:
PUBLIC
PROC [degrees:
REAL]
RETURNS [tan:
REAL] = {
IF ABS[degrees] > 360 THEN degrees ← GetWithin360[degrees];
tan ← SinDeg[degrees] / CosDeg[degrees];
};
Constants for ArcTan
tanPI16: REAL = .1989123673;
x2: REAL ← 1/(.414213562);
x22: REAL ← x2*x2 + 1;
tan3PI16: REAL = .668178638;
x4: REAL = 1.0;
x44: REAL ← x4*x4 + 1;
tan5PI16: REAL = 1.496605763;
x6: REAL ← 1/(2.41421356);
x66: REAL ← x6*x6 + 1;
tan7PI16:
REAL = 5.02733949;
p0: REAL = .999999998;
p1: REAL = -.333331726;
p2: REAL = .1997952738;
p3: REAL = -.134450639;
ArcTan:
PUBLIC
PROC [y, x:
REAL]
RETURNS [radians:
REAL] = {
This function is good to 8.7 decimal places and is taken from:
Computer Approximations, John F. Hart et.al. pp129(top 2nd col).
s, v, t, t2, c, q: REAL;
IF
ABS[x] <=
ABS[smallnumber]
THEN
IF
ABS[y] <=
ABS[smallnumber]
THEN RETURN[0]
ELSE IF y < 0 THEN RETURN[-PIovr2] ELSE RETURN[PIovr2];
IF x < 0
THEN IF y < 0 THEN {q ← -PI; s ← 1} ELSE {q ← PI; s ← -1}
ELSE IF y < 0 THEN {q ← 0; s ← -1} ELSE {q ← 0; s ← 1};
v ← ABS[y/x];
SELECT v
FROM
< tanPI16 => {t ← v; c ← 0};
< tan3PI16 => {t ← x2 - x22/(x2 + v); c ← PIovr8};
< tan5PI16 => {t ← x4 - x44/(x4 + v); c ← PIovr4};
< tan7PI16 => {t ← x6 - x66/(x6 + v); c ← PI3ovr8};
ENDCASE => {t ← -1/v; c ← PIovr2};
t2 ← t*t;
radians ← s*(t*(p0 + t2*(p1 + t2*(p2 + t2*p3))) + c) + q;
};
ArcTanDeg:
PUBLIC
PROC [y, x:
REAL]
RETURNS [degrees:
REAL] = {
degrees ← ArcTan[y, x]*degreesPerRadian;
};
AlmostZero:
PUBLIC
PROC [x:
REAL, distance:
INTEGER [-126..127]]
RETURNS [
BOOL] = {
RETURN[LOOPHOLE[x, Ieee.SingleReal].exp < (distance + Ieee.ExponentBias)];
};
AlmostEqual:
PUBLIC
PROC [y, x:
REAL, distance:
INTEGER [-126..0]]
RETURNS [
BOOL] = {
fl: Ieee.SingleReal ← LOOPHOLE[(x - y)];
oldexp:
INTEGER ←
MAX[
LOOPHOLE[x, Ieee.SingleReal].exp, LOOPHOLE[y, Ieee.SingleReal].exp];
RETURN[fl.exp < (distance + oldexp)];
};
Helpful internal functions
Coefficients for the taylor series for sin and cos.
fact9recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7*8*9);
fact8recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7*8);
fact7recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7);
fact6recip: REAL ← 1.0/(INT[1]*2*3*4*5*6);
fact5recip: REAL ← 1.0/(INT[1]*2*3*4*5);
fact4recip: REAL ← 1.0/(INT[1]*2*3*4);
fact3recip: REAL ← 1.0/(INT[1]*2*3);
fact2recip: REAL ← 1.0/(INT[1]*2);
minEps:
REAL ← 1.0 -
LOOPHOLE[
LOOPHOLE[1.0,
LONG
CARDINAL]-1,
REAL];
roughly 5.96e-8, the minimum relative error between two different reals
sinCosEps:
REAL ← SqRt[minEps];
RRA: IF radians <= sinEps THEN SinFirstOctant[radians] = radians AND CosFirstOctant[radians] = 1.0 to the precision of our number system. The contribution of the smaller term of two consecutive terms in the Sin or Cos polynomials is down by the square from the larger (and a factor of at least two). So sinCosEps roughly equals 1.726e-4. Testing for this case improves accuracy at small values.
SinFirstOctant:
PROC [radians:
REAL]
RETURNS [
REAL] = {
This version of Sin is only guaranteed to be accurate within the first octant: [0.0..PI/4], which is approximately [0.0..0.785+]. When radians = PI/4 we have the maximum contribution of the term after the fact9recip term, for a truncation error of roughly 1.76e-9, which is within minEps*Sin[radians] (= 4.2e-8).
IF radians <= sinCosEps
THEN
RETURN [radians]
ELSE {
r2: REAL ~ radians*radians;
RETURN [(((fact9recip*r2-fact7recip)*r2+fact5recip)*r2-fact3recip)*r2*radians+radians];
};
};
CosFirstOctant:
PROC [radians:
REAL]
RETURNS [
REAL] = {
This version of Cos is only guaranteed to be accurate within the first octant: [0.0..PI/4]. When radians = PI/4 we have the maximum contribution of the terms after the fact8recip term, for a truncation error of roughly 2.46e-8, which is within minEps*Cos[PI/4] (= 4.2e-8).
IF radians <= sinCosEps
THEN
RETURN [1.0]
ELSE {
r2: REAL ~ radians*radians;
RETURN [((((fact8recip*r2-fact6recip)*r2+fact4recip)*r2-fact2recip)*r2+1)];
};
};
Two constants used to add N*PI with more than normal precision.
PIc1: REAL ← 6.28125; -- 402.0/64 (exact representation near twoPI)
PIc2: REAL = 1.9353072e-3; -- 2*PI - c1
GetWithinTwoPI:
PROC [radians:
REAL]
RETURNS [
REAL] = {
Let the returned value be R. Then
radians <= 0 ƒ radians H xn*twoPI + R ' -twoPI <= R <= 0
radians >= 0 ƒ radians H xn*twoPI + R ' 0 <= R <= twoPI
xn: REAL ← Real.Fix[radians * rec2pi];
RETURN [(radians - xn*PIc1) - xn*PIc2];
};
GetWithin360:
PROC [degrees:
REAL]
RETURNS [
REAL] = {
Let the returned value be R. Then
degrees <= 0 ƒ degrees H xn*360 + R ' -360 <= R <= 0
degrees >= 0 ƒ degrees H xn*360 + R ' 0 <= R <= 360
xn: REAL ← Real.Fix[degrees * rec360];
RETURN [degrees - xn*360];
};
September 16, 1980 3:13 PM, Stewart; Bug in Exp
September 28, 1980 8:28 PM, Stewart; Add AlmostEqual and AlmostZero, format
November 7, 1980 3:33 PM, Stewart; fix AlmostZero
August 27, 1982 1:07 pm, Stewart; CEDAR
October 24, 1984 4:37:17 pm PDT, Plass; New impls for Sin, Cos, SinDeg, CosDeg
November 2, 1984, Atkinson; Faster & more accurate SqRt, Sin, Cos, SinDeg, CosDeg