[Cyan]<Cedar6.0>Real>RealFnsImpl.mesa!5

RealFnsImpl.mesa

Stewart on August 27, 1982 1:07 pm

Paul Rovner on May 4, 1983 10:13 am

Levin, August 8, 1983 4:39 pm

Plass, October 24, 1984 4:26:48 pm PDT

Russ Atkinson (RRA), May 29, 1985 1:16:23 am PDT

DIRECTORY

Ieee USING [CVExtended, ExponentBias, SingleReal, Unpack],

Real USING [Extended, Fix, FixC, FixI, RealError, RealException],

RealFns;

RealFnsImpl: CEDAR PROGRAM IMPORTS Ieee, Real EXPORTS RealFns = BEGIN

NOTE: The compiler is bloody incompetent about floating point! In particular, it cannot deal with constant folding. Therefore, various constants are given as variables to avoid excess computation. Please make these revert to constants when the compiler gets smarter. (RRA)

Some useful constants

Ln2: REAL = 0.693147181;

LogBase2ofE: REAL ← 1/Ln2;

smallnumber: REAL = 0.00000000005;

PI: REAL = 3.14159265;

twoPI: REAL ← PI*2;

PI3ovr2: REAL ← 3*PI/2;

PIovr2: REAL = 1.570796327;

PI3ovr8: REAL = 1.1780972;

PIovr4: REAL = .785398163;

PIovr8: REAL = .392699;

radiansPerDegree: REAL ← PI/180;

degreesPerRadian: REAL ← 180/PI;

rec2pi: REAL ← 1.0/twoPI;

rec360: REAL ← 1.0/360.0;

fourOverPI: REAL ← 4.0/PI;

Exp: PUBLIC PROC [x: REAL] RETURNS [REAL] = {

compute e^x using continued fractions

ACM Algorithm 229, Morelock, James C., "Elementary functions by continued fractions"

r, f: REAL;

k: INTEGER;

fl: Ieee.SingleReal ← LOOPHOLE[1.0, Ieee.SingleReal];

To keep the error small near 0.0, use this Taylor's series

SELECT ABS[x] FROM

< 3e-4 => RETURN [1+x];

< 3e-3 => RETURN [1+x+x*x*0.5];

ENDCASE;

x ← x*LogBase2ofE; --exponent of 2 (instead of e)

IF x < -Ieee.ExponentBias THEN RETURN [0.0];

IF x > Ieee.ExponentBias THEN ERROR Real.RealError[];

k ← Real.FixI[x]; --integer part

x ← x - k; --fractional part

x ← x*Ln2; --now result = 2^k * exp(x) and 0<=x<Ln2

r ← x*x;

{

Expand the continued fraction loop a few times

LoopCount: NAT = 3;

f ← 4*LoopCount + 2;

f ← (4*(LoopCount-0) - 2) + r/f;

f ← (4*(LoopCount-1) - 2) + r/f;

f ← (4*(LoopCount-2) - 2) + r/f;

Note, if LoopCount > 3, then use the following comments

f ← (4*(LoopCount-3) - 2) + r/f;

f ← (4*(LoopCount-4) - 2) + r/f;

};

unscaled result is (f+x)/(f-x)

x ← (f + x)/(f - x);

scale: multipy by 2^k by adding k to the exponent

fl.exp ← k + Ieee.ExponentBias;

RETURN[x*LOOPHOLE[fl, REAL]];

};

Log: PUBLIC PROC [base, arg: REAL] RETURNS [REAL] = {

logxy = Ln(y)/Ln(x)

RETURN[Ln[arg]/Ln[base]];

};

Ln: PUBLIC PROC [x: REAL] RETURNS [REAL] = {

fl: Ieee.SingleReal;

m: INTEGER;

xm1: REAL ← x-1;

OneMinusX, OneMinusXPowerRep: REAL;

IF x <= 0 THEN Bomb[x];

To keep the error small near 1.0, use this Taylor's series

SELECT ABS[xm1] FROM

> 3e-3 => {};

> 3e-4 => RETURN [xm1 - 0.5*xm1*xm1];

ENDCASE => RETURN [xm1];

if x = z * 2^m then Ln(x) = m * Ln2 + Ln(z). Here, 1/2 <= z < 1

fl ← LOOPHOLE[x, Ieee.SingleReal];

m ← fl.exp - (Ieee.ExponentBias - 1);

fl.exp ← Ieee.ExponentBias - 1; --now 1/2 <= z < 1

x ← LOOPHOLE[fl, REAL];

special case for limit condition (x a power of 2)

IF x = 0.5 THEN RETURN[Ln2*(m-1)];

OneMinusX ← 1 - x;

x ← m*Ln2;

OneMinusXPowerRep ← 1;

FOR i: CARDINAL ← 1, i+1 DO

PreviousVal: REAL ← x;

OneMinusXPowerRep ← OneMinusXPowerRep*OneMinusX;

x ← x - OneMinusXPowerRep/i;

IF x = PreviousVal THEN EXIT;

ENDLOOP;

RETURN[x];

};

Bomb: PROC [x: REAL] = {

[] ← SIGNAL Real.RealException[flags: [invalidOperation: TRUE], vp: NEW[Real.Extended ← Ieee.CVExtended[Ieee.Unpack[x]]]];

ERROR Real.RealError[];

};

SquareReal: TYPE = MACHINE DEPENDENT RECORD [

m2: CARDINAL,

sign: BOOL, expD2: [0..128), index: [0..32), m1: [0..8)];

This declaration obviously depends on IEEE format 32-bit numbers (Ieee.SingleReal). The idea is that the index field has the low bit of the exponent and the high 4 bits of the mantissa, which is used to index a table of initial guesses at the square root, each guess is roughly halfway between the lowest and highest root with the given index, which allows convergence with only two iterations. I don't know where the table originated.

SqRt: PUBLIC PROC [a: REAL] RETURNS [b: REAL ← 0.0] = {

RRA: The idea is to lookup a square root guess from the table, adjusting for the exponent, then perform a few iterations to get within a reasonable error. Someday we will get ambitious and share the table with IeeeFloatB.

SELECT TRUE FROM

a <= 0.0 => {

The guess is only valid when we have a normalized number, so we force it to be normalized, then correct for the shift.

IF a < 0.0 THEN Bomb[a];

b ← 0.0;

};

LOOPHOLE[a, Ieee.SingleReal].exp = 0 =>

The guess is only valid when we have a normalized number, so we force it to be normalized, then correct for the shift.

b ← SqRt[a * (LONG[20000B] * LONG[20000B])] / 20000B;

ENDCASE => {

aFmt: SquareReal ← LOOPHOLE[a, SquareReal];

bFmt: Ieee.SingleReal ← LOOPHOLE[sqGuesses[aFmt.index], Ieee.SingleReal];

Get the initial guess based on the top 4 bits of the mantissa and the bottom bit of the exponent.

bFmt.exp ← bFmt.exp + aFmt.expD2 - 63;

Adjust the guess exponent based on adding half of the initial exponent (adjusted for the offset inherent in Ieee format).

b ← LOOPHOLE[bFmt, REAL];

The result is initially the guess

b ← LOOPHOLE[LOOPHOLE[b + a/b, LONG CARDINAL] - 40000000B, REAL];

This is really b ← (b + a/b)/2, but much faster

b ← LOOPHOLE[LOOPHOLE[b + a/b, LONG CARDINAL] - 40000000B, REAL];

This is really b ← (b + a/b)/2, but much faster

};

sqGuesses: ARRAY[0..32) OF REAL ← [

LOOPHOLE[7715741440B, REAL], -- 0.7178211

LOOPHOLE[7717240700B, REAL],

LOOPHOLE[7720514140B, REAL],

LOOPHOLE[7721745140B, REAL],

LOOPHOLE[7723155200B, REAL],

LOOPHOLE[7724346000B, REAL],

LOOPHOLE[7725517500B, REAL],

LOOPHOLE[7726654000B, REAL], -- 0.8568115

LOOPHOLE[7727773400B, REAL], -- 0.8748627

LOOPHOLE[7731077100B, REAL],

LOOPHOLE[7732167300B, REAL],

LOOPHOLE[7733245000B, REAL],

LOOPHOLE[7734310400B, REAL],

LOOPHOLE[7735342500B, REAL],

LOOPHOLE[7736363400B, REAL],

LOOPHOLE[7737373700B, REAL], -- 0.9920616

LOOPHOLE[7740370240B, REAL], -- 1.015156

LOOPHOLE[7741351440B, REAL],

LOOPHOLE[7742314540B, REAL],

LOOPHOLE[7743243000B, REAL],

LOOPHOLE[7744155200B, REAL],

LOOPHOLE[7745054400B, REAL],

LOOPHOLE[7745741300B, REAL],

LOOPHOLE[7746614600B, REAL], -- 1.211716

LOOPHOLE[7747457040B, REAL], -- 1.237247

LOOPHOLE[7750310640B, REAL],

LOOPHOLE[7751132500B, REAL],

LOOPHOLE[7751745000B, REAL],

LOOPHOLE[7752550100B, REAL],

LOOPHOLE[7753344400B, REAL],

LOOPHOLE[7754132600B, REAL],

LOOPHOLE[7754712500B, REAL] -- 1.402992

];

Root: PUBLIC PROC [index, arg: REAL] RETURNS [REAL] = {

yth root of x = e^(Ln(x)/y)

RETURN[Exp[Ln[arg]/index]];

};

Power: PUBLIC PROC [base, exponent: REAL] RETURNS [REAL] = {

x^y = e^(y*Ln(x))

IF base = 0 THEN IF exponent = 0 THEN RETURN[1] ELSE RETURN[0];

RETURN[Exp[exponent*Ln[base]]];

};

Sin: PUBLIC PROC [radians: REAL] RETURNS [sin: REAL] = {

rem: REAL ← ABS[radians];

IF rem > twoPI THEN rem ← GetWithinTwoPI[rem];

We need to have rem IN [0..twoPI]

{

Now select the appropriate formula given the octant.

SELECT Real.FixC[rem * fourOverPI] FROM

0 => {GO TO posSin};

1 => {rem ← PIovr2 - rem; GO TO posCos};

2 => {rem ← rem - PIovr2; GO TO posCos};

3 => {rem ← PI - rem; GO TO posSin};

4 => {rem ← rem - PI; GO TO negSin};

5 => {rem ← PI3ovr2 - rem; GO TO negCos};

6 => {rem ← rem - PI3ovr2; GO TO negCos};

7 => {rem ← twoPI - rem; GO TO negSin};

ENDCASE => RETURN [0.0];

EXITS

posSin => {sin ← SinFirstOctant[rem]; IF radians < 0.0 THEN GO TO neg};

negSin => {sin ← SinFirstOctant[rem]; IF radians > 0.0 THEN GO TO neg};

posCos => {sin ← CosFirstOctant[rem]; IF radians < 0.0 THEN GO TO neg};

negCos => {sin ← CosFirstOctant[rem]; IF radians > 0.0 THEN GO TO neg};

};

EXITS neg => sin ← - sin;

};

Cos: PUBLIC PROC [radians: REAL] RETURNS [cos: REAL] = {

rem: REAL ← ABS[radians];

IF rem > twoPI THEN rem ← GetWithinTwoPI[rem];

We need to have rem IN [0..twoPI]

{

Now select the appropriate formula given the octant.

SELECT Real.FixC[rem * fourOverPI] FROM

0 => {GO TO posCos};

1 => {rem ← PIovr2 - rem; GO TO posSin};

2 => {rem ← rem - PIovr2; GO TO negSin};

3 => {rem ← PI - rem; GO TO negCos};

4 => {rem ← rem - PI; GO TO negCos};

5 => {rem ← PI3ovr2 - rem; GO TO negSin};

6 => {rem ← rem - PI3ovr2; GO TO posSin};

7 => {rem ← twoPI - rem; GO TO posCos};

ENDCASE => cos ← 1.0;

EXITS

posSin => {cos ← SinFirstOctant[rem]};

negSin => {cos ← SinFirstOctant[rem]; cos ← -cos};

posCos => {cos ← CosFirstOctant[rem]};

negCos => {cos ← CosFirstOctant[rem]; cos ← -cos};

};

SinDeg: PUBLIC PROC [degrees: REAL] RETURNS [sin: REAL] = {

IF ABS[degrees] > 360 THEN degrees ← GetWithin360[degrees];

We get better results if we compute in the degrees domain for a while.

sin ← Sin[degrees*radiansPerDegree];

};

CosDeg: PUBLIC PROC [degrees: REAL] RETURNS [cos: REAL] = {

degrees ← ABS[degrees];

IF degrees > 360 THEN degrees ← GetWithin360[degrees];

We get better results if we compute in the degrees domain for a while.

cos ← Cos[degrees*radiansPerDegree];

};

Tan: PUBLIC PROC [radians: REAL] RETURNS [tan: REAL] = {

IF ABS[radians] > twoPI THEN radians ← GetWithinTwoPI[radians];

tan ← Sin[radians] / Cos[radians];

};

TanDeg: PUBLIC PROC [degrees: REAL] RETURNS [tan: REAL] = {

IF ABS[degrees] > 360 THEN degrees ← GetWithin360[degrees];

tan ← SinDeg[degrees] / CosDeg[degrees];

};

Constants for ArcTan

tanPI16: REAL = .1989123673;

x2: REAL ← 1/(.414213562);

x22: REAL ← x2*x2 + 1;

tan3PI16: REAL = .668178638;

x4: REAL = 1.0;

x44: REAL ← x4*x4 + 1;

tan5PI16: REAL = 1.496605763;

x6: REAL ← 1/(2.41421356);

x66: REAL ← x6*x6 + 1;

tan7PI16: REAL = 5.02733949;

p0: REAL = .999999998;

p1: REAL = -.333331726;

p2: REAL = .1997952738;

p3: REAL = -.134450639;

ArcTan: PUBLIC PROC [y, x: REAL] RETURNS [radians: REAL] = {

This function is good to 8.7 decimal places and is taken from:

Computer Approximations, John F. Hart et.al. pp129(top 2nd col).

s, v, t, t2, c, q: REAL;

IF ABS[x] <= ABS[smallnumber] THEN

IF ABS[y] <= ABS[smallnumber]

THEN RETURN[0]

ELSE IF y < 0 THEN RETURN[-PIovr2] ELSE RETURN[PIovr2];

IF x < 0

THEN IF y < 0 THEN {q ← -PI; s ← 1} ELSE {q ← PI; s ← -1}

ELSE IF y < 0 THEN {q ← 0; s ← -1} ELSE {q ← 0; s ← 1};

v ← ABS[y/x];

SELECT v FROM

< tanPI16 => {t ← v; c ← 0};

< tan3PI16 => {t ← x2 - x22/(x2 + v); c ← PIovr8};

< tan5PI16 => {t ← x4 - x44/(x4 + v); c ← PIovr4};

< tan7PI16 => {t ← x6 - x66/(x6 + v); c ← PI3ovr8};

ENDCASE => {t ← -1/v; c ← PIovr2};

t2 ← t*t;

radians ← s*(t*(p0 + t2*(p1 + t2*(p2 + t2*p3))) + c) + q;

};

ArcTanDeg: PUBLIC PROC [y, x: REAL] RETURNS [degrees: REAL] = {

degrees ← ArcTan[y, x]*degreesPerRadian;

};

AlmostZero: PUBLIC PROC [x: REAL, distance: INTEGER [-126..127]] RETURNS [BOOL] = {

RETURN[LOOPHOLE[x, Ieee.SingleReal].exp < (distance + Ieee.ExponentBias)];

};

AlmostEqual: PUBLIC PROC [y, x: REAL, distance: INTEGER [-126..0]] RETURNS [BOOL] = {

fl: Ieee.SingleReal ← LOOPHOLE[(x - y)];

oldexp: INTEGER ← MAX[

LOOPHOLE[x, Ieee.SingleReal].exp, LOOPHOLE[y, Ieee.SingleReal].exp];

RETURN[fl.exp < (distance + oldexp)];

};

Helpful internal functions

Coefficients for the taylor series for sin and cos.

fact9recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7*8*9);

fact8recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7*8);

fact7recip: REAL ← 1.0/(INT[1]*2*3*4*5*6*7);

fact6recip: REAL ← 1.0/(INT[1]*2*3*4*5*6);

fact5recip: REAL ← 1.0/(INT[1]*2*3*4*5);

fact4recip: REAL ← 1.0/(INT[1]*2*3*4);

fact3recip: REAL ← 1.0/(INT[1]*2*3);

fact2recip: REAL ← 1.0/(INT[1]*2);

minEps: REAL ← 1.0 - LOOPHOLE[LOOPHOLE[1.0, LONG CARDINAL]-1, REAL];

roughly 5.96e-8, the minimum relative error between two different reals

sinCosEps: REAL ← SqRt[minEps];

RRA: IF radians <= sinEps THEN SinFirstOctant[radians] = radians AND CosFirstOctant[radians] = 1.0 to the precision of our number system. The contribution of the smaller term of two consecutive terms in the Sin or Cos polynomials is down by the square from the larger (and a factor of at least two). So sinCosEps roughly equals 1.726e-4. Testing for this case improves accuracy at small values.

SinFirstOctant: PROC [radians: REAL] RETURNS [REAL] = {

This version of Sin is only guaranteed to be accurate within the first octant: [0.0..PI/4], which is approximately [0.0..0.785+]. When radians = PI/4 we have the maximum contribution of the term after the fact9recip term, for a truncation error of roughly 1.76e-9, which is within minEps*Sin[radians] (= 4.2e-8).

IF radians <= sinCosEps THEN RETURN [radians] ELSE {

r2: REAL ~ radians*radians;

RETURN [(((fact9recip*r2-fact7recip)*r2+fact5recip)*r2-fact3recip)*r2*radians+radians];

};

CosFirstOctant: PROC [radians: REAL] RETURNS [REAL] = {

This version of Cos is only guaranteed to be accurate within the first octant: [0.0..PI/4]. When radians = PI/4 we have the maximum contribution of the terms after the fact8recip term, for a truncation error of roughly 2.46e-8, which is within minEps*Cos[PI/4] (= 4.2e-8).

IF radians <= sinCosEps THEN RETURN [1.0] ELSE {

r2: REAL ~ radians*radians;

RETURN [((((fact8recip*r2-fact6recip)*r2+fact4recip)*r2-fact2recip)*r2+1)];

};

Two constants used to add N*PI with more than normal precision.

PIc1: REAL ← 6.28125; -- 402.0/64 (exact representation near twoPI)

PIc2: REAL = 1.9353072e-3; -- 2*PI - c1

GetWithinTwoPI: PROC [radians: REAL] RETURNS [REAL] = {

Let the returned value be R. Then

radians <= 0 ƒ radians H xn*twoPI + R ' -twoPI <= R <= 0

radians >= 0 ƒ radians H xn*twoPI + R ' 0 <= R <= twoPI

xn: REAL ← Real.Fix[radians * rec2pi];

RETURN [(radians - xn*PIc1) - xn*PIc2];

};

GetWithin360: PROC [degrees: REAL] RETURNS [REAL] = {

Let the returned value be R. Then

degrees <= 0 ƒ degrees H xn*360 + R ' -360 <= R <= 0

degrees >= 0 ƒ degrees H xn*360 + R ' 0 <= R <= 360

xn: REAL ← Real.Fix[degrees * rec360];

RETURN [degrees - xn*360];

};

END.

September 16, 1980 3:13 PM, Stewart; Bug in Exp

September 28, 1980 8:28 PM, Stewart; Add AlmostEqual and AlmostZero, format

November 7, 1980 3:33 PM, Stewart; fix AlmostZero

August 27, 1982 1:07 pm, Stewart; CEDAR

October 24, 1984 4:37:17 pm PDT, Plass; New impls for Sin, Cos, SinDeg, CosDeg

November 2, 1984, Atkinson; Faster & more accurate SqRt, Sin, Cos, SinDeg, CosDeg