[_CDCSL_93-16_]<1>Cedar>release>Rope>RopeImpl.mesa

RopeImpl.mesa, "Thick" string implementation

Willie-sue, March 3, 1992 10:18 pm PST

Russ Atkinson (RRA) October 13, 1989 3:16:42 pm PDT

Christian Jacobi, October 5, 1992 5:57 pm PDT

Mike Spreitzer, June 5, 1990 1:32:27 pm PDT

Carl Hauser, July 10, 1990 9:32 am PDT

Doug Wyatt, November 22, 1991 5:06 pm PST

Michael Plass, September 13, 1993 8:55 am PDT

This implementation supports "lazy evaluation" for Substr, Concat, and Replace operations. It also allows the user to create arbitrary implementations for compatible Rope objects by supplying procedures for the defining operations.

DIRECTORY

Ascii,

Basics,

RefText,

Rope,

RopePrivate,

SafeStorage;

RopeImpl: CEDAR PROGRAM

IMPORTS Ascii, Basics, RefText, RopePrivate, SafeStorage

EXPORTS Rope

SHARES Rope

= BEGIN OPEN Rope, RopePrivate;

For those who care, this is the official explanation of the XRopeRep variants:

Note: NIL is allowed as a valid ROPE.

Note: ALL integer components of the representation must be non-negative.

SELECT x: x FROM

eightbit => {

SELECT x: x FROM

text => {

[0..x.length) is the range of char indexes

[0..x.max) is the number of chars of storage reserved

all Rope operations creating new text objects init x.length = x.max

x.length <= x.max is required

the bit pattern of the text IS IDENTICAL to TEXT and StringBody!!!!

};

node => {

x.length is ignored

x.depth is the depth of the structure (for triggering Balance)

SELECT x:x FROM

substr => {

[0..x.size) is the range of char indexes

x.base contains chars indexed by [0..x.size)

[0..x.size) in x ==> [x.start..x.start+x.size) in x.base

Size[x.base] >= x.start + x.size

};

concat => {

[0..x.size) is the range of char indexes

x.base contains chars indexed by [0..x.pos)

[0..x.pos) in x ==> [0..x.pos) in x.base

x.rest contains the chars indexed by [x.pos..x.size)

[x.pos..x.size) in x ==> [0..x.size-x.pos) in x.base

x.pos = Size[x.base] AND x.size = x.pos + Size[x.rest]

};

replace => {

[0..x.size) is the range of char indexes

x.base contains chars indexed by [0..x.start), [x.newPos..x.size)

[0..x.start) in x ==> [0..x.start) in x.base

[x.newPos..x.size) in x ==> [x.oldPos..Size[x.base]) in x.base

x.rest contains the chars indexed by [x.start..x.newPos)

[x.start..x.newPos) in x => [0..x.newPos-x.start) in x.base

x.size >= x.newPos >= x.start AND x.oldPos >= x.start

x.size - x.newPos = Size[x.base] - x.oldPos

};

object => {

[0..x.size) is the range of char indexes

x.base is the data needed by the user-supplied operations

x.fetch[x.base, i] should fetch the ith char AND x.fetch # NIL

x.map[x.base, st, len, action] implements Map[x, st, len, action]

x.move[block, x.base, st] implements MoveChars[block, x, st]

it is OK to have x.map = NIL OR x.move = NIL

};

ENDCASE => ERROR NoRope

}

ENDCASE => ERROR NoRope

};

extended => {

[0..x.size) is the range of char indexes

ext is for use by the extended-char-set superclass of ROPE.

}

ENDCASE => ERROR NoRope

charsPerWord: NAT = BYTES[WORD]/BYTES[CHAR];

RawCharsPtr: TYPE = POINTER TO Basics.RawChars;

WordPtr: TYPE = POINTER TO WORD;

errors peculiar to Rope

NoRope: PUBLIC ERROR = CODE;

emptyRope: Text = NEW[TextRep[0]];

Watch out for startup problems with literals. Don't use "" here!

The following variables are here so Cirio could find ROPE and Text in the target world.

aXRope: XROPE ¬ NIL;

aRope: ROPE ¬ NIL;

aText: Rope.Text ¬ NIL;

untracedZone: ZONE ¬ NIL; -- initialized in RopeImpl's program body

NewText: PUBLIC PROC [size: TextBound] RETURNS [text: Text] = {

procedure to allocate new Text objects

IF size = 0 THEN RETURN [emptyRope];

text ¬ untracedZone.NEW[TextRep[size]];

text.length ¬ size;

};

Substr: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen]
RETURNS [new: ROPE] = TRUSTED {

... returns a subrope of the base. BoundsFault occurs if start < 0 or start > Length[base].

size: INT ¬ InlineSize[base];

rem: INT ¬ NonNeg[size - NonNeg[start]];

depth: INTEGER ¬ 1;

IF len <= 0 THEN RETURN [emptyRope] ELSE IF len > rem THEN len ¬ rem;

IF start = 0 AND len = rem THEN RETURN [base];

IF len <= FlatMax THEN RETURN [Flatten[base, start, len]];

At this point the resulting rope is large enough to need a separate node. The idea is to dive down through as many levels of rope objects until we get to the deepest such object that fully contains the specified rope.

(note: change base last, since it is aliased with x!)

WITH x: base SELECT FROM

text => EXIT;

node =>

WITH x: x SELECT FROM

substr => {

start ¬ start + x.start;

base ¬ x.base;

};

concat => {

rem: INT ¬ x.pos - start;

IF rem > 0

THEN {IF len > rem THEN {depth ¬ x.depth; EXIT}; base ¬ x.base}

ELSE {start ¬ -rem; base ¬ x.rest};

};

replace => {

len1: INT ¬ x.start - start;

IF len1 > 0

THEN {

substr starts in 1st section

IF len > len1 THEN {depth ¬ x.depth; EXIT};

entirely in first section, so go deeper

base ¬ x.base;

}

ELSE {

substr starts in middle or last sections

xnew: INT ¬ x.newPos;

len2: INT ¬ xnew - start;

IF len2 > 0

THEN {

substr starts in middle section

IF len > len2 THEN {depth ¬ x.depth; EXIT};

entirely in middle section

start ¬ -len1;

base ¬ x.replace;

}

ELSE {

entirely in last section

start ¬ x.oldPos - len2;

base ¬ x.base;

};

object => EXIT;

no sub-structure

ENDCASE => ERROR NoRope;

IF start # 0 THEN LOOP;

IF len = InlineSize[base] THEN RETURN [base];

ENDLOOP;

[] ¬ NonNeg[start];

[] ¬ NonNeg[len];

new ¬ NEW[Tsubstr ¬

[eightbit[0, node[depth: depth + 1, cases: substr[size: len, base: base, start: start]]]]];

IF depth > MaxDepth THEN new ¬ NewBalance[new];

};

Cat: PUBLIC PROC [r1, r2, r3, r4, r5: ROPE] RETURNS [ROPE] = TRUSTED {

Return the concatenation of the given 5 ropes.

RETURN [Concat[Concat[r1, r2], Concat[Concat[r3, r4], r5]]];

};

Concat: PUBLIC PROC [base, rest: ROPE] RETURNS [ROPE] = {

Return the concatenation of the two ropes. If the result is small enough, then it will be flat. Otherwise we need to create a new node.

IF rest = NIL THEN RETURN [base];

RETURN[SafeConcat[base, rest, TRUE]];

};

SafeConcat: PROC [base, rest: ROPE, checkBalance: BOOL ¬ FALSE] RETURNS [new: ROPE] ~ TRUSTED {

This private version does not trigger a rebalancing unless asked; it is used inside of NewBalance to ensure that an infinite recursion does not happen.

baseStr, restStr: Text;

baseLen, restLen, size: INT;

depth: INTEGER ¬ 1;

IF rest = NIL THEN RETURN [base];

[baseLen, baseStr] ¬ SingleSize[base];

IF baseLen = 0 THEN RETURN [rest];

[restLen, restStr] ¬ SingleSize[rest];

IF restLen = 0 THEN RETURN [base];

size ¬ CheckLongAdd[baseLen,restLen];

IF size <= FlatMax THEN {

The result is small enough to make it flat.

str: Text ¬ NewText[QShort[size]];

index: CARDINAL ¬ 0;

AddChar: PROC [c: CHAR] RETURNS [BOOL] = TRUSTED {

QStore[c, str, index];

index ¬ index + 1;

RETURN [FALSE];

};

IF baseStr = NIL

THEN [] ¬ Map[base, 0, baseLen, AddChar]

ELSE FOR i: TextBound IN [0..QShort[baseLen]) DO

QStore[QFetch[baseStr, i], str, index];

index ¬ index + 1;

ENDLOOP;

IF restStr = NIL

THEN [] ¬ Map[rest, 0, restLen, AddChar]

ELSE FOR i: TextBound IN [0..QShort[restLen]) DO

QStore[QFetch[restStr, i], str, index];

index ¬ index + 1;

ENDLOOP;

RETURN [str];

};

SELECT TRUE FROM

restLen < FlatMax => {

Possibly can reduce depth by combining with a concat node.

WITH x: base SELECT FROM

node => WITH x: x SELECT FROM

concat =>

IF x.size-x.pos < FlatMax/2 THEN {

baseLen ¬ x.pos;

rest ¬ SafeConcat[x.rest, rest];

base ¬ x.base;

};

ENDCASE;

};

baseLen < FlatMax => {

Possibly can reduce depth by combining with a concat node.

WITH x: base SELECT FROM

node => WITH x: x SELECT FROM

concat =>

IF x.pos < FlatMax/2 THEN {

rest ¬ x.rest;

baseLen ¬ x.pos+baseLen;

base ¬ SafeConcat[base, x.base];

};

ENDCASE;

};

ENDCASE;

[] ¬ NonNeg[size-baseLen];

depth ¬ MAX[InlineDepth[base], InlineDepth[rest]] + 1;

new ¬ NEW[Tconcat ¬

[eightbit[0, node[depth: depth,
cases: concat[size: size, base: base, rest: rest, pos: baseLen]]]]];

IF checkBalance AND depth > MaxDepth THEN new ¬ NewBalance[new];

};

Replace: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen, with: ROPE ¬ NIL] RETURNS [new: ROPE] = TRUSTED {

baseSize: INT ¬ InlineSize[base];

repSize: INT ¬ InlineSize[with];

rem: INT ¬ NonNeg[baseSize - NonNeg[start]];

depth: INTEGER ¬ 0;

oldPos: INT ¬ start +

(IF len < 0 THEN len ¬ 0 ELSE IF len > rem THEN len ¬ rem ELSE len);

newPos: INT ¬ CheckLongAdd[start,repSize];

size: INT ¬ CheckLongAdd[baseSize-len, repSize];

IF size = repSize THEN RETURN [with];

IF len = 0 AND repSize = 0 THEN RETURN [base]; -- identity check

IF size <= FlatMax THEN {

result is small enough to be flat

str: Text ¬ NewText[QShort[size]];

index: TextBound ¬ 0;

AddChar: PROC [c: CHAR] RETURNS [BOOL] = TRUSTED {

QStore[c, str, index];

index ¬ index + 1;

RETURN [FALSE];

};

IF start > 0 THEN [] ¬ Map[base, 0, start, AddChar];

IF repSize > 0 THEN [] ¬ Map[with, 0, repSize, AddChar];

IF oldPos < baseSize THEN [] ¬ Map[base, oldPos, baseSize, AddChar];

RETURN [str];

};

We need to make a new node. First, test for combining the replacement rope with a previous replacement rope, so that successive replacements at the same spot will not make ropes too deep.

(note: change base last, since it is aliased with x!)

WITH x: base SELECT FROM

node =>

WITH x: x SELECT FROM

replace => {

xnewPos: INT ¬ x.newPos;

xstart: INT ¬ x.start;

SELECT TRUE FROM

start <= xstart AND oldPos >= xnewPos => {

replacing the replacement string

oldPos ¬ x.oldPos + (oldPos - xnewPos);

base ¬ x.base;

};

start = xnewPos => {

adding to old replace string

IF repSize + (xnewPos - xstart) <= FlatMax THEN {

with ¬ Concat[x.replace, with];

start ¬ xstart;

oldPos ¬ x.oldPos + len;

base ¬ x.base;

};

ENDCASE;

};

ENDCASE;

[] ¬ NonNeg[NonNeg[newPos] - NonNeg[start]];

[] ¬ NonNeg[NonNeg[oldPos] - start];

[] ¬ NonNeg[NonNeg[size] - newPos];

depth ¬ MAX[InlineDepth[base], InlineDepth[with]] + 1;

new ¬ NEW[Treplace ¬

[eightbit[0, node[depth: depth, cases:
replace[size: size, base: base, replace: with, newPos: newPos, oldPos: oldPos, start: start]]]]];

IF depth > MaxDepth THEN new ¬ NewBalance[new];

};

InlineFetch: PROC [base: ROPE, index: INT] RETURNS [CHAR] = INLINE {

RETURN [Fetch[base, index]]

};

Fetch: PUBLIC PROC [base: ROPE, index: INT ¬ 0] RETURNS [CHAR] = TRUSTED {

... fetches indexed character from given ropes. BoundsFault occurs if index < 0 or index is >= Length[base].

This first discrimination does the bounds checking and the quick kill for a flat rope.

WITH base SELECT FROM

text: Text => RETURN [text[index]];

node: REF RopeRep.node => {

[] ¬ Basics.BoundsCheckInt[index, NodeSize[node]];

};

ENDCASE => Basics.RaiseBoundsFault[]; -- presumably NIL

Now we really don't need bounds checking, since checking the top level is sufficient. What we really need to do now is dive down to the right place as quickly as possible.

(note: change base last, since it is aliased with x!)

WITH base SELECT FROM

x: REF TextRep => {RETURN[QFetch[x, index]]};

x: REF RopeRep.node =>

WITH x: x SELECT FROM

substr => {index ¬ index + x.start; base ¬ x.base};

concat => {

IF index < x.pos THEN {base ¬ x.base; LOOP};

index ¬ index - x.pos;

base ¬ x.rest;

};

replace => {

IF index < x.start THEN {base ¬ x.base; LOOP};

IF index < x.newPos THEN {

index ¬ index - x.start;

base ¬ x.replace;

LOOP;

};

index ¬ index - x.newPos + x.oldPos;

base ¬ x.base;

};

object => RETURN [x.fetch[x.base, index]];

ENDCASE => ERROR NoRope;

ENDLOOP;

};

Map: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen, action: ActionType] RETURNS [BOOL] = TRUSTED {

... applies the action to the given range of characters in the rope. Returns TRUE when some action returns TRUE. BoundsFault occurs when start < 0 or start > Length[base].

rem: INT ¬ NonNeg[InlineSize[base] - NonNeg[start]];

IF len > rem THEN len ¬ rem;

WHILE len > 0 DO

WITH x: base SELECT FROM

text => {

st: TextBound ¬ QShort[start];

FOR i: TextBound IN [st..st+QShort[len]) DO

IF action[QFetch[@x,i]] THEN RETURN [TRUE];

ENDLOOP;

RETURN [FALSE];

};

node =>

(note: change base last, since it is aliased with x!)

WITH x: x SELECT FROM

substr => {start ¬ start + x.start; base ¬ x.base; LOOP};

concat => {

xpos: INT ¬ x.pos;

IF start+len <= xpos THEN {base ¬ x.base; LOOP};

IF start < xpos THEN {

subLen: INT ¬ xpos-start;

IF Map[x.base, start, subLen, action] THEN RETURN [TRUE];

start ¬ xpos; len ¬ len - subLen;

};

start ¬ start - xpos;

base ¬ x.rest;

};

replace => {

xstart: INT ¬ x.start;

xnew: INT ¬ x.newPos;

IF start < xstart THEN {

subLen: INT ¬ xstart-start;

IF subLen >= len THEN {base ¬ x.base; LOOP};

IF Map[x.base, start, subLen, action] THEN RETURN [TRUE];

start ¬ xstart;

len ¬ len - subLen;

};

IF start < xnew THEN {

subLen: INT ¬ xnew-start;

st: INT ¬ start - xstart;

IF subLen >= len THEN {start ¬ st; base ¬ x.replace; LOOP};

IF Map[x.replace, st, subLen, action] THEN RETURN [TRUE];

start ¬ xnew;

len ¬ len - subLen;

};

start ¬ start - xnew + x.oldPos;

base ¬ x.base};

object => {

map: MapType ¬ x.map;

data: REF ¬ x.base;

IF map # NIL

THEN RETURN[map[data, start, len, action]]

ELSE {

fetch: FetchType ¬ x.fetch;

FOR i: INT IN [start..start+len) DO

IF action[fetch[data, i]] THEN RETURN [TRUE];

ENDLOOP;

RETURN [FALSE];

};

ENDCASE => ERROR NoRope;

ENDLOOP;

RETURN [FALSE];

};

Translate: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen, translator: TranslatorType ¬ NIL] RETURNS [new: ROPE] = TRUSTED {

applies the translation to get a new rope

if the resulting size > 0, then new does not share with the original rope!

if translator = NIL, the identity translation is performed

index: INT ¬ start;

intRem: INT ¬ NonNeg[InlineSize[base] - NonNeg[start]];

rem: INT ¬ intRem;

text: Text ¬ NIL;

IF len <= 0 OR rem = 0 THEN RETURN [emptyRope];

IF len < rem THEN rem ¬ len;

WITH base SELECT FROM

t: Text => {

short: TextBound ¬ index;

text ¬ NewText[rem];

FOR i: TextBound IN [0..rem) DO

c: CHAR ¬ QFetch[t, short];

IF translator # NIL THEN c ¬ translator[c];

text[i] ¬ c;

short ¬ short + 1;

ENDLOOP;

new ¬ text;

};

ENDCASE => {

each: PROC RETURNS [CHAR] = TRUSTED {

c: CHAR ¬ InlineFetch[base, index];

index ¬ index + 1;

IF translator # NIL THEN c ¬ translator[c];

RETURN [c];

};

RETURN [FromProc[rem, each]];

};

Lower: PUBLIC TranslatorType ~ { RETURN [Ascii.Lower[old]] };

Upper: PUBLIC TranslatorType ~ { RETURN [Ascii.Upper[old]] };

Flatten: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen]
RETURNS [Text] = TRUSTED {

size: INT ¬ InlineSize[base];

rem: INT ¬ NonNeg[size - NonNeg[start]];

IF len > rem THEN len ¬ rem;

IF start = 0 AND len = rem THEN {

IF base = NIL THEN RETURN [NIL];

IF base.tag = text THEN RETURN [LOOPHOLE[base]];

};

IF len <= 0

THEN RETURN [emptyRope]

ELSE {

rtn: Text = NewText[Short[len]];

rtn.length ¬ 0;

[] ¬ AppendChars[LOOPHOLE[rtn], base, start, len];

RETURN [rtn];

};

MakeRope: PUBLIC PROC
[base: REF, size: INT, fetch: FetchType, map: MapType, move: MoveType]
RETURNS [ROPE] = TRUSTED {

no optimization for user-supplied strings

IF size = 0 THEN RETURN [emptyRope];

RETURN [NEW[Tobject ¬

[eightbit[0, node[depth: 1, cases: object[size: size, base: base, fetch: fetch, map: map, move: move]]]]]];

};

FromProc: PUBLIC PROC
[len: INT, p: PROC RETURNS [CHAR], maxPiece: INT ¬ MaxLen]
RETURNS [ROPE] = TRUSTED {

IF len <= 0 THEN RETURN [emptyRope];

SELECT maxPiece FROM

< FlatMax => maxPiece ¬ FlatMax;

> TextBound.LAST => maxPiece ¬ TextBound.LAST;

ENDCASE;

IF len <= maxPiece

THEN {

rtn: Text ~ NewText[QShort[len]];

FOR i: TextBound IN [0..QShort[len]) DO rtn[i] ¬ p[]; ENDLOOP;

RETURN [rtn];

}

ELSE {

Force proper evaluation order, since the compiler might get it backwards if left to its own devices.

left: ROPE ~ FromProc[len/2, p, maxPiece];

right: ROPE ~ FromProc[(len+1)/2, p, maxPiece];

RETURN [Concat[left, right]];

};

FromChars: PUBLIC PROC [genChars: PROC [PROC [CHAR]]] RETURNS [ROPE] ~ {

genRopesFromChars: PROC [putRope: PROC [ROPE]] ~ {

buf: REF TEXT ~ RefText.ObtainScratch[512];

putChar: PROC [c: CHAR] ~ {

text: REF TEXT ~ buf;

len: NAT ¬ text.length;

IF NOT len<text.maxLength THEN { putRope[FromRefText[text]]; len ¬ 0; };

text[len] ¬ c;

text.length ¬ len+1;

};

genChars[putChar];

IF buf.length>0 THEN putRope[FromRefText[buf]];

RefText.ReleaseScratch[buf];

};

RETURN[FromRopes[genRopesFromChars]];

};

FromRopes: PUBLIC PROC [genRopes: PROC [PROC [ROPE]]] RETURNS [result: ROPE ¬ NIL] ~ {

stackSize: NAT ~ 10;

s: ARRAY [0..stackSize) OF ROPE ¬ ALL[NIL];

putRope: PROC [rope: ROPE] ~ {

FOR i: NAT DECREASING IN [0..stackSize) DO

IF s[i] = NIL THEN { s[i] ¬ rope; RETURN }

ELSE { rope ¬ Concat[s[i], rope]; s[i] ¬ NIL };

ENDLOOP;

s[0] ¬ rope;

};

genRopes[putRope];

FOR i: NAT DECREASING IN [0..stackSize) DO

IF s[i] # NIL THEN result ¬ Concat[s[i], result];

ENDLOOP;

};

fromChars: ARRAY CHAR OF Text ¬ ALL[NIL];

FromChar: PUBLIC PROC [c: CHAR] RETURNS [Text] = TRUSTED {

rtn: Text ¬ fromChars[c];

IF rtn=NIL THEN {

rtn ¬ NewText[1];

rtn[0] ¬ c;

fromChars[c] ¬ rtn; --ok not to monitor

};

RETURN [rtn];

};

MakeConstantRope: PUBLIC PROC [char: CHAR, length: INT] RETURNS [ROPE] = {

RETURN MakeRope[NEW [Constant ¬ [char, length]], length, ConstFetch, ConstMap, ConstMove]};

Constant: TYPE = RECORD [char: CHAR, length: INT];

ConstFetch: PROC [data: REF, index: INT] RETURNS [CHAR] = {

c: REF Constant = NARROW[data];

RETURN [c.char]};

ConstMap: PROC [base: REF, start, len: INT, action: ActionType] RETURNS [quit: BOOL ¬ FALSE] = {

c: REF Constant = NARROW[base];

THROUGH [1 .. MIN[c.length-start, len]] DO

IF action[c.char] THEN RETURN [TRUE];

ENDLOOP;

RETURN [FALSE]};

ConstMove: UNSAFE PROC [block: Basics.UnsafeBlock, data: REF, start: INT] RETURNS [charsMoved: INT ¬ 0] = UNCHECKED {

c: REF Constant = NARROW[data];

value: BYTE = ORD[c.char];

n: INT = MIN[c.length-start, block.count];

WHILE charsMoved<n DO

Basics.FillBytes[dstBase: block.base, dstStart: block.startIndex+charsMoved, count: count, value: value];

charsMoved ¬ charsMoved+count;

ENDLOOP;

};

FromRefText: PUBLIC PROC
[s: REF READONLY TEXT, start: TextBound ¬ 0, len: TextBound ¬ TextBound.LAST]
RETURNS [rtn: Text ¬ NIL] = TRUSTED {

IF s # NIL THEN TRUSTED {

rem: TextBound ~ s.length-start;

IF rem<len THEN len ¬ rem;

IF len = 0 THEN RETURN [emptyRope];

rtn ¬ NewText[len];

Basics.CopyBytes[

dstBase: RefText.BaseFromTextRope[rtn], dstStart: 0,

srcBase: RefText.BaseFromText[s], srcStart: start,

};

ToRefText: PUBLIC PROC [base: ROPE] RETURNS [REF TEXT] = TRUSTED {

len: TextBound ¬ Short[InlineSize[base]];

rtn: REF TEXT ¬ untracedZone.NEW[TEXT[len]];

IF len # 0 THEN {

WITH base SELECT FROM

txt: Text => TRUSTED {

Basics.CopyBytes[

dstBase: RefText.BaseFromText[rtn], dstStart: 0,

srcBase: RefText.BaseFromTextRope[txt], srcStart: 0,

rtn.length ¬ len;

};

ENDCASE =>

Sigh, do it the hard way.

[] ¬ AppendChars[rtn, base, 0, len];

};

RETURN [rtn];

};

DoMoveChars: UNSAFE PROC
[pointer: POINTER TO Basics.RawBytes, index: INT, rope: ROPE, start: INT, len: INT] ~ {

We assume start IN[0..size] AND len IN[0..size-start], where size=Size[rope]

The effect is ... FOR i: INT IN[0..len) DO pointer[index+i] ← rope.Fetch[start+i] ENDLOOP

WHILE len # 0 DO

base: ROPE; bStart, bLen: INT;

[base, bStart, bLen] ¬ ContainingPiece[rope, start];

grab the smallest piece of the rope containing the given start index

IF bLen > len THEN bLen ¬ len;

this piece is longer than we need to finish the transfer

IF bLen = 0 THEN ERROR NoRope;

this should not happen! it means that some calculation or invariant is bad!

WITH base SELECT FROM

txt: Text => TRUSTED {

We are moving chars from a flat rope.

Basics.CopyBytes[

dstBase: pointer, dstStart: index,

srcBase: RefText.BaseFromTextRope[txt], srcStart: bStart,

};

n: REF RopeRep.node => {

WITH n SELECT FROM

obj: REF RopeRep.node.object =>

We are moving chars from a user-defined object.

SELECT TRUE FROM

obj.move # NIL => TRUSTED {

The user has supplied a fast move routine, so use it.

moved: INT ¬ obj.move[[pointer, index, bLen], obj.base, bStart];

IF moved # bLen THEN ERROR NoRope;

We asked for a perfectly legitimate # of chars, and the user's move routine blew it! This rope is not reliable.

};

obj.map # NIL => {

The user has supplied a fast map routine, so use it.

toPointer: RawCharsPtr ¬ LOOPHOLE[pointer];

toIndex: INT ¬ index;

moved: INT ¬ 0;

action: ActionType = TRUSTED {

IF toIndex > TextBound.LAST THEN {

toPointer ¬ toPointer+(toIndex/charsPerWord)*UNITS[WORD];

toIndex ¬ toIndex MOD charsPerWord;

};

toPointer[toIndex] ¬ c;

toIndex ¬ toIndex + 1;

moved ¬ moved + 1;

};

[] ¬ obj.map[obj.base, bStart, bLen, action];

IF moved # bLen THEN ERROR NoRope;

We asked for a perfectly legitimate # of chars, and the user's map routine blew it! This rope is not reliable.

};

ENDCASE => TRUSTED {

Sigh. We have to do this one ourselves.

toPointer: RawCharsPtr ¬ LOOPHOLE[pointer];

toIndex: INT ¬ index;

fetch: FetchType ¬ obj.fetch;

data: REF ¬ obj.base;

FOR i: INT IN [0..bLen) DO

IF toIndex > TextBound.LAST THEN {

toPointer ¬ toPointer+(toIndex/charsPerWord)*UNITS[WORD];

toIndex ¬ toIndex MOD charsPerWord;

};

toPointer[toIndex] ¬ fetch[data, bStart+i];

toIndex ¬ toIndex + 1;

ENDLOOP;

};

ENDCASE => ERROR NoRope; -- this should not happen!

};

ENDCASE => ERROR NoRope; -- this should not happen!

len ¬ len - bLen;

update the # of chars left in the transfer

start ¬ start + bLen;

start is now the starting index for the next piece

index ¬ index + bLen;

index is now the destination index for the next piece

ENDLOOP;

};

AppendChars: PUBLIC PROC
[buffer: REF TEXT, rope: ROPE, start: INT ¬ 0, len: INT ¬ LAST[INT]]
RETURNS [charsMoved: TextBound ¬ 0] = {

... appends characters to the end of a REF TEXT buffer, starting at start within the rope. The move stops if there are no more characters from the rope OR len characters have been moved OR the buffer is full (buffer.length = buffer.maxLength). charsMoved is always the # of characters appended. NOTE: the user is responsible for protecting buffer from concurrent modifications.

rem: INT ¬ NonNeg[InlineSize[rope]-NonNeg[start]];

# of characters in rope after start

IF rem > len THEN rem ¬ len;

The user may have specified a shorter run of characters

IF rem > 0 AND buffer # NIL THEN TRUSTED {

bufPos: TextBound ¬ buffer.length;

position of next place to append character (cache for buffer.length)

bufRem: TextBound ¬ buffer.maxLength - bufPos;

# of chars remaining in the transfer

IF bufRem > rem THEN bufRem ¬ QShort[rem];

the caller may have specified a shorter amount

DoMoveChars[
pointer: RefText.BaseFromText[buffer],
index: bufPos, rope: rope, start: start, len: bufRem];

buffer.length ¬ bufPos+bufRem;

RETURN [bufRem];

};

UnsafeMoveChars: PUBLIC UNSAFE PROC [block: UnsafeBlock, rope: ROPE, start: INT ¬ 0]
RETURNS [charsMoved: INT ¬ 0] ~ {

... moves characters to an UnsafeBlock, starting at start within the rope. The move stops if there are no more characters from the rope OR block.count characters have been moved; charsMoved is always the # of characters moved. This UNSAFE operation is mainly for the implementation of IO.RIS. Most clients should use AppendChars.

rem: INT ¬ NonNeg[InlineSize[rope]-NonNeg[start]];

# of characters in rope after start

IF rem > block.count THEN rem ¬ block.count;

The user may have specified a shorter run of characters

IF rem > 0 THEN TRUSTED {

DoMoveChars[
pointer: block.base,
index: NonNeg[block.startIndex], rope: rope, start: start, len: rem];

RETURN [rem];

};

Equal: PUBLIC PROC [s1, s2: ROPE ¬ NIL, case: BOOL ¬ TRUE] RETURNS [BOOL] = TRUSTED {

contents equality of s1 and s2

len1,len2: INT;

str1, str2: Text;

[len1, str1] ¬ SingleSize[s1];

[len2, str2] ¬ SingleSize[s2];

SELECT TRUE FROM

len1 # len2 => RETURN [FALSE];

s1 = s2 OR len1 = 0 => RETURN [TRUE];

case AND str1 # NIL AND str2 # NIL => {

relatively cheap test for equality

len: TextBound ¬ QShort[len1];

wp1: WordPtr ¬ LOOPHOLE[str1, WordPtr] + SIZE[TEXT[0]];

wp2: WordPtr ¬ LOOPHOLE[str2, WordPtr] + SIZE[TEXT[0]];

WHILE len >= charsPerWord DO

Run through the words first

IF wp1 # wp2 THEN RETURN [FALSE];

wp1 ¬ wp1 + UNITS[WORD];

wp2 ¬ wp2 + UNITS[WORD];

len ¬ len - charsPerWord;

ENDLOOP;

FOR i: TextBound IN [0..len MOD charsPerWord) DO

Run through the remaining characters

IF LOOPHOLE[wp1, RawCharsPtr][i] # LOOPHOLE[wp2, RawCharsPtr][i] THEN

RETURN [FALSE];

ENDLOOP;

RETURN [TRUE];

};

ENDCASE =>

RETURN [Compare[s1,s2,case] = equal];

};

Compare: PUBLIC PROC [s1, s2: ROPE ¬ NIL, case: BOOL ¬ TRUE]
RETURNS [Comparison] = TRUSTED {

contents comparison of s1 and s2 (less(0), equal(1), greater(2))

if NOT case, then upper case chars = lower case chars

contents equality of s1 and s2

len1,len2: INT;

str1, str2: Text;

[len1, str1] ¬ SingleSize[s1];

[len2, str2] ¬ SingleSize[s2];

IF str1 # NIL AND str2 # NIL THEN {

The two ropes are both flat, so we don't have to discriminate for every character.

sz1: TextBound ¬ QShort[len1];

sz2: TextBound ¬ QShort[len2];

sz: TextBound ¬ MIN[sz1, sz2];

IF case

THEN

FOR i: TextBound IN [0..sz) DO

c1: CHAR ¬ QFetch[str1, i];

c2: CHAR ¬ QFetch[str2, i];

IF c1 = c2 THEN LOOP;

IF c1 < c2 THEN RETURN [less] ELSE RETURN [greater];

ENDLOOP

ELSE

FOR i: TextBound IN [0..sz) DO

c1: CHAR ¬ QFetch[str1, i];

c2: CHAR ¬ QFetch[str2, i];

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

IF c1 = c2 THEN LOOP;

IF c1 < c2 THEN RETURN [less] ELSE RETURN [greater];

ENDLOOP;

IF sz1 > sz2 THEN RETURN [greater];

IF sz1 < sz2 THEN RETURN [less];

RETURN [equal];

};

{

At least one rope is not flat, so we do it the hard way.

r1,r2: ROPE ¬ NIL;

pos1,st1,sz1,lm1: INT ¬ 0;

pos2,st2,sz2,lm2: INT ¬ 0;

c1,c2: CHAR;

IF st1 = lm1 THEN {

need a new piece from s1

IF (pos1 ¬ pos1 + sz1) = len1

THEN RETURN [IF pos1 = len2 THEN equal ELSE less];

[r1, st1, sz1] ¬ ContainingPiece[s1, pos1];

IF sz1 = 0 THEN ERROR;

lm1 ¬ st1 + sz1;

};

IF st2 = lm2 THEN {

need a new piece from s2

IF (pos2 ¬ pos2 + sz2) = len2 THEN RETURN [greater];

[r2, st2, sz2] ¬ ContainingPiece[s2, pos2];

IF sz2 = 0 THEN ERROR;

lm2 ¬ st2 + sz2;

};

c1 ¬ InlineFetch[r1, st1];

c2 ¬ InlineFetch[r2, st2];

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

};

IF c1 # c2 THEN RETURN [IF c1 < c2 THEN less ELSE greater];

st1 ¬ st1 + 1; st2 ¬ st2 + 1;

ENDLOOP;

};

CompareSubstrs: PUBLIC PROC [
s1: ROPE, start1: INT ¬ 0, len1: INT ¬ MaxLen,
s2: ROPE, start2: INT ¬ 0, len2: INT ¬ MaxLen,
case: BOOL ¬ TRUE] RETURNS [Comparison] ~ {

rem1: INT ¬ IF len1 <= 0 THEN 0 ELSE MIN[len1, NonNeg[InlineSize[s1]-NonNeg[start1]]];

rem2: INT ¬ IF len2 <= 0 THEN 0 ELSE MIN[len2, NonNeg[InlineSize[s2]-NonNeg[start2]]];

rem: INT ¬ MIN[rem1, rem2];

r1, r2: ROPE ¬ NIL;

st1, sz1, lm1: INT ¬ 0;

st2, sz2, lm2: INT ¬ 0;

WHILE rem # 0 DO

IF st1 = lm1 THEN {

need a new piece from s1

[r1, st1, sz1] ¬ ContainingPiece[s1, start1 ¬ start1 + sz1];

IF sz1 = 0 THEN EXIT;

lm1 ¬ st1 + sz1;

};

IF st2 = lm2 THEN {

need a new piece from s2

[r2, st2, sz2] ¬ ContainingPiece[s2, start2 ¬ start2 + sz2];

IF sz2 = 0 THEN EXIT;

lm2 ¬ st2 + sz2;

};

{

c1: CHAR ¬ InlineFetch[r1, st1];

c2: CHAR ¬ InlineFetch[r2, st2];

IF c1 # c2 THEN {

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

IF c1 = c2 THEN GO TO equal;

};

RETURN [Basics.CompareInt[ORD[c1], ORD[c2]]];

EXITS equal => {};

};

st1 ¬ st1 + 1;

st2 ¬ st2 + 1;

rem ¬ rem - 1;

ENDLOOP;

RETURN [Basics.CompareInt[rem1, rem2]];

};

EqualSubstrs: PUBLIC PROC [
s1: ROPE, start1: INT ¬ 0, len1: INT ¬ MaxLen,
s2: ROPE, start2: INT ¬ 0, len2: INT ¬ MaxLen,
case: BOOL ¬ TRUE] RETURNS [BOOL] ~ {

rem1: INT ¬ IF len1 <= 0 THEN 0 ELSE MIN[len1, NonNeg[InlineSize[s1]-NonNeg[start1]]];

rem2: INT ¬ IF len2 <= 0 THEN 0 ELSE MIN[len2, NonNeg[InlineSize[s2]-NonNeg[start2]]];

IF rem1 = rem2 THEN {

r1, r2: ROPE ¬ NIL;

st1, sz1, lm1: INT ¬ 0;

st2, sz2, lm2: INT ¬ 0;

WHILE rem1 # 0 DO

IF st1 = lm1 THEN {

need a new piece from s1

[r1, st1, sz1] ¬ ContainingPiece[s1, start1 ¬ start1 + sz1];

IF sz1 = 0 THEN EXIT;

lm1 ¬ st1 + sz1;

};

IF st2 = lm2 THEN {

need a new piece from s2

[r2, st2, sz2] ¬ ContainingPiece[s2, start2 ¬ start2 + sz2];

IF sz2 = 0 THEN EXIT;

lm2 ¬ st2 + sz2;

};

{

c1: CHAR ¬ InlineFetch[r1, st1];

c2: CHAR ¬ InlineFetch[r2, st2];

IF c1 # c2 THEN {

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

IF c1 = c2 THEN GO TO equal;

};

RETURN [FALSE];

EXITS equal => {};

};

st1 ¬ st1 + 1;

st2 ¬ st2 + 1;

rem1 ¬ rem1 - 1;

ENDLOOP;

RETURN [TRUE];

};

RETURN [FALSE];

};

ContainingPiece: PUBLIC PROC [rope: ROPE, index: INT ¬ 0]
RETURNS [base: ROPE, start: INT, len: INT] = TRUSTED {

find the largest piece containg the given index

such that the resulting rope is either the text or the object variant

(NIL, 0, 0) is returned if the index is NOT in the given rope

len ¬ InlineSize[rope];

IF index < 0 OR index >= len THEN RETURN [NIL, 0, 0];

base ¬ rope;

start ¬ index;

len ¬ len - start;

nlen: INT ¬ len;

WITH x: base SELECT FROM

(note: change base last, since it is aliased with x!)

text => RETURN;

node =>

WITH x: x SELECT FROM

substr => {

nlen ¬ x.size - start;

start ¬ start + x.start;

base ¬ x.base;

};

concat => {

del1: INT ¬ x.pos - start;

IF del1 > 0

THEN {nlen ¬ del1; base ¬ x.base}

ELSE {nlen ¬ x.size - start; start ¬ -del1; base ¬ x.rest};

};

replace => {

del2: INT ¬ x.newPos - start;

del1: INT ¬ x.start - start;

SELECT TRUE FROM

del1 > 0 => {nlen ¬ del1; base ¬ x.base};

del2 > 0 => {start ¬ -del1; nlen ¬ del2; base ¬ x.replace};

ENDCASE => {

nlen ¬ x.size - start;

start ¬ x.oldPos - del2;

base ¬ x.base;

};

object => {RETURN};

ENDCASE => ERROR NoRope;

IF nlen < len THEN len ¬ NonNeg[nlen];

ENDLOOP;

};

IsEmpty: PUBLIC PROC [r: XROPE] RETURNS [BOOL] = {

RETURN [Size[r]=0];

};

Length: PUBLIC PROC [base: XROPE] RETURNS [INT] = {

RETURN [Size[base]];

};

InlineSize: PROC [base: ROPE] RETURNS [INT] = {

SELECT TRUE FROM

base = NIL => RETURN [0];

base.tag = text => RETURN [base.length];

ENDCASE => TRUSTED {

Relys on size field being in the same place in all node variants:

RETURN [LOOPHOLE[base, REF Tobject].size];

};

Size: PUBLIC PROC [base: XROPE] RETURNS [INT] = {

WITH base SELECT FROM

base: REF XRopeRep.eightbit => {

IF base.tag = text THEN RETURN [base.length];

TRUSTED {

Relys on size field being in the same place in all node variants:

RETURN [LOOPHOLE[base, REF Tobject].size];

};

base: REF XRopeRep.extended => RETURN [base.size];

ENDCASE => RETURN [0];

};

QShort: PROC [x: INT] RETURNS [TextBound] = TRUSTED INLINE {

This "routine" is meant to convert an INT to a TextBound without checking in cases where the INT is known from previous tests to be in bounds. Use Short to actually perform the checking.

RETURN [LOOPHOLE[x]];

};

Formerly in RopeImplExt

Run: PUBLIC PROC
[s1: ROPE, pos1: INT, s2: ROPE, pos2: INT, case: BOOL ¬ TRUE, len: INT ¬ MaxLen]
RETURNS [result: INT ¬ 0] = TRUSTED {

Returns the largest number of characters N <= len such that s1 starting at pos1 is equal to s2 starting at pos2 for N characters. More formally: FOR i IN [0..N): s1[pos1+i] = s2[pos2+i]. If case is true, then case matters, otherwise upper and lower case characters are considered equal.

len1: INT;

str1: Text;

[len1, str1] ¬ SingleSize[s1];

IF NonNeg[pos1] < len1 THEN {

len2: INT;

str2: Text;

[len2, str2] ¬ SingleSize[s2];

IF NonNeg[pos2] < len2 THEN {

r1,r2: ROPE ¬ NIL;

st1,sz1,lm1: INT ¬ 0;

st2,sz2,lm2: INT ¬ 0;

WHILE result<len DO

IF st1 = lm1 THEN {

need a new piece from s1

[r1, st1, sz1] ¬ ContainingPiece[s1, pos1 ¬ pos1 + sz1];

IF sz1 = 0 THEN RETURN;

lm1 ¬ st1 + sz1;

};

IF st2 = lm2 THEN {

need a new piece from s2

[r2, st2, sz2] ¬ ContainingPiece[s2, pos2 ¬ pos2 + sz2];

IF sz2 = 0 THEN RETURN;

lm2 ¬ st2 + sz2;

};

{

c1: CHAR ¬ InlineFetch[r1, st1];

c2: CHAR ¬ InlineFetch[r2, st2];

IF c1 # c2 THEN {

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

IF c1 = c2 THEN GO TO equal;

};

RETURN;

EXITS equal => {};

};

result ¬ result + 1;

st1 ¬ st1 + 1; st2 ¬ st2 + 1;

ENDLOOP;

};

IsPrefix: PUBLIC PROC [prefix: ROPE, subject: ROPE, case: BOOL ¬ TRUE] RETURNS [BOOL] ~ {

RETURN [Run[prefix, 0, subject, 0, case]=InlineSize[prefix]];

};

Index: PUBLIC PROC
[s1: ROPE, pos1: INT, s2: ROPE, case: BOOL ¬ TRUE] RETURNS [INT] = TRUSTED {

Returns the smallest character position N such that s2 occurs in s1 at N and N >= pos1. If s2 does not occur in s1 at or after pos1, s1.length is returned. pos1 <= N < s1.length => FOR i IN [0..s2.length): s1[N+i] = s2[i]; N = s1.length => s2 does not occur in s1. If case is true, then case matters, otherwise upper and lower case characters are considered equal.

len1,len2, rem: INT; both: BOOL;

[len1,len2,both] ¬ DoubleSize[s1, s2];

rem ¬ IF pos1 >= len1 THEN 0 ELSE len1 - NonNeg[pos1];

IF rem >= len2 THEN {

c: CHAR;

IF len2 = 0 THEN RETURN [pos1];

c ¬ InlineFetch[s2, 0];

rem ¬ rem - len2 + 1;

IF case

THEN {

Cmp: PROC [cc: CHAR] RETURNS [BOOL] = TRUSTED {

IF c = cc AND Run[s1, pos1+1, s2, 1, case]+1 = len2 THEN RETURN [TRUE];

pos1 ¬ pos1 + 1; RETURN [FALSE];

};

IF Map[s1, pos1, rem, Cmp] THEN RETURN [pos1]

}

ELSE {

LCmp: PROC [cc: CHAR] RETURNS [BOOL] = TRUSTED {

IF cc <= 'Z AND cc >= 'A THEN cc ¬ cc + ('a-'A);

IF c = cc AND Run[s1, pos1+1, s2, 1, case]+1 = len2 THEN RETURN [TRUE];

pos1 ¬ pos1 + 1; RETURN [FALSE];

};

IF c <= 'Z AND c >= 'A THEN c ¬ c + ('a-'A);

IF Map[s1, pos1, rem, LCmp] THEN RETURN [pos1];

};

RETURN [len1];

};

Find: PUBLIC PROC [s1, s2: ROPE, pos1: INT ¬ 0, case: BOOL ¬ TRUE] RETURNS [INT] = {

index: INT ¬ Index[s1, pos1, s2, case];

IF index = InlineSize[s1] THEN RETURN [-1];

RETURN [index];

};

FindBackward: PUBLIC PROC
[s1, s2: ROPE, pos1: INT ¬ MaxLen, case: BOOL ¬ TRUE] RETURNS [INT] = {

len1,len2: INT; both: BOOL;

[len1,len2,both] ¬ DoubleSize[s1, s2];

IF NonNeg[pos1]>len1 THEN pos1 ¬ len1;

IF len2 = 0 THEN RETURN [pos1];

IF len1 >= len2 THEN {

c2: CHAR ¬ InlineFetch[s2, 0]; -- first char of pattern

rem2: INT ~ len2-1; -- remaining chars in pattern

IF (len1-pos1)<len2 THEN pos1 ¬ len1-len2;

IF case THEN {

FOR i: INT DECREASING IN[0..pos1] DO

c1: CHAR ~ InlineFetch[s1, i];

IF c1=c2 AND (rem2=0 OR Run[s1, i+1, s2, 1, case]=rem2) THEN RETURN [i];

ENDLOOP;

}

ELSE {

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

FOR i: INT DECREASING IN[0..pos1] DO

c1: CHAR ¬ InlineFetch[s1, i];

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c1=c2 AND (rem2=0 OR Run[s1, i+1, s2, 1, case]=rem2) THEN RETURN [i];

ENDLOOP;

};

RETURN [-1];

};

Match: PUBLIC PROC [pattern, object: ROPE, case: BOOL ¬ TRUE] RETURNS [BOOL] = TRUSTED {

Returns TRUE if the object matches the pattern, where the pattern may contain * to indicate that 0 or more characters will match. Returns FALSE otherwise. For example, using a*b as a pattern, some matching objects are: ab, a#b, a###b, and some not matching objects: abc, cde, bb, a, Ab. If case is true, then case matters, otherwise upper and lower case characters are considered equal.

submatch: PROC [i1: INT, len1: INT, i2: INT, len2: INT] RETURNS [BOOL] = TRUSTED {

WHILE len1 > 0 DO

c1: CHAR ¬ InlineFetch[pattern, i1];

IF c1 = '* THEN {

quick kill for * at end of pattern

IF len1 = 1 THEN RETURN [TRUE];

else must take all combinations

{-- first, accept the *

j1: INT ¬ i1 + 1;

nlen1: INT ¬ len1 - 1;

j2: INT ¬ i2;

nlen2: INT ¬ len2;

WHILE nlen2 >= 0 DO

IF submatch[j1, nlen1, j2, nlen2] THEN RETURN [TRUE];

j2 ¬ j2 + 1;

nlen2 ¬ nlen2 - 1;

ENDLOOP;

};

RETURN [FALSE];

};

IF len2 <= 0 THEN RETURN [FALSE];

at this point demand an exact match in both strings

{c2: CHAR ¬ InlineFetch[object, i2];

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

};

IF c1 # c2 THEN RETURN [FALSE];

};

i1 ¬ i1 + 1;

len1 ¬ len1 - 1;

i2 ¬ i2 + 1;

len2 ¬ len2 - 1;

ENDLOOP;

RETURN [len2 = 0];

};

len1: INT ¬ InlineSize[pattern];

len2: INT ¬ InlineSize[object];

First, strip off the common tails until they differ (quick kill false), or the pattern has a * at the tail. This strip is easy, because we just decrement the lengths.

WHILE len1 > 0 DO

n: INT ¬ len1 - 1;

c1: CHAR ¬ InlineFetch[pattern, n];

c2: CHAR;

IF c1 = '* THEN EXIT;

IF len2 = 0 THEN RETURN [FALSE];

len1 ¬ n;

len2 ¬ len2 - 1;

c2 ¬ InlineFetch[object, len2];

IF NOT case THEN {

IF c1 <= 'Z AND c1 >= 'A THEN c1 ¬ c1 + ('a-'A);

IF c2 <= 'Z AND c2 >= 'A THEN c2 ¬ c2 + ('a-'A);

};

IF c1 # c2 THEN RETURN [FALSE];

ENDLOOP;

RETURN [submatch [0, len1, 0, len2]];

};

SkipTo: PUBLIC PROC [s: ROPE, pos: INT, skip: ROPE] RETURNS [INT] = TRUSTED {

... returns the lowest position N in s such that s[N] is in the skip string and N >= pos. If no such character occurs in s, then return s.length.

len: INT ¬ InlineSize[s];

skipText: Rope.Text = Flatten[skip];

skiplen: NAT ¬ IF skipText = NIL THEN 0 ELSE skipText.length;

IF pos < len AND skiplen # 0 THEN {

CharMatch: PROC [c: CHAR] RETURNS [BOOL] = TRUSTED {

FOR i: NAT IN [0..skiplen) DO

IF c = skipText[i] THEN RETURN [TRUE];

ENDLOOP;

pos ¬ pos + 1;

RETURN [FALSE];

};

IF Map[s, pos, len - pos, CharMatch] THEN RETURN [pos];

};

RETURN [len];

};

SkipOver: PUBLIC PROC [s: ROPE, pos: INT, skip: ROPE] RETURNS [INT] = TRUSTED {

... return the lowest position N in s such that s[N] is NOT in the skip string and N >= pos. If no such character occurs in s, then return s.length.

len: INT ¬ InlineSize[s];

skipText: Rope.Text = Flatten[skip];

skiplen: NAT ¬ IF skipText = NIL THEN 0 ELSE skipText.length;

IF pos >= len THEN RETURN [len];

IF skiplen # 0 THEN {

CharMatch: PROC [c: CHAR] RETURNS [BOOL] = TRUSTED {

FOR i: NAT IN [0..skiplen) DO

IF c = skipText[i] THEN GO TO found;

ENDLOOP;

RETURN [TRUE];

EXITS found => {pos ¬ pos + 1; RETURN [FALSE]};

};

IF Map[s, pos, len - pos, CharMatch] THEN RETURN [pos];

};

RETURN [pos];

};

VerifyStructure: PUBLIC PROC
[s: ROPE] RETURNS [leaves,nodes,maxDepth: INT ¬ 0] = TRUSTED {

... traverses the structure of the given rope object; extra checking is performed to verify invariants.

IF s = NIL THEN RETURN;

WITH x: s SELECT FROM

text => {

leaves ¬ 1;

IF x.length > x.max THEN ERROR VerifyFailed};

node =>

WITH x: x SELECT FROM

substr => {

ref: ROPE ¬ x.base;

len1: INT ¬ Size[x.base];

IF x.start < 0 OR x.size <= 0 THEN ERROR VerifyFailed;

IF len1 < x.start + x.size THEN ERROR VerifyFailed;

[leaves, nodes, maxDepth] ¬ VerifyStructure[ref];

nodes ¬ nodes + 1};

concat => {

leaves1,nodes1,maxDepth1: INT;

left: ROPE ¬ x.base;

lSize: INT ¬ Size[left];

right: ROPE ¬ x.rest;

rSize: INT ¬ Size[right];

[leaves1, nodes1, maxDepth1] ¬ VerifyStructure[left];

[leaves, nodes, maxDepth] ¬ VerifyStructure[right];

leaves ¬ leaves + leaves1;

nodes ¬ nodes + nodes1 + 1;

IF maxDepth1 > maxDepth THEN maxDepth ¬ maxDepth1;

IF x.size # lSize + rSize THEN ERROR VerifyFailed;

IF x.pos # lSize THEN ERROR VerifyFailed;

};

replace => {

leaves1,nodes1,maxDepth1: INT;

old: ROPE ¬ x.base;

oldSize: INT ¬ Size[old];

repl: ROPE ¬ x.replace;

replSize: INT ¬ Size[repl];

[leaves, nodes, maxDepth] ¬ VerifyStructure[old];

[leaves1, nodes1, maxDepth1] ¬ VerifyStructure[repl];

leaves ¬ leaves + leaves1;

nodes ¬ nodes + nodes1 + 1;

IF maxDepth < maxDepth1 THEN maxDepth ¬ maxDepth1;

IF x.start > x.oldPos OR x.start > x.newPos THEN ERROR VerifyFailed;

IF x.newPos - x.start # replSize THEN ERROR VerifyFailed;

IF x.start < 0 OR x.start > x.size THEN ERROR VerifyFailed;

IF x.oldPos > oldSize THEN ERROR VerifyFailed;

};

object => {

leaves ¬ 1;

IF x.size < 0 OR x.fetch = NIL THEN ERROR VerifyFailed;

};

ENDCASE => ERROR NoRope;

maxDepth ¬ maxDepth + 1;

IF maxDepth # InlineDepth[s] THEN ERROR VerifyFailed;

};

VerifyFailed: PUBLIC ERROR = CODE;

Balance implementation

This implementation is courtesy of Michael Plass (March 1985, revised September 1993).

Balance: PUBLIC PROC [base: ROPE, start: INT ¬ 0, len: INT ¬ MaxLen, flat: INT ¬ FlatMax]
RETURNS [new: ROPE ¬ NIL] = {

This procedure is here to match the new implementation against the old definition.

RETURN [NewBalance[Substr[base, start, len]]];

};

Stopper: TYPE = PROC [part: Part] RETURNS [BOOL];

Part: TYPE ~ RECORD [base: ROPE, start: INT, end: INT];

Must have 0 <= start <= end <= Size[base];

HeightArray: TYPE = ARRAY [0..MaxDepth) OF INT;

minSizeForHeight: REF HeightArray ¬ NIL;

InitMinSizeForHeight: PROC ~ {

Initializes the height array. Does not need monitoring, since the array is completely initialized by the time the assignment occurs, and REF assignment is atomic. Races can only cause a little extra allocation.

IF minSizeForHeight = NIL THEN {

h: REF HeightArray ¬ untracedZone.NEW[HeightArray];

h[0] ¬ 0;

A NIL rope has no characters and height 0.

h[1] ¬ 1;

A flat rope ought to have at least one character.

h[2] ¬ Rope.FlatMax+1;

Must be at least this big to warrant any non-flat structure.

FOR i: NAT IN [3..MaxDepth) DO

Use Fibonacci recurrence to compute rest.

Be careful about overflow here...

IF INT.LAST - h[i-1] < h[i-2]

THEN h[i] ¬ INT.LAST

ELSE h[i] ¬ h[i-1] + h[i-2];

ENDLOOP;

minSizeForHeight ¬ h;

};

PartIsBalanced: Stopper ~ {

Examines only the root.

size: INT ~ InlineSize[part.base];

height: INT ~ InlineDepth[part.base];

IF part.start # 0

OR part.end # size

OR height >= MaxDepth

THEN RETURN [FALSE];

IF minSizeForHeight = NIL THEN InitMinSizeForHeight[];

IF size < minSizeForHeight[height] THEN RETURN [FALSE];

WITH part.base SELECT FROM

substr: REF Rope.RopeRep.node.substr => RETURN [height<=1];

concat: REF Rope.RopeRep.node.concat => RETURN [TRUE];

replace: REF Rope.RopeRep.node.replace => RETURN [FALSE];

ENDCASE => RETURN [TRUE];

};

NewBalance: PROC [rope: ROPE] RETURNS [ROPE] ~ {

a: ARRAY [0..MaxDepth+6] OF ROPE ¬ ALL[NIL];

aN: INT ¬ 0;

The ordered concatenation of the a[i] is the result-so-far.

Assume and maintain this invariant: depth[a[i-1]] > depth[a[i]], for i IN (0..aN)

Collapse: PROC [d: INTEGER] ~ {

Makes depth[a[aN-1]] >= d, while maintaining the invariant.

UNTIL aN < 2 OR InlineDepth[a[aN-1]] >= d DO

r: ROPE = SafeConcat[a[aN-2], a[aN-1]];

a[aN-1] ¬ NIL;

a[aN-2] ¬ NIL;

aN ¬ aN - 2;

Collapse[InlineDepth[r]];

Build[r];

ENDLOOP;

};

Build: PROC [r: ROPE] ~ {

Requires the invariant, plus (aN = 0) OR (depth[a[aN-1]] >= depth[r])

Appends r to the current collection of ropes.

UNTIL aN = 0 OR InlineDepth[a[aN-1]] > InlineDepth[r] DO

r ¬ SafeConcat[a[aN - 1], r];

aN ¬ aN - 1;

ENDLOOP;

a[aN] ¬ r;

aN ¬ aN + 1;

};

SavePart: PROC [part: Part] ~ {

IF part.end > part.start THEN {

Since this part is balanced (it has passed PartIsBalanced), we know the depth is small enough so that the Substr call will not trigger a recursive call to NewBalance

r: ROPE ¬ Substr[part.base, part.start, part.end-part.start];

d: INTEGER ¬ InlineDepth[r];

Collapse[d];

UNTIL aN = 0 OR InlineDepth[a[aN-1]] > d DO

r ¬ SafeConcat[a[aN - 1], r];

d ¬ InlineDepth[r];

aN ¬ aN - 1;

ENDLOOP;

a[aN] ¬ r;

aN ¬ aN + 1;

};

part: Part ~ [rope, 0, Size[rope]];

stopper: Stopper ~ IF InlineDepth[rope] > MaxDepth+1 THEN NIL ELSE PartIsBalanced;

Note: If the depth exceeds MaxDepth+1, that means we have a rope that NewBalance was unable to iron out sufficiently using PartIsBalanced as the stopper. In this very unusual case, we will map all the way down to the flat pieces to guarantee a balanced result.

MapParts[part, SavePart, stopper];

WHILE aN > 1 DO

a[aN-2] ¬ SafeConcat[a[aN-2], a[aN-1]];

a[aN-1] ¬ NIL;

aN ¬ aN - 1;

ENDLOOP;

RETURN [a[0]]

};

BadPart: SIGNAL ~ CODE;

MapParts: PROC [part: Part, action: PROC[Part], stopDescent: Stopper ¬ NIL] ~{

IF stopDescent#NIL AND stopDescent[part]

THEN action[part]

ELSE {

size: INT ~ InlineSize[part.base];

depth: INT ~ InlineDepth[part.base]; -- debugging information

IF part.start < 0 OR part.end NOT IN [part.start..size] THEN ERROR BadPart;

WITH part.base SELECT FROM

substr: REF Rope.RopeRep.node.substr => {

MapParts[[substr.base, substr.start+part.start, substr.start+part.end], action, stopDescent];

};

concat: REF Rope.RopeRep.node.concat => {

IF part.start < concat.pos THEN {

MapParts[[concat.base, part.start, MIN[part.end, concat.pos]], action, stopDescent];

};

IF part.end > concat.pos THEN {

newStart: INT ¬ MAX[part.start-concat.pos, 0];

newEnd: INT ¬ part.end-concat.pos;

MapParts[[concat.rest, newStart, newEnd], action, stopDescent];

};

replace: REF Rope.RopeRep.node.replace => {

len1: INT ~ replace.start;

len2: INT ~ replace.newPos-replace.start;

len3: INT ~ replace.size-replace.newPos;

offset3: INT ~ replace.oldPos;

IF part.start < len1 THEN {

MapParts[[replace.base, part.start, MIN[part.end, len1]], action, stopDescent];

};

IF part.start < len1+len2 AND part.end > len1 THEN {

newStart: INT ~ MAX[part.start-len1, 0];

newEnd: INT ~ MIN[part.end-len1, len2];

MapParts[[replace.replace, newStart, newEnd], action, stopDescent];

};

IF part.end > len1+len2 THEN {

newStart: INT ¬ MAX[part.start-(len1+len2), 0]+offset3;

newEnd: INT ¬ MIN[part.end-(len1+len2), len3]+offset3;

MapParts[[replace.base, newStart, newEnd], action, stopDescent];

};

ENDCASE => action[part];

};

Package Initialization

TRUSTED {untracedZone ¬ SafeStorage.GetUntracedZone[];}

END.

26-Feb-81, Russ Atkinson, fixed bug in Substr (REF Tconcat case) found by Paxton

31-Mar-81, Russ Atkinson, fixed bug in Map (overlarge len to user's map) found by Morris

8-Apr-81, Russ Atkinson, fixed bug in Substr (REF Tconcat case) found by Paxton

11-May-81, Russ Atkinson, converted to use variant record representation

23-May-81, Russ Atkinson, added Balance, ContainingPiece

12-Jul-81, Russ Atkinson, added FromChar, fixed FromProc

22-Sep-81, Russ Atkinson, removed dependence on CedarString

14-Oct-81, Russ Atkinson, added stuff for depth maintenance

30-Oct-81 17:21:35, Russ Atkinson, added pz & changes to match new specs

November 24, 1981 12:11 pm, Russ Atkinson, fixed ContainingPiece and some defaults

17-Feb-82 14:50:00, Russ Atkinson, minor defs changes

19-Feb-82 12:00:25, Russ Atkinson, try to avoid returning NIL

April 8, 1982, Russ Atkinson, fix Compare bug

June 7, 1982, Russ Atkinson, convert to Cedar 3.2

September 9, 1982, Russ Atkinson, convert to Cedar 3.4 (Compare returns Comparison)

December 20, 1984, Russ Atkinson, formatting cleanup & performance improvements

Russ Atkinson (RRA) January 29, 1985 4:09:50 pm PST

Removed PieceMap and its support, added AppendChars and its support. Other small changes to use AppendChars internally (like in Flatten).

Christian Jacobi December 2, 1987 10:38:38 am PST

added debugging code; made inlines be real procedures.

Christian Jacobi December 2, 1987 10:39:14 am PST

copied the implementation of Balance from princops ropeimpl.

Carl Hauser: March 6, 1989: restored inlines to Rope and RopePrivate interfaces.

Michael Plass: October 10, 1991: Made changes to allow for XROPE in rope interface.

Michael Plass: September 7, 1993: Made internal calls to Balance use NewBalance; re-coded NewBalance to concatenate neighbors on the basis of depth, rather than size, and made a recursive invocation impossible; increased MaxDepth to allow enough depth for maximum-size ropes (MaxDepth is in RopePrivate).