RosaryDoc.tioga
Last edited by Michael Plass, July 22, 1988 9:36:00 am PDT
Rick Beach, March 13, 1987 11:23:43 am PST
Willie-Sue, November 25, 1987 4:16:02 pm PST
ROSARY
PCEDAR 1.0 —
Rosary
Michael Plass
© Copyright 1985, 1986, 1987, 1988 Xerox Corporation. All rights reserved.
A client package that provides the analogy for ROPEs, with REF ANYs in place of CHARs.
Keywords: data structures, lists, REF, ROPE, trees
XEROX Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, California 94304
1. Introduction
The ROPE type is the preferred way of representing immutable sequences of characters, and has proven to be very successful. This package provides the analogous data type where the base types are of type REF ANY. The fundamental operations on objects of this type are concatenation, substring extraction, and fetching the values by index. Additional operations are provided for more efficient sequential creation and scanning of the objects.
The client interface is called Rosary.mesa, and the basic data type is ROSARY. The Segment type is used to refer to part of a ROSARY without doing the extra work associated with substring extraction.
The Rosary implementation provides a print procedure to the Cedar system, so that when you print a value of type
ROSARY in the interpreter or with PutF, it will be displayed in the form
ROSARY[<item0>, <item1>, ...]
2. Operations
The basic operations are Size, Fetch, FromItem, Substr, and Concat. These are analogous to their counterparts in Rope.
Cat is a convenience for concatenating multiple ROSARYs together.
CatSegments provides a way of concatenating up to three segments to obtain a new ROSARY.
FromProc is analogous to Rope.FromProc.
FromProcProc is a more convoluted way of creating a
ROSARY, but is often applicable where
FromProc is not. It uses a double callback mechanism, where the client passes in a procedure
p; the package calls
p back, handing it a procedure
q, and then
p calls
q once for each item. If this seems confusing, don't be discouraged, because it is. The implementation of
FromList may make it all clear:
FromList:
PUBLIC
PROC [list:
LIST
OF
REF]
RETURNS [Ref] ~ {
p:
PROC[q:
PROC[item:
REF, repeat:
INT ← 1]] ~ {
FOR l:
LIST
OF
REF ← list, l.rest
UNTIL l =
NIL
DO
q[l.first];
ENDLOOP;
};
RETURN [FromProcProc[p]]
};
Notice that q takes a second parameter, namely repeat; this repeats the item in the ROSARY the specified number of times.
Map is analogous to Rope.Map, and MapRuns may be used where it is expected that repeated items will occur frequently. MapRuns does not promise to collapse adjacent runs for you, merely to take advantage of any encoded runs that occur in the data structure. FetchRun may be used to obtain a single, uncollapsed run.
FetchBuffer may be used to get a buffer-full of consecutive entries out of a
ROSARY; this is especially useful when more than one
ROSARY are being traversed in parallel, where Map is unusable. For example:
Equal:
PROC [r1, r2:
ROSARY]
RETURNS [
BOOL] ~ {
i: INT ← 0;
size: INT ← Rosary.Size[r1];
IF size # Rosary.Size[r2] THEN RETURN [FALSE];
WHILE i # size
DO
b1: Rosary.Buffer ← Rosary.FetchBuffer[r1, i];
b2: Rosary.Buffer ← Rosary.FetchBuffer[r2, i];
FOR j:
NAT
IN [0..b1.count)
DO
IF b1.a[j] # b2.a[j] THEN RETURN [FALSE];
ENDLOOP;
i ← i + b1.count;
ENDLOOP;
RETURN [TRUE]
};
FromList and ToList are conversion procedures between the types ROSARY and LIST OF REF.
FromObject provides a way of creating a
ROSARY with a non-standard implementation. To do this, the client provides a procedure for mapping the runs of the rosary object, along with the size and any data it needs. The client may also provide a substr procedure for implementing the substring operation; if this is omitted, a default one will be provided. As an example, consider the following lazy-reverse routine:
Reverse:
PROC [r:
ROSARY]
RETURNS [
ROSARY] ~ {
IF Rosary.Size[r] < 2 THEN RETURN [r]
ELSE RETURN [Rosary.FromObject[size: Rosary.Size[r], data: r, mapRuns: ReverseMap, substr: ReverseSubstr]]
};
ReverseMap: Rosary.MapRunsProc ~ {
data: ROSARY ~ NARROW[NARROW[s.base, REF Rosary.RosaryRep.object].data];
FOR i:
INT
DECREASING
IN [s.start..s.start+s.size)
DO
IF Rosary.MapRuns[[data, i, 1], action] THEN RETURN [TRUE];
ENDLOOP;
RETURN [FALSE];
};
ReverseSubstr: Rosary.SubstrProc ~ {
data: ROSARY ~ NARROW[NARROW[base, REF Rosary.RosaryRep.object].data];
RETURN [Reverse[Rosary.Substr[data, data.size-(start+len), len]]]
};
Note that the MapRunsProc and SubstrProc may assume that the start and size (len) fields are already clipped to the size of the base rosary. The data need not be a ROSARY, but can be any kind of data structure that the client wants. It is considered good form to build objects that always map the same sequence of REFs each time they are called, but there is nothing inside the Rosary implementation that requires this.
3. Implementation notes
The underlying representation of a ROSARY is a binary tree. The implementation tries to create trees that are not too high, so that excessive stack space is not used when traversing them. Unlike the ROPE representation, there is no substr node; this is done to guarantee that a ROSARY points only to the REFs that it contains. Thus the Substr operation needs to pick apart the subtrees it should contain, and glue them together with concat nodes. All of this work is actually done in CatSegments; Concat is lazier and only rebalances when necessary to keep the maximum height from being exceeded.
There is a compact representation provided for adjacent elements that are all equal, so ROSARYs that tend to have lots of runs will be more compact.
Neglecting the run encodings, a ROSARY takes between 2 and 3 times as much storage as the equivalent LIST OF REF, not counting the storage for the elements. The time to create a ROSARY using FromProcProc is roughly twice the time to create a LIST OF REF using CONS (when the elements are already allocated). The traversal time using Map is about 8 times slower than running through a list, neglecting the time spent processing each element. The random access time is, of course, much faster, growing as the height of the ROSARY rather than as its size.