DIRECTORY
Ascii USING [Upper],
Process USING [CheckForAbort],
RefID USING [ID, Release, Seal, Unseal],
RegularExpression USING [CreateFromRope, Finder, SearchRope],
Rope USING [Compare, Equal, Fetch, InlineFetch, Length, ROPE, SkipTo, Substr],
Soundex USING [NameToCode],
LoganBerryStub,
LoganQuery;
Basic operations
The following operations are syntactically and semantically similar to those provided by LoganBerry. Any errors raised by these operations are generated by LoganBerry.
NextEntry:
PUBLIC
PROC [cursor: ComplexCursor, dir: CursorDirection ← increasing]
RETURNS [entry: LoganBerry.Entry] = {
Retrieves the next entry relative to the given cursor. The cursor is automatically updated so that NextEntry may be repeatedly called to enumerate entries. NIL is returned if the cursor is at the end of the sequence and dir=increasing or at the start of the sequence and dir=decreasing.
Errors: IndexNotOpen, BadIndex, LogNotOpen, BadLogEntry
IF cursor.class.retrieve = NIL THEN RETURN[NIL];
Process.CheckForAbort[];
RETURN[cursor.class.retrieve[cursor, dir]];
};
EndGenerate:
PUBLIC
PROC [cursor: ComplexCursor]
RETURNS [] = {
Releases the cursor; no further operations may be performed using the given cursor. This must be called once for every cursor created.
IF cursor.class.destroy = NIL THEN RETURN;
cursor.class.destroy[cursor];
};
Predefined query classes
$Simple
SimpleClass: QueryClass = NEW[QueryClassRec ← [$Simple, SimpleRetrieve, SimpleDestroy]];
SimpleCursor: TYPE = LoganBerry.Cursor;
SimpleRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
RETURN[LoganBerry.NextEntry[cursor: cursor.data, dir: dir]];
};
SimpleDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
LoganBerry.EndGenerate[cursor: cursor.data];
};
GenerateEntries:
PUBLIC
PROC [db: LoganBerry.OpenDB, key: LoganBerry.AttributeType, start: LoganBerry.AttributeValue ←
NIL, end: LoganBerry.AttributeValue ←
NIL]
RETURNS [cursor: ComplexCursor] = {
Identical to the GenerateEntries operation provided by LoganBerry but returns a ComplexCursor with class=$Simple and data=LoganBerry.Cursor.
Errors: LoganBerry.NoIndex
cursor.class ← SimpleClass;
cursor.data ← LoganBerry.GenerateEntries[db: db, key: key, start: start, end: end];
};
$Filter
FilterClass: QueryClass = NEW[QueryClassRec ← [$Filter, FilterRetrieve, FilterDestroy]];
FilterCursor: TYPE = REF FilterCursorRec;
FilterCursorRec:
TYPE =
RECORD [
input: ComplexCursor,
pattern: ROPE,
filter: FilterProc,
atype: LoganBerry.AttributeType,
stopNG: BOOLEAN ← FALSE
];
FilterRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
fc: FilterCursor = NARROW[RefID.Unseal[cursor.data]];
match, nothingGreater: BOOLEAN ← FALSE;
numAtypes: INT;
entry ← NextEntry[fc.input, dir];
UNTIL match
OR entry =
NIL
DO
could be multiple attributes of fc.atype per entry so check them all
numAtypes ← 0;
FOR e: LoganBerry.Entry ← entry, e.rest
WHILE e #
NIL
DO
IF e.first.type = fc.atype
THEN {
numAtypes ← numAtypes + 1;
[match, nothingGreater] ← fc.filter[e.first.value, fc.pattern];
IF match THEN EXIT;
};
ENDLOOP;
IF
NOT match
THEN
IF nothingGreater
AND fc.stopNG
AND dir = increasing
AND numAtypes = 1
THEN
entry ← NIL -- can stop before actual end of input
ELSE
entry ← NextEntry[fc.input, dir];
ENDLOOP;
};
FilterDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
fc: FilterCursor = NARROW[RefID.Unseal[cursor.data]];
EndGenerate[fc.input];
[] ← RefID.Release[cursor.data];
};
FilterEntries:
PUBLIC
PROC [input: ComplexCursor, pattern:
ROPE, filter: FilterProc, atype: LoganBerry.AttributeType, stopIfNothingGreater:
BOOLEAN ←
FALSE]
RETURNS [output: ComplexCursor] = {
Returns a ComplexCursor of class $Filter. Retrievals using this new cursor apply the given pattern-matching filter to the appropriate attribute of entries identified by the input cursor. A retrieval returns NIL if the input is exhausted or stopIfNothingGreater=TRUE and the filter procedure returns nothingGreater=TRUE.
fc: FilterCursor ← NEW[FilterCursorRec];
fc.input ← input;
fc.pattern ← pattern;
fc.filter ← filter;
fc.atype ← atype;
fc.stopNG ← stopIfNothingGreater;
output.class ← FilterClass;
output.data ← RefID.Seal[fc];
};
$Merger
MergerClass: QueryClass = NEW[QueryClassRec ← [$Merger, MergerRetrieve, MergerDestroy]];
MergerCursor: TYPE = REF MergerCursorRec;
MergerCursorRec:
TYPE =
RECORD [
input: ARRAY [0..1] OF ComplexCursor,
atype: LoganBerry.AttributeType,
prefetch: BOOLEAN ← FALSE, -- is there a prefetched entry?
prefetchEntry: LoganBerry.Entry, -- prefetched entry if one exists
prefetchValue: LoganBerry.AttributeValue, -- value saved for performance reasons
prefetchInput: [0..1] ← 0, -- which input the prefetch came from
prefetchDir: CursorDirection ← increasing -- direction of prefetch
];
MergerRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
mc: MergerCursor = NARROW[RefID.Unseal[cursor.data]];
newInput: [0..1];
newEntry: LoganBerry.Entry;
newValue: LoganBerry.AttributeValue;
returnNew: BOOLEAN;
IF mc.prefetch
AND
NOT mc.prefetchDir = dir
THEN {
-- prefetched in wrong direction
so move cursor back over prefetched entry
[] ← NextEntry[mc.input[mc.prefetchInput], dir];
mc.prefetch ← FALSE;
};
IF
NOT mc.prefetch
THEN {
-- go ahead and get an entry
mc.prefetchEntry ← NextEntry[mc.input[0], dir];
mc.prefetchValue ← GetAttributeValue[mc.prefetchEntry, mc.atype];
mc.prefetchInput ← 0;
mc.prefetchDir ← dir;
mc.prefetch ← TRUE;
};
newInput ← ABS[1-mc.prefetchInput];
newEntry ← NextEntry[mc.input[newInput], dir];
IF mc.prefetchEntry =
NIL
THEN
-- one input already exhausted
RETURN[newEntry];
IF newEntry =
NIL
THEN {
-- just exhausted input
entry ← mc.prefetchEntry;
mc.prefetchEntry ← NIL;
mc.prefetchInput ← newInput;
RETURN[entry];
};
newValue ← GetAttributeValue[newEntry, mc.atype];
returnNew ←
SELECT Rope.Compare[newValue, mc.prefetchValue,
FALSE]
FROM
less => (dir = increasing),
greater => (dir = decreasing),
ENDCASE => TRUE;
IF returnNew
THEN {
entry ← newEntry;
}
ELSE {
entry ← mc.prefetchEntry;
mc.prefetchEntry ← newEntry;
mc.prefetchValue ← newValue;
mc.prefetchInput ← newInput;
};
};
MergerDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
mc: MergerCursor = NARROW[RefID.Unseal[cursor.data]];
EndGenerate[mc.input[0]];
EndGenerate[mc.input[1]];
[] ← RefID.Release[cursor.data];
};
MergeEntries:
PUBLIC
PROC [input1, input2: ComplexCursor, atype: LoganBerry.AttributeType]
RETURNS [output: ComplexCursor] = {
Returns a ComplexCursor of class $Merger. Retrievals using this new cursor merge the two input streams. The input streams should be ordered by the given attribute type. A retrieval returns NIL only when both inputs are exhausted.
mc: MergerCursor ← NEW[MergerCursorRec];
mc.input[0] ← input1;
mc.input[1] ← input2;
mc.atype ← atype;
mc.prefetch ← FALSE;
output.class ← MergerClass;
output.data ← RefID.Seal[mc];
};
Predefined filters
Equal:
PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Compares the value and pattern for equality.
SELECT Rope.Compare[s1: value, s2: pattern, case:
FALSE]
FROM
less => match ← FALSE;
equal => match ← TRUE;
greater => {match ← FALSE; nothingGreater ← TRUE};
ENDCASE => ERROR;
};
Prefix:
PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Checks if the pattern is a prefix of the value. Prefix[v, "p"] is the same as Wildcard[v, "p*"], though faster.
SELECT Rope.Compare[s1: Rope.Substr[value, 0, Rope.Length[pattern]], s2: pattern, case:
FALSE]
FROM
less => match ← FALSE;
equal => match ← TRUE;
greater => {match ← FALSE; nothingGreater ← TRUE}
ENDCASE => ERROR;
};
Wildcard:
PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
The pattern may contain zero or more wildcards (the character "*") that match anything.
This routine was adapted from FSMainImpl2.Match.
SubMatch:
PROC [pstart:
INT, plen:
INT, vstart:
INT, vlen:
INT]
RETURNS [match:
BOOLEAN, nothingGreater:
BOOLEAN ← FALSE] = {
See if there is a match between Rope.Substr[value,vstart,vlen] and Rope.Substr[pattern,pstart,plen].
pchar, vchar: CHAR;
WHILE plen > 0
DO
pchar ← Ascii.Upper[pattern.Fetch[pstart]];
IF pchar = '*
THEN {
quick kill for * at end of pattern
IF plen = 1 THEN RETURN [match: TRUE];
else must take all combinations
{
-- first, accept the *
newpstart: INT = pstart + 1;
newplen: INT = plen - 1;
newvstart: INT ← vstart;
newvlen: INT ← vlen;
WHILE newvlen >= 0
DO
IF SubMatch[newpstart, newplen, newvstart, newvlen].match THEN RETURN [match: TRUE];
newvstart ← newvstart + 1;
newvlen ← newvlen - 1;
ENDLOOP;
};
RETURN [match: FALSE];
};
IF vlen = 0 THEN RETURN [match: FALSE];
at this point demand an exact match in both strings
vchar ← Ascii.Upper[value.Fetch[vstart]];
IF pchar # vchar THEN RETURN [match: FALSE, nothingGreater: vchar > pchar];
pstart ← pstart + 1;
plen ← plen - 1;
vstart ← vstart + 1;
vlen ← vlen - 1;
ENDLOOP;
IF vlen = 0
THEN
RETURN [match:
TRUE]
ELSE RETURN [match: FALSE, nothingGreater: TRUE];
};
[match, nothingGreater] ← SubMatch [0, pattern.Length[], 0, value.Length[]];
};
finderCache:
RECORD[
-- cache last RE finder to improve performance
pattern: ROPE,
finder: RegularExpression.Finder
];
RE:
PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
The pattern is taken to be a regular expression as defined in RegularExpressionDoc.tioga.
finder: RegularExpression.Finder;
purePrefix: ROPE;
i: INT;
IF Rope.Equal[s1: pattern, s2: finderCache.pattern, case:
FALSE]
THEN
-- get from cache
finder ← finderCache.finder
ELSE {
finder ← RegularExpression.CreateFromRope[pattern: pattern, ignoreCase: TRUE, addBounds: TRUE];
finderCache.pattern ← pattern;
finderCache.finder ← finder;
};
match ← RegularExpression.SearchRope[finder, value].found;
IF
NOT match
THEN {
-- see if something greater might match
i ← Rope.SkipTo[s: pattern, skip: "\'#[^$*+(\\<{!"]; -- look for special chars
IF
NOT i = Rope.Length[pattern]
AND Rope.InlineFetch[pattern, i] = '*
THEN
-- could be zero of previous char
i ← i-1;
purePrefix ← Rope.Substr[base: pattern, len: i];
IF Rope.Compare[s1: Rope.Substr[value, 0, Rope.Length[purePrefix]], s2: purePrefix, case:
FALSE] = greater
THEN
nothingGreater ← TRUE;
};
};
Soundex:
PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Compares the value and pattern based on their Soundex codes. The Soundex encoding tends to group together variants of the same name; for instance, Johnson, Jansen, and Johansen have identical Soundex codes.
ch: CHAR ← Ascii.Upper[Rope.InlineFetch[pattern,0]];
as an optimization, only compute codes if first characters match
SELECT Ascii.Upper[Rope.InlineFetch[value,0]]
FROM
= ch => match ← Rope.Equal[Soundx.NameToCode[value], Soundx.NameToCode[pattern]];
< ch => match ← FALSE;
> ch => {match ← FALSE; nothingGreater ← TRUE};
ENDCASE => ERROR;
};
Doug Terry November 7, 1985 3:11:59 pm PST
created
changes to: DIRECTORY, ILQueryImpl, EXPORTS, ~, NextEntry, EndGenerate, SimpleRetrieve, SimpleDestroy, GenerateEntries, FilterEntries, MergeEntries, Equal, SimpleRetrieve, FilterClass, FilterCursorRec, FilterRetrieve, FilterDestroy, FilterEntries, FilterCursor, FilterCursorRec, Soundex, MergerClass, MergerRetrieve, MergerDestroy, Prefix, Wildcard, RE
Doug Terry, November 8, 1985 7:36:34 pm PST
changes to: Wildcard, SubMatch (local of Wildcard), RE, Soundex, MergerCursorRec, MergerRetrieve, DIRECTORY, IMPORTS, Equal, Prefix, NextEntry, EndGenerate, GenerateEntries, FilterEntries, MergeEntries, FilterRetrieve
Doug Terry, November 26, 1985 6:15:38 pm PST
changes to: DIRECTORY, LoganQueryImpl, IMPORTS, EXPORTS, ~, NextEntry, SimpleRetrieve, SimpleDestroy, GenerateEntries, FilterCursorRec, FilterRetrieve, FilterEntries, MergerCursorRec, MergerRetrieve, MergeEntries, GetAttributeValue, FilterDestroy, MergerDestroy