LoganQueryImpl.mesa
Copyright © 1985 by Xerox Corporation. All rights reserved.
Doug Terry, February 12, 1986 11:29:36 am PST
LoganQuery allows more complicated queries to be performed on a LoganBerry database than the simple retrievals provided by the LoganBerry interface. Database queries of various classes can be registered; retrieval operations are class-specific. For instance, the $Filter class returns database entries that match a particular pattern, while the $Merger class merges the entries returned by two independent queries. Filters can be cascaded to perform multiple-attribute queries. The predefined $Simple class permits operations identical to those provided by LoganBerry. Clients are free to create their own class of queries.
DIRECTORY
Ascii USING [Upper],
RefID USING [ID, Release, Seal, Unseal],
RegularExpression USING [CreateFromRope, Finder, SearchRope],
Rope USING [Compare, Equal, Fetch, InlineFetch, Length, ROPE, SkipTo, Substr],
Soundex USING [NameToCode],
LoganBerry,
LoganQuery;
LoganQueryImpl: CEDAR PROGRAM
IMPORTS Ascii, RefID, RegularExpression, Rope, Soundx: Soundex, LoganBerry
EXPORTS LoganQuery
~ BEGIN OPEN LoganQuery;
ROPE: TYPE ~ Rope.ROPE;
Basic operations
The following operations are syntactically and semantically similar to those provided by LoganBerry. Any errors raised by these operations are generated by LoganBerry.
NextEntry: PUBLIC PROC [cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry] = {
Retrieves the next entry relative to the given cursor. The cursor is automatically updated so that NextEntry may be repeatedly called to enumerate entries. NIL is returned if the cursor is at the end of the sequence and dir=increasing or at the start of the sequence and dir=decreasing.
Errors: IndexNotOpen, BadIndex, LogNotOpen, BadLogEntry
IF cursor.class.retrieve = NIL THEN RETURN[NIL];
RETURN[cursor.class.retrieve[cursor, dir]];
};
EndGenerate: PUBLIC PROC [cursor: ComplexCursor] RETURNS [] = {
Releases the cursor; no further operations may be performed using the given cursor. This must be called once for every cursor created.
IF cursor.class.destroy = NIL THEN RETURN;
cursor.class.destroy[cursor];
};
Predefined query classes
$Simple
SimpleClass: QueryClass = NEW[QueryClassRec ← [$Simple, SimpleRetrieve, SimpleDestroy]];
SimpleCursor: TYPE = LoganBerry.Cursor;
SimpleRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
RETURN[LoganBerry.NextEntry[cursor: cursor.data, dir: dir]];
};
SimpleDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
LoganBerry.EndGenerate[cursor: cursor.data];
};
GenerateEntries: PUBLIC PROC [db: LoganBerry.OpenDB, key: LoganBerry.AttributeType, start: LoganBerry.AttributeValue ← NIL, end: LoganBerry.AttributeValue ← NIL] RETURNS [cursor: ComplexCursor] = {
Identical to the GenerateEntries operation provided by LoganBerry but returns a ComplexCursor with class=$Simple and data=LoganBerry.Cursor.
Errors: LoganBerry.NoIndex
cursor.class ← SimpleClass;
cursor.data ← LoganBerry.GenerateEntries[db: db, key: key, start: start, end: end];
};
$Filter
FilterClass: QueryClass = NEW[QueryClassRec ← [$Filter, FilterRetrieve, FilterDestroy]];
FilterCursor: TYPE = REF FilterCursorRec;
FilterCursorRec: TYPE = RECORD [
input: ComplexCursor,
pattern: ROPE,
filter: FilterProc,
atype: LoganBerry.AttributeType,
stopNG: BOOLEANFALSE
];
FilterRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
fc: FilterCursor = NARROW[RefID.Unseal[cursor.data]];
match, nothingGreater: BOOLEANFALSE;
numAtypes: INT;
entry ← NextEntry[fc.input, dir];
UNTIL match OR entry = NIL DO
could be multiple attributes of fc.atype per entry so check them all
numAtypes ← 0;
FOR e: LoganBerry.Entry ← entry, e.rest WHILE e # NIL DO
IF e.first.type = fc.atype THEN {
numAtypes ← numAtypes + 1;
[match, nothingGreater] ← fc.filter[e.first.value, fc.pattern];
IF match THEN EXIT;
};
ENDLOOP;
IF NOT match THEN
IF nothingGreater AND fc.stopNG AND dir = increasing AND numAtypes = 1 THEN
entry ← NIL -- can stop before actual end of input
ELSE
entry ← NextEntry[fc.input, dir];
ENDLOOP;
};
FilterDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
fc: FilterCursor = NARROW[RefID.Unseal[cursor.data]];
EndGenerate[fc.input];
[] ← RefID.Release[cursor.data];
};
FilterEntries: PUBLIC PROC [input: ComplexCursor, pattern: ROPE, filter: FilterProc, atype: LoganBerry.AttributeType, stopIfNothingGreater: BOOLEANFALSE] RETURNS [output: ComplexCursor] = {
Returns a ComplexCursor of class $Filter. Retrievals using this new cursor apply the given pattern-matching filter to the appropriate attribute of entries identified by the input cursor. A retrieval returns NIL if the input is exhausted or stopIfNothingGreater=TRUE and the filter procedure returns nothingGreater=TRUE.
fc: FilterCursor ← NEW[FilterCursorRec];
fc.input ← input;
fc.pattern ← pattern;
fc.filter ← filter;
fc.atype ← atype;
fc.stopNG ← stopIfNothingGreater;
output.class ← FilterClass;
output.data ← RefID.Seal[fc];
};
$Merger
MergerClass: QueryClass = NEW[QueryClassRec ← [$Merger, MergerRetrieve, MergerDestroy]];
MergerCursor: TYPE = REF MergerCursorRec;
MergerCursorRec: TYPE = RECORD [
input: ARRAY [0..1] OF ComplexCursor,
atype: LoganBerry.AttributeType,
prefetch: BOOLEANFALSE, -- is there a prefetched entry?
prefetchEntry: LoganBerry.Entry, -- prefetched entry if one exists
prefetchValue: LoganBerry.AttributeValue, -- value saved for performance reasons
prefetchInput: [0..1] ← 0, -- which input the prefetch came from
prefetchDir: CursorDirection ← increasing -- direction of prefetch
];
MergerRetrieve: RetrieveProc = {
[cursor: ComplexCursor, dir: CursorDirection ← increasing] RETURNS [entry: LoganBerry.Entry]
mc: MergerCursor = NARROW[RefID.Unseal[cursor.data]];
newInput: [0..1];
newEntry: LoganBerry.Entry;
newValue: LoganBerry.AttributeValue;
returnNew: BOOLEAN;
IF mc.prefetch AND NOT mc.prefetchDir = dir THEN { -- prefetched in wrong direction
so move cursor back over prefetched entry
[] ← NextEntry[mc.input[mc.prefetchInput], dir];
mc.prefetch ← FALSE;
};
IF NOT mc.prefetch THEN { -- go ahead and get an entry
mc.prefetchEntry ← NextEntry[mc.input[0], dir];
mc.prefetchValue ← GetAttributeValue[mc.prefetchEntry, mc.atype];
mc.prefetchInput ← 0;
mc.prefetchDir ← dir;
mc.prefetch ← TRUE;
};
newInput ← ABS[1-mc.prefetchInput];
newEntry ← NextEntry[mc.input[newInput], dir];
IF mc.prefetchEntry = NIL THEN -- one input already exhausted
RETURN[newEntry];
IF newEntry = NIL THEN { -- just exhausted input
entry ← mc.prefetchEntry;
mc.prefetchEntry ← NIL;
mc.prefetchInput ← newInput;
RETURN[entry];
};
newValue ← GetAttributeValue[newEntry, mc.atype];
returnNew ← SELECT Rope.Compare[newValue, mc.prefetchValue, FALSE] FROM
less => (dir = increasing),
greater => (dir = decreasing),
ENDCASE => TRUE;
IF returnNew THEN {
entry ← newEntry;
}
ELSE {
entry ← mc.prefetchEntry;
mc.prefetchEntry ← newEntry;
mc.prefetchValue ← newValue;
mc.prefetchInput ← newInput;
};
};
MergerDestroy: DestroyProc = {
[cursor: ComplexCursor] RETURNS []
mc: MergerCursor = NARROW[RefID.Unseal[cursor.data]];
EndGenerate[mc.input[0]];
EndGenerate[mc.input[1]];
[] ← RefID.Release[cursor.data];
};
MergeEntries: PUBLIC PROC [input1, input2: ComplexCursor, atype: LoganBerry.AttributeType] RETURNS [output: ComplexCursor] = {
Returns a ComplexCursor of class $Merger. Retrievals using this new cursor merge the two input streams. The input streams should be ordered by the given attribute type. A retrieval returns NIL only when both inputs are exhausted.
mc: MergerCursor ← NEW[MergerCursorRec];
mc.input[0] ← input1;
mc.input[1] ← input2;
mc.atype ← atype;
mc.prefetch ← FALSE;
output.class ← MergerClass;
output.data ← RefID.Seal[mc];
};
Predefined filters
Equal: PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Compares the value and pattern for equality.
SELECT Rope.Compare[s1: value, s2: pattern, case: FALSE] FROM
less => match ← FALSE;
equal => match ← TRUE;
greater => {match ← FALSE; nothingGreater ← TRUE};
ENDCASE => ERROR;
};
Prefix: PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Checks if the pattern is a prefix of the value. Prefix[v, "p"] is the same as Wildcard[v, "p*"], though faster.
SELECT Rope.Compare[s1: Rope.Substr[value, 0, Rope.Length[pattern]], s2: pattern, case: FALSE] FROM
less => match ← FALSE;
equal => match ← TRUE;
greater => {match ← FALSE; nothingGreater ← TRUE}
ENDCASE => ERROR;
};
Wildcard: PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
The pattern may contain zero or more wildcards (the character "*") that match anything.
This routine was adapted from FSMainImpl2.Match.
SubMatch: PROC [pstart: INT, plen: INT, vstart: INT, vlen: INT] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE] = {
See if there is a match between Rope.Substr[value,vstart,vlen] and Rope.Substr[pattern,pstart,plen].
pchar, vchar: CHAR;
WHILE plen > 0 DO
pchar ← Ascii.Upper[pattern.Fetch[pstart]];
IF pchar = '* THEN {
quick kill for * at end of pattern
IF plen = 1 THEN RETURN [match: TRUE];
else must take all combinations
{ -- first, accept the *
newpstart: INT = pstart + 1;
newplen: INT = plen - 1;
newvstart: INT ← vstart;
newvlen: INT ← vlen;
WHILE newvlen >= 0 DO
IF SubMatch[newpstart, newplen, newvstart, newvlen].match THEN RETURN [match: TRUE];
newvstart ← newvstart + 1;
newvlen ← newvlen - 1;
ENDLOOP;
};
RETURN [match: FALSE];
};
IF vlen = 0 THEN RETURN [match: FALSE];
at this point demand an exact match in both strings
vchar ← Ascii.Upper[value.Fetch[vstart]];
IF pchar # vchar THEN RETURN [match: FALSE, nothingGreater: vchar > pchar];
pstart ← pstart + 1;
plen ← plen - 1;
vstart ← vstart + 1;
vlen ← vlen - 1;
ENDLOOP;
IF vlen = 0 THEN RETURN [match: TRUE]
ELSE RETURN [match: FALSE, nothingGreater: TRUE];
};
[match, nothingGreater] ← SubMatch [0, pattern.Length[], 0, value.Length[]];
};
finderCache: RECORD[ -- cache last RE finder to improve performance
pattern: ROPE,
finder: RegularExpression.Finder
];
RE: PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
The pattern is taken to be a regular expression as defined in RegularExpressionDoc.tioga.
finder: RegularExpression.Finder;
purePrefix: ROPE;
i: INT;
IF Rope.Equal[s1: pattern, s2: finderCache.pattern, case: FALSE] THEN -- get from cache
finder ← finderCache.finder
ELSE {
finder ← RegularExpression.CreateFromRope[pattern: pattern, ignoreCase: TRUE, addBounds: TRUE];
finderCache.pattern ← pattern;
finderCache.finder ← finder;
};
match ← RegularExpression.SearchRope[finder, value].found;
IF NOT match THEN { -- see if something greater might match
i ← Rope.SkipTo[s: pattern, skip: "\'#[^$*+(\\<{!"]; -- look for special chars
IF NOT i = Rope.Length[pattern] AND Rope.InlineFetch[pattern, i] = '* THEN -- could be zero of previous char
i ← i-1;
purePrefix ← Rope.Substr[base: pattern, len: i];
IF Rope.Compare[s1: Rope.Substr[value, 0, Rope.Length[purePrefix]], s2: purePrefix, case: FALSE] = greater THEN
nothingGreater ← TRUE;
};
};
Soundex: PUBLIC FilterProc = {
[value: ROPE, pattern: ROPE] RETURNS [match: BOOLEAN, nothingGreater: BOOLEAN ← FALSE]
Compares the value and pattern based on their Soundex codes. The Soundex encoding tends to group together variants of the same name; for instance, Johnson, Jansen, and Johansen have identical Soundex codes.
ch: CHAR ← Ascii.Upper[Rope.InlineFetch[pattern,0]];
as an optimization, only compute codes if first characters match
SELECT Ascii.Upper[Rope.InlineFetch[value,0]] FROM
= ch => match ← Rope.Equal[Soundx.NameToCode[value], Soundx.NameToCode[pattern]];
< ch => match ← FALSE;
> ch => {match ← FALSE; nothingGreater ← TRUE};
ENDCASE => ERROR;
};
Miscellaneous
GetAttributeValue: PROC [entry: LoganBerry.Entry, type: LoganBerry.AttributeType] RETURNS [LoganBerry.AttributeValue] ~ {
FOR e: LoganBerry.Entry ← entry, e.rest WHILE e # NIL DO
IF e.first.type = type THEN
RETURN[e.first.value];
ENDLOOP;
RETURN[NIL];
};
END.
Doug Terry November 7, 1985 3:11:59 pm PST
created
Doug Terry, November 7, 1985 6:34:25 pm PST
changes to: DIRECTORY, ILQueryImpl, EXPORTS, ~, NextEntry, EndGenerate, SimpleRetrieve, SimpleDestroy, GenerateEntries, FilterEntries, MergeEntries, Equal, SimpleRetrieve, FilterClass, FilterCursorRec, FilterRetrieve, FilterDestroy, FilterEntries, FilterCursor, FilterCursorRec, Soundex, MergerClass, MergerRetrieve, MergerDestroy, Prefix, Wildcard, RE
Doug Terry, November 8, 1985 3:07:48 pm PST
changes to: Wildcard, SubMatch (local of Wildcard), RE, Soundex
Doug Terry, November 8, 1985 7:36:34 pm PST
changes to: Wildcard, SubMatch (local of Wildcard), RE, Soundex, MergerCursorRec, MergerRetrieve, DIRECTORY, IMPORTS, Equal, Prefix, NextEntry, EndGenerate, GenerateEntries, FilterEntries, MergeEntries, FilterRetrieve
Doug Terry, November 26, 1985 6:15:38 pm PST
changes to: DIRECTORY, LoganQueryImpl, IMPORTS, EXPORTS, ~, NextEntry, SimpleRetrieve, SimpleDestroy, GenerateEntries, FilterCursorRec, FilterRetrieve, FilterEntries, MergerCursorRec, MergerRetrieve, MergeEntries, GetAttributeValue, FilterDestroy, MergerDestroy
Doug Terry, December 17, 1985 11:15:09 am PST
changes to: FilterRetrieve, finderCache, RE
Doug Terry, December 23, 1985 1:58:05 pm PST
changes to: SubMatch (local of Wildcard), finderCache
Doug Terry, January 23, 1986 8:38:31 pm PST
changes to: SimpleRetrieve, SimpleDestroy, GenerateEntries
Doug Terry, February 12, 1986 11:06:34 am PST
changes to: SubMatch (local of Wildcard)