File: KeyNote.mesa
Copyright © 1987 by Xerox Corporation. All rights reserved.
Contents:
Last edited by:
Jack Kent, January 29, 1988 1:50:15 pm PST
DIRECTORY
Rope,
KeyNoteDatabase;
KeyNote: CEDAR DEFINITIONS
= BEGIN
ROPE: TYPE = Rope.ROPE;
FileName: TYPE = Rope.ROPE;
FileNameList: TYPE = LIST OF Rope.ROPE;
for every fileName that matches the pattern, every word in fileName is called with WordVerifierProc to verify whether it is a word or it should be ignored.
FileNamesToMatch: TYPE = REF FileNamesToMatchObject;
FileNamesToMatchObject: TYPE = RECORD [ pattern: ROPE, wDir: ROPENIL ];
WordVerifierProc: TYPE = PROCEDURE [clientData: REF ANY, word: ROPE] RETURNS [BOOLEAN];
this identfies the MINIMUM number of times a token must appear within a document in order to be considered a bona-fide keyword...and thus relevant!
TokenRelevanceThreshhold: TYPE = INT;
the top "StopListCutOff" words that appear in the document universe WILL appear in the stopList. This stopList can then be passed down to the WordVerifierProc as clientData.
StopListCutOff: TYPE = INT;
Handle: TYPE = KeyNoteDatabase.DatabaseHandle;
ResultList is a List.LORA for easy sorting on overAllWeight
ResultList: TYPE = LIST OF Result;
Result: TYPE = REF ANY;
Result: TYPE = REF ResultObject;
Result: TYPE = ROPE;
ResultObject: TYPE = RECORD [
fileName: ROPE,
overAllWeight: REAL,
tokenInfoList: TokenInfoList];
TokenInfoList is a List.LORA for easy sorting on weight contributed by tokens
TokenInfoList: TYPE = LIST OF TokenInfo;
TokenInfo: TYPE = REF ANY;
TokenInfo: TYPE = REF TokenInfoObject;
TokenInfoObject: TYPE = RECORD [
token: ROPE,
weight: REAL,
frequency: INTEGER];
BuildStopList: PROCEDURE[fileNamesToMatch: FileNamesToMatch, stopListCutOff: StopListCutOff] RETURNS [wordList: ResultList];
used either to open or to build a database: if fileNamesToMatch=NIL THEN just open is presumed
OpenDatabase: PROCEDURE[databaseName: ROPE, fileNamesToMatch: FileNamesToMatch ← NIL, wordVerifierProc: WordVerifierProc ← NIL, tokenRelevanceThreshhold: TokenRelevanceThreshhold ← 1, clientDataForVerifierProc: REF ANYNIL] RETURNS [ db: Handle];
CloseDatabase: PROCEDURE [db: Handle];
AddDocumentToIndex: PROCEDURE [db: Handle, fileName: FileName, wordVerifierProc: WordVerifierProc] ;
RemoveDocumentFromIndex: PROCEDURE [db: Handle, fileName: FileName, wordVerifierProc: WordVerifierProc];
document entries returned are sorted by weight
FindDocumentsFromWords: PROCEDURE [db: Handle, ropeList: ResultList] RETURNS [resultList: ResultList];
Errors raised during the execution of an operation indicate uncommon or abnormal conditions. Clients should be prepared to catch these.
Error: ERROR [ec: ErrorCode, explanation: ROPENIL];
ErrorCode: TYPE = ATOM;
$DatabaseProblems -- look at KeyNoteDatabase.Error
END.
Jack Kent, December 21, 1987 2:20:15 pm PST