[Indigo]<Arpanet>MT>Lex822.mesa!2

Lex822.mesa, Lexical analyzer for RFC 822.

David Nichols, July 7, 1983 1:56 pm

Last Edited by: Taft, February 5, 1984 2:32:04 pm PST PST

DIRECTORY

Rope USING [ROPE],

IO USING [STREAM];

Lex822: CEDAR DEFINITIONS =

BEGIN

Types

TokenType: TYPE = {atomTok, commentTok, domainLiteralTok, EOLTok, errorTok, quotedStringTok, specialTok, whiteSpaceTok};

FieldNameType: TYPE = {nameField, dateField, textField};

ErrorCode: TYPE = {badCharacter, badFolding, badNesting};

Procedures

LexToken: PROC [s: IO.STREAM] RETURNS [token, whiteSpace: Rope.ROPE, tokType: TokenType];

Get a normal token from s and return the text of the token, the white space and comments following the token, and the type of token that was found. Since the white space is returned separately, tokType will never be whiteSpaceTok or commentTok.

LexFieldName: PROC [s: IO.STREAM] RETURNS [fieldName: Rope.ROPE, fieldNameOk: BOOL];

Get a token consisting of a name of a header field.

LexText: PROC [s: IO.STREAM] RETURNS [text: Rope.ROPE];

Return the contents of a field as text, i.e. return the text that follows up to a newline not followed by white space.

CharProc: TYPE = PROC [char: CHAR] RETURNS [quit: BOOL ← FALSE, include: BOOL ← TRUE];

GetSequence: PROC [stream: IO.STREAM, charProc: CharProc] RETURNS [value: Rope.ROPE];

Reads characters from stream until EOF or quit is TRUE, including those characters for which include is TRUE. Raises IO.EndOfStream only if it occurs when reading the first included character.

This procedure was available in Cedar 4.4 and earlier but was eliminated in Cedar 5. Unfortunately there seems not to be any straightforward way to obtain the exact semantics of GetSequence. The implementation is a modification of the one in Cedar 4.4.

END.