DIRECTORY SpellingWordMap, Rope USING [ROPE, Size], RopeReader USING [Create, SetPosition, Get, Ref, Backwards]; SpellingWordMapImpl: CEDAR MONITOR IMPORTS Rope, RopeReader EXPORTS SpellingWordMap = BEGIN ROPE: TYPE = Rope.ROPE; reader: RopeReader.Ref ¬ RopeReader.Create[]; -- To make word parsing efficient. MapWordsInRope: PUBLIC ENTRY PROC [words: ROPE, alphabetic: SpellingWordMap.CharSet, buffer: REF TEXT, f: PROC [REF TEXT, REF ANY, INT] RETURNS [BOOL], data: REF ANY, offset: INT] RETURNS [premature: BOOLEAN ¬ FALSE, wordStart, wordEnd: INT ¬ 0, newBuffer: REF TEXT] = { HandleExpansion: PROC [] = INLINE { IF wp >= buffer.maxLength THEN { newBuffer ¬ NEW[TEXT[2*buffer.maxLength + 1]]; newBuffer.length ¬ newBuffer.maxLength; FOR j: NAT IN [0..wp) DO newBuffer[j] ¬ buffer[j]; ENDLOOP; buffer ¬ newBuffer; }; }; ParseState: TYPE = {InWord, InJunk, Embeddable}; state: ParseState ¬ InJunk; wp: NAT ¬ 0; size: INT ¬ words.Size[]; c, e: CHAR ¬ 0C; newBuffer ¬ buffer; buffer.length ¬ buffer.maxLength; RopeReader.SetPosition[reader, words, 0]; FOR pos: INT IN [0..size) DO c ¬ reader.Get[]; IF alphabetic[c] THEN { IF state = InJunk THEN { wordStart ¬ pos; wp ¬ 0; } ELSE IF state = Embeddable THEN { HandleExpansion[]; buffer[wp] ¬ e; wp ¬ wp + 1; }; state ¬ InWord; HandleExpansion[]; buffer[wp] ¬ c; wp ¬ wp + 1; } ELSE { IF state = InWord THEN { IF c = '' THEN { state ¬ Embeddable; e ¬ c; } ELSE { buffer.length ¬ wp; wordEnd ¬ pos; premature ¬ f[buffer, data, wordStart+offset]; IF premature THEN RETURN; state ¬ InJunk; }; } ELSE IF state = Embeddable THEN { buffer.length ¬ wp; wordEnd ¬ pos-1; premature ¬ f[buffer, data, wordStart+offset]; IF premature THEN RETURN; state ¬ InJunk; }; }; ENDLOOP; IF state = InWord THEN { buffer.length ¬ wp; wordEnd ¬ size; premature ¬ f[buffer, data, wordStart+offset]; } ELSE IF state = Embeddable THEN { buffer.length ¬ wp; wordEnd ¬ size-1; premature ¬ f[buffer, data, wordStart+offset]; }; }; MapWordsInRopeBackward: PUBLIC ENTRY PROC [words: ROPE, alphabetic: SpellingWordMap.CharSet, buffer: REF TEXT, f: PROC [REF TEXT, REF ANY, INT] RETURNS [BOOL], data: REF ANY, offset: INT] RETURNS [premature: BOOLEAN ¬ FALSE, wordStart, wordEnd: INT ¬ 0, newBuffer: REF TEXT] = { HandleExpansion: PROC [] = INLINE { IF wp >= buffer.maxLength THEN { newBuffer ¬ NEW[TEXT[2*buffer.maxLength + 1]]; newBuffer.length ¬ newBuffer.maxLength; FOR j: NAT IN [0..wp) DO newBuffer[j] ¬ buffer[j]; ENDLOOP; buffer ¬ newBuffer; }; }; ReverseWord: PROC [] = INLINE { j: NAT ¬ buffer.length; c: CHAR; FOR i: NAT IN [0..j/2) DO j ¬ j - 1; c ¬ buffer[i]; buffer[i] ¬ buffer[j]; buffer[j] ¬ c; ENDLOOP; }; ParseState: TYPE = {InWord, InJunk, Embeddable}; state: ParseState ¬ InJunk; wp: NAT ¬ 0; size: INT ¬ words.Size[]; c, e: CHAR ¬ 0C; newBuffer ¬ buffer; buffer.length ¬ buffer.maxLength; RopeReader.SetPosition[reader, words, words.Size[]]; FOR pos: INT DECREASING IN [0..size) DO c ¬ reader.Backwards[]; IF alphabetic[c] THEN { IF state = InJunk THEN { wordEnd ¬ pos+1; wp ¬ 0; } ELSE IF state = Embeddable THEN { HandleExpansion[]; buffer[wp] ¬ e; wp ¬ wp + 1; }; state ¬ InWord; HandleExpansion[]; buffer[wp] ¬ c; wp ¬ wp + 1; } ELSE { IF state = InWord THEN { IF c = '' THEN { state ¬ Embeddable; e ¬ c; } ELSE { buffer.length ¬ wp; wordStart ¬ pos+1; ReverseWord[]; premature ¬ f[buffer, data, wordStart+offset]; IF premature THEN RETURN; state ¬ InJunk; }; } ELSE IF state = Embeddable THEN { buffer.length ¬ wp; wordStart ¬ pos+2; ReverseWord[]; premature ¬ f[buffer, data, wordStart+offset]; IF premature THEN RETURN; state ¬ InJunk; }; }; ENDLOOP; IF state = InWord THEN { buffer.length ¬ wp; wordStart ¬ 0; ReverseWord[]; premature ¬ f[buffer, data, wordStart+offset]; } ELSE IF state = Embeddable THEN { buffer.length ¬ wp; wordStart ¬ 1; ReverseWord[]; premature ¬ f[buffer, data, wordStart+offset]; }; }; END. CHANGE LOG Created by Nix on May 1, 1987 10:41:54 am PDT, END SpellingWordMapImpl.mesa Copyright Σ 1987, 1990, 1992 by Xerox Corporation. All rights reserved. Last Edited by: Nix, October 24, 1983 10:47 am Last tweaked by Mike Spreitzer on February 17, 1989 4:25:31 pm PST Tim Diebert: January 26, 1987 4:53:12 pm PST JKF, May 29, 1990 7:53:57 am PDT Maps the function f over each of the words in the ROPE words, stopping either when f returns TRUE or when all of the words are exhausted. A word is defined to be the longest non-empty sequence of ['A..'Z]+['a..'z]+ + , with embedded apostrophes allowed: "abcd" is a word, "ab'cd" is a word, but "abcd'" is not a word. The accumulation and testing of words is carried out in the buffer, which is grown to accomodate long words, and which is returned (possibly longer) when the mapping is complete. On return, premature is TRUE iff the function f returned TRUE on some word. If premature is true, then wordStart and wordEnd are set to the starting index and stopping index of the word for which f returned true. If premature is false, then these values denote nothing of interest. Maps the function f over each of the words in the ROPE words, stopping either when f returns TRUE or when all of the words are exhausted. A word is defined to be the longest non-empty sequence of ['A..'Z]+['a..'z]+ + , with embedded apostrophes allowed: "abcd" is a word, "ab'cd" is a word, but "abcd'" is not a word. The accumulation and testing of words is carried out in the buffer, which is grown to accomodate long words, and which is returned (possibly longer) when the mapping is complete. On return, premature is TRUE iff the function f returned TRUE on some word. If premature is true, then wordStart and wordEnd are set to the starting index and stopping index of the word for which f returned true. If premature is false, then these values denote nothing of interest. Last tweaked by Mike Spreitzer on May 1, 1987 10:41:27 am PDT Made accents (chars 005 and 013) alphabetic. changes to: embeds, MapWordsInRope, ParseState, MapWordsInRopeBackward, ParseState, Start, Start Κ–(cedarcode) style•NewlineDelimiter ˜code– "Cedar" stylešœ™K– "Cedar" stylešœ Οeœ=™HK™.K™BK™,K™ —K˜šΟk ˜ K˜Kšœžœžœ˜Kšœ žœ,˜<—K˜KšΠbnœžœž˜"šžœ˜Kšœ˜Kšœ ˜ —šžœ˜Kšœ˜—Kšž˜K˜Kšžœžœžœ˜Kšœ.Οc"˜PK˜šΟbœ˜Kš"žœžœžœ žœ/žœžœžœžœžœžœžœžœžœžœ žœžœ žœ˜£Kš žœ žœžœžœžœžœ˜[Kš œ2žœ'žœ―žœžœή™“šΟnœžœžœ˜#šžœžœ˜ Kšœ žœžœ˜.K˜'šžœžœžœ ž˜K˜Kšžœ˜—K˜K˜—K˜—Kšœ žœ ˜0K˜Kšœžœ˜ Kšœžœ˜Kšœžœžœ˜K˜K˜!Kšœ)˜)šžœžœžœ ž˜K˜šžœžœ˜šžœžœ˜K˜K˜K˜—šžœžœžœ˜!K˜K˜K˜ K˜—K˜K˜K˜K˜ K˜—šžœ˜šžœžœ˜šœžœžœ˜K˜K˜K˜—šžœ˜K˜K˜K˜.Kšžœ žœžœ˜K˜K˜—K˜—šžœžœžœ˜!K˜K˜K˜.Kšžœ žœžœ˜K˜K˜—K˜—Kšžœ˜—šžœžœ˜K˜K˜K˜.K˜—šžœžœžœ˜!K˜K˜K˜.K˜—K˜K˜—š‘œ˜Kš"žœžœžœ žœ/žœžœžœžœžœžœžœžœžœžœ žœžœ žœ˜€Kš žœ žœžœžœžœžœ˜[Kš œ2žœ'žœ―žœžœή™“š’œžœžœ˜#šžœžœ˜ Kšœ žœžœ˜.K˜'šžœžœžœ ž˜K˜Kšžœ˜—K˜K˜—K˜—š’ œžœžœ˜Kšœžœ˜Kšœžœ˜šžœžœžœ ž˜K˜ K˜K˜K˜Kšžœ˜—K˜—Kšœ žœ ˜0K˜Kšœžœ˜ Kšœžœ˜Kšœžœžœ˜K˜K˜!Kšœ4˜4š žœžœž œžœ ž˜'K˜šžœžœ˜šžœžœ˜K˜K˜K˜—šžœžœžœ˜!K˜K˜K˜ K˜—K˜K˜K˜K˜ K˜—šžœ˜šžœžœ˜šœžœžœ˜K˜K˜K˜—šžœ˜K˜K˜K˜K˜.Kšžœ žœžœ˜K˜K˜—K˜—šžœžœžœ˜!K˜K˜K˜K˜.Kšžœ žœžœ˜K˜K˜—K˜—Kšžœ˜—šžœžœ˜K˜K˜K˜K˜.K˜—šžœžœžœ˜!K˜K˜K˜K˜.K˜—K˜K˜—šžœ˜Kšžœž˜ Kšœ-Οr˜2—K˜™=K™,Kšœ £T™`—K™—…—f