DIRECTORY Rope USING [ROPE, Size, Fetch, FromRefText], SpellingCorrection; SpellingCorrectionImpl: CEDAR MONITOR IMPORTS Rope EXPORTS SpellingCorrection = BEGIN ROPE: TYPE = Rope.ROPE; BruteForceCorrection: PUBLIC PROC [word: ROPE, f: PROC [REF TEXT] RETURNS [BOOL], buffer: REF TEXT] RETURNS [corrections: LIST OF ROPE _ NIL, newBuffer: REF TEXT] = { s: CARDINAL _ word.Size[]; w: REF TEXT; c, t: CHAR; Test: PROC [w: REF TEXT] = INLINE { IF f[w] THEN corrections _ CONS[Rope.FromRefText[w], corrections]; }; IF buffer.maxLength < s+1 THEN buffer _ NEW[TEXT[s+1]]; w _ buffer; w.length _ s; FOR i: CARDINAL IN [0..s) DO w[i] _ Rope.Fetch[word, i]; ENDLOOP; FOR i: CARDINAL IN [0..s-1) DO t _ w[i]; w[i] _ w[i+1]; w[i+1] _ t; Test[w]; w[i+1] _ w[i]; w[i] _ t; ENDLOOP; c _ w[s-1]; w.length _ s - 1; FOR i: CARDINAL DECREASING IN [0..s-1) DO Test[w]; t _ c; c _ w[i]; w[i] _ t; ENDLOOP; Test[w]; w.length _ s; FOR i: CARDINAL DECREASING IN [0..s-1) DO w[i+1] _ w[i]; ENDLOOP; w[0] _ c; w.length _ s + 1; FOR i: CARDINAL DECREASING IN [0..s] DO IF i < s THEN w[i+1] _ w[i]; IF i = s THEN c _ w[s-1] ELSE c _ w[i]; IF c IN ['A..'Z] THEN FOR x: CHAR IN ['A..'Z] DO w[i] _ x; Test[w]; ENDLOOP ELSE FOR x: CHAR IN ['a..'z] DO w[i] _ x; Test[w]; ENDLOOP; ENDLOOP; FOR i: CARDINAL IN [0..s) DO w[i] _ w[i+1]; ENDLOOP; w.length _ s; FOR i: CARDINAL IN [0..s) DO c _ w[i]; IF c IN ['A..'Z] THEN FOR x: CHAR IN ['A..'Z] DO IF x # c THEN { w[i] _ x; Test[w]; }; ENDLOOP ELSE FOR x: CHAR IN ['a..'z] DO IF x # c THEN { w[i] _ x; Test[w]; }; ENDLOOP; w[i] _ c; ENDLOOP; newBuffer _ buffer; }; END. CHANGE LOG Created by Nix on October 7, 1983 2:22 pm ΄File: SpellingCorrectionImpl.mesa Last Edited by: Nix, October 8, 1983 5:03 pm Generates all words close to "word" and returns a list of those for which f returns TRUE. The generation of close words proceeds in the usual brute-force manner: transpositions of adjacent letters in the word are considered first, then deletions of single letters, then insertions of single letters from 'a to 'z, and finally replacement of single letters by one of the other twenty five letters between 'a and 'z. The corrections proposed try to mimic the same capitalization style of the root word. Tests to see if a correction is in the word list and adds it to the correction list if it is. Try transpositions. Transpose each of the adjacent characters in the word and see if any of those are in the word list. Try deletions. Delete each of the characters in the word and see if the resulting word is in the word list. Try insertions. Insert a character between 'a and 'z in each of the s+1 possible locations in the word and see if the resulting word is in the dictionary. Try replacements. Replace each of the letters in the word with one of the other 25 letters and see if that word is in the dictionary. Κ˜– "Cedar" stylešœ!™!J™,—unitšΟk ˜ Icodešœœœ˜,L˜—KšΟbœœ˜%šœ˜Lšœ˜—š˜Lšœ˜—Lš˜Lšœœœ˜J˜šžœ˜Jšœœœœœœœœ œœ˜NJšœœœœœ œœ˜BJ™χJšœœ˜Jšœœœ˜ Jšœœ˜ J˜š Οnœœœœœ˜#Jšœ]™]šœ˜ Jšœœ#˜5—J˜J˜—šœ˜Jšœ œœ˜—J˜ J˜ šœœœ˜J˜Jšœ˜—˜Jšœx™x—šœœœ ˜J˜%Jšœ˜J˜Jšœ˜J˜Jšœl™l—Jšœ ˜ J˜š œœ œœ ˜)Jšœ˜J˜Jšœ˜—Jšœ˜J˜ š œœ œœ ˜)J˜Jšœ˜—J˜ ˜Jšœ›™›—J˜š œœ œœ˜'Jšœœ˜Jšœœ œ ˜'šœœ ˜šœœœ ˜J˜ Jšœ˜Jš˜——šœ˜šœœœ ˜J˜ Jšœ˜Jšœ˜——Jšœ˜—šœœœ˜J˜Jšœ˜—J˜ ˜Jšœ†™†—šœœœœ˜J˜ šœœ ˜šœœœ ˜šœœ˜J˜ Jšœ˜J˜—Jš˜——š˜šœœœ ˜šœœ˜J˜ Jšœ˜J˜—Jšœ˜——J˜ Jšœ˜—J˜J˜J˜—šœ˜Kšœ˜ Kšœ)˜)——…—x.