File: SpellingCorrectionImpl.mesa
Last Edited by: Nix, October 8, 1983 5:03 pm
DIRECTORY
Rope USING [ROPE, Size, Fetch, FromRefText],
SpellingCorrection;
SpellingCorrectionImpl: CEDAR MONITOR
IMPORTS
Rope
EXPORTS
SpellingCorrection =
BEGIN
ROPE: TYPE = Rope.ROPE;
BruteForceCorrection:
PUBLIC PROC [word: ROPE, f: PROC [REF TEXT] RETURNS [BOOL], buffer: REF TEXT]
RETURNS [corrections: LIST OF ROPENIL, newBuffer: REF TEXT] = {
Generates all words close to "word" and returns a list of those for which f returns TRUE. The generation of close words proceeds in the usual brute-force manner: transpositions of adjacent letters in the word are considered first, then deletions of single letters, then insertions of single letters from 'a to 'z, and finally replacement of single letters by one of the other twenty five letters between 'a and 'z. The corrections proposed try to mimic the same capitalization style of the root word.
s: CARDINAL ← word.Size[];
w: REF TEXT;
c, t: CHAR;
Test: PROC [w: REF TEXT] = INLINE {
Tests to see if a correction is in the word list and adds it to the correction list if it is.
IF f[w] THEN
corrections ← CONS[Rope.FromRefText[w], corrections];
};
IF buffer.maxLength < s THEN
buffer ← NEW[TEXT[s+1]];
w ← buffer;
w.length ← s;
FOR i: CARDINAL IN [0..s) DO
w[i] ← Rope.Fetch[word, i];
ENDLOOP;
Try transpositions. Transpose each of the adjacent characters in the word and see if any of those are in the word list.
FOR i: CARDINAL IN [0..s-1) DO
t ← w[i]; w[i] ← w[i+1]; w[i+1] ← t;
Test[w];
w[i+1] ← w[i]; w[i] ← t;
ENDLOOP;
Try deletions. Delete each of the characters in the word and see if the resulting word is in the word list.
c ← w[s-1];
w.length ← s - 1;
FOR i: CARDINAL DECREASING IN [0..s-1) DO
Test[w];
t ← c; c ← w[i]; w[i] ← t;
ENDLOOP;
Test[w];
w.length ← s;
FOR i: CARDINAL DECREASING IN [0..s-1) DO
w[i+1] ← w[i];
ENDLOOP;
w[0] ← c;
Try insertions. Insert a character between 'a and 'z in each of the s+1 possible locations in the word and see if the resulting word is in the dictionary.
w.length ← s + 1;
FOR i: CARDINAL DECREASING IN [0..s] DO
IF i < s THEN w[i+1] ← w[i];
IF i = s THEN c ← w[s-1] ELSE c ← w[i];
IF c IN ['A..'Z] THEN
FOR x: CHAR IN ['A..'Z] DO
w[i] ← x;
Test[w];
ENDLOOP
ELSE
FOR x: CHAR IN ['a..'z] DO
w[i] ← x;
Test[w];
ENDLOOP;
ENDLOOP;
FOR i: CARDINAL IN [0..s) DO
w[i] ← w[i+1];
ENDLOOP;
w.length ← s;
Try replacements. Replace each of the letters in the word with one of the other 25 letters and see if that word is in the dictionary.
FOR i: CARDINAL IN [0..s) DO
c ← w[i];
IF c IN ['A..'Z] THEN
FOR x: CHAR IN ['A..'Z] DO
IF x # c THEN {
w[i] ← x;
Test[w];
};
ENDLOOP
ELSE
FOR x: CHAR IN ['a..'z] DO
IF x # c THEN {
w[i] ← x;
Test[w];
};
ENDLOOP;
w[i] ← c;
ENDLOOP;
newBuffer ← buffer;
};
END.
CHANGE LOG
Created by Nix on October 7, 1983 2:22 pm