DIRECTORY Ascii, RefText, Rope, Soundex; SoundexImpl: CEDAR PROGRAM IMPORTS Ascii, RefText, Rope EXPORTS Soundex ~ BEGIN OPEN Soundex; ROPE: TYPE ~ Rope.ROPE; codeSize: CARDINAL = 8; -- the above algorithm assumes a codeSize of 4 NameToCode: PUBLIC PROC [name: ROPE] RETURNS [SoundexCode] ~ { LetterToDigit: Rope.TranslatorType = { new _ SELECT Ascii.Lower[old] FROM 'a, 'e, 'h, 'i, 'o, 'u, 'w, 'y => '0, 'b, 'f, 'p, 'v => '1, 'c, 'g, 'j, 'k, 'q, 's, 'x, 'z => '2, 'd, 't => '3, 'l => '4, 'm, 'n => '5, 'r => '6, ENDCASE => '7; }; codename: ROPE; scode: REF TEXT _ RefText.ObtainScratch[codeSize]; prevchar, c: CHAR; IF name = NIL THEN RETURN [NIL]; codename _ Rope.Translate[base: name, start: 1, translator: LetterToDigit]; scode _ RefText.InlineAppendChar[scode, Ascii.Upper[Rope.InlineFetch[name,0]]]; prevchar _ LetterToDigit[Rope.InlineFetch[name,0]]; FOR i: INT IN [0..Rope.InlineLength[codename]) DO c _ Rope.InlineFetch[codename, i]; IF c # prevchar AND c # '0 THEN { scode _ RefText.InlineAppendChar[scode, c]; IF RefText.Length[scode] = codeSize THEN EXIT; }; prevchar _ c; ENDLOOP; FOR i: INT IN [RefText.Length[scode]..codeSize) DO scode _ RefText.InlineAppendChar[scode, '0]; ENDLOOP; codename _ Rope.FromRefText[scode]; RefText.ReleaseScratch[scode]; RETURN[codename]; }; END. $SoundexImpl.mesa Copyright c 1985 by Xerox Corporation. All rights reserved. Doug Terry, November 26, 1985 4:39:52 pm PST The Soundex algorithm presented in Knuth's "The Art of Computer Programming", Vol. 3, page 392 is as follows: 1. Retain the first letter of the name, and drop all occurrences of a, e, h, i, o, u, w, y in other positions. 2. Assign the following numbers to the remaining letters after the first: b, f, p, v --> 1 c, g, j, k, q, s, x, z --> 2 d, t --> 3 l --> 4 m, n --> 5 r --> 6 3. If two or more letters with the same code were adjacent in the original name (before step 1), omit all but the first. 4. Convert to the form "letter, digit, digit, digit" by adding trailing zeros (if there are less than three digits), or by dropping rightmost digits (if there are more than three). [old: CHAR] RETURNS [new: CHAR] Doug Terry, October 18, 1985 1:46:04 pm PDT created. Doug Terry, November 8, 1985 6:50:06 pm PST changes to: NameToCode, DIRECTORY, EXPORTS, ~ Doug Terry, November 26, 1985 4:39:52 pm PST changes to: ~ Κέ˜codešœ™Kšœ Οmœ1™•StartOfExpansion# -- [old: CHAR] RETURNS [new: CHAR]šΟb œ˜&KšΠck™šœžœž˜"Kšœ%˜%Kšœ˜Kšœ%˜%Kšœ ˜ Kšœ ˜ Kšœ ˜ Kšœ ˜ Kšžœ˜—K˜—Kšœ žœ˜Kšœžœžœ#˜2Kšœ žœ˜Kš žœžœžœžœžœ˜ KšœK˜KKšœO˜OKšœ3˜3šžœžœžœ"ž˜1K˜"šžœžœžœ˜!Kšœ+˜+Kšžœ"žœžœ˜.K˜—Kšœ ˜ Kšžœ˜—šžœžœžœ#ž˜2Kšœ,˜,Kšžœ˜—Kšœ#˜#Kšœ˜Kšžœ ˜K˜—K˜—Kšžœ˜™+Kšœ™—™+Kšœ Οr!™-—K™™,Kšœ ₯™ —K™—…—