DIRECTORY Ascii, RefText, Rope, Soundex; SoundexImpl: CEDAR PROGRAM IMPORTS Ascii, RefText, Rope EXPORTS Soundex ~ BEGIN OPEN Soundex; ROPE: TYPE ~ Rope.ROPE; codeSize: CARDINAL = 8; -- the above algorithm assumes a codeSize of 4 NameToCode: PUBLIC PROC [name: ROPE] RETURNS [SoundexCode] ~ { LetterToDigit: Rope.TranslatorType = { new: CHAR ¬ SELECT Ascii.Lower[old] FROM 'a, 'e, 'h, 'i, 'o, 'u, 'w, 'y => '0, 'b, 'f, 'p, 'v => '1, 'c, 'g, 'j, 'k, 'q, 's, 'x, 'z => '2, 'd, 't => '3, 'l => '4, 'm, 'n => '5, 'r => '6, ENDCASE => '7; RETURN[new]; -- must explicitly return new here }; codename: ROPE; scode: REF TEXT ¬ RefText.ObtainScratch[codeSize]; prevchar, c: CHAR; IF name = NIL THEN RETURN [NIL]; codename ¬ Rope.Translate[base: name, start: 1, translator: LetterToDigit]; scode ¬ RefText.InlineAppendChar[scode, Ascii.Upper[Rope.Fetch[name,0]]]; prevchar ¬ LetterToDigit[Rope.Fetch[name,0]]; FOR i: INT IN [0..Rope.Length[codename]) DO c ¬ Rope.Fetch[codename, i]; IF c # prevchar AND c # '0 THEN { scode ¬ RefText.InlineAppendChar[scode, c]; IF RefText.Length[scode] = codeSize THEN EXIT; }; prevchar ¬ c; ENDLOOP; FOR i: INT IN [RefText.Length[scode]..codeSize) DO scode ¬ RefText.InlineAppendChar[scode, '0]; ENDLOOP; codename ¬ Rope.FromRefText[scode]; RefText.ReleaseScratch[scode]; RETURN[codename]; }; END. ͺSoundexImpl.mesa Copyright Σ 1985, 1992 by Xerox Corporation. All rights reserved. Doug Terry, November 26, 1985 4:39:52 pm PST Willie-s, April 27, 1992 11:40 am PDT The Soundex algorithm presented in Knuth's "The Art of Computer Programming", Vol. 3, page 392 is as follows: 1. Retain the first letter of the name, and drop all occurrences of a, e, h, i, o, u, w, y in other positions. 2. Assign the following numbers to the remaining letters after the first: b, f, p, v --> 1 c, g, j, k, q, s, x, z --> 2 d, t --> 3 l --> 4 m, n --> 5 r --> 6 3. If two or more letters with the same code were adjacent in the original name (before step 1), omit all but the first. 4. Convert to the form "letter, digit, digit, digit" by adding trailing zeros (if there are less than three digits), or by dropping rightmost digits (if there are more than three). [old: CHAR] RETURNS [new: CHAR] -- old Cedar7.0 impl [old: CHAR] RETURNS [CHAR] -- note "new" not declared as return type Doug Terry, October 18, 1985 1:46:04 pm PDT created. Doug Terry, November 8, 1985 6:50:06 pm PST changes to: NameToCode, DIRECTORY, EXPORTS, ~ Doug Terry, November 26, 1985 4:39:52 pm PST changes to: ~ Κ–(cedarcode) style•NewlineDelimiter ˜codešœ™Kšœ Οeœ6™BK™,K™%K™—šΟk ˜ K˜K˜Kšœ˜Kšœ˜—K˜KšΠbl œžœž˜Kšžœ˜Kšžœ˜šœž˜Kšžœ ˜ K˜Kšžœžœžœ˜K˜šΟim™mIitemš n™nš I™ILš ™Lš ™Lš  ™ Lš  ™ Lš  ™ Lš  ™ —Lš x™xLš ΄™΄—K˜Kšœ žœΟc.˜GK˜š Οn œžœžœžœžœ˜>•StartOfExpansion# -- [old: CHAR] RETURNS [new: CHAR]šΟb œ˜&KšΠck4™4Kš€œ*™Dšœžœžœž˜(Kšœ%˜%Kšœ˜Kšœ%˜%Kšœ ˜ Kšœ ˜ Kšœ ˜ Kšœ ˜ Kšžœ˜—Kšžœ*˜0K˜—Kšœ žœ˜Kšœžœžœ#˜2Kšœ žœ˜Kš žœžœžœžœžœ˜ K˜KK˜IK˜-šžœžœžœž˜+K˜šžœžœžœ˜!K˜+Kšžœ"žœžœ˜.K˜—K˜ Kšžœ˜—šžœžœžœ#ž˜2K˜,Kšžœ˜—K˜#Kšœ˜Kšžœ ˜K˜—K˜—Kšžœ˜™+Kšœ™—™+Kšœ Οr!™-—K™™,Kšœ ₯™ —K™—…—4 ΰ