File [Erinyes]<LispManual>SpellingCorrection.im!6



{Begin SubSec Spelling Correction}
{Title Spelling Correction}
{Text

{index *PRIMARY* Spelling correction}

{Tag SpellingCorrector}
{Tag SpellingCorrection}
{Tag SpellingLists}


{note 

Edited 7-Jun-83 12:10:08 by Masinter: minor typos, notes primarily. removed a couple of footnotes.

the Spelling Correction section needs to be reorganized.

Possible Organization:
(1) idea of correcting the spelling of a word using a spelling list.
(2) use of spelling correction in DWIM (DWIM spelling lists, etc.)
(3) user entries to spelling correction (FIXSPELL, etc.)
(4) spelling correction algorithm, and format of spelling lists

Note: also need to put in FNDEF and VARDEF format. Too many internal variables are defined in running text rather as separate headings. }


The spelling corrector is given as arguments a misspelled word (word means literal atom), a spelling list (a list of words), and a number: {arg XWORD}, {arg SPLST}, and {arg REL} respectively.  Its task is to find that word on {arg SPLST} which is closest to {arg XWORD}, in the sense described below.  This word is called a {it respelling} of {arg XWORD}.  {arg REL} specifies the minimum "closeness" between {arg XWORD} and a respelling.  If the spelling corrector cannot find a word on {arg SPLST} closer to {arg XWORD} than {arg REL}, or if it finds two or more words equally close, its value is {lisp NIL}, otherwise its value is the respelling. The spelling corrector can also be given an optional functional argument, {arg FN}, to be used for selecting out a subset of {arg SPLST}, i.e., only those members of {arg SPLST} that satisfy {arg FN} will be considered as possible respellings.

The exact algorithm for computing the spelling metric is described later, but briefly "closeness" is inversely proportional to the number of disagreements between the two words, and directly proportional to the length of the longer word.  For example, {lisp PRTTYPRNT} is "closer" to {lisp PRETTYPRINT} than {lisp CS} is to {lisp CONS} even though both pairs of words have the same number of disagreements.  The spelling corrector operates by proceeding down {arg SPLST}, and computing the closeness between each word and {arg XWORD}, and keeping a list of those that are closest.  Certain differences between words are not counted as disagreements, for example a single transposition, e.g., {lisp CONS} to {lisp CNOS}, or a doubled letter, e.g., {lisp CONS} to {lisp CONSS}, etc.  In the event that the spelling corrector finds a word on {arg SPLST} with {it no} disagreements, it will stop searching and return this word as the respelling.  Otherwise, the spelling corrector continues through the entire spelling list.  Then if it has found one and only one "closest" word, it returns this word as the respelling.  For example, if {arg XWORD} is {lisp VONS}, the spelling corrector will probably return {lisp CONS} as the respelling.  However, if {arg XWORD} is {lisp CONZ}, the spelling corrector will not be able to return a respelling, since {lisp CONZ} is equally close to both {lisp CONS} and {lisp COND}.  If the spelling corrector finds an acceptable respelling, it interacts with the user as described earlier.

In the special case that the misspelled word contains one or more {lisp $}s (escape),{index *PRIMARY* Escape ($) (in spelling correction)}{index *PRIMARY* $ (escape) (in spelling correction)} the spelling corrector searches for those words on {arg SPLST} that match {arg XWORD}, where a {lisp $} can match any number of characters (including 0), e.g., {lisp FOO$} matches {lisp FOO1} and {lisp FOO}, but not {lisp NEWFOO}.  {lisp $FOO$} matches all three.  Both completion and correction may be involved, e.g. {lisp RPETTY$} will match {lisp PRETTYPRINT}, with one mistake.  The entire spelling list is always searched, and if more than one respelling is found, the spelling corrector prints {lisp AMBIGUOUS},{index AMBIGUOUS (printed by DWIM)} and returns {lisp NIL}.  For example, {lisp CON$} would be ambiguous if both {lisp CONS} and {lisp COND} were on the spelling list.  If the spelling corrector finds one and only one respelling, it interacts with the user as described earlier.

For both spelling correction and spelling completion, regardless of whether or not the user approves of the spelling corrector's choice, the respelling is moved to the front of {arg SPLST}. Since many respellings are of the type with no disagreements, this procedure has the effect of considerably reducing the time required to correct the spelling of frequently misspelled words.



{Begin SubSec Synonyms}
{Title Synonyms}
{Text

{index *PRIMARY* Synonyms in spelling correction}

Spelling lists also provide a way of defining synonyms for a particular context.  If a dotted pair appears on a spelling list (instead of just an atom), {fn CAR} is interpreted as the correct spelling of the misspelled word, and {fn CDR} as the antecedent for that word.  If {fn CAR} is {it identical} with the misspelled word, the antecedent is returned without any interaction or approval being  necessary.  If the misspelled word {it corrects} to {fn CAR} of the dotted pair, the usual interaction and approval will take place, and then the antecedent, i.e., {fn CDR} of the dotted pair, is returned.  For example, the user could make {lisp IFLG} synonymous with {lisp CLISPIFTRANFLG} by adding {lisp (IFLG . CLISPIFTRANFLG)} to {var SPELLINGS3}, the spelling list for unbound atoms.
Similarly, the user could make {lisp OTHERWISE} mean the same as {lisp ELSEIF} by adding {lisp (OTHERWISE . ELSEIF)} to {var CLISPIFWORDSPLST}, or make {lisp L} be synonymous with {lisp LAMBDA} by adding {lisp (L . LAMBDA)} to {var LAMBDASPLST}.  Note that {lisp L} could also be used as a variable without confusion, since the association of {lisp L} with {lisp LAMBDA} occurs only in the appropriate context.


}{End SubSec Synonyms}




{Begin SubSec Spelling Lists}
{Title Spelling Lists}
{Text

{index *PRIMARY* Spelling lists}

Any list of atoms can be used as a spelling list, e.g., {var BROKENFNS}, {var FILELST}, etc.  Various system packages have their own spellings lists, e.g., {var LISPXCOMS}, {var CLISPFORWORDSPLST}, {var EDITCOMSA}, etc.  These are documented under their corresponding sections, and are also indexed under "spelling lists."  In addition to these spelling lists, the system maintains, i.e., automatically adds to, and occasionally prunes, four lists used solely for spelling correction: {var SPELLINGS1}, {var SPELLINGS2}, {var SPELLINGS3}, and {var USERWORDS}. These spelling lists are maintained {it only} when {index ADDSPELLFLG Var}{var ADDSPELLFLG} is non-{lisp NIL}. {var ADDSPELLFLG} is initially {lisp T}.


{VarDef {Name SPELLINGS1}
{Text
{var SPELLINGS1} is a list of functions used for spelling correction when an input is typed in apply format, and the function is undefined, e.g., {lisp EDTIF(FOO)}.  {var SPELLINGS1} is initialized to contain {fn DEFINEQ}, {fn BREAK}, {fn MAKEFILE}, {fn EDITF}, {fn TCOMPL}, {fn LOAD}, etc. Whenever {fn LISPX} is given an input in apply format, i.e., a function and arguments, the name of the function is added to {var SPELLINGS1} if the function has a definition.

For example, typing {lisp CALLS(EDITF)} will cause {lisp CALLS} to be added to {var SPELLINGS1}. Thus if the user typed {lisp CALLS(EDITF)} and later typed {lisp CALLLS(EDITV)}, since {var SPELLINGS1} would then contain {lisp CALLS}, DWIM would be successful in correcting {lisp CALLLS} to {lisp CALLS}. 

}}



{VarDef {Name SPELLINGS2}
{Text
{var SPELLINGS2} is a list of functions used for spelling correction for all other undefined functions. It is initialized to contain functions such as {fn ADD1}, {fn APPEND}, {fn COND}, {fn CONS}, {fn GO}, {fn LIST}, {fn NCONC}, {fn PRINT}, {fn PROG}, {fn RETURN}, {fn SETQ}, etc. Whenever {index LISPX FN}{fn LISPX} is given a non-atomic form, the name of the function is added to {var SPELLINGS2}. For example, typing {lisp (RETFROM (STKPOS (QUOTE FOO) 2))} to a break would add {fn RETFROM} to {var SPELLINGS2}. Function names are also added to {index SPELLINGS2 Var}{var SPELLINGS2} by {fn DEFINE}, {fn DEFINEQ}, {fn LOAD} (when loading compiled code), {fn UNSAVEDEF}, {fn EDITF}, and {fn PRETTYPRINT}.
}}



{VarDef {Name SPELLINGS3}
{Text
{var SPELLINGS3} is a list of words used for spelling correction on all unbound atoms. {var SPELLINGS3} is initialized to {var EDITMACROS}, {var BREAKMACROS}, {var BROKENFNS}, and {var ADVISEDFNS}. Whenever {fn LISPX} is given an atom to evaluate, the name of the atom is added to {var SPELLINGS3} if the atom has a value. Atoms are also added to {var SPELLINGS3} whenever they are edited by {fn EDITV}, and whenever they are set via {fn RPAQ} or {fn RPAQQ}. For example, when a file is loaded, all of the variables set in the file are added to {var SPELLINGS3}. Atoms are also added to {var SPELLINGS3} when they are set by a {fn LISPX} input, e.g., typing {lisp (SETQ FOO (REVERSE (SETQ FIE {ellipsis})))} will add both {lisp FOO} and {lisp FIE} to {var SPELLINGS3}.
}}



{VarDef {Name USERWORDS}
{Text
{var USERWORDS} is a list containing both functions and variables that the user has {it referred} to, e.g., by breaking or editing. {var USERWORDS} is used for spelling correction by {fn ARGLIST}, {fn UNSAVEDEF}, {fn PRETTYPRINT}, {fn BREAK}, {fn EDITF}, {fn ADVISE}, etc. {var USERWORDS} is initially {lisp NIL}. Function names are added to it by {fn DEFINE}, {fn DEFINEQ}, {fn LOAD}, (when loading compiled code, or loading exprs to property lists) {fn UNSAVEDEF}, {fn EDITF}, {fn EDITV}, {fn EDITP}, {fn PRETTYPRINT}, etc. Variable names are added to {var USERWORDS} at the same time as they are added to {var SPELLINGS3}. In addition, the variable {index *PRIMARY* LASTWORD Var}{var LASTWORD} is always set to the last word added to {index USERWORDS Var}{var USERWORDS}, i.e., the last function or variable referred to by the user, and the respelling of {lisp NIL} is defined to be the value of {index LASTWORD Var}{var LASTWORD}. Thus, if the user has just defined a function, he can then prettyprint it by typing {lisp PP()}.
}}


Each of the above four spelling lists are divided into two sections separated by a special marker (the value of the variable {var SPELLSTR1}{index SPELLSTR1 Var}).  The first section contains the "permanent" words; the second section contains the temporary words.  New words are added to the corresponding spelling list at the front of its temporary section (except that functions added to {index SPELLINGS1 Var}{var SPELLINGS1} or {index SPELLINGS2 Var}{var SPELLINGS2} by {fn LISPX} are always added to the end of the permanent section.
If the word is already in the temporary section, it is moved to the front of that section; if the word is in the permanent section, no action is taken. If the length of the temporary section then exceeds a specified number, the last (oldest) word in the temporary section is forgotten, i.e., deleted. This procedure prevents the spelling lists from becoming cluttered with unimportant words that are no longer being used, and thereby slowing down spelling correction time. Since the spelling corrector usually moves each word selected as a respelling to the front of its spelling list, the word is thereby moved into the permanent section.
Thus once a word is misspelled and corrected, it is considered important and will never be forgotten.

Note:  The spelling correction algorithm will not alter a spelling list unless it contains the special marker (the value of {var SPELLSTR1}).  This provides a way to ensure that a spelling list will not be altered. 

{indexX {Name SPELLINGS1} {Type Variable}
{Text {lisp #SPELLINGS1}} }
{indexX {Name SPELLINGS2} {Type Variable}
{Text {lisp #SPELLINGS2}} }
{indexX {Name SPELLINGS3} {Type Variable}
{Text {lisp #SPELLINGS3}} }
{indexX {Name USERWORDS} {Type Variable}
{Text {lisp #SPELLINGS1}} }


{VarDef {Name #SPELLINGS1}}
{VarDef {Name #SPELLINGS2}}
{VarDef {Name #SPELLINGS3}}
{VarDef {Name #USERWORDS}
{Text
The maximum length of the temporary section for {var SPELLINGS1}, {var SPELLINGS2}, {var SPELLINGS3} and {var USERWORDS} is given by the value of {var #SPELLINGS1}, {var #SPELLINGS2}, {var #SPELLINGS3}, and {var #USERWORDS}, initialized to 30, 30, 30, and 60 respectively.

Users can alter these values to modify the performance behavior of spelling correction. 

{note There is some heuristic here which I don't know how to describe which says how to muck with these vars to make spelling correction work for you. This used to have how many milliseconds of compute time it took in Interlisp-10 on a KA-10. The numbers were particularly meaningless now.}

}}


}{End SubSec Spelling Lists}




{Begin SubSec Generators for Spelling Correction}
{Title Generators for Spelling Correction}
{Text

{index *PRIMARY* Generators for spelling correction}

For some applications, it is more convenient to {it generate} candidates for a respelling one by one, rather than construct a complete list of all possible candidates, e.g., spelling correction involving a large directory of files, or a natural language  data base.  For these purposes, {arg SPLST} can be an array (of any size).  The first element of this array is the generator function, which is called with the array itself as its argument.  Thus the function can use the remainder of the array to store "state" information, e.g., the last position on a file, a pointer into a data structure, etc.  The value returned by the function is the next candidate for respelling.  If {lisp NIL} is returned, the spelling "list" is considered to be exhausted, and the closest match is returned.  If a candidate is found with no disagreements, it is returned immediately without waiting for the "list" to exhaust.

{arg SPLST} can also be a generator, i.e. the value of the function {fn GENERATOR} ({PageRef Fn GENERATOR}). The generator {arg SPLST} will be started up whenever the spelling corrector needs the next candidate, and it should return candidates via the function {fn PRODUCE}. For example, the following could be used as a "spelling list" which effectively contains all functions in the system:


{lispcode
[GENERATOR
   (MAPATOMS (FUNCTION (LAMBDA (X) (if (GETD X) then (PRODUCE X]}


}{End SubSec Generators for Spelling Correction}




{Begin SubSec Spelling Corrector Algorithm}
{Title Spelling Corrector Algorithm}
{Text

{index Spelling correction}

The basic philosophy of DWIM spelling correction is to count the number of disagreements between two words, and use this number divided by the length of the longer of the two words as a measure of their relative disagreement.  One minus this number is then the relative agreement or closeness. For example, {lisp CONS} and {lisp CONX} differ only in their last character. Such substitution errors count as one disagreement, so that the two words are in 75% agreement. Most calls to the spelling corrector specify a relative agreement of 70, so that a single substitution error is permitted in words of four characters or longer.  However, spelling correction on shorter words is possible since certain types of differences such as single transpositions are not counted as disagreements.  For example, {lisp AND} and {lisp NAD} have a relative agreement of 100. Calls to the spelling corrector from {lisp DWIM} use the value of {var FIXSPELLREL}, which is initially 70. Note that by setting {var FIXSPELLREL} to 100, only spelling corrections with "zero" mistakes, will be considered, e.g., transpositions, double characters, etc.

The central function of the spelling corrector is {index CHOOZ FN}{fn CHOOZ}.  {fn CHOOZ} takes as arguments: a word, a minimum relative agreement, a spelling list, and an optional functional argument, {arg XWORD}, {arg REL}, {arg SPLST}, and {arg FN} respectively.

{fn CHOOZ} proceeds down {arg SPLST} examining each word. Words not satisfying {arg FN} (if {arg FN} is non-{lisp NIL}), or those obviously too long or too short to be sufficiently close to {arg XWORD} are immediately rejected. For example, if {arg REL}=70, and {arg XWORD} is 5 characters long, words longer than 7 characters will be rejected.

Special treatment is necessary for words shorter than {arg XWORD}, since doubled letters are not counted as disagreements. For example, {lisp CONNSSS} and {lisp CONS} have a relative agreement of 100. {fn CHOOZ} handles this by counting the number of doubled characters in {arg XWORD} before it begins scanning {arg SPLST}, and taking this into account when deciding whether to reject shorter words.

If {arg TWORD}, the current word on {arg SPLST}, is not rejected, {fn CHOOZ} computes the number of disagreements between it and {arg XWORD} by calling a subfunction, {fn SKOR}.

{fn SKOR}{index SKOR FN} operates by scanning both words from left to right one character at a time. {fn SKOR} operates on the list of character codes for each word. This list is computed by {fn CHOOZ} before calling {fn SKOR}. Characters are considered to agree if they are the same characters or appear on the same key (i.e., a shift mistake). The variable {var SPELLCASEARRAY} is a CASEARRAY which is used to determine equivalence classes for this purpose. It is initialized to equivalence lowercase and upper case letters, as well as the standard key transitions: for example, {lisp 1} with {lisp !}, {lisp 3} with {lisp #}, etc.

If the first character in {arg XWORD} and {arg TWORD} do {it not} agree, {fn SKOR} checks to see if either character is the same as one previously encountered, and not accounted-for at that time. (In other words, transpositions are not handled by lookahead, but by {it lookback}.) A displacement of two or fewer positions is counted as a tranposition; a displacement by more than two positions is counted as a disagreement.In either case, both characters are now considered as accounted for and are discarded, and {fn SKOR}ing continues.

If the first character in {arg XWORD} and {arg TWORD} do not agree, and neither agree with previously unaccounted-for characters, and {arg TWORD} has more characters remaining than {arg XWORD}, {fn SKOR} removes and saves the first character of {arg TWORD}, and continues by comparing the rest of {arg TWORD} with {arg XWORD} as described above. If {arg TWORD} has the same or fewer characters remaining than {arg XWORD}, the procedure is the same except that the character is removed from {arg XWORD}.  In this case, a special check is first made to see if that character is equal to the {it previous} character in {arg XWORD}, or to the {it next} character in {arg XWORD}, i.e., a double character typo, and if so, the character is considered accounted-for, and not counted as a disagreement. In this case, the "length" of {arg XWORD} is also decremented. Otherwise making {arg XWORD} sufficiently long by adding double characters would make it be arbitrarily close to {arg TWORD}, e.g., {lisp XXXXXX} would correct to {lisp PP}.

When {fn SKOR} has finished processing both {arg XWORD} and {arg TWORD} in this fashion, the value of {fn SKOR} is the number of unaccounted-for characters, plus the number of disagreements, plus the number of tranpositions, with two qualifications:  (1) if both {arg XWORD} and {arg TWORD} have a character unaccounted-for in the same position, the two characters are counted only once, i.e., substitution errors count as only one disagreement, not two; and (2) if there are no unaccounted-for characters and no disagreements, transpositions are not counted.   This permits spelling correction on very short words, such as edit commands, e.g., {lisp XRT->XTR}. Transpositions are also not counted when {var FASTYPEFLG}={lisp T}, for example, {lisp IPULX} and {lisp IPLUS} will be in 80% agreement with {var FASTYPEFLG}={lisp T}, only 60% with {var FASTYPEFLG}={lisp NIL}.  The rationale behind this is that transpositions are much more common for fast typists, and should not be counted as disagreements, whereas more deliberate typists are not as likely to combine tranpositions and other mistakes in a single word, and therefore can use more conservative metric.  {index FASTYPEFLG Var}{var FASTYPEFLG} is initially {lisp NIL}.

}{End SubSec Spelling Corrector Algorithm}



{Begin SubSec Spelling Corrector Functions and Variables}
{Title Spelling Corrector Functions and Variables}
{Text

{FnDef {FnName ADDSPELL} {FnArgs X SPLST N}
{Text
Adds {arg X} to one of the spelling lists as determined by the value of {arg SPLST}:

{Begin LabeledList SPLST arg of ADDSPELL}

{Label {lisp NIL}}
{Text
Adds {arg X} to {index USERWORDS Var}{var USERWORDS} and to {index SPELLINGS2 Var}{var SPELLINGS2}.  Used by {fn DEFINEQ}.
}

{Label {lisp 0}}
{Text
Adds {arg X} to {var USERWORDS}.  Used by {fn LOAD} when loading {lisp EXPR}s to property lists.
}

{Label {lisp 1}}
{Text
Adds {arg X} to {index SPELLINGS1 Var}{var SPELLINGS1} (at end of permanent section).  Used by {fn LISPX}.
}

{Label {lisp 2}}
{Text
Adds {arg X} to {index SPELLINGS2 Var}{var SPELLINGS2} (at end of permanent section).  Used by {fn LISPX}.
}

{Label {lisp 3}}
{Text
Adds {arg X} to {var USERWORDS} and {index SPELLINGS3 Var}{var SPELLINGS3}.
}

{Label a spelling list}
{Text
If {arg SPLST} is a spelling list, {arg X} is added to it.  In this case, {arg N} is the (optional) length of the temporary section.
}

{End LabeledList SPLST arg of ADDSPELL}

If {arg X} is already on the spelling list, and in its temporary section, {fn ADDSPELL} moves {arg X} to the front of that section. 

{fn ADDSPELL} sets {index LASTWORD Var}{var LASTWORD} to {arg X} when {arg SPLST}={lisp NIL}, {lisp 0} or {lisp 3}.

If {arg X} is not a literal atom, {index ADDSPELL FN}{fn ADDSPELL} takes no action.
}}


Note that the various systems calls to {fn ADDSPELL}, e.g. from {fn DEFINE}, {fn EDITF}, {fn LOAD}, etc., can all be suppressed by setting or binding {index ADDSPELLFLG Var}{var ADDSPELLFLG} to {lisp NIL} ({PageRef Var ADDSPELLFLG}).



{FnDef {FnName MISSPELLED?} {FnArgs XWORD REL SPLST FLG TAIL FN}
{Text
If {arg XWORD}={lisp NIL} or {lisp $} (<esc>), {fn MISSPELLED?} prints {lisp =} followed by the value of {index LASTWORD Var}{var LASTWORD}, and returns this as the respelling, without asking for approval. Otherwise, {fn MISSPELLED?} checks to see if {arg XWORD} is really misspelled, i.e., if {arg FN} applied to {arg XWORD} is true, or {arg XWORD} is already contained on {arg SPLST}. In this case, {fn MISSPELLED?} simply returns {arg XWORD}. Otherwise {fn MISSPELLED?} computes and returns {lisp (FIXSPELL {arg XWORD} {arg REL} {arg SPLST} {arg FLG} {arg TAIL} {arg FN})}.
}}



{FnDef {FnName FIXSPELL} {FnArgs XWORD REL SPLST FLG TAIL FN TIEFLG DONTMOVETOPFLG {anonarg} {anonarg}}
{Text
The value of {fn FIXSPELL} is either the respelling of {arg XWORD} or {lisp NIL}. If for some reason {arg XWORD} itself is on {arg SPLST}, then {fn FIXSPELL} aborts and calls {fn ERROR!}.  If there is a possibility that {arg XWORD} is spelled correctly, {fn MISSPELLED?} should be used instead of {fn FIXSPELL}.  {fn FIXSPELL} performs all of the interactions described earlier, including requesting user approval if necessary.

If {arg XWORD}={lisp NIL} or {lisp $} (escape),{index Escape ($) (in spelling correction)}{index $ (escape) (in spelling correction)} the respelling is the value of {index LASTWORD Var}{var LASTWORD}, and no approval is requested.

If {arg XWORD} contains lowercase characters, and the corresponding uppercase word is correct, i.e. on {arg SPLST} or satisfies {arg FN}, the uppercase word is returned and no interaction is performed.  If {var FIXSPELL.UPPERCASE.QUIET}{index FIXSPELL.UPPERCASE.QUIET Var} is {lisp NIL} (the default), a warning "=XX" is printed when coercing from "xx" to "XX".  If {var FIXSPELL.UPPERCASE.QUIET} is non-{lisp NIL}, no warning is given. 

If {arg REL}={lisp NIL}, defaults to the value of {index FIXSPELLREL Var}{var FIXSPELLREL} (initially 70).

If {arg FLG}={lisp NIL}, the correction is handled in type-in mode, i.e., approval is never requested, and {arg XWORD} is not typed.
If {arg FLG}={lisp T}, {arg XWORD} is typed (before the {lisp =}) and approval is requested if {index APPROVEFLG Var}{var APPROVEFLG}={lisp T}.  If {arg FLG}={lisp NO-MESSAGE}, the correction is returned with no further processing. In this case, a run-on correction will be returned as a dotted pair of the two parts of the word, and a synonym correction as a list of the form {lisp ({arg WORD1} {arg WORD2})}, where {arg WORD1} is (the corrected version of) {arg XWORD}, and {arg WORD2} is the synonym.  Note that the effect of the function {fn CHOOZ} can be obtained by calling {fn FIXSPELL} with {arg FLG}={lisp NO-MESSAGE}.

If {arg TAIL} is not {lisp NIL}, and the correction is successful,
{fn CAR} of {arg TAIL} is replaced by the respelling (using {fn /RPLACA}).


{index *PRIMARY* Run-on spelling corrections}

{fn FIXSPELL} will attempt to correct misspellings caused by running two words together, if the global variable {var RUNONFLG}{index RUNONFLG Var} is non-{lisp NIL} (default is {lisp NIL}).  In this case, approval is always requested.  When a run-on error is corrected, {fn CAR} of {arg TAIL} is replaced by the two words, and the value of {fn FIXSPELL} is the first one.  For example, if {fn FIXSPELL} is called to correct the edit command {lisp (MOVE TO AFTERCOND 3 2)} with {arg TAIL}={lisp (AFTERCOND 3 2)}, {arg TAIL} would be changed to {lisp (AFTER COND 2 3)}, and {fn FIXSPELL} would return {lisp AFTER} (subject to user approval where necessary). If {arg TAIL}={lisp T}, {fn FIXSPELL} will also perform run-on corrections, returning a dotted pair of the two words in the event the correction is of this type.

If {arg TIEFLG}={lisp NIL} and a tie occurs, i.e., more than one word on {arg SPLST} is found with the same degree of "closeness", {fn FIXSPELL} returns {lisp NIL}, i.e., no correction.  If {arg TIEFLG}={lisp PICKONE} and a tie occurs, the first word is taken as the correct spelling.  
If {arg TIEFLG}={lisp LIST}, the value of {fn FIXSPELL} is a list of the respellings (even if there is only one), and {fn FIXSPELL} will not perform any interaction with the user, nor modify {arg TAIL}, the idea being that the calling program will handle those tasks.  Similarly, if {arg TIEFLG}={lisp EVERYTHING}, a list of all candidates whose degree of closeness is above {arg REL} will be returned, regardless of whether some are better than others.  No interaction will be performed.

If {arg DONTMOVETOPFLG}={lisp T} and a correction occurs, it will {it not} be moved to the front of the spelling list.  Also, the spelling list will not be altered unless it contains the special marker used to separate the temporary and perminant parts of the system spelling lists (the value of {var SPELLSTR1}).


{Begin Comment}
Functions that call the spelling corrector directly, such as {index EDITDEFAULT FN}{fn EDITDEFAULT} ({PageRef Fn EDITDEFAULT}), specify whether or not the correction is to be handled as type-in.  For example, in the case of {fn EDITDEFAULT}, commands typed directly to the editor are treated as type-in, so that corrections to them will never require approval.
Commands given as an argument to the editor, or resulting from macro expansions, or from {editcom IF}, {editcom LP}, {editcom ORR} commands etc. are not treated as type-in, and thus approval will be requested if {index APPROVEFLG Var}{var APPROVEFLG}={lisp T}.
{End Comment}

}}


{Begin Note}
I assume FIXSPELL1 should not be documented

Date: 25 SEP 1979 0032-PDT
From: TEITELMAN
Subject: experimental new feeature in spelling corrector

fixspell1 has been modified so that if the value returned from askuser is a list, it simply passes this back. (fixspell1 used to always return T or NIL).  the default keylst used by askuser and dwim have been modified to allow user to type U for Use and then a value (one value - having user be able to esentially type in two things meaning a run on correction was too complicated an interaction with fixspell).

fixspell has been modified so that if fixspell1 returns a list, it treats car of the list as the correct value.

this is probably not true on our machines:

The time required for a call to {fn FIXSPELL} with a spelling list of length 60 when the entire list must be searched is .5 seconds.  If {index FIXSPELL FN}{fn FIXSPELL} determines that the first word on the spelling list is the respelling and does not need to search any further, the time required is .02 seconds.  In other words, the time required is proportional to the number of words with which {arg XWORD} is compared, with the time for one comparison, i.e., one call to {fn SKOR} takes roughly .01 seconds (varies slightly with the number of characters in the words being compared).
{End Note}

{FnDef {FnName FNCHECK} {FnArgs FN NOERRORFLG SPELLFLG PROPFLG TAIL}
{Text
The task of {fn FNCHECK} is to check whether {arg FN} is the name of a function and if not, to correct its spelling.  If {arg FN} is the name of a function or spelling correction is successful, {fn FNCHECK} adds the (corrected) name of the function to {index USERWORDS Var}{var USERWORDS} using {index ADDSPELL FN}{fn ADDSPELL}, and returns it as its value.

Since {fn FNCHECK} is called by many low level functions such as {fn ARGLIST}, {fn UNSAVEDEF}, etc., spelling correction only takes place when {index DWIMFLG Var}{var DWIMFLG}={lisp T}, so that these functions can operate in a small Interlisp system which does not contain DWIM.

{arg NOERRORFLG} informs {fn FNCHECK} whether or not the calling function wants to handle the unsuccessful case: if {arg NOERRORFLG} is {lisp T}, {fn FNCHECK} simply returns {lisp NIL}, otherwise it prints {lisp fn NOT A FUNCTION} and generates a non-breaking error.

If {arg FN} does not have a definition, but does have an {prop EXPR} property, then spelling correction is not attempted.  Instead, if {arg PROPFLG}={lisp T}, {arg FN} is considered to be the name of a function, and is returned.  If {arg PROPFLG}={lisp NIL}, {arg FN} is {it not} considered to be the name of a function, and {lisp NIL} is returned or an error generated, depending on the value of {arg NOERRORFLG}.

{fn FNCHECK} calls {index MISSPELLED? FN}{fn MISSPELLED?} to perform spelling correction, so that if {arg FN}={lisp NIL}, the value of {index LASTWORD Var}{var LASTWORD} will be returned.  {arg SPELLFLG} corresponds to {fn MISSPELLED?}'s fourth argument, {arg FLG}.  If {arg SPELLFLG}={lisp T}, approval will be asked if DWIM was enabled in {index CAUTIOUS  (DWIM mode)}{lisp CAUTIOUS} mode, i.e., if {index APPROVEFLG Var}{var APPROVEFLG}={lisp T}.  {arg TAIL} corresponds to the fifth argument to {fn MISSPELLED?}.
}}

{fn FNCHECK} is currently used by {fn ARGLIST}, {fn UNSAVEDEF}, {fn PRETTYPRINT}, {fn BREAK0}, {fn BREAKIN}, {fn ADVISE}, and {fn CALLS}.  For example, {fn BREAK0} calls {fn FNCHECK} with {arg NOERRORFLG}={lisp T} since if {fn FNCHECK} cannot produce a function, {fn BREAK0} wants to define a dummy one.  {fn CALLS} however calls {fn FNCHECK} with {arg NOERRORFLG}={lisp NIL}, since it cannot operate without a function.

Many other system functions call {fn MISSPELLED?}{index MISSPELLED? FN} or {index FIXSPELL FN}{fn FIXSPELL} directly.  For example, {index BREAK1 FN}{fn BREAK1} calls {fn FIXSPELL} on unrecognized atomic inputs before attempting to evaluate them, using as a spelling list a list of all break commands.  Similarly, {index LISPX FN}{fn LISPX} calls {fn FIXSPELL} on atomic inputs using a list of all {fn LISPX} commands.  When {fn UNBREAK} is given the name of a function that is not broken, it calls {fn FIXSPELL} with two different spelling lists, first with {index BROKENFNS Var}{var BROKENFNS}, and if that fails, with {index USERWORDS Var}{var USERWORDS}.  {index Spelling correction on file names}{index MAKEFILE FN}{fn MAKEFILE} calls {fn MISSPELLED?} using {index FILELST Var}{var FILELST} as a spelling list.  Finally, {fn LOAD}, {fn BCOMPL}, {fn BRECOMPILE}, {fn TCOMPL}, and {fn RECOMPILE} all call {fn MISSPELLED?} if their input file(s) won't open.


}{End SubSec Spelling Corrector Functions and Variables}



}{End SubSec Spelling Correction}