MT.tioga
Last Edited by: Nichols, August 31, 1983 5:38 pm
This file describes the changes that the RFC software performs on message headers.
The grammar for name fields has been extended to handle more than the syntax of RFC 822. The extensions are:
1. RFC 822 does not permit a phrase to contain periods. This is a needless restriction, and one that is routinely violated, e.g., in a mailbox of the form:
 Larry J. Loser <LJL@SAIL>
Therefore, this grammar redefines phrase so that it can include periods, and renames local-part to be dot-words since it is no longer being used solely as the local-part of an addr-spec.
2. The rule 'addr-spec = dot-words' is added in order to accomodate Grapevine mailbox names, which are not qualified by a domain ("@" domain).
3. The rule 'addr-spec = addr-spec "@" domain' is used, instead of the correct 'addr-spec = dot-words "@" domain', in order to permit addr-specs with multiple "@" to be accepted. This is done as a compatibility measure, until use of this construct has died out.
4. The scanner recognizes the atom "at" and returns a token whose type is the same as for "@". This is done as a backward compatibility measure. It has the disadvantage that "at" is not an atom (it is difficult to do this and keep the grammar LALR(1)).
5. The rule 'mailbox = route-addr' is added. Taft suspects its absence in RFC 822 is a bug.
6. The rule 'mailbox = "@" dot-words' is added to accomodate the construct used in Xerox mail programs to denote private distribution lists. This will be flushed as soon as the mail programs have been changed appropriately.
7. RFC 822 requires 'From' to be followed by just a single mailbox if no 'Sender' field is present. It is unclear how to incorporate this restriction into a real grammar. This restriction, if desired, must be enforced by semantic checking on the entire parsed header. At present, it is not enforced at all.
8. The rule 'address = empty' is added so that empty address-containing fields, and fields containing extraneous commas, will be accepted.
9. The rules 'mailbox-list = empty' and 'mailbox-list = mailbox-list "," empty' serve a purpose similar to (9).
10. The rules 'group = phrase ":"' and 'group = phrase ":" mailbox-list' is added because a lot of old mail programs send group expressions with the trailing ";" missing. This is ambiguous and is applied only in the context of an unexpected "," or EOL.
The following are differences in the date grammar from RFC 822.
1. The rules 'date = atom month "." atom' and 'time = hour ampm zone' are added in order to accomodate Laurel 6 format dates, in which the month is punctuated with a period and am/pm-style time is used. This will be flushed as soon as Laurel has been changed.
2. The rule 'date = atom' is added to accomodate dates of the form dd-mmm-yy, where hyphens replace the expected spaces. From the tokenizer's point of view, the whole date looks like a single atom, since "-" is not a special.
3. The rule 'time = hour' is added to accomodate Tenex/Tops20-format times, in which the zone is separated from the hour by a hyphen instead of a space. From the tokenizer's point of view, the final 2dights of the hour, the hyphen, and the zone together constitute a single atom, since "-" is not a special.
4. The rule 'hour = atom' is added to accomodate 4-digit times without colons.
5. The rule 'date-time = day "," date "," time' is added.
6. Full month and weekday names are admitted as well as the standard abbreviations.
7. The rules
 date-time ::= day month atom hour atom
 date-time ::= day month atom hour zone atom
are added to accomodate the Unix date format.
Several transformations are made on the parsed name list. Some are done as the name list is being parsed in RFCEvalImpl, and the rest are done to the parse tree by routines in RFCTreeOpsImpl, called from RFCDriver. The changes done in RFCEvalImpl are:
1. Empty addresses and mailboxes cause the extra comma to be removed.
2. "At" is changed to "@" when an addr-spec is recognized.
3. A semicolon is added to group lists when necessary.
One of the operations applied to names by the tree transformations is appending default Arpanet domains (hosts) or Grapevine registries to unqualified names found in the headers. The default domain/registry to apply is determined by examining the sender and from fields according to the following rules:
1. The Sender field is examined first. If no default is found, the From field is examined. If no default is found there, then the default registry "PA" is used.
2. Within a field, each mailbox is examined and the first one to provide a default domain/registry is used.
3. Within a single mailbox, if multiple hosts are present (the name is of the form foo@hostn@...@host2@host1 or is a route address of the form @host1,@host2,...:foo@hostn), then hosts are removed in order (as shown above) until the first host is not Parc-Maxc or until only one host remains.
4. If a host remains and it is not Parc-Maxc, then it is used as the default domain.
5. If a host remains and it is Parc-Maxc, then the name is examined for a default Grapevine registry. If one is found, it is used. If no registry is found, then a default registry of "PA" is used.
6. If no host remains then the name is examined for a Grapevine registry. If one is found, it is used, else no default is obtained from this mailbox.
After the translator determines the default registry/domain, it translates the address-containing header fields.
1. Only mailboxes are translated. Group names are not touched.
2. If the mailbox is of the form @dot-words, we quote the whole thing. And should add a host, but we don't yet.
3. For addr-specs and route-addrs, we strip off hosts as for default domains (except that doesn't yet work right for route-addrs, it just strips off one "Parc-Maxc"). If we have no remaining host, we apply the default registry/domain to it (see below). If there is a remaining host and it is local, we apply the default allowing only registry defaults. If there is a remaining host and it is not local, we apply no defaults.
4. If we are translating to Grapevine format, all @Parc-Maxc are removed. (Again, this is a little funny for route-addrs.)
5. All remaining host names have .ag and .arpagateway removed whenever present and .arpa added whenever absent.
The procedure for applying a default is as follows:
1. If the default is an Arpa domain and we are not forced to use a registry default, we add the arpa domain to the addr-spec.
2. If the default is a registry or we are forced to use a registry default, we add a registry if one is not already present. We also add @Parc-Maxc if requested (when translating to Grapevine format, the translator simply requests that the @Parc-Maxc not be added, rather than removing it later).
To do:
Add productions for extra syntax: :include:, uucp, space in names (last two are the same. Quote?).
Catch errors
Logging.
Hook up to GV.