Date: 19-Aug-84 10':53':37

Submitter: Masinter

Source: Masinter

Subject: regularize Interlisp-D handling of Unix filenames

Assigned To: 

Attn: vanMelle

Status: Open


Problem Type: Design - Impl

Impact: Moderate

Difficulty: Moderate



System: Operating System

Subsystem: Generic File Operations



Lisp Version: 15-Aug-84 20':06':06

Source Files: 

Microcode Version: 5124

Memory Size: 4096

File Server: 

Server Software Version: 


Description: '
Maybe Interlisp-D could handle file names on remote Unix and VMS sites in the same way that ISI''s VAX interlisp does. I dunno if this belongs on the Lisp end or the VAX end, but I suspect it is better to put it on the VAX end.'
The UNIX operating system''s facility for handling files posed two obstacles for ISI-INTERLISP.  First, the UNIX system'
does not save old files which have been updated.  All new files supersede their old versions in the user''s  directory.  Second, the UNIX system recognizes distinct file names of only'
14 characters or less; any longer names are truncated by the UNIX system  to fit this  format.'
The UNIX operating system''s file names are really pointers to files, which means that a file can have more than one name.  ISI-INTERLISP uses this feature to save old versions of files as new ones are created.  When a file is updated, ISI-INTERLISP designates it as a new file and gives it an appropriate version number.  It also passes along to each updated file an unnumbered name, so that the latest version of the file always has two names assigned to it, one numbered and one unnumbered.  The UNIX operating system recognizes the unnumbered names but not the numbered ones, so any manipulation of older versions of files must be done in Interlisp.'
Under Berkeley Unix 4.1, to be able to handle numbered files at the UNIX shell level, one must use the special versions of @IIT(csh) and @IIT(ls) provided with ISI-INTERLISP. Setting the environment variable VERSIONS in the modified C shell '
causes the shell to pass the "true" names of files to programs rather than the names with characters truncated to 7 bits.  Typing "setenv VERSIONS n", where n is an integer, allows the new ls to print the file names as "filename;versionnumber".'
Under 4.1, Interlisp implements the version number by setting the high order bit of the last character of the file name.  This convention limits the length of file names to 13 characters, since one character must be reserved for version numbers.  However, if the file is a .v (binary) file, or if the FORCEEXT flag is set to T, the file name is truncated to the first 11 characters to allow room for the extension as well as the version number.'
This method of handling version numbers is unnecessary under Berkely Unix 4.2.'
The UNIX operating system also, by nature, distinguishes between upper- and lower- case characters in file names.  ISI-INTERLISP, however, recognizes either upper  or lower case and will match the upper or lower case pattern of the previous '
version when creating new files. '
Under VMS, no special handling is required for version numbers since this is taken care of by the operating system.  VMS permits two forms of punctuation in directory names': [DIRECTORY] and <DIRECTORY>.  Due to the difficulty of typein for "[]" style punctuation under Interlisp (since, by default, these characters represent meta-open and meta-close, they must be preceded by the escape character "%"), users of ISI-INTERLISP under VMS are encouraged to adopt "<" and ">" for directory name punctuation.  In fact, internally ISI-INTERLISP will convert to this form under most circumstances.'
Manipulating File Names'
The directory structure of the UNIX operating system is represented by separating the names of successive directories leading to the intended file by slashes (e.g., /usr/Interlisp/foo).  A user can change a TENEX or TOPS-20 -type file name to this format by using the following function':'
uses the alist FILENAMETRANSLATIONS to direct the translation of all or specific fields of FILENAME.'
The entries of FILENAMETRANSLATIONS can be any of the field names known to UNPACKFILENAME, plus the following':'
FIRST		Each form in FIRST is evaluated before any other processing is done. FILENAME and FILENAMETRANSLATIONS are bound to the file name and to the whole translation list, respectively.'
FULLNAME	A list of dotted pairs of full path names.  If the CAR of any of these pairs matches FILENAME, the corresponding CDR is returned as the translation.'
	Each of these is a list of dotted pairs, which are matched with the corresponding fields as returned by'
UNPACKFILENAME.  If any CAR matches the input, the CDR replaces it.'
LAST	After all changes have been made, each form in LAST is evaluated.'
Below is a sample showing the recommended format for setting '
  ((DIRECTORY (DDYER . /lisp/ddyer/lisp)'
              (VORECK . /lisp/voreck/lisp)'
              (RBATES . /lisp/rbates/lisp)'
              (IGNATOWSKI . /lisp/ignatowski/lisp)'
              (LISPUSERS . /lisp/Interlisp/lispusers))'
   (EXTENSION (COM . v))'
         (FOO . newfoo))'
With this format, if  FILENAME="<IGNATOWSKI>FOO.COM.3", TRANSLATEFILENAME would produce "/lisp/ignatowski/lisp/newfoo.v".'
FILENAME can also be a list of a file name''s properties as returned by UNPACKFILENAME, in which case it would return that list with the changes according to FILENAMETRANSLATIONS.'


Test Case: