% This program by D. E. Knuth is not copyrighted and can be used freely. % Version 0 was implemented in January 1982. % In February 1982 a new restriction on ligature steps was added. % In June 1982 the routines were divided into smaller pieces for IBM people. % Hex was added in September 1982, and the result became "Version 1". % Version 1.1 fixed a bug in section 28 (since eoln is undefined after eof). % Slight changes were made in October, 1982, for version 0.6 of TeX. % Version 1.2 fixed a bug in section 115 (TOP, MID, and BOT can be zero) % Version 1.3 (April 1983) blanked out unused BCPL header bytes % Version 2 (July 1983) was released with TeX version 0.999. % Version 2.1 (September 1983) changed TEXINFO to FONTDIMEN. % Version 2.2 (May 1985) added checksum computation to match METAFONT. % Version 2.3 (August 1985) introduced `backup' to fix a minor bug. % Here is TeX material that gets inserted after \input webmac \def\hang{\hangindent 3em\indent\ignorespaces} \font\ninerm=cmr9 \let\mc=\ninerm % medium caps for names like SAIL \def\PASCAL{Pascal} \font\logo=logo10 % for the METAFONT logo \def\MF={\logo METAFONT} \def\(#1){} % this is used to make section names sort themselves better \def\9#1{} % this is used for sort keys in the index \def\title{PL\lowercase{to}TF} \def\contentspagenumber{301} \def\topofcontents{\null \def\titlepage{F} % include headline on the contents page \def\rheader{\mainfont\hfil \contentspagenumber} \vfill \centerline{\titlefont The {\ttitlefont PLtoTF} processor} \vskip 15pt \centerline{(Version 2.3, August 1985)} \vfill} \def\botofcontents{\vfill \centerline{\hsize 5in\baselineskip9pt \vbox{\ninerm\noindent The preparation of this report was supported in part by the National Science Foundation under grants IST-8201926 and MCS-8300984, and by the System Development Foundation. `\TeX' is a trademark of the American Mathematical Society.}}} \pageno=\contentspagenumber \advance\pageno by 1 @* Introduction. The \.{PLtoTF} utility program converts property-list (``\.{PL}'') files into equivalent \TeX\ font metric (``\.{TFM}'') files. It also makes a thorough check of the given \.{PL} file, so that the \.{TFM} file should be acceptable to \TeX. The first \.{PLtoTF} program was designed by Leo Guibas in the summer of 1978. Contributions by Frank Liang, Doug Wyatt, and Lyle Ramshaw also had a significant effect on the evolution of the present code. The |banner| string defined here should be changed whenever \.{PLtoTF} gets modified. @d banner=='This is PLtoTF, Version 2.3' {printed when the program starts} @ This program is written entirely in standard \PASCAL, except that it has to do some slightly system-dependent character code conversion on input. Furthermore, lower case letters are used in error messages; they could be converted to upper case if necessary. The input is read from |pl←file|, and the output is written on |tfm←file|; error messages and other remarks are written on the |output| file, which the user may choose to assign to the terminal if the system permits it. @↑system dependencies@> The term |print| is used instead of |write| when this program writes on the |output| file, so that all such output can be easily deflected. @d print(#)==write(#) @d print←ln(#)==write←ln(#) @p program PLtoTF(@!pl←file,@!tfm←file,@!output); const @<Constants in the outer block@>@/ type @<Types in the outer block@>@/ var @<Globals in the outer block@>@/ procedure initialize; {this procedure gets things started properly} var @<Local variables for initialization@>@/ begin print←ln(banner);@/ @<Set initial values@>@/ end; @ The following parameters can be changed at compile time to extend or reduce \.{PLtoTF}'s capacity. @<Constants...@>= @!buf←size=60; {length of lines displayed in error messages} @!max←header←bytes=100; {four times the maximum number of words allowed in the \.{TFM} file header block, must be 1024 or less} @!max←param←words=30; {the maximum number of \.{fontdimen} parameters allowed} @ Here are some macros for common programming idioms. @d incr(#) == #:=#+1 {increase a variable by unity} @d decr(#) == #:=#-1 {decrease a variable by unity} @d do←nothing == {empty statement} @* Property list description of font metric data. The idea behind \.{PL} files is that precise details about fonts, i.e., the facts that are needed by typesetting routines like \TeX, sometimes have to be supplied by hand. The nested property-list format provides a reasonably convenient way to do this. A good deal of computation is necessary to parse and process a \.{PL} file, so it would be inappropriate for \TeX\ itself to do this every time it loads a font. \TeX\ deals only with the compact descriptions of font metric data that appear in \.{TFM} files. Such data is so compact, however, it is almost impossible for anybody but a computer to read it. The purpose of \.{PLtoTF} is to convert from a human-oriented file of text to a computer-oriented file of binary numbers. @<Glob...@>= @!pl←file:text; @ @<Set init...@>= reset(pl←file); @ A \.{PL} file is a list of entries of the form $$\.{(PROPERTYNAME VALUE)}$$ where the property name is one of a finite set of names understood by this program, and the value may itself in turn be a property list. The idea is best understood by looking at an example, so let's consider a fragment of the \.{PL} file for a hypothetical font. $$\vbox{\halign{\.{#}\hfil\cr (FAMILY NOVA)\cr (FACE F MIE)\cr (CODINGSCHEME ASCII)\cr (DESIGNSIZE D 10)\cr (DESIGNUNITS D 18)\cr (COMMENT A COMMENT IS IGNORED)\cr (COMMENT (EXCEPT THIS ONE ISN'T))\cr (COMMENT (ACTUALLY IT IS, EVEN THOUGH\cr \qquad\qquad IT SAYS IT ISN'T))\cr (FONTDIMEN\cr \qquad (SLANT R -.25)\cr \qquad (SPACE D 6)\cr \qquad (SHRINK D 2)\cr \qquad (STRETCH D 3)\cr \qquad (XHEIGHT R 10.55)\cr \qquad (QUAD D 18)\cr \qquad )\cr (LIGTABLE\cr \qquad (LABEL C f)\cr \qquad (LIG C i O 200)\cr \qquad (LIG C f O 201)\cr \qquad (KRN O 51 R 1.5)\cr \qquad (STOP)\cr \qquad (LABEL O 201)\cr \qquad (LIG C i O 203)\cr \qquad (STOP)\cr \qquad )\cr (CHARACTER C f\cr \qquad (CHARWD D 6)\cr \qquad (CHARHT R 13.5)\cr \qquad (CHARIC R 1.5)\cr \qquad )\cr}}$$ This example says that the font whose metric information is being described belongs to the hypothetical \.{NOVA} family; its face code is medium italic extended; and the characters appear in ASCII code positions. The design size is 10 points, and all other sizes in this \.{PL} file are given in units such that 18 units equals the design size. The font is slanted with a slope of $-.25$ (hence the letters actually slant backward---perhaps that is why the family name is \.{NOVA}). The normal space between words is 6 units (i.e., one third of the 18-unit design size), with glue that shrinks by 2 units or stretches by 3. The letters for which accents don't need to be raised or lowered are 10.55 units high, and one em equals 18 units. The example ligature table specifies that the letter \.f followed by \.i is changed to code @'200, while \.f followed by \.f is changed to @'201; and if \.f is followed by the code @'51 (which is a right parenthesis) an additional 1.5 units of space should be inserted after the \.f. The character code @'201 is changed to @'203 if it is followed by \.i; thus, the sequence \.{ffi} leads to code @'203, which is presumably where the `ffi' ligature appears in the font. Character \.f itself is 6 units wide and 13.5 units tall, in this example. Its depth is zero (since \.{CHARDP} is not given), and its italic correction is 1.5 units. @ The example above illustrates most of the features found in \.{PL} files. Note that some property names, like \.{FAMILY} or \.{COMMENT}, take a string as their value; this string continues until the first unmatched right parenthesis. But most property names, like \.{DESIGNSIZE} and \.{SLANT} and \.{LABEL}, take a number as their value. This number can be expressed in a variety of ways, indicated by a prefixed code; \.D stands for decimal, \.H for hexadecimal, \.O for octal, \.R for real, \.C for character, and \.F for ``face.'' Other property names, like \.{LIG}, take two numbers as their value. And still other names, like \.{FONTDIMEN} and \.{LIGTABLE} and \.{CHARACTER}, have more complicated values that involve property lists. A property name is supposed to be used only in an appropriate property list. For example, \.{CHARWD} shouldn't occur on the outer level or within \.{FONTDIMEN}. The individual property-and-value pairs in a property list can appear in any order. For instance, `\.{SHRINK}' precedes `\.{STRETCH}' in the above example, although the \.{TFM} file always puts the stretch parameter first. One could even give the information about characters like `\.f' before specifying the number of units in the design size, or before specifying the ligature and kerning table. However, the \.{LIGTABLE} itself is an exception to this rule; the individual elements of the \.{LIGTABLE} property list can be reordered only to a certain extent without changing the meaning of that table. If property-and-value pairs are omitted, a default value is used. For example, we have already noted that the default for \.{CHARDP} is zero. The default for {\sl every\/} numeric value is, in fact, zero, unless otherwise stated below. If the same property name is used more than once, \.{PLtoTF} will not notice the discrepancy; it simply uses the final value given. Once again, however, the \.{LIGTABLE} is an exception to this rule; \.{PLtoTF} will complain if there is more than one label for some character. And of course many of the entries in the \.{LIGTABLE} property list have the same property name. From these rules, you can guess (correctly) that \.{PLtoTF} operates in four main steps. First it assigns the default values to all properties; then it scans through the \.{PL} file, changing property values as new ones are seen; then it checks the information and corrects any problems; and finally it outputs the \.{TFM} file. @ Instead of relying on a hypothetical example, let's consider a complete grammar for \.{PL} files. At the outer level, the following property names are valid: \yskip\hang\.{CHECKSUM} (four-byte value). The value, which should be a nonnegative integer less than $2↑{32}$, is used to identify a particular version of a font; it should match the check sum value stored with the font itself. A check sum of zero, which is the default, is used to bypass check sum testing. If no checksum is specified in the \.{PL} file, \.{PLtoTF} will compute the checksum that \MF\ would compute from the same data. \yskip\hang\.{DESIGNSIZE} (numeric value, default is 10). The value, which should be a real number in the range |1.0<=x<1024|, represents the default amount by which all quantities will be scaled if the font is not loaded with an `\.{at}' specification. For example, if one says `\.{\\font A=cmr10 at 15pt}' in \TeX\ language, the design size in the \.{TFM} file is ignored and effectively replaced by 15 points; but if one simply says `\.{\\font A=cmr10}' the stated design size is used. This quantity is always in units of printer's points. \yskip\hang\.{DESIGNUNITS} (numeric value, default is 1). The value should be a positive real number; it says how many units equals the design size (or the eventual `\.{at}' size, if the font is being scaled). For example, suppose you have a font that has been digitized with 600 pixels per em, and the design size is one em; then you could say `\.{(DESIGNUNITS D 600)}' if you wanted to give all of your measurements in units of pixels. \yskip\hang\.{CODINGSCHEME} (string value, default is `\.{UNSPECIFIED}'). The string should not contain parentheses, and its length must be less than 40. It identifies the correspondence between the numeric codes and font characters. (\TeX\ ignores this information, but other software programs make use of it.) \yskip\hang\.{FAMILY} (string value, default is `\.{UNSPECIFIED}'). The string should not contain parentheses, and its length must be less than 20. It identifies the name of the family to which this font belongs, e.g., `\.{HELVETICA}'. (\TeX\ ignores this information; but it is needed, for example, when converting \.{DVI} files to \.{PRESS} files for Xerox equipment.) \yskip\hang\.{FACE} (one-byte value). This number, which must lie between 0 and 255 inclusive, is a subsidiary ident\-ifi\-ca\-tion of the font within its family. For example, bold italic condensed fonts might have the same family name as light roman extended fonts, differing only in their face byte. (\TeX\ ignores this information; but it is needed, for example, when converting \.{DVI} files to \.{PRESS} files for Xerox equipment.) \yskip\hang\.{SEVENBITSAFEFLAG} (string value, default is `\.{FALSE}'). The value should start with either `\.T' (true) or `\.F' (false). If true, character codes less than 128 cannot lead to codes of 128 or more via ligatures or charlists or extensible characters. (\TeX82 ignores this flag, but older versions of \TeX\ would only accept \.{TFM} files that were seven-bit safe.) \.{PLtoTF} computes the correct value of this flag and gives an error message only if a claimed ``true'' value is incorrect. \yskip\hang\.{HEADER} (a one-byte value followed by a four-byte value). The one-byte value should be between 18 and a maximum limit that can be raised or lowered depending on the compile-time setting of |max←header←bytes|. The four-byte value goes into the header word whose index is the one-byte value; for example, to set |header[18]:=1|, one may write `\.{(HEADER D 18 O 1)}'. This notation is used for header information that is presently unnamed. (\TeX\ ignores it.) \yskip\hang\.{FONTDIMEN} (property list value). See below for the names allowed in this property list. \yskip\hang\.{LIGTABLE} (property list value). See below for the rules about this special kind of property list. \yskip\hang\.{CHARACTER}. The value is a one-byte integer followed by a property list. The integer represents the number of a character that is present in the font; the property list of a character is defined below. The default is an empty property list. @ Numeric property list values can be given in various forms identified by a prefixed letter. \yskip\hang\.C denotes an ASCII character, which should be a standard visible character that is not a parenthesis. The numeric value will therefore be between @'41 and @'176 but not @'50 or @'51. \yskip\hang\.D denotes a decimal integer, which must be nonnegative and less than 256. (Use \.R for larger values or for negative values.) \yskip\hang\.F denotes a three-letter Xerox face code; the admissible codes are \.{MRR}, \.{MIR}, \.{BRR}, \.{BIR}, \.{LRR}, \.{LIR}, \.{MRC}, \.{MIC}, \.{BRC}, \.{BIC}, \.{LRC}, \.{LIC}, \.{MRE}, \.{MIE}, \.{BRE}, \.{BIE}, \.{LRE}, and \.{LIE}, denoting the integers 0 to 17, respectively. \yskip\hang\.O denotes an unsigned octal integer, which must be less than $2↑{32}$, i.e., at most `\.{O 37777777777}'. \yskip\hang\.H denotes an unsigned hexadecimal integer, which must be less than $2↑{32}$, i.e., at most `\.{H FFFFFFFF}'. \yskip\hang\.R denotes a real number in decimal notation, optionally preceded by a `\.+' or `\.-' sign, and optionally including a decimal point. The absolute value must be less than 1024. @ The property names allowed in a \.{FONTDIMEN} property list correspond to various \TeX\ parameters, each of which has a (real) numeric value. All of the parameters except \.{SLANT} are in design-size units. The admissible names are \.{SLANT}, \.{SPACE}, \.{STRETCH}, \.{SHRINK}, \.{XHEIGHT}, \.{QUAD}, \.{EXTRASPACE}, \.{NUM1}, \.{NUM2}, \.{NUM3}, \.{DENOM1}, \.{DENOM2}, \.{SUP1}, \.{SUP2}, \.{SUP3}, \.{SUB1}, \.{SUB2}, \.{SUPDROP}, \.{SUBDROP}, \.{DELIM1}, \.{DELIM2}, and \.{AXISHEIGHT}, for parameters 1~to~22. The alternate names \.{DEFAULTRULETHICKNESS}, \.{BIGOPSPACING1}, \.{BIGOPSPACING2}, \.{BIGOPSPACING3}, \.{BIGOPSPACING4}, and \.{BIGOPSPACING5}, may also be used for parameters 8 to 13. The notation `\.{PARAMETER} $n$' provides another way to specify the $n$th parameter; for example, `\.{(PARAMETER} \.{D 1 R -.25)}' is another way to specify that the \.{SLANT} is $-0.25$. The value of $n$ must be positive and less than |max←param←words|. @ The elements of a \.{CHARACTER} property list can be of six different types. \yskip\hang\.{CHARWD} (real value) denotes the character's width in design-size units. \yskip\hang\.{CHARHT} (real value) denotes the character's height in design-size units. \yskip\hang\.{CHARDP} (real value) denotes the character's depth in design-size units. \yskip\hang\.{CHARIC} (real value) denotes the character's italic correction in design-size units. \yskip\hang\.{NEXTLARGER} (one-byte value), specifies the character that follows the present one in a ``charlist.'' The value must be the number of a character in the font, and there must be no infinite cycles of supposedly larger and larger characters. \yskip\hang\.{VARCHAR} (property list value), specifies an extensible character. This option and \.{NEXTLARGER} are mutually exclusive; i.e., they cannot both be used within the same \.{CHARACTER} list. \yskip\noindent The elements of a \.{VARCHAR} property list are either \.{TOP}, \.{MID}, \.{BOT} or \.{REP}; the values are integers, which must be zero or the number or a character in the font. A zero value for \.{TOP}, \.{MID}, or \.{BOT} means that the corresponding piece of the extensible character is absent. A nonzero value, or a \.{REP} value of zero, denotes the character code used to make up the top, middle, bottom, or replicated piece of an extensible character. @ A \.{LIGTABLE} property list contains elements of four kinds, specifying a program in a simple command language that \TeX\ uses for ligatures and kerns. \yskip\hang\.{LABEL} (one-byte value) means that the program for the stated character value starts here. The integer must be the number of a character in the font; its \.{CHARACTER} property list must not have a \.{NEXTLARGER} or \.{VARCHAR} field. \yskip\hang\.{LIG} (two one-byte values). The instruction `\.{(LIG} $c$ $r$\.)' means, ``If the next character is $c$, then replace both the current character and $c$ by the character $r$; otherwise go on to the next instruction.'' Character $r$ must be present in the font, but $c$ need not be. \yskip\hang\.{KRN} (a one-byte value and a real value). The instruction `\.{(KRN} $c$ $r$\.)' means, ``If the next character is $c$, then insert a blank space of width $r$ between the current character character and $c$; otherwise go on to the next intruction.'' The value of $r$, which is in units of the design size, is often negative. Character code $c$ must exist in the font. \yskip\hang\.{STOP} (no value). This instruction ends a ligature/kern program. It must follow either a \.{LIG} or \.{KRN} instruction, not a \.{LABEL} or \.{STOP}. @ In addition to all these possibilities, the property name \.{COMMENT} is allowed in any property list. Such comments are ignored. @ So that is what \.{PL} files hold. The next question is, ``What about \.{TFM} files?'' A complete answer to that question appears in the documentation of the companion program, \.{TFtoPL}, so it will not be repeated here. Suffice it to say that a \.{TFM} file stores all of the relevant font information in a sequence of 8-bit bytes. The number of bytes is always a multiple of 4, so we could regard the \.{TFM} file as a sequence of 32-bit words; but \TeX\ uses the byte interpretation, and so does \.{PLtoTF}. Note that the bytes are considered to be unsigned numbers. @<Glob...@>= @!tfm←file:packed file of 0..255; @ On some systems you may have to do something special to write a packed file of bytes. For example, the following code didn't work when it was first tried at Stanford, because packed files have to be opened with a special switch setting on the \PASCAL\ that was used. @↑system dependencies@> @<Set init...@>= rewrite(tfm←file); @* Basic input routines. For the purposes of this program, a |byte| is an unsigned eight-bit quantity, and an |ASCII←code| is an integer between @'40 and @'177. Such ASCII codes correspond to one-character constants like \.{"A"} in \.{WEB} language. @<Types...@>= @!byte=0..255; {unsigned eight-bit quantity} @!ASCII←code=@'40..@'177; {standard ASCII code numbers} @ One of the things \.{PLtoTF} has to do is convert characters of strings to ASCII form, since that is the code used for the family name and the coding scheme in a \.{TFM} file. An array |xord| is used to do the conversion from |char|; the method below should work with little or no change on most \PASCAL\ systems. @↑system dependencies@> @d first←ord=0 {ordinal number of the smallest element of |char|} @d last←ord=127 {ordinal number of the largest element of |char|} @<Global...@>= @!xord:array[char] of ASCII←code; {conversion table} @ @<Local variables for init...@>= @!k:first←ord..last←ord; {an index used for clearing |xord|} @ Characters that should not appear in \.{PL} files (except in comments) are mapped into @'177. @d invalid←code=@'177 {code deserving an error message} @<Set init...@>= for k:=first←ord to last←ord do xord[chr(k)]:=invalid←code; xord[' ']:=" "; xord['!']:="!"; xord['"']:=""""; xord['#']:="#"; xord['$']:="$"; xord['%']:="%"; xord['&']:="&"; xord['''']:="'"; xord['(']:="("; xord[')']:=")"; xord['*']:="*"; xord['+']:="+"; xord[',']:=","; xord['-']:="-"; xord['.']:="."; xord['/']:="/"; xord['0']:="0"; xord['1']:="1"; xord['2']:="2"; xord['3']:="3"; xord['4']:="4"; xord['5']:="5"; xord['6']:="6"; xord['7']:="7"; xord['8']:="8"; xord['9']:="9"; xord[':']:=":"; xord[';']:=";"; xord['<']:="<"; xord['=']:="="; xord['>']:=">"; xord['?']:="?"; xord['@@']:="@@"; xord['A']:="A"; xord['B']:="B"; xord['C']:="C"; xord['D']:="D"; xord['E']:="E"; xord['F']:="F"; xord['G']:="G"; xord['H']:="H"; xord['I']:="I"; xord['J']:="J"; xord['K']:="K"; xord['L']:="L"; xord['M']:="M"; xord['N']:="N"; xord['O']:="O"; xord['P']:="P"; xord['Q']:="Q"; xord['R']:="R"; xord['S']:="S"; xord['T']:="T"; xord['U']:="U"; xord['V']:="V"; xord['W']:="W"; xord['X']:="X"; xord['Y']:="Y"; xord['Z']:="Z"; xord['[']:="["; xord['\']:="\"; xord[']']:="]"; xord['↑']:="↑"; xord['←']:="←"; xord['`']:="`"; xord['a']:="a"; xord['b']:="b"; xord['c']:="c"; xord['d']:="d"; xord['e']:="e"; xord['f']:="f"; xord['g']:="g"; xord['h']:="h"; xord['i']:="i"; xord['j']:="j"; xord['k']:="k"; xord['l']:="l"; xord['m']:="m"; xord['n']:="n"; xord['o']:="o"; xord['p']:="p"; xord['q']:="q"; xord['r']:="r"; xord['s']:="s"; xord['t']:="t"; xord['u']:="u"; xord['v']:="v"; xord['w']:="w"; xord['x']:="x"; xord['y']:="y"; xord['z']:="z"; xord['{']:="{"; xord['|']:="|"; xord['}']:="}"; xord['~']:="~"; @ In order to help catch errors of badly nested parentheses, \.{PLtoTF} assumes that the user will begin each line with a number of blank spaces equal to some constant times the number of open parentheses at the beginning of that line. However, the program doesn't know in advance what the constant is, nor does it want to print an error message on every line for a user who has followed no consistent pattern of indentation. Therefore the following strategy is adopted: If the user has been consistent with indentation for ten or more lines, an indentation error will be reported. The constant of indentation is reset on every line that should have nonzero indentation. @<Glob...@>= @!line:integer; {the number of the current line} @!good←indent:integer; {the number of lines since the last bad indentation} @!indent: integer; {the number of spaces per open parenthesis, zero if unknown} @!level: integer; {the current number of open parentheses} @ @<Set init...@>= line:=0; good←indent:=0; indent:=0; level:=0; @ The input need not really be broken into lines of any maximum length, and we could read it character by character without any buffering. But we shall place it into a small buffer so that offending lines can be displayed in error messages. @<Glob...@>= @!left←ln,@!right←ln:boolean; {are the left and right ends of the buffer at end-of-line marks?} @!limit:0..buf←size; {position of the last character present in the buffer} @!loc:0..buf←size; {position of the last character read in the buffer} @!buffer:array[1..buf←size] of char; @!input←has←ended:boolean; {there is no more input to read} @ @<Set init...@>= limit:=0; loc:=0; left←ln:=true; right←ln:=true; input←has←ended:=false; @ Just before each \.{CHARACTER} property list is evaluated, the character code is printed in octal notation. Up to eight such codes appear on a line; so we have a variable to keep track of how many are currently there. @<Glob...@>= @!chars←on←line:0..8; {the number of characters printed on the current line} @ @<Set init...@>= chars←on←line:=0; @ The following routine prints an error message and an indication of where the error was detected. The error message should not include any final punctuation, since this procedure suppies its own. @d err←print(#)==begin if chars←on←line>0 then print←ln(' '); print(#); show←error←context; end @p procedure show←error←context; {prints the current scanner location} var k:0..buf←size; {an index into |buffer|} begin print←ln(' (line ',line:1,').'); if not left←ln then print('...'); for k:=1 to loc do print(buffer[k]); {print the characters already scanned} print←ln(' '); if not left←ln then print(' '); for k:=1 to loc do print(' '); {space out the second line} for k:=loc+1 to limit do print(buffer[k]); {print the characters yet unseen} if right←ln then print←ln(' ')@+else print←ln('...'); chars←on←line:=0; end; @ Here is a procedure that does the right thing when we are done reading the present contents of the buffer. It keeps |buffer[buf←size]| empty, in order to avoid range errors on certain \PASCAL\ compilers. An infinite sequence of right parentheses is placed at the end of the file, so that the program is sure to get out of whatever level of nesting it is in. On some systems it is desirable to modify this code so that tab marks in the buffer are replaced by blank spaces. (Simply setting |xord[chr(@'11)]:=" "| would not work; for example, two-line error messages would not come out properly aligned.) @↑system dependencies@> @p procedure fill←buffer; begin left←ln:=right←ln; limit:=0; loc:=0; if left←ln then begin if line>0 then read←ln(pl←file); incr(line); end; if eof(pl←file) then begin limit:=1; buffer[1]:=')'; right←ln:=false; input←has←ended:=true; end else begin while (limit<buf←size-1)and(not eoln(pl←file)) do begin incr(limit); read(pl←file,buffer[limit]); end; buffer[limit+1]:=' '; right←ln:=eoln(pl←file); if left←ln then @<Set |loc| to the number of leading blanks in the buffer, and check the indentation@>; end; end; @ The interesting part about |fill←buffer| is the part that learns what indentation conventions the user is following, if any. @d bad←indent(#)==begin if good←indent>=10 then err←print(#); good←indent:=0; indent:=0; end @<Set |loc|...@>= begin while (loc<limit)and(buffer[loc+1]=' ') do incr(loc); if loc<limit then begin if level=0 then if loc=0 then incr(good←indent) else bad←indent('Warning: Indented line occurred at level zero') @.Warning: Indented line...@> else if indent=0 then if (loc div level)*level=loc then begin indent:=loc div level; good←indent:=1; end else good←indent:=0 else if indent*level=loc then incr(good←indent) else bad←indent('Warning: Inconsistent indentation; ', @.Warning: Inconsistent indentation...@> 'you are at parenthesis level ',level:1); end; end @* Basic scanning routines. The global variable |cur←char| holds the ASCII code corresponding to the character most recently read from the input buffer, or to a character that has been substituted for the real one. @<Global...@>= @!cur←char:ASCII←code; {we have just read this} @ Here is a procedure that sets |cur←char| to an ASCII code for the next character of input, if that character is a letter or digit. Otherwise it sets |cur←char:=" "|, and the input system will be poised to reread the character that was rejected, whether or not it was a space. Lower case letters are converted to upper case. @p procedure get←letter←or←digit; begin while (loc=limit)and(not right←ln) do fill←buffer; if loc=limit then cur←char:=" " {end-of-line counts as a delimiter} else begin cur←char:=xord[buffer[loc+1]]; if cur←char>="a" then cur←char:=cur←char-@'40; if ((cur←char>="0")and(cur←char<="9"))or ((cur←char>="A")and(cur←char<="Z")) then incr(loc) else cur←char:=" "; end; end; @ The following procedure sets |cur←char| to the next character code, and converts lower case to upper case. If the character is a left or right parenthesis, it will not be ``digested''; the character will be read again and again, until the calling routine does something like `|incr(loc)|' to get past it. Such special treatment of parentheses insures that the structural information they contain won't be lost in the midst of other error recovery operations. @d backup==begin if (cur←char>")")or(cur←char<"(") then decr(loc); end {undoes the effect of |get←next|} @p procedure get←next; {sets |cur←char| to next, balks at parentheses} begin while loc=limit do fill←buffer; incr(loc); cur←char:=xord[buffer[loc]]; if cur←char>="a" then if cur←char<="z" then cur←char:=cur←char-@'40 {uppercasify} else begin if cur←char=invalid←code then begin err←print('Illegal character in the file'); @.Illegal character...@> cur←char:="?"; end; end else if (cur←char<=")")and(cur←char>="(") then decr(loc); end; @ The next procedure is used to ignore the text of a comment, or to pass over erroneous material. As such, it has the privilege of passing parentheses. It stops after the first right parenthesis that drops the level below the level in force when the procedure was called. @p procedure skip←to←end←of←item; var l:integer; {initial value of |level|} begin l:=level; while level>=l do begin while loc=limit do fill←buffer; incr(loc); if buffer[loc]=')' then decr(level) else if buffer[loc]='(' then incr(level); end; if input←has←ended then err←print('File ended unexpectedly: No closing ")"'); @.File ended unexpectedly...@> cur←char:=" "; {now the right parenthesis has been read and digested} end; @ Sometimes we merely want to skip past characters in the input until we reach a left or a right parenthesis. For example, we do this whenever we have finished scanning a property value and we hope that a right parenthesis is next (except for possible blank spaces). @d skip←to←paren==repeat get←next@;@+ until (cur←char="(")or(cur←char=")") @d skip←error(#)==begin err←print(#); skip←to←paren; end {this gets to the right parenthesis if something goes wrong} @d flush←error(#)==begin err←print(#); skip←to←end←of←item; end {this gets past the right parenthesis if something goes wrong} @ After a property value has been scanned, we want to move just past the right parenthesis that should come next in the input (except for possible blank spaces). @p procedure finish←the←property; {do this when the value has been scanned} begin while cur←char=" " do get←next; if cur←char<>")" then err←print('Junk after property value will be ignored'); @.Junk after property value...@> skip←to←end←of←item; end; @* Scanning property names. We have to figure out the meaning of names that appear in the \.{PL} file, by looking them up in a dictionary of known keywords. Keyword number $n$ appears in locations |start[n]| through |start[n+1]-1| of an array called |dictionary|. @d max←name←index=66 {upper bound on the number of keywords} @d max←letters=500 {upper bound on the total length of all keywords} @<Global...@>= @!start:array[1..max←name←index] of 0..max←letters; @!dictionary:array[0..max←letters] of ASCII←code; @!start←ptr:0..max←name←index; {the first available place in |start|} @!dict←ptr:0..max←letters; {the first available place in |dictionary|} @ @<Set init...@>= start←ptr:=1; start[1]:=0; dict←ptr:=0; @ When we are looking for a name, we put it into the |cur←name| array. When we have found it, the corresponding |start| index will go into the global variable |name←ptr|. @d longest←name=20 {length of \.{DEFAULTRULETHICKNESS}} @<Glob...@>= @!cur←name:array[1..longest←name] of ASCII←code; {a name to look up} @!name←length:0..longest←name; {its length} @!name←ptr:0..max←name←index; {its ordinal number in the dictionary} @ A conventional hash table with linear probing (cf.\ Algorithm 6.4L in {\sl The Art of Computer Pro\-gram\-ming\/}) is used for the dictionary operations. If |hash[h]=0|, the table position is empty, otherwise |hash[h]| points into the |start| array. @d hash←prime=101 {size of the hash table} @<Glob...@>= @!hash:array[0..hash←prime-1] of 0..max←name←index; @!cur←hash:0..hash←prime-1; {current position in the hash table} @ @<Local...@>= @!h:0..hash←prime-1; {runs through the hash table} @ @<Set init...@>= for h:=0 to hash←prime-1 do hash[h]:=0; @ Since there is no chance of the hash table overflowing, the procedure is very simple. After |lookup| has done its work, |cur←hash| will point to the place where the given name was found, or where it should be inserted. @p procedure lookup; {finds |cur←name| in the dictionary} var k:0..longest←name; {index into |cur←name|} @!j:0..max←letters; {index into |dictionary|} @!not←found:boolean; {clumsy thing necessary to avoid |goto| statement} begin @<Compute the hash code, |cur←hash|, for |cur←name|@>; not←found:=true; while not←found do begin if cur←hash=0 then cur←hash:=hash←prime-1@+else decr(cur←hash); if hash[cur←hash]=0 then not←found:=false else begin j:=start[hash[cur←hash]]; if start[hash[cur←hash]+1]=j+name←length then begin not←found:=false; for k:=1 to name←length do if dictionary[j+k-1]<>cur←name[k] then not←found:=true; end; end; end; name←ptr:=hash[cur←hash]; end; @ @<Compute the hash...@>= cur←hash:=cur←name[1]; for k:=2 to name←length do cur←hash:=(cur←hash+cur←hash+cur←name[k]) mod hash←prime @ The ``meaning'' of the keyword that begins at |start[k]| in the dictionary is kept in |equiv[k]|. The numeric |equiv| codes are given symbolic meanings by the following definitions. @d comment←code=0 @d check←sum←code=1 @d design←size←code=2 @d design←units←code=3 @d coding←scheme←code=4 @d family←code=5 @d face←code=6 @d seven←bit←safe←flag←code=7 @d header←code= 8 @d font←dimen←code=9 @d lig←table←code=10 @d character←code=11 @d parameter←code=20 @d char←info←code=50 @d width=1 @d height=2 @d depth=3 @d italic=4 @d char←wd←code=char←info←code+width @d char←ht←code=char←info←code+height @d char←dp←code=char←info←code+depth @d char←ic←code=char←info←code+italic @d next←larger←code=55 @d var←char←code=56 @d label←code=70 @d lig←code=71 @d krn←code=72 @d stop←code=73 @<Glo...@>= @!equiv:array[0..max←name←index] of byte; @!cur←code:byte; {equivalent most recently found in |equiv|} @ We have to get the keywords into the hash table and into the dictionary in the first place (sigh). The procedure that does this has the desired |equiv| code as a parameter. In order to facilitate \.{WEB} macro writing for the initialization, the keyword being initialized is placed into the last positions of |cur←name|, instead of the first positions. @p procedure enter←name(v:byte); {|cur←name| goes into the dictionary} var k:0..longest←name; begin for k:=1 to name←length do cur←name[k]:=cur←name[k+longest←name-name←length]; {now the name has been shifted into the correct position} lookup; {this sets |cur←hash| to the proper insertion place} hash[cur←hash]:=start←ptr; equiv[start←ptr]:=v; for k:=1 to name←length do begin dictionary[dict←ptr]:=cur←name[k]; incr(dict←ptr); end; incr(start←ptr); start[start←ptr]:=dict←ptr; end; @ Here are the macros to load a name of up to 20 letters into the dictionary. For example, the macro |load5| is used for five-letter keywords. @d tail(#)==enter←name(#) @d t20(#)==cur←name[20]:=#;tail @d t19(#)==cur←name[19]:=#;t20 @d t18(#)==cur←name[18]:=#;t19 @d t17(#)==cur←name[17]:=#;t18 @d t16(#)==cur←name[16]:=#;t17 @d t15(#)==cur←name[15]:=#;t16 @d t14(#)==cur←name[14]:=#;t15 @d t13(#)==cur←name[13]:=#;t14 @d t12(#)==cur←name[12]:=#;t13 @d t11(#)==cur←name[11]:=#;t12 @d t10(#)==cur←name[10]:=#;t11 @d t9(#)==cur←name[9]:=#;t10 @d t8(#)==cur←name[8]:=#;t9 @d t7(#)==cur←name[7]:=#;t8 @d t6(#)==cur←name[6]:=#;t7 @d t5(#)==cur←name[5]:=#;t6 @d t4(#)==cur←name[4]:=#;t5 @d t3(#)==cur←name[3]:=#;t4 @d t2(#)==cur←name[2]:=#;t3 @d t1(#)==cur←name[1]:=#;t2 @d load3==name←length:=3;t18 @d load4==name←length:=4;t17 @d load5==name←length:=5;t16 @d load6==name←length:=6;t15 @d load7==name←length:=7;t14 @d load8==name←length:=8;t13 @d load9==name←length:=9;t12 @d load10==name←length:=10;t11 @d load11==name←length:=11;t10 @d load12==name←length:=12;t9 @d load13==name←length:=13;t8 @d load14==name←length:=14;t7 @d load15==name←length:=15;t6 @d load16==name←length:=16;t5 @d load17==name←length:=17;t4 @d load18==name←length:=18;t3 @d load19==name←length:=19;t2 @d load20==name←length:=20;t1 @ @<Enter all of the names and their equivalents, except the parameter names@>= equiv[0]:=comment←code; {this is used after unknown keywords} load8("C")("H")("E")("C")("K")("S")("U")("M")(check←sum←code);@/ load10("D")("E")("S")("I")("G")("N")("S")("I")("Z")("E")(design←size←code);@/ load11("D")("E")("S")("I")("G")("N") ("U")("N")("I")("T")("S")(design←units←code);@/ load12("C")("O")("D")("I")("N")("G") ("S")("C")("H")("E")("M")("E")(coding←scheme←code);@/ load6("F")("A")("M")("I")("L")("Y")(family←code);@/ load4("F")("A")("C")("E")(face←code);@/ load16("S")("E")("V")("E")("N")("B")("I")("T")@/@t\hskip2em@> ("S")("A")("F")("E")("F")("L")("A")("G")(seven←bit←safe←flag←code);@/ load6("H")("E")("A")("D")("E")("R")(header←code);@/ load9("F")("O")("N")("T")("D")("I")("M")("E")("N")(font←dimen←code);@/ load8("L")("I")("G")("T")("A")("B")("L")("E")(lig←table←code);@/ load9("C")("H")("A")("R")("A")("C")("T")("E")("R")(character←code);@/ load9("P")("A")("R")("A")("M")("E")("T")("E")("R")(parameter←code);@/ load6("C")("H")("A")("R")("W")("D")(char←wd←code);@/ load6("C")("H")("A")("R")("H")("T")(char←ht←code);@/ load6("C")("H")("A")("R")("D")("P")(char←dp←code);@/ load6("C")("H")("A")("R")("I")("C")(char←ic←code);@/ load10("N")("E")("X")("T")("L")("A")("R")("G")("E")("R")(next←larger←code);@/ load7("V")("A")("R")("C")("H")("A")("R")(var←char←code);@/ load3("T")("O")("P")(var←char←code+1);@/ load3("M")("I")("D")(var←char←code+2);@/ load3("B")("O")("T")(var←char←code+3);@/ load3("R")("E")("P")(var←char←code+4);@/ load3("E")("X")("T")(var←char←code+4); {compatibility with older \.{PL} format} load7("C")("O")("M")("M")("E")("N")("T")(comment←code);@/ load5("L")("A")("B")("E")("L")(label←code);@/ load3("L")("I")("G")(lig←code);@/ load3("K")("R")("N")(krn←code);@/ load4("S")("T")("O")("P")(stop←code); @ @<Enter the parameter names@>= load5("S")("L")("A")("N")("T")(parameter←code+1);@/ load5("S")("P")("A")("C")("E")(parameter←code+2);@/ load7("S")("T")("R")("E")("T")("C")("H")(parameter←code+3);@/ load6("S")("H")("R")("I")("N")("K")(parameter←code+4);@/ load7("X")("H")("E")("I")("G")("H")("T")(parameter←code+5);@/ load4("Q")("U")("A")("D")(parameter←code+6);@/ load10("E")("X")("T")("R")("A")("S")("P")("A")("C")("E")(parameter←code+7);@/ load4("N")("U")("M")("1")(parameter←code+8);@/ load4("N")("U")("M")("2")(parameter←code+9);@/ load4("N")("U")("M")("3")(parameter←code+10);@/ load6("D")("E")("N")("O")("M")("1")(parameter←code+11);@/ load6("D")("E")("N")("O")("M")("2")(parameter←code+12);@/ load4("S")("U")("P")("1")(parameter←code+13);@/ load4("S")("U")("P")("2")(parameter←code+14);@/ load4("S")("U")("P")("3")(parameter←code+15);@/ load4("S")("U")("B")("1")(parameter←code+16);@/ load4("S")("U")("B")("2")(parameter←code+17);@/ load7("S")("U")("P")("D")("R")("O")("P")(parameter←code+18);@/ load7("S")("U")("B")("D")("R")("O")("P")(parameter←code+19);@/ load6("D")("E")("L")("I")("M")("1")(parameter←code+20);@/ load6("D")("E")("L")("I")("M")("2")(parameter←code+21);@/ load10("A")("X")("I")("S")("H")("E")("I")("G")("H")("T")(parameter←code+22);@/ load20("D")("E")("F")("A")("U")("L")("T")("R")("U")("L")("E")@/@t\hskip2em@> ("T")("H")("I")("C")("K")("N")("E")("S")("S")(parameter←code+8);@/ load13("B")("I")("G")("O")("P") ("S")("P")("A")("C")("I")("N")("G")("1")(parameter←code+9);@/ load13("B")("I")("G")("O")("P") ("S")("P")("A")("C")("I")("N")("G")("2")(parameter←code+10);@/ load13("B")("I")("G")("O")("P") ("S")("P")("A")("C")("I")("N")("G")("3")(parameter←code+11);@/ load13("B")("I")("G")("O")("P") ("S")("P")("A")("C")("I")("N")("G")("4")(parameter←code+12);@/ load13("B")("I")("G")("O")("P") ("S")("P")("A")("C")("I")("N")("G")("5")(parameter←code+13);@/ @ When a left parenthesis has been scanned, the following routine is used to interpret the keyword that follows, and to store the equivalent value in |cur←code|. @p procedure get←name; begin incr(loc); incr(level); {pass the left parenthesis} cur←char:=" "; while cur←char=" " do get←next; if (cur←char>")")or(cur←char<"(") then decr(loc); {back up one character} name←length:=0; get←letter←or←digit; {prepare to scan the name} while cur←char<>" " do begin if name←length=longest←name then cur←name[1]:="X" {force error} else incr(name←length); cur←name[name←length]:=cur←char; get←letter←or←digit; end; lookup; if name←ptr=0 then err←print('Sorry, I don''t know that property name'); @.Sorry, I don't know...@> cur←code:=equiv[name←ptr]; end; @* Scanning numeric data. The next thing we need is a trio of subroutines to read the one-byte, four-byte, and real numbers that may appear as property values. These subroutines are careful to stick to numbers between $-2↑{31}$ and $2↑{31}-1$, inclusive, so that a computer with two's complement 32-bit arithmetic will not be interrupted by overflow. @ The first number scanner, which returns a one-byte value, surely has no problems of arithmetic overflow. @p function get←byte:byte; {scans a one-byte property value} var acc:integer; {an accumulator} @!t:ASCII←code; {the type of value to be scanned} begin repeat get←next; until cur←char<>" "; {skip the blanks before the type code} t:=cur←char; acc:=0; repeat get←next; until cur←char<>" "; {skip the blanks after the type code} if t="C" then @<Scan an ASCII character code@> else if t="D" then @<Scan a small decimal number@> else if t="O" then @<Scan a small octal number@> else if t="H" then @<Scan a small hexadecimal number@> else if t="F" then @<Scan a face code@> else skip←error('You need "C" or "D" or "O" or "H" or "F" here'); @.You need "C" or "D" ...here@> cur←char:=" "; get←byte:=acc; end; @ The |get←next| routine converts lower case to upper case, but it leaves the character in the buffer, so we can unconvert it. @<Scan an ASCII...@>= if (cur←char>=@'41)and(cur←char<=@'176)and ((cur←char<"(")or(cur←char>")")) then acc:=xord[buffer[loc]] else skip←error('"C" value must be standard ASCII and not a paren') @:C value}\.{"C" value must be...@> @ @<Scan a small dec...@>= begin while (cur←char>="0")and(cur←char<="9") do begin acc:=acc*10+cur←char-"0"; if acc>255 then begin skip←error('This value shouldn''t exceed 255'); @.This value shouldn't...@> acc:=0; cur←char:=" "; end else get←next; end; backup; end @ @<Scan a small oct...@>= begin while (cur←char>="0")and(cur←char<="7") do begin acc:=acc*8+cur←char-"0"; if acc>255 then begin skip←error('This value shouldn''t exceed ''377'); @.This value shouldn't...@> acc:=0; cur←char:=" "; end else get←next; end; backup; end @ @<Scan a small hex...@>= begin while ((cur←char>="0")and(cur←char<="9"))or ((cur←char>="A")and(cur←char<="F")) do begin if cur←char>="A" then cur←char:=cur←char+"0"+10-"A"; acc:=acc*16+cur←char-"0"; if acc>255 then begin skip←error('This value shouldn''t exceed "FF'); @.This value shouldn't...@> acc:=0; cur←char:=" "; end else get←next; end; backup; end @ @<Scan a face...@>= begin if cur←char="B" then acc:=2 else if cur←char="L" then acc:=4 else if cur←char<>"M" then acc:=18; get←next; if cur←char="I" then incr(acc) else if cur←char<>"R" then acc:=18; get←next; if cur←char="C" then acc:=acc+6 else if cur←char="E" then acc:=acc+12 else if cur←char<>"R" then acc:=18; if acc>=18 then begin skip←error('Illegal face code, I changed it to MRR'); @.Illegal face code...@> acc:=0; end; end @ The routine that scans a four-byte value puts its output into |cur←bytes|, which is a record containing (yes, you guessed it) four bytes. @<Types...@>= @!four←bytes=record @!b0:byte;@+@!b1:byte;@+@!b2:byte;@+@!b3:byte;@+end; @ @d c0==cur←bytes.b0 @d c1==cur←bytes.b1 @d c2==cur←bytes.b2 @d c3==cur←bytes.b3 @<Glob...@>= cur←bytes:four←bytes; {a four-byte accumulator} @ Since the |get←four←bytes| routine is used very infrequently, no attempt has been made to make it fast; we only want it to work. @p procedure get←four←bytes; {scans an octal constant and sets |four←bytes|} var c:integer; {leading byte} @!r:integer; {radix} @!q:integer; {|256/r|} begin repeat get←next; until cur←char<>" "; {skip the blanks before the type code} r:=0; c0:=0; c1:=0; c2:=0; c3:=0; {start with the accumulator zero} if cur←char="H" then r:=16 else if cur←char="O" then r:=8 else skip←error('An octal ("O") or hex ("H") value is needed here'); @.An octal ("O") or hex ("H")...@> if r>0 then begin q:=256 div r; repeat get←next; until cur←char<>" "; {skip the blanks after the type code} while ((cur←char>="0")and(cur←char<="9"))or@| ((cur←char>="A")and(cur←char<="F")) do @<Multiply by |r|, add |cur←char-"0"|, and |get←next|@>; end; end; @ @<Multiply by |r|...@>= begin if cur←char>="A" then cur←char:=cur←char+"0"+10-"A"; c:=(r*c0)+(c1 div q); if c>255 then begin c0:=0; c1:=0; c2:=0; c3:=0; if r=8 then skip←error('Sorry, the maximum octal value is O 37777777777') @.Sorry, the maximum octal...@> else skip←error('Sorry, the maximum hex value is H FFFFFFFF'); @.Sorry, the maximum hex...@> end else if cur←char>="0"+r then skip←error('Illegal digit') @.Illegal digit@> else begin c0:=c; c1:=(r*(c1 mod q))+(c2 div q); c2:=(r*(c2 mod q))+(c3 div q); c3:=(r*(c3 mod q))+cur←char-"0"; get←next; end; end; @ The remaining scanning routine is the most interesting. It scans a real constant and returns the nearest |fix←word| approximation to that constant. A |fix←word| is a 32-bit integer that represents a real value that has been multiplied by $2↑{20}$. Since \.{PLtoTF} restricts the magnitude of reals to 1024, the |fix←word| will have a magnitude less than $2↑{30}$. @d unity==@'4000000 {$2↑{20}$, the |fix←word| 1.0} @<Types...@>= @!fix←word=integer; {a scaled real value with 20 bits of fraction} @ When a real value is desired, we might as well treat `\.D' and `\.R' formats as if they were identical. @p function get←fix:fix←word; {scans a real property value} var negative:boolean; {was there a minus sign?} @!acc:integer; {an accumulator} @!int←part:integer; {the integer part} @!j:0..7; {the number of decimal places stored} begin repeat get←next; until cur←char<>" "; {skip the blanks before the type code} negative:=false; acc:=0; {start with the accumulators zero} if (cur←char<>"R")and(cur←char<>"D") then skip←error('An "R" or "D" value is needed here') @.An "R" or "D" ... needed here@> else begin @<Scan the blanks and/or signs after the type code@>; while (cur←char>="0") and (cur←char<="9") do @<Multiply by 10, add |cur←char-"0"|, and |get←next|@>; int←part:=acc; acc:=0; if cur←char="." then @<Scan the fraction part and put it in |acc|@>; if (acc>=unity)and(int←part=1023) then skip←error('Real constants must be less than 1024') @.Real constants must be...@> else acc:=int←part*unity+acc; end; if negative then get←fix:=-acc@+else get←fix:=acc; end; @ @<Scan the blanks...@>= repeat get←next; if cur←char="-" then begin cur←char:=" "; negative:=true; end else if cur←char="+" then cur←char:=" "; until cur←char<>" " @ @<Multiply by 10...@>= begin acc:=acc*10+cur←char-"0"; if acc>=1024 then begin skip←error('Real constants must be less than 1024'); @.Real constants must be...@> acc:=0; cur←char:=" "; end else get←next; end @ To scan the fraction $.d←1d←2\ldots\,$, we keep track of up to seven of the digits $d←j$. A correct result is obtained if we first compute $f↑\prime=\lfloor 2↑{21}(d←1\ldots d←j)/10↑j\rfloor$, after which $f=\lfloor(f↑\prime+1)/2\rfloor$. It is possible to have $f=1.0$. @<Glob...@>= @!fraction←digits:array[1..7] of integer; {$2↑{21}$ times $d←j$} @ @<Scan the frac...@>= begin j:=0; get←next; while (cur←char>="0")and(cur←char<="9") do begin if j<7 then begin incr(j); fraction←digits[j]:=@'10000000*(cur←char-"0"); end; get←next; end; acc:=0; while j>0 do begin acc:=fraction←digits[j]+(acc div 10); decr(j); end; acc:=(acc+10) div 20; end @* Storing the property values. When property values have been found, they are squirreled away in a bunch of arrays. The header information is unpacked into bytes in an array called |header←bytes|. The ligature/kerning program is stored in an array of type |four←bytes|; note that such a program is at most 511 steps long, since a label after step 255 may begin a program of length 256. Another |four←bytes| array holds the specifications of extensible characters. The kerns and parameters are stored in separate arrays of |fix←word| values. Instead of storing the design size in the header array, we will keep it in a |fix←word| variable until the last minute. The number of units in the design size is also kept in a |fix←word|. @<Glob...@>= @!header←bytes:array[header←index] of byte; {the header block} @!header←ptr:header←index; {the number of header bytes in use} @!design←size:fix←word; {the design size} @!design←units:fix←word; {reciprocal of the scaling factor} @!seven←bit←safe←flag:boolean; {does the file claim to be seven-bit-safe?} @!lig←kern:array[0..511] of four←bytes; {the ligature program} @!nl:0..511; {the number of ligature/kern instructions so far} @!unused←label:boolean; {was the last lig/kern step a label?} @!kern:array[0..256] of fix←word; {the distinct kerning amounts} @!nk:0..256; {the number of entries of |kern|} @!exten:array[0..255] of four←bytes; {extensible character specs} @!ne:0..256; {the number of extensible characters} @!param:array[1..max←param←words] of fix←word; {\.{fontdimen} parameters} @!np:0..max←param←words; {the largest parameter set nonzero} @!check←sum←specified:boolean; {did the user name the check sum?} @ @<Types...@>= @!header←index=0..max←header←bytes; @ @<Local...@>= @!d:header←index; {an index into |header←bytes|} @ We start by setting up the default values. @d check←sum←loc=0 @d design←size←loc=4 @d coding←scheme←loc=8 @d family←loc=coding←scheme←loc+40 @d seven←flag←loc=family←loc+20 @d face←loc=seven←flag←loc+3 @<Set init...@>= check←sum←specified:=false; for d:=0 to 18*4-1 do header←bytes[d]:=0; header←bytes[8]:=11; header←bytes[9]:="U"; header←bytes[10]:="N"; header←bytes[11]:="S"; header←bytes[12]:="P"; header←bytes[13]:="E"; header←bytes[14]:="C"; header←bytes[15]:="I"; header←bytes[16]:="F"; header←bytes[17]:="I"; header←bytes[18]:="E"; header←bytes[19]:="D"; @.UNSPECIFIED@> for d:=family←loc to family←loc+11 do header←bytes[d]:=header←bytes[d-40]; design←size:=10*unity; design←units:=unity; seven←bit←safe←flag:=false;@/ header←ptr:=18*4; nl:=0; unused←label:=false; nk:=0; ne:=0; np:=0; @ Most of the dimensions, however, go into the |memory| array. There are at most 257 widths, 257 heights, 257 depths, and 257 italic corrections, since the value 0 is required but it need not be used. So |memory| has room for 1028 entries, each of which is a |fix←word|. An auxiliary table called |link| is used to link these words together in linear lists, so that sorting and other operations can be done conveniently. We also add four ``list head'' words to the |memory| and |link| arrays; these are in locations |width| through |italic|, i.e., 1 through 4. For example, |link[height]| points to the smallest element in the sorted list of distinct heights that have appeared so far, and |memory[height]| is the number of distinct heights. @d mem←size=1028+4 {number of nonzero memory addresses} @<Types...@>= @!pointer=0..mem←size; {an index into memory} @ The arrays |char←wd|, |char←ht|, |char←dp|, and |char←ic| contain pointers to the |memory| array entries where the corresponding dimensions appear. Two other arrays, |char←tag| and |char←remainder|, hold the other information that \.{TFM} files pack into a |char←info←word|. @d no←tag=0 {vanilla character} @d lig←tag=1 {character has a ligature/kerning program} @d list←tag=2 {character has a successor in a charlist} @d ext←tag=3 {character is extensible} @<Glob...@>= @!memory:array[pointer] of fix←word; {character dimensions and kerns} @!mem←ptr:pointer; {largest |memory| word in use} @!link:array[pointer] of pointer; {to make lists of |memory| items} @!char←wd:array[byte] of pointer; {pointers to the widths} @!char←ht:array[byte] of pointer; {pointers to the heights} @!char←dp:array[byte] of pointer; {pointers to the depths} @!char←ic:array[byte] of pointer; {pointers to italic corrections} @!char←tag:array[byte] of no←tag..ext←tag; {character tags} @!char←remainder:array[byte] of 0..255; {pointers to ligature labels, next larger characters, or extensible characters} @ @<Local...@>= @!c:byte; {runs through all character codes} @ @<Set init...@>= for c:=0 to 255 do begin char←wd[c]:=0; char←ht[c]:=0; char←dp[c]:=0; char←ic[c]:=0;@/ char←tag[c]:=no←tag; char←remainder[c]:=0; end; memory[0]:=@'17777777777; {an ``infinite'' element at the end of the lists} memory[width]:=0; link[width]:=0; {width list is empty} memory[height]:=0; link[height]:=0; {height list is empty} memory[depth]:=0; link[depth]:=0; {depth list is empty} memory[italic]:=0; link[italic]:=0; {italic list is empty} mem←ptr:=italic; @ As an example of these data structures, let us consider the simple routine that inserts a potentially new element into one of the dimension lists. The first parameter indicates the list head (i.e., |h=width| for the width list, etc.); the second parameter is the value that is to be inserted into the list if it is not already present. The procedure returns the value of the location where the dimension appears in |memory|. The fact that |memory[0]| is larger than any legal dimension makes the algorithm particularly short. We do have to handle two somewhat subtle situations. A width of zero must be put into the list, so that a zero-width character in the font will not appear to be nonexistent (i.e., so that its |char←wd| index will not be zero), but this does not need to be done for heights, depths, or italic corrections. Furthermore, it is necessary to test for memory overflow even though we have provided room for the maximum number of different dimensions in any legal font, since the \.{PL} file might foolishly give any number of different sizes to the same character. @p function sort←in(@!h:pointer;@!d:fix←word):pointer; {inserts into list} var p:pointer; {the current node of interest} begin if (d=0)and(h<>width) then sort←in:=0 else begin p:=h; while d>=memory[link[p]] do p:=link[p]; if (d=memory[p])and(p<>h) then sort←in:=p else if mem←ptr=mem←size then begin err←print('Memory overflow: more than 1028 widths, etc'); @.Memory overflow...@> print←ln('Congratulations! It''s hard to make this error.'); sort←in:=p; end else begin incr(mem←ptr); memory[mem←ptr]:=d; link[mem←ptr]:=link[p]; link[p]:=mem←ptr; incr(memory[h]); sort←in:=mem←ptr; end; end; end; @ When these lists of dimensions are eventually written to the \.{TFM} file, we may have to do some rounding of values, because the \.{TFM} file allows at most 256 widths, 16 heights, 16 depths, and 64 italic corrections. The following procedure takes a given list head |h| and a given dimension |d|, and returns the minimum $m$ such that the elements of the list can be covered by $m$ intervals of width $d$. It also sets |next←d| to the smallest value $d↑\prime>d$ such that the covering found by this procedure would be different. In particular, if $d=0$ it computes the number of elements of the list, and sets |next←d| to the smallest distance between two list elements. (The covering by intervals of width |next←d| is not guaranteed to have fewer than $m$ elements, but in practice this seems to happen most of the time.) @<Glob...@>= @!next←d:fix←word; {the next larger interval that is worth trying} @ Once again we can make good use of the fact that |memory[0]| is ``infinite.'' @p function min←cover(@!h:pointer;@!d:fix←word):integer; var p:pointer; {the current node of interest} @!l:fix←word; {the least element covered by the current interval} @!m:integer; {the current size of the cover being generated} begin m:=0; p:=link[h]; next←d:=memory[0]; while p<>0 do begin incr(m); l:=memory[p]; while memory[link[p]]<=l+d do p:=link[p]; p:=link[p]; if memory[p]-l<next←d then next←d:=memory[p]-l; end; min←cover:=m; end; @ The following procedure uses |min←cover| to determine the smallest $d$ such that a given list can be covered with at most a given number of intervals. @p function shorten(@!h:pointer;m:integer):fix←word; {finds best way to round} var d:fix←word; {the current trial interval length} @!k:integer; {the size of a minimum cover} begin if memory[h]>m then begin k:=min←cover(h,0); d:=next←d; {now the answer is at least |d|} repeat d:=d+d; k:=min←cover(h,d); until k<=m; {first we ascend rapidly until finding the range} d:=d div 2; k:=min←cover(h,d); {now we run through the feasible steps} while k>m do begin d:=next←d; k:=min←cover(h,d); end; shorten:=d; end else shorten:=0; end; @ When we are nearly ready to output the \.{TFM} file, we will set |index[p]:=k| if the dimension in |memory[p]| is being rounded to the |k|th element of its list. @<Glob...@>= @!index:array[pointer] of byte; @ Here is the procedure that sets the |index| values. It also shortens the list so that there is only one element per covering interval; the remaining elements are the midpoints of their clusters. @p procedure set←indices(@!h:pointer;@!d:fix←word); {reduces and indexes a list} var p:pointer; {the current node of interest} @!q:pointer; {trails one step behind |p|} @!m:byte; {index number of nodes in the current interval} @!l:fix←word; {least value in the current interval} begin q:=h; p:=link[q]; m:=0; while p<>0 do begin incr(m); l:=memory[p]; index[p]:=m; while memory[link[p]]<=l+d do begin p:=link[p]; index[p]:=m; end; link[q]:=p; memory[p]:=(l+memory[p]) div 2; q:=p; p:=link[p]; end; memory[h]:=m; end; @* The input phase. We're ready now to read and parse the \.{PL} file, storing property values as we go. @<Glob...@>= @!c:byte; {the current character or byte being processed} @ @<Read all the input@>= cur←char:=" "; repeat while cur←char=" " do get←next; if cur←char="(" then @<Read a font property value@> else if (cur←char=")")and not input←has←ended then begin err←print('Extra right parenthesis'); incr(loc); cur←char:=" "; end @.Extra right parenthesis@> else if not input←has←ended then junk←error; until input←has←ended @ The |junk←error| routine just referred to is called when something appears in the forbidden area between properties of a property list. @p procedure junk←error; {gets past no man's land} begin err←print('There''s junk here that is not in parentheses'); @.There's junk here...@> skip←to←paren; end; @ For each font property, we are supposed to read the data from the left parenthesis that is the current value of |cur←char| to the right parenthesis that matches it in the input. The main complication is to recover with reasonable grace from various error conditions that might arise. @<Read a font property value@>= begin get←name; if cur←code=comment←code then skip←to←end←of←item else if cur←code>character←code then flush←error('This property name doesn''t belong on the outer level') @.This property name doesn't belong...@> else begin @<Read the font property value specified by |cur←code|@>; finish←the←property; end; end @ @<Read the font property value spec...@>= case cur←code of check←sum←code: begin check←sum←specified:=true; read←four←bytes(check←sum←loc); end; design←size←code: @<Read the design size@>; design←units←code: @<Read the design units@>; coding←scheme←code: read←BCPL(coding←scheme←loc,40); family←code: read←BCPL(family←loc,20); face←code:header←bytes[face←loc]:=get←byte; seven←bit←safe←flag←code: @<Read the seven-bit-safe flag@>; header←code: @<Read an indexed header word@>; font←dimen←code: @<Read font parameter list@>; lig←table←code: read←lig←kern; character←code: read←char←info; end @ The |case| statement just given makes use of two subroutines that we haven't defined yet. The first of these puts a 32-bit octal quantity into four specified bytes of the header block. @p procedure read←four←bytes(l:header←index); begin get←four←bytes; header←bytes[l]:=c0; header←bytes[l+1]:=c1; header←bytes[l+2]:=c2; header←bytes[l+3]:=c3; end; @ The second little procedure is used to scan a string and to store it in the ``{\mc BCPL} format'' required by \.{TFM} files. The string is supposed to contain at most |n| bytes, including the first byte (which holds the length of the rest of the string). @p procedure read←BCPL(l:header←index;n:byte); var k:header←index; begin k:=l; while cur←char=" " do get←next; while (cur←char<>"(")and(cur←char<>")") do begin if k<l+n then incr(k); if k<l+n then header←bytes[k]:=cur←char; get←next; end; if k=l+n then begin err←print('String is too long; its first ',n-1:1, @.String is too long...@> ' characters will be kept'); decr(k); end; header←bytes[l]:=k-l; while k<l+n-1 do {tidy up the remaining bytes by setting them to nulls} begin incr(k); header←bytes[k]:=0; end; end; @ @<Read the design size@>= begin next←d:=get←fix; if (next←d<unity)or(next←d>=@'10000000000) then err←print('The design size must be between 1 and 1024') @.The design size must...@> else design←size:=next←d; end @ @<Read the design units@>= begin next←d:=get←fix; if next←d<=0 then err←print('The number of units per design size must be positive') @.The number of units...@> else design←units:=next←d; end @ @<Read the seven-bit-safe...@>= begin while cur←char=" " do get←next; if cur←char="T" then seven←bit←safe←flag:=true else if cur←char="F" then seven←bit←safe←flag:=false else err←print('The flag value should be "TRUE" or "FALSE"'); @.The flag value should be...@> skip←to←paren; end @ @<Read an indexed header word@>= begin c:=get←byte; if c<18 then skip←error('HEADER indices should be 18 or more') @.HEADER indices...@> else if 4*c+4>max←header←bytes then skip←error('This HEADER index is too big for my present table size') @.This HEADER index is too big...@> else begin while header←ptr<4*c do begin header←bytes[header←ptr]:=0; incr(header←ptr); end; read←four←bytes(4*c); header←ptr:=4*c+4; end; end @ The remaining kinds of font property values that need to be read are those that involve property lists on higher levels. Each of these has a loop similar to the one that was used at level zero. Then we put the right parenthesis back so that `|finish←the←property|' will be happy; there is probably a more elegant way to do this. @d finish←inner←property←list==begin decr(loc); incr(level); cur←char:=")"; end @<Read font parameter list@>= begin while level=1 do begin while cur←char=" " do get←next; if cur←char="(" then @<Read a parameter value@> else if cur←char=")" then skip←to←end←of←item else junk←error; end; finish←inner←property←list; end @ @<Read a parameter value@>= begin get←name; if cur←code=comment←code then skip←to←end←of←item else if (cur←code<parameter←code)or(cur←code>=char←wd←code) then flush←error('This property name doesn''t belong in a FONTDIMEN list') @.This property name doesn't belong...@> else begin if cur←code=parameter←code then c:=get←byte else c:=cur←code-parameter←code; if c=0 then flush←error('PARAMETER index must not be zero') @.PARAMETER index must not...@> else if c>max←param←words then flush←error('This PARAMETER index is too big for my present table size') @.This PARAMETER index is too big...@> else begin while np<c do begin incr(np); param[np]:=0; end; param[c]:=get←fix; finish←the←property; end; end; end @ @<Read ligature/kern list@>= begin while level=1 do begin while cur←char=" " do get←next; if cur←char="(" then @<Read a ligature/kern command@> else if cur←char=")" then skip←to←end←of←item else junk←error; end; finish←inner←property←list; end @ @<Read a ligature/kern command@>= begin get←name; if cur←code=comment←code then skip←to←end←of←item else if (cur←code<label←code)or(cur←code>stop←code) then flush←error('This property name doesn''t belong in a LIGTABLE list') @.This property name doesn't belong...@> else begin case cur←code of label←code:@<Read a label step@>; lig←code:@<Read a ligature step@>; krn←code:@<Read a kerning step@>; stop←code:@<Read a stop step@>; end;@/ finish←the←property; end; end @ When a character is about to be tagged, we call the following procedure so that an error message is given in case of multiple tags. @p procedure check←tag(c:byte); {print error if |c| already tagged} begin case char←tag[c] of no←tag: do←nothing; lig←tag: err←print('This character already appeared in a LIGTABLE LABEL'); @.This character already...@> list←tag: err←print('This character already has a NEXTLARGER spec'); ext←tag: err←print('This character already has a VARCHAR spec'); end; end; @ @<Read a label step@>= begin c:=get←byte; check←tag(c); if nl>255 then err←print('LIGTABLE with more than 255 commands cannot have further labels') @.LIGTABLE with more than 255...@> else begin char←tag[c]:=lig←tag; char←remainder[c]:=nl; unused←label:=true; end; end @ @<Read a ligature step@>= begin lig←kern[nl].b0:=0; lig←kern[nl].b1:=get←byte; lig←kern[nl].b2:=0; lig←kern[nl].b3:=get←byte; if nl=511 then err←print('LIGTABLE should never exceed 511 LIG/KRN commands') @.LIGTABLE should never...@> else incr(nl); unused←label:=false; end @ @d stop←flag=128 {value indicating `\.{STOP}' in a lig/kern program} @d kern←flag=128 {op code for a kern step} @<Globals...@>= @!krn←ptr:0..256; {an index into |kern|} @ @<Read a kerning step@>= begin lig←kern[nl].b0:=0; lig←kern[nl].b1:=get←byte; lig←kern[nl].b2:=kern←flag; kern[nk]:=get←fix; krn←ptr:=0; while kern[krn←ptr]<>kern[nk] do incr(krn←ptr); if krn←ptr=nk then begin if nk<256 then incr(nk) else begin err←print('At most 256 different kerns are allowed'); @.At most 256 different kerns...@> krn←ptr:=255; end; end; lig←kern[nl].b3:=krn←ptr; if nl=511 then err←print('LIGTABLE should never exceed 511 LIG/KRN commands') @.LIGTABLE should never...@> else incr(nl); unused←label:=false; end @ @<Read a stop step@>= begin if nl=0 then err←print('Why STOP? You haven''t started') @.Why STOP?...@> else begin if unused←label then begin err←print('STOP after LABEL invalidates the label'); @.STOP after LABEL...@> for c:=0 to 255 do if (char←tag[c]=lig←tag)and(char←remainder[c]=nl) then char←tag[c]:=no←tag; unused←label:=false; end; lig←kern[nl-1].b0:=stop←flag; end; end @ Finally we come to the part of \.{PLtoTF}'s input mechanism that is used most, the processing of individual character data. @<Read character info list@>= begin c:=get←byte; {read the character code that is being specified} @<Print |c| in octal notation@>; while level=1 do begin while cur←char=" " do get←next; if cur←char="(" then @<Read a character property@> else if cur←char=")" then skip←to←end←of←item else junk←error; end; if char←wd[c]=0 then char←wd[c]:=sort←in(width,0); {legitimatize |c|} finish←inner←property←list; end @ @<Read a character prop...@>= begin get←name; if cur←code=comment←code then skip←to←end←of←item else if (cur←code<char←wd←code)or(cur←code>var←char←code) then flush←error('This property name doesn''t belong in a CHARACTER list') @.This property name doesn't belong...@> else begin case cur←code of char←wd←code:char←wd[c]:=sort←in(width,get←fix); char←ht←code:char←ht[c]:=sort←in(height,get←fix); char←dp←code:char←dp[c]:=sort←in(depth,get←fix); char←ic←code:char←ic[c]:=sort←in(italic,get←fix); next←larger←code:begin check←tag(c); char←tag[c]:=list←tag; char←remainder[c]:=get←byte; end; var←char←code:@<Read an extensible recipe for |c|@>; end;@/ finish←the←property; end; end @ @<Read an extensible r...@>= begin if ne=256 then err←print('At most 256 VARCHAR specs are allowed') @.At most 256 VARCHAR specs...@> else begin check←tag(c); char←tag[c]:=ext←tag; char←remainder[c]:=ne;@/ exten[ne].b0:=0; exten[ne].b1:=0; exten[ne].b2:=0; exten[ne].b3:=0; while level=2 do begin while cur←char=" " do get←next; if cur←char="(" then @<Read an extensible piece@> else if cur←char=")" then skip←to←end←of←item else junk←error; end; incr(ne); finish←inner←property←list; end; end @ @<Read an extensible p...@>= begin get←name; if cur←code=comment←code then skip←to←end←of←item else if (cur←code<var←char←code+1)or(cur←code>var←char←code+4) then flush←error('This property name doesn''t belong in a VARCHAR list') @.This property name doesn't belong...@> else begin case cur←code-(var←char←code+1) of 0:exten[ne].b0:=get←byte; 1:exten[ne].b1:=get←byte; 2:exten[ne].b2:=get←byte; 3:exten[ne].b3:=get←byte; end;@/ finish←the←property; end; end @ The input routine is now complete except for the following code, which prints a progress report as the file is being read. @p procedure print←octal(c:byte); {prints three octal digits} begin print('''',(c div 64):1,((c div 8) mod 8):1,(c mod 8):1); end; @ @<Print |c| in octal...@>= begin if chars←on←line=8 then begin print←ln(' '); chars←on←line:=1; end else begin if chars←on←line>0 then print(' '); incr(chars←on←line); end; print←octal(c); {progress report} end @* The checking and massaging phase. Once the whole \.{PL} file has been read in, we must check it for consistency and correct any errors. This process consists mainly of running through the characters that exist and seeing if they refer to characters that don't exist. We also compute the true value of |seven←unsafe|; we make sure that the charlists contain no loops; and we shorten the lists of widths, heights, depths, and italic corrections, if necessary, to keep from exceeding the required maximum sizes. @<Glob...@>= @!seven←unsafe:boolean; {do seven-bit characters generate eight-bit ones?} @ @<Correct and check the information@>= @<Make sure the ligature/kerning program ends with `\.{STOP}'@>; seven←unsafe:=false; for c:=0 to 255 do if char←wd[c]<>0 then @<For all characters |g| generated by |c|, make sure that |char←wd[g]| is nonzero, and set |seven←unsafe| if |c<128<=g|@>; if seven←bit←safe←flag and seven←unsafe then print←ln('The font is not really seven-bit-safe!'); @.The font is not...safe@> @<Doublecheck the lig/kern commands and the extensible recipes@>; for c:=0 to 255 do @<Make sure that |c| is not the largest element of a charlist cycle@>; @<Put the width, height, depth, and italic lists into final form@> @ @<Make sure the ligature/kerning program ends...@>= if unused←label then begin for c:=0 to 255 do if (char←tag[c]=lig←tag)and(char←remainder[c]=nl) then char←tag[c]:=no←tag; print←ln('Last LIGTABLE LABEL was not used.'); @.Last LIGTABLE LABEL...@> end; if nl>0 then lig←kern[nl-1].b0:=stop←flag @ The checking that we need in several places is accomplished by two macros that are only slightly tricky. @d existence←tail(#)==begin char←wd[g]:=sort←in(width,0); print(#,' '); print←octal(c); print←ln(' had no CHARACTER spec.'); end; end @d check←existence(#)==begin g:=#; if (g>=128)and(c<128) then seven←unsafe:=true; if char←wd[g]=0 then existence←tail @<For all characters |g| generated by |c|...@>= case char←tag[c] of no←tag: do←nothing; lig←tag: @<Check ligature program of |c|@>; list←tag: check←existence(char←remainder[c])('The character NEXTLARGER than'); @.The character NEXTLARGER...@> ext←tag:@<Check the pieces of |exten[c]|@>; end @ @<Glo...@>= @!lig←ptr:0..511; {an index into |lig←kern|} @ @<Check lig...@>= begin if char←wd[c]=0 then begin print('There''s a LABEL but no CHARACTER spec for '); @.There's a LABEL but...@> print←octal(c); print←ln('.'); char←wd[c]:=sort←in(width,0); end; lig←ptr:=char←remainder[c]; repeat if lig←kern[lig←ptr].b2<kern←flag then begin check←existence(lig←kern[lig←ptr].b1)('LIG character generated by'); check←existence(lig←kern[lig←ptr].b3)('LIG character generated by'); @.LIG character generated...@> end else check←existence(lig←kern[lig←ptr].b1)('KRN character generated by'); @.KRN character generated...@> incr(lig←ptr); until lig←kern[lig←ptr-1].b0=stop←flag; end @ @<Check the pieces...@>= begin if exten[char←remainder[c]].b0>0 then check←existence(exten[char←remainder[c]].b0)('TOP piece of character'); @.TOP piece of character...@> if exten[char←remainder[c]].b1>0 then check←existence(exten[char←remainder[c]].b1)('MID piece of character'); @.MID piece of character...@> if exten[char←remainder[c]].b2>0 then check←existence(exten[char←remainder[c]].b2)('BOT piece of character'); @.BOT piece of character...@> check←existence(exten[char←remainder[c]].b3)('REP piece of character'); @.REP piece of character...@> end @ The lig/kern program may still contain references to nonexistent characters, if parts of that program are never used. Similarly, there may be extensible characters that are never used, because they were overridden by \.{NEXTLARGER}, say. This would produce an invalid \.{TFM} file; so we must fix such errors. @d double←check←tail(#)==@t\1@>if char←wd[0]=0 then char←wd[0]:=sort←in(width,0); print('Unused ',#,' refers to nonexistent character '); print←octal(c); print←ln('!'); end; end @d double←check←lig(#)==begin c:=lig←kern[lig←ptr].#; if char←wd[c]=0 then begin lig←kern[lig←ptr].#:=0; double←check←tail @d double←check←ext(#)==begin c:=exten[g].#; if c>0 then if char←wd[c]=0 then begin exten[g].#:=0; double←check←tail @d double←check←rep(#)==begin c:=exten[g].#; if char←wd[c]=0 then begin exten[g].#:=0; double←check←tail @<Doublecheck...@>= if nl>0 then for lig←ptr:=0 to nl-1 do if lig←kern[lig←ptr].b2<kern←flag then double←check←lig(b3)('LIG step') else double←check←lig(b1)('KRN step'); @.Unused LIG step...@> @.Unused KRN step...@> if ne>0 then for g:=0 to ne-1 do begin double←check←ext(b0)('VARCHAR TOP'); double←check←ext(b1)('VARCHAR MID'); double←check←ext(b2)('VARCHAR BOT'); double←check←rep(b3)('VARCHAR REP'); @.Unused VARCHAR...@> end @ @<Make sure that |c| is not the largest element of a charlist cycle@>= if char←tag[c]=list←tag then begin g:=char←remainder[c]; while (g<c)and(char←tag[g]=list←tag) do g:=char←remainder[g]; if g=c then begin char←tag[c]:=no←tag; print('A cycle of NEXTLARGER characters has been broken at '); @.A cycle of NEXTLARGER...@> print←octal(c); print←ln('.'); end; end @ @<Glob...@>= @!delta:fix←word; {size of the intervals needed for rounding} @ @d round←message(#)==if delta>0 then print←ln('I had to round some ', @.I had to round...@> #,'s by',(((delta+1) div 2)/@'4000000):1:7,' units.') @<Put the width, height, depth, and italic lists into final form@>= delta:=shorten(width,255); set←indices(width,delta); round←message('width');@/ delta:=shorten(height,15); set←indices(height,delta); round←message('height');@/ delta:=shorten(depth,15); set←indices(depth,delta); round←message('depth');@/ delta:=shorten(italic,63); set←indices(italic,delta); round←message('italic correction'); @* The output phase. Now that we know how to get all of the font data correctly stored in \.{PLtoTF}'s memory, it only remains to write the answers out. First of all, it is convenient to have an abbreviation for output to the \.{TFM} file: @d out(#)==write(tfm←file,#) @ The general plan for producing \.{TFM} files is long but simple: @<Do the output@>= @<Compute the twelve subfile sizes@>; @<Output the twelve subfile sizes@>; @<Output the header block@>; @<Output the character info@>; @<Output the dimensions themselves@>; @<Output the ligature/kern program@>; @<Output the extensible character recipes@>; @<Output the parameters@> @ A \.{TFM} file begins with 12 numbers that tell how big its subfiles are. We already know most of these numbers; for example, the number of distinct widths is |memory[width]+1|, where the $+1$ accounts for the zero width that is always supposed to be present. But we still should compute the beginning and ending character codes (|bc| and |ec|), the number of header words (|lh|), and the total number of words in the \.{TFM} file (|lf|). @<Gl...@>= @!bc:byte; {the smallest character code in the font} @!ec:byte; {the largest character code in the font} @!lh:byte; {the number of words in the header block} @!lf:0..32767; {the number of words in the entire \.{TFM} file} @!not←found:boolean; {has a font character been found?} @!temp←width:fix←word; {width being used to compute a check sum} @ It might turn out that no characters exist at all. But \.{PLtoTF} keeps going and writes the \.{TFM} anyway. In this case |ec| will be~0 and |bc| will be~1. @<Compute the twelve...@>= lh:=header←ptr div 4;@/ not←found:=true; bc:=0; while not←found do if (char←wd[bc]>0)or(bc=255) then not←found:=false else incr(bc); not←found:=true; ec:=255; while not←found do if (char←wd[ec]>0)or(ec=0) then not←found:=false else decr(ec); if bc>ec then bc:=1; incr(memory[width]); incr(memory[height]); incr(memory[depth]); incr(memory[italic]);@/ lf:=6+lh+(ec-bc+1)+memory[width]+memory[height]+memory[depth]+ memory[italic]+nl+nk+ne+np; @ @d out←size(#)==out((#) div 256); out((#) mod 256) @<Output the twelve subfile sizes@>= out←size(lf); out←size(lh); out←size(bc); out←size(ec); out←size(memory[width]); out←size(memory[height]); out←size(memory[depth]); out←size(memory[italic]); out←size(nl); out←size(nk); out←size(ne); out←size(np); @ The routines that follow need a few temporary variables of different types. @<Gl...@>= @!j:0..max←header←bytes; {index into |header←bytes|} @!p:pointer; {index into |memory|} @!q:width..italic; {runs through the list heads for dimensions} @!par←ptr:0..max←param←words; {runs through the parameters} @ The header block follows the subfile sizes. The necessary information all appears in |header←bytes|, except that the design size and the seven-bit-safe flag must still be set. @<Output the header block@>= if not check←sum←specified then @<Compute the check sum@>; header←bytes[design←size←loc]:=design←size div @'100000000; {this works since |design←size>0|} header←bytes[design←size←loc+1]:=(design←size div @'200000) mod 256; header←bytes[design←size←loc+2]:=(design←size div 256) mod 256; header←bytes[design←size←loc+3]:=design←size mod 256; if not seven←unsafe then header←bytes[seven←flag←loc]:=128; for j:=0 to header←ptr-1 do out(header←bytes[j]); @ @<Compute the check sum@>= begin c0:=bc; c1:=ec; c2:=bc; c3:=ec; for c:=bc to ec do if char←wd[c]>0 then begin temp←width:=memory[char←wd[c]]; if design←units<>unity then temp←width:=trunc((temp←width/design←units)*1048576.0); temp←width:=temp←width + (c+4)*@'20000000; {this should be positive} c0:=(c0+c0+temp←width) mod 255; c1:=(c1+c1+temp←width) mod 253; c2:=(c2+c2+temp←width) mod 251; c3:=(c3+c3+temp←width) mod 247; end; header←bytes[check←sum←loc]:=c0; header←bytes[check←sum←loc+1]:=c1; header←bytes[check←sum←loc+2]:=c2; header←bytes[check←sum←loc+3]:=c3; end @ The next block contains packed |char←info|. @<Output the character info@>= index[0]:=0; for c:=bc to ec do begin out(index[char←wd[c]]); out(index[char←ht[c]]*16+index[char←dp[c]]); out(index[char←ic[c]]*4+char←tag[c]); out(char←remainder[c]); end @ When a scaled quantity is output, we may need to divide it by |design←units|. The following subroutine takes care of this, using floating point arithmetic only if |design←units<>1.0|. @p procedure out←scaled(x:fix←word); {outputs a scaled |fix←word|} var z:real; {a number to output after conversion to fixed point} @!n:byte; {the first byte after the sign} @!m:0..65535; {the two least significant bytes} begin if abs(x/design←units)>=16.0 then begin print←ln('The relative dimension',x/@'4000000:1:3, ' is too large.'); @.The relative dimension...@> print(' (Must be less than 16*designsize'); if design←units<>unity then print(' =',design←units/@'200000:1:3, ' designunits'); print←ln(')'); x:=0; end; if x<0 then out(255) else out(0); if design←units=unity then begin if x<0 then x:=x+@'100000000; n:=x div @'200000; m:=x mod @'200000; end else begin z:=(x/design←units)*16.0; if z<0 then z:=z+256.0; n:=trunc(z); m:=trunc(65536.0*(z-n)); end; out(n); out(m div 256); out(m mod 256); end; @ We have output the packed indices for individual characters. The scaled widths, heights, depths, and italic corrections are next. @<Output the dimensions themselves@>= for q:=width to italic do begin out(0); out(0); out(0); out(0); {output the zero word} p:=link[q]; {head of list} while p>0 do begin out←scaled(memory[p]); p:=link[p]; end; end; @ @<Output the ligature/kern program@>= if nl>0 then for lig←ptr:=0 to nl-1 do begin out(lig←kern[lig←ptr].b0); out(lig←kern[lig←ptr].b1); out(lig←kern[lig←ptr].b2); out(lig←kern[lig←ptr].b3); end; if nk>0 then for krn←ptr:=0 to nk-1 do out←scaled(kern[krn←ptr]) @ @<Output the extensible character recipes@>= if ne>0 then for c:=0 to ne-1 do begin out(exten[c].b0); out(exten[c].b1); out(exten[c].b2); out(exten[c].b3); end; @ For our grand finale, we wind everything up by outputting the parameters. @<Output the parameters@>= for par←ptr:=1 to np do begin if par←ptr=1 then @<Output the slant (|param[1]|) without scaling@> else out←scaled(param[par←ptr]); end @ @<Output the slant...@>= begin if param[1]<0 then begin param[1]:=param[1]+@'10000000000; out((param[1] div @'100000000)+256-64); end else out(param[1] div @'100000000); out((param[1] div @'200000) mod 256); out((param[1] div 256) mod 256); out(param[1] mod 256); end @* The main program. The routines sketched out so far need to be packaged into separate procedures, on some systems, since some \PASCAL\ compilers place a strict limit on the size of a routine. The packaging is done here in an attempt to avoid some system-dependent changes. @p procedure param←enter; begin @<Enter the parameter names@>; end; @# procedure name←enter; {enter all names and their equivalents} begin @<Enter all of the names...@>; param←enter; end; @# procedure read←lig←kern; var krn←ptr:0..256; {an index into |kern|} @!c:byte; {runs through all character codes} begin @<Read ligature/kern list@>; end; @# procedure read←char←info; begin @<Read character info list@>; end; @# procedure read←input; begin @<Read all the input@>; end; @# procedure corr←and←check; var c:byte; {runs through all character codes} @!lig←ptr:0..511; {an index into |lig←kern|} @!g:byte; {a character generated by the current character |c|} begin @<Correct and check the information@> end; @ Here is where \.{PLtoTF} begins and ends. @p begin initialize;@/ name←enter;@/ read←input; print←ln('.');@/ corr←and←check;@/ @<Do the output@>; end. @* System-dependent changes. This section should be replaced, if necessary, by changes to the program that are necessary to make \.{PLtoTF} work at a particular installation. It is usually best to design your change file so that all changes to previous sections preserve the section numbering; then everybody's version will be consistent with the printed program. More extensive changes, which introduce new sections, can be inserted here; then only the index itself will get a new section number. @↑system dependencies@> @* Index. Pointers to error messages appear here together with the section numbers where each ident\-i\-fier is used.