File [Cyan]<CedarChest6.1>PLtoTF>PLtoTF.web!1

% This program by D. E. Knuth is not copyrighted and can be used freely.
% Version 0 was implemented in January 1982.
% In February 1982 a new restriction on ligature steps was added.
% In June 1982 the routines were divided into smaller pieces for IBM people.
% Hex was added in September 1982, and the result became "Version 1".
% Version 1.1 fixed a bug in section 28 (since eoln is undefined after eof).
% Slight changes were made in October, 1982, for version 0.6 of TeX.
% Version 1.2 fixed a bug in section 115 (TOP, MID, and BOT can be zero)
% Version 1.3 (April 1983) blanked out unused BCPL header bytes
% Version 2 (July 1983) was released with TeX version 0.999.
% Version 2.1 (September 1983) changed TEXINFO to FONTDIMEN.
% Version 2.2 (May 1985) added checksum computation to match METAFONT.
% Version 2.3 (August 1985) introduced `backup' to fix a minor bug.

% Here is TeX material that gets inserted after \input webmac
\def\hang{\hangindent 3em\indent\ignorespaces}
\font\ninerm=cmr9
\let\mc=\ninerm % medium caps for names like SAIL
\def\PASCAL{Pascal}
\font\logo=logo10 % for the METAFONT logo
\def\MF={\logo METAFONT}

\def\(#1){} % this is used to make section names sort themselves better
\def\9#1{} % this is used for sort keys in the index

\def\title{PL\lowercase{to}TF}
\def\contentspagenumber{301}
\def\topofcontents{\null
  \def\titlepage{F} % include headline on the contents page
  \def\rheader{\mainfont\hfil \contentspagenumber}
  \vfill
  \centerline{\titlefont The {\ttitlefont PLtoTF} processor}
  \vskip 15pt
  \centerline{(Version 2.3, August 1985)}
  \vfill}
\def\botofcontents{\vfill
  \centerline{\hsize 5in\baselineskip9pt
    \vbox{\ninerm\noindent
    The preparation of this report
    was supported in part by the National Science
    Foundation under grants IST-8201926 and MCS-8300984,
    and by the System Development Foundation. `\TeX' is a
    trademark of the American Mathematical Society.}}}
\pageno=\contentspagenumber \advance\pageno by 1

@* Introduction.
The \.{PLtoTF} utility program converts property-list (``\.{PL}'')
files into equivalent \TeX\ font metric (``\.{TFM}'') files. It also
makes a thorough check of the given \.{PL} file, so that the \.{TFM}
file should be acceptable to \TeX.

The first \.{PLtoTF} program was designed by Leo Guibas in the summer of
1978. Contributions by Frank Liang, Doug Wyatt, and Lyle Ramshaw
also had a significant effect on the evolution of the present code.

The |banner| string defined here should be changed whenever \.{PLtoTF}
gets modified.

@d banner=='This is PLtoTF, Version 2.3' {printed when the program starts}

@ This program is written entirely in standard \PASCAL, except that
it has to do some slightly system-dependent character code conversion
on input. Furthermore, lower case letters are used in error messages;
they could be converted to upper case if necessary. The input is read
from |pl←file|, and the output is written on |tfm←file|; error messages and
other remarks are written on the |output| file, which the user may
choose to assign to the terminal if the system permits it.
@↑system dependencies@>

The term |print| is used instead of |write| when this program writes on
the |output| file, so that all such output can be easily deflected.

@d print(#)==write(#)
@d print←ln(#)==write←ln(#)

@p program PLtoTF(@!pl←file,@!tfm←file,@!output);
const @<Constants in the outer block@>@/
type @<Types in the outer block@>@/
var @<Globals in the outer block@>@/
procedure initialize; {this procedure gets things started properly}
  var @<Local variables for initialization@>@/
  begin print←ln(banner);@/
  @<Set initial values@>@/
  end;

@ The following parameters can be changed at compile time to extend or
reduce \.{PLtoTF}'s capacity.

@<Constants...@>=
@!buf←size=60; {length of lines displayed in error messages}
@!max←header←bytes=100; {four times the maximum number of words allowed in
  the \.{TFM} file header block, must be 1024 or less}
@!max←param←words=30; {the maximum number of \.{fontdimen} parameters allowed}

@ Here are some macros for common programming idioms.

@d incr(#) == #:=#+1 {increase a variable by unity}
@d decr(#) == #:=#-1 {decrease a variable by unity}
@d do←nothing == {empty statement}

@* Property list description of font metric data.
The idea behind \.{PL} files is that precise details about fonts, i.e., the
facts that are needed by typesetting routines like \TeX, sometimes have to
be supplied by hand. The nested property-list format provides a reasonably
convenient way to do this.

A good deal of computation is necessary to parse and process a
\.{PL} file, so it would be inappropriate for \TeX\ itself to do this
every time it loads a font. \TeX\ deals only with the compact descriptions
of font metric data that appear in \.{TFM} files. Such data is so compact,
however, it is almost impossible for anybody but a computer to read it.
The purpose of \.{PLtoTF} is to convert from a human-oriented file of text
to a computer-oriented file of binary numbers.

@<Glob...@>=
@!pl←file:text;

@ @<Set init...@>=
reset(pl←file);

@ A \.{PL} file is a list of entries of the form
$$\.{(PROPERTYNAME VALUE)}$$
where the property name is one of a finite set of names understood by
this program, and the value may itself in turn be a property list.
The idea is best understood by looking at an example, so let's consider
a fragment of the \.{PL} file for a hypothetical font.
$$\vbox{\halign{\.{#}\hfil\cr
(FAMILY NOVA)\cr
(FACE F MIE)\cr
(CODINGSCHEME ASCII)\cr
(DESIGNSIZE D 10)\cr
(DESIGNUNITS D 18)\cr
(COMMENT A COMMENT IS IGNORED)\cr
(COMMENT (EXCEPT THIS ONE ISN'T))\cr
(COMMENT (ACTUALLY IT IS, EVEN THOUGH\cr
\qquad\qquad IT SAYS IT ISN'T))\cr
(FONTDIMEN\cr
\qquad   (SLANT R -.25)\cr
\qquad   (SPACE D 6)\cr
\qquad   (SHRINK D 2)\cr
\qquad   (STRETCH D 3)\cr
\qquad   (XHEIGHT R 10.55)\cr
\qquad   (QUAD D 18)\cr
\qquad   )\cr
(LIGTABLE\cr
\qquad   (LABEL C f)\cr
\qquad   (LIG C i O 200)\cr
\qquad   (LIG C f O 201)\cr
\qquad   (KRN O 51 R 1.5)\cr
\qquad   (STOP)\cr
\qquad   (LABEL O 201)\cr
\qquad   (LIG C i O 203)\cr
\qquad   (STOP)\cr
\qquad   )\cr
(CHARACTER C f\cr
\qquad   (CHARWD D 6)\cr
\qquad   (CHARHT R 13.5)\cr
\qquad   (CHARIC R 1.5)\cr
\qquad   )\cr}}$$
This example says that the font whose metric information is being described
belongs to the hypothetical
\.{NOVA} family; its face code is medium italic extended;
and the characters appear in ASCII code positions. The design size is 10 points,
and all other sizes in this \.{PL} file are given in units such that 18 units
equals the design size. The font is slanted with a slope of $-.25$ (hence the
letters actually slant backward---perhaps that is why the family name is
\.{NOVA}). The normal space between words is 6 units (i.e., one third of
the 18-unit design size), with glue that shrinks by 2 units or stretches by 3.
The letters for which accents don't need to be raised or lowered are 10.55
units high, and one em equals 18 units.

The example ligature table specifies that the letter \.f followed by \.i
is changed to code @'200, while \.f followed by \.f is changed to @'201; and
if \.f is followed by the code @'51 (which is a right parenthesis) an
additional 1.5 units of space should be inserted after the \.f. The character
code @'201 is changed to @'203 if it is followed by \.i; thus, the sequence
\.{ffi} leads to code @'203, which is presumably where the `ffi' ligature
appears in the font.

Character \.f itself is 6 units wide and 13.5 units tall, in this example.
Its depth is zero (since \.{CHARDP} is not given), and its italic correction
is 1.5 units.

@ The example above illustrates most of the features found in \.{PL} files.
Note that some property names, like \.{FAMILY} or \.{COMMENT}, take a
string as their value; this string continues until the first unmatched
right parenthesis. But most property names, like \.{DESIGNSIZE} and \.{SLANT}
and \.{LABEL}, take a number as their value. This number can be expressed in
a variety of ways, indicated by a prefixed code; \.D stands for decimal,
\.H for hexadecimal, \.O for octal, \.R for real, \.C for character, and
\.F for ``face.''  Other property names, like \.{LIG}, take two numbers as
their value.  And still other names, like \.{FONTDIMEN} and \.{LIGTABLE} and
\.{CHARACTER}, have more complicated values that involve property lists.

A property name is supposed to be used only in an appropriate property
list.  For example, \.{CHARWD} shouldn't occur on the outer level or
within \.{FONTDIMEN}.

The individual property-and-value pairs in a property list can appear in
any order. For instance, `\.{SHRINK}' precedes `\.{STRETCH}' in the above
example, although the \.{TFM} file always puts the stretch parameter first.
One could even give the information about characters like `\.f' before
specifying the number of units in the design size, or before specifying the
ligature and kerning table. However, the \.{LIGTABLE} itself is an exception
to this rule; the individual elements of the \.{LIGTABLE} property list
can be reordered only to a certain extent without changing the meaning
of that table.

If property-and-value pairs are omitted, a default value is used. For example,
we have already noted that the default for \.{CHARDP} is zero. The default
for {\sl every\/} numeric value is, in fact, zero, unless otherwise stated
below.

If the same property name is used more than once, \.{PLtoTF} will not notice
the discrepancy; it simply uses the final value given. Once again, however, the
\.{LIGTABLE} is an exception to this rule; \.{PLtoTF} will complain if there
is more than one label for some character. And of course many of the
entries in the \.{LIGTABLE} property list have the same property name.

From these rules, you can guess (correctly) that \.{PLtoTF} operates in four
main steps. First it assigns the default values to all properties; then it scans
through the \.{PL} file, changing property values as new ones are seen; then
it checks the information and corrects any problems; and finally it outputs
the \.{TFM} file.

@ Instead of relying on a hypothetical example, let's consider a complete
grammar for \.{PL} files. At the outer level, the following property names
are valid:

\yskip\hang\.{CHECKSUM} (four-byte value). The value, which should be a
nonnegative integer less than $2↑{32}$, is used to identify a particular
version of a font; it should match the check sum value stored with the font
itself. A check sum of zero, which is the default, is used to bypass
check sum testing. If no checksum is specified in the \.{PL} file,
\.{PLtoTF} will compute the checksum that \MF\ would compute from the
same data.

\yskip\hang\.{DESIGNSIZE} (numeric value, default is 10). The value, which
should be a real number in the range |1.0<=x<1024|, represents the default
amount by which all quantities will be scaled if the font is not loaded
with an `\.{at}' specification. For example, if one says
`\.{\\font A=cmr10 at 15pt}' in \TeX\ language, the design size in the \.{TFM}
file is ignored and effectively replaced by 15 points; but if one simply
says `\.{\\font A=cmr10}' the stated design size is used. This quantity is
always in units of printer's points.

\yskip\hang\.{DESIGNUNITS} (numeric value, default is 1). The value
should be a positive real number; it says how many units equals the design
size (or the eventual `\.{at}' size, if the font is being scaled). For
example, suppose you have a font that has been digitized with 600 pixels per
em, and the design size is one em; then you could say `\.{(DESIGNUNITS D 600)}'
if you wanted to give all of your measurements in units of pixels.

\yskip\hang\.{CODINGSCHEME} (string value, default is `\.{UNSPECIFIED}').
The string should not contain parentheses, and its length must be less than 40.
It identifies the correspondence between the numeric codes and font characters.
(\TeX\ ignores this information, but other software programs make use of it.)

\yskip\hang\.{FAMILY} (string value, default is `\.{UNSPECIFIED}').
The string should not contain parentheses, and its length must be less than 20.
It identifies the name of the family to which this font belongs, e.g.,
`\.{HELVETICA}'.  (\TeX\ ignores this information; but it is needed, for
example, when converting \.{DVI} files to \.{PRESS} files for Xerox
equipment.)

\yskip\hang\.{FACE} (one-byte value). This number, which must lie between
0 and 255 inclusive, is a subsidiary ident\-ifi\-ca\-tion of the font within its
family. For example, bold italic condensed fonts might have the same family name
as light roman extended fonts, differing only in their face byte.  (\TeX\
ignores this information; but it is needed, for example, when converting
\.{DVI} files to \.{PRESS} files for Xerox equipment.)

\yskip\hang\.{SEVENBITSAFEFLAG} (string value, default is `\.{FALSE}'). The
value should start with either `\.T' (true) or `\.F' (false). If true, character
codes less than 128 cannot lead to codes of 128 or more via ligatures or
charlists or extensible characters. (\TeX82 ignores this flag, but older
versions of \TeX\ would only accept \.{TFM} files that were seven-bit safe.)
\.{PLtoTF} computes the correct value of this flag and gives an error message
only if a claimed ``true'' value is incorrect.

\yskip\hang\.{HEADER} (a one-byte value followed by a four-byte value).
The one-byte value should be between 18 and a maximum limit that can be
raised or lowered depending on the compile-time setting of |max←header←bytes|.
The four-byte value goes into the header word whose index is the one-byte
value; for example, to set |header[18]:=1|, one may write
`\.{(HEADER D 18 O 1)}'. This notation is used for header information that
is presently unnamed. (\TeX\ ignores it.)

\yskip\hang\.{FONTDIMEN} (property list value). See below for the names
allowed in this property list.

\yskip\hang\.{LIGTABLE} (property list value). See below for the rules
about this special kind of property list.

\yskip\hang\.{CHARACTER}. The value is a one-byte integer followed by
a property list. The integer represents the number of a character that is
present in the font; the property list of a character is defined below.
The default is an empty property list.

@ Numeric property list values can be given in various forms identified by
a prefixed letter.

\yskip\hang\.C denotes an ASCII character, which should be a standard visible
character that is not a parenthesis. The numeric value will therefore be
between @'41 and @'176 but not @'50 or @'51.

\yskip\hang\.D denotes a decimal integer, which must be nonnegative and
less than 256. (Use \.R for larger values or for negative values.)

\yskip\hang\.F denotes a three-letter Xerox face code; the admissible codes
are \.{MRR}, \.{MIR}, \.{BRR}, \.{BIR}, \.{LRR}, \.{LIR}, \.{MRC}, \.{MIC},
\.{BRC}, \.{BIC}, \.{LRC}, \.{LIC}, \.{MRE}, \.{MIE}, \.{BRE}, \.{BIE},
\.{LRE}, and \.{LIE}, denoting the integers 0 to 17, respectively.

\yskip\hang\.O denotes an unsigned octal integer, which must be less than
$2↑{32}$, i.e., at most `\.{O 37777777777}'.

\yskip\hang\.H denotes an unsigned hexadecimal integer, which must be less than
$2↑{32}$, i.e., at most `\.{H FFFFFFFF}'.

\yskip\hang\.R denotes a real number in decimal notation, optionally preceded
by a `\.+' or `\.-' sign, and optionally including a decimal point. The
absolute value must be less than 1024.

@ The property names allowed in a \.{FONTDIMEN} property list correspond to
various \TeX\ parameters, each of which has a (real) numeric value. All
of the parameters except \.{SLANT} are in design-size units. The admissible
names are \.{SLANT}, \.{SPACE}, \.{STRETCH}, \.{SHRINK}, \.{XHEIGHT},
\.{QUAD}, \.{EXTRASPACE}, \.{NUM1}, \.{NUM2}, \.{NUM3}, \.{DENOM1},
\.{DENOM2}, \.{SUP1}, \.{SUP2}, \.{SUP3}, \.{SUB1}, \.{SUB2}, \.{SUPDROP},
\.{SUBDROP}, \.{DELIM1}, \.{DELIM2}, and \.{AXISHEIGHT}, for parameters
1~to~22. The alternate names \.{DEFAULTRULETHICKNESS},
\.{BIGOPSPACING1}, \.{BIGOPSPACING2}, \.{BIGOPSPACING3},
\.{BIGOPSPACING4}, and \.{BIGOPSPACING5}, may also be used for parameters
8 to 13.

The notation `\.{PARAMETER} $n$' provides another way to specify the
$n$th parameter; for example, `\.{(PARAMETER} \.{D 1 R -.25)}' is another way
to specify that the \.{SLANT} is $-0.25$. The value of $n$ must be positive
and less than |max←param←words|.

@ The elements of a \.{CHARACTER} property list can be of six different types.

\yskip\hang\.{CHARWD} (real value) denotes the character's width in
design-size units.

\yskip\hang\.{CHARHT} (real value) denotes the character's height in
design-size units.

\yskip\hang\.{CHARDP} (real value) denotes the character's depth in
design-size units.

\yskip\hang\.{CHARIC} (real value) denotes the character's italic correction in
design-size units.

\yskip\hang\.{NEXTLARGER} (one-byte value), specifies the character that
follows the present one in a ``charlist.'' The value must be the number of a
character in the font, and there must be no infinite cycles of supposedly
larger and larger characters.

\yskip\hang\.{VARCHAR} (property list value), specifies an extensible character.
This option and \.{NEXTLARGER} are mutually exclusive; i.e., they cannot
both be used within the same \.{CHARACTER} list.

\yskip\noindent
The elements of a \.{VARCHAR} property list are either \.{TOP}, \.{MID},
\.{BOT} or \.{REP}; the values are integers, which must be zero or the number
or a character in the font. A zero value for \.{TOP}, \.{MID}, or \.{BOT} means
that the corresponding piece of the extensible character is absent. A nonzero
value, or a \.{REP} value of zero, denotes the character code used to make
up the top, middle, bottom, or replicated piece of an extensible character.

@ A \.{LIGTABLE} property list contains elements of four kinds, specifying a
program in a simple command language that \TeX\ uses for ligatures and kerns.

\yskip\hang\.{LABEL} (one-byte value) means that the program for the
stated character value starts here. The integer must be the number of a
character in the font; its \.{CHARACTER} property list must not have a
\.{NEXTLARGER} or \.{VARCHAR} field.

\yskip\hang\.{LIG} (two one-byte values). The instruction `\.{(LIG} $c$ $r$\.)'
means, ``If the next character is $c$, then replace both the current character
and $c$ by the character $r$; otherwise go on to the next instruction.''
Character $r$ must be present in the font, but $c$ need not be.

\yskip\hang\.{KRN} (a one-byte value and a real value). The instruction
`\.{(KRN} $c$ $r$\.)' means, ``If the next character is $c$, then insert
a blank space of width $r$ between the current character character and $c$;
otherwise go on to the next intruction.'' The value of $r$, which is in
units of the design size, is often negative. Character code $c$ must exist
in the font.

\yskip\hang\.{STOP} (no value). This instruction ends a ligature/kern program.
It must follow either a \.{LIG} or \.{KRN} instruction, not a \.{LABEL}
or \.{STOP}.

@ In addition to all these possibilities, the property name \.{COMMENT} is
allowed in any property list. Such comments are ignored.

@ So that is what \.{PL} files hold. The next question is, ``What about
\.{TFM} files?'' A complete answer to that question appears in the
documentation of the companion program, \.{TFtoPL}, so it will not
be repeated here. Suffice it to say that a \.{TFM} file stores all of the
relevant font information in a sequence of 8-bit bytes. The number of
bytes is always a multiple of 4, so we could regard the \.{TFM} file
as a sequence of 32-bit words; but \TeX\ uses the byte interpretation,
and so does \.{PLtoTF}. Note that the bytes are considered to be unsigned
numbers.

@<Glob...@>=
@!tfm←file:packed file of 0..255;

@ On some systems you may have to do something special to write a
packed file of bytes. For example, the following code didn't work
when it was first tried at Stanford, because packed files have to be
opened with a special switch setting on the \PASCAL\ that was used.
@↑system dependencies@>

@<Set init...@>=
rewrite(tfm←file);

@* Basic input routines.
For the purposes of this program, a |byte| is an unsigned eight-bit quantity,
and an |ASCII←code| is an integer between @'40 and @'177. Such ASCII codes
correspond to one-character constants like \.{"A"} in \.{WEB} language.

@<Types...@>=
@!byte=0..255; {unsigned eight-bit quantity}
@!ASCII←code=@'40..@'177; {standard ASCII code numbers}

@ One of the things \.{PLtoTF} has to do is convert characters of strings
to ASCII form, since that is the code used for the family name and the
coding scheme in a \.{TFM} file. An array |xord| is used to do the
conversion from |char|; the method below should work with little or no change
on most \PASCAL\ systems.
@↑system dependencies@>

@d first←ord=0 {ordinal number of the smallest element of |char|}
@d last←ord=127 {ordinal number of the largest element of |char|}

@<Global...@>=
@!xord:array[char] of ASCII←code; {conversion table}

@ @<Local variables for init...@>=
@!k:first←ord..last←ord; {an index used for clearing |xord|}

@ Characters that should not appear in \.{PL} files (except in comments)
are mapped into @'177.

@d invalid←code=@'177 {code deserving an error message}

@<Set init...@>=
for k:=first←ord to last←ord do xord[chr(k)]:=invalid←code;
xord[' ']:=" "; xord['!']:="!"; xord['"']:=""""; xord['#']:="#";
xord['$']:="$"; xord['%']:="%"; xord['&']:="&"; xord['''']:="'";
xord['(']:="("; xord[')']:=")"; xord['*']:="*"; xord['+']:="+"; xord[',']:=",";
xord['-']:="-"; xord['.']:="."; xord['/']:="/"; xord['0']:="0"; xord['1']:="1";
xord['2']:="2"; xord['3']:="3"; xord['4']:="4"; xord['5']:="5"; xord['6']:="6";
xord['7']:="7"; xord['8']:="8"; xord['9']:="9"; xord[':']:=":"; xord[';']:=";";
xord['<']:="<"; xord['=']:="="; xord['>']:=">"; xord['?']:="?";
xord['@@']:="@@"; xord['A']:="A"; xord['B']:="B"; xord['C']:="C";
xord['D']:="D"; xord['E']:="E"; xord['F']:="F"; xord['G']:="G"; xord['H']:="H";
xord['I']:="I"; xord['J']:="J"; xord['K']:="K"; xord['L']:="L"; xord['M']:="M";
xord['N']:="N"; xord['O']:="O"; xord['P']:="P"; xord['Q']:="Q"; xord['R']:="R";
xord['S']:="S"; xord['T']:="T"; xord['U']:="U"; xord['V']:="V"; xord['W']:="W";
xord['X']:="X"; xord['Y']:="Y"; xord['Z']:="Z"; xord['[']:="["; xord['\']:="\";
xord[']']:="]"; xord['↑']:="↑"; xord['←']:="←"; xord['`']:="`"; xord['a']:="a";
xord['b']:="b"; xord['c']:="c"; xord['d']:="d"; xord['e']:="e"; xord['f']:="f";
xord['g']:="g"; xord['h']:="h"; xord['i']:="i"; xord['j']:="j"; xord['k']:="k";
xord['l']:="l"; xord['m']:="m"; xord['n']:="n"; xord['o']:="o"; xord['p']:="p";
xord['q']:="q"; xord['r']:="r"; xord['s']:="s"; xord['t']:="t"; xord['u']:="u";
xord['v']:="v"; xord['w']:="w"; xord['x']:="x"; xord['y']:="y"; xord['z']:="z";
xord['{']:="{"; xord['|']:="|"; xord['}']:="}"; xord['~']:="~";

@ In order to help catch errors of badly nested parentheses, \.{PLtoTF}
assumes that the user will begin each line with a number of blank spaces equal
to some constant times the number of open parentheses at the beginning of
that line. However, the program doesn't know in advance what the constant
is, nor does it want to print an error message on every line for a user
who has followed no consistent pattern of indentation.

Therefore the following strategy is adopted: If the user has been consistent
with indentation for ten or more lines, an indentation error will be
reported. The constant of indentation is reset on every line that should
have nonzero indentation.

@<Glob...@>=
@!line:integer; {the number of the current line}
@!good←indent:integer; {the number of lines since the last bad indentation}
@!indent: integer; {the number of spaces per open parenthesis, zero if unknown}
@!level: integer; {the current number of open parentheses}

@ @<Set init...@>=
line:=0; good←indent:=0; indent:=0; level:=0;

@ The input need not really be broken into lines of any maximum length, and
we could read it character by character without any buffering. But we shall
place it into a small buffer so that offending lines can be displayed in error
messages.

@<Glob...@>=
@!left←ln,@!right←ln:boolean; {are the left and right ends of the buffer
  at end-of-line marks?}
@!limit:0..buf←size; {position of the last character present in the buffer}
@!loc:0..buf←size; {position of the last character read in the buffer}
@!buffer:array[1..buf←size] of char;
@!input←has←ended:boolean; {there is no more input to read}

@ @<Set init...@>=
limit:=0; loc:=0; left←ln:=true; right←ln:=true; input←has←ended:=false;

@ Just before each  \.{CHARACTER} property list is evaluated, the character
code is printed in octal notation. Up to eight such codes appear on a line;
so we have a variable to keep track of how many are currently there.

@<Glob...@>=
@!chars←on←line:0..8; {the number of characters printed on the current line}

@ @<Set init...@>=
chars←on←line:=0;

@ The following routine prints an error message and an indication of
where the error was detected. The error message should not include any
final punctuation, since this procedure suppies its own.

@d err←print(#)==begin if chars←on←line>0 then print←ln(' ');
  print(#); show←error←context;
  end

@p procedure show←error←context; {prints the current scanner location}
var k:0..buf←size; {an index into |buffer|}
begin print←ln(' (line ',line:1,').');
if not left←ln then print('...');
for k:=1 to loc do print(buffer[k]); {print the characters already scanned}
print←ln(' ');
if not left←ln then print('   ');
for k:=1 to loc do print(' '); {space out the second line}
for k:=loc+1 to limit do print(buffer[k]); {print the characters yet unseen}
if right←ln then print←ln(' ')@+else print←ln('...');
chars←on←line:=0;
end;

@ Here is a procedure that does the right thing when we are done
reading the present contents of the buffer. It keeps |buffer[buf←size]|
empty, in order to avoid range errors on certain \PASCAL\ compilers.

An infinite sequence of right parentheses is placed at the end of the
file, so that the program is sure to get out of whatever level of nesting
it is in.

On some systems it is desirable to modify this code so that tab marks
in the buffer are replaced by blank spaces. (Simply setting
|xord[chr(@'11)]:=" "| would not work; for example, two-line
error messages would not come out properly aligned.)
@↑system dependencies@>

@p procedure fill←buffer;
begin left←ln:=right←ln; limit:=0; loc:=0;
if left←ln then
  begin if line>0 then read←ln(pl←file);
  incr(line);
  end;
if eof(pl←file) then
  begin limit:=1; buffer[1]:=')'; right←ln:=false; input←has←ended:=true;
  end
else  begin while (limit<buf←size-1)and(not eoln(pl←file)) do
    begin incr(limit); read(pl←file,buffer[limit]);
    end;
  buffer[limit+1]:=' '; right←ln:=eoln(pl←file);
  if left←ln then @<Set |loc| to the number of leading blanks in
    the buffer, and check the indentation@>;
  end;
end;

@ The interesting part about |fill←buffer| is the part that learns what
indentation conventions the user is following, if any.

@d bad←indent(#)==begin if good←indent>=10 then err←print(#);
  good←indent:=0; indent:=0;
  end

@<Set |loc|...@>=
begin while (loc<limit)and(buffer[loc+1]=' ') do incr(loc);
if loc<limit then
  begin if level=0 then
    if loc=0 then incr(good←indent)
    else bad←indent('Warning: Indented line occurred at level zero')
@.Warning: Indented line...@>
  else if indent=0 then
    if (loc div level)*level=loc then
      begin indent:=loc div level; good←indent:=1;
      end
    else good←indent:=0
  else if indent*level=loc then incr(good←indent)
  else bad←indent('Warning: Inconsistent indentation; ',
@.Warning: Inconsistent indentation...@>
    'you are at parenthesis level ',level:1);
  end;
end

@* Basic scanning routines.
The global variable |cur←char| holds the ASCII code corresponding to the
character most recently read from the input buffer, or to a character that
has been substituted for the real one.

@<Global...@>=
@!cur←char:ASCII←code; {we have just read this}

@ Here is a procedure that sets |cur←char| to an ASCII code for the
next character of input, if that character is a letter or digit. Otherwise
it sets |cur←char:=" "|, and the input system will be poised to reread the
character that was rejected, whether or not it was a space.
Lower case letters are converted to upper case.

@p procedure get←letter←or←digit;
begin while (loc=limit)and(not right←ln) do fill←buffer;
if loc=limit then cur←char:=" " {end-of-line counts as a delimiter}
else  begin cur←char:=xord[buffer[loc+1]];
  if cur←char>="a" then cur←char:=cur←char-@'40;
  if ((cur←char>="0")and(cur←char<="9"))or
    ((cur←char>="A")and(cur←char<="Z")) then incr(loc)
  else cur←char:=" ";
  end;
end;

@ The following procedure sets |cur←char| to the next character code,
and converts lower case to upper case. If the character is a left or
right parenthesis, it will not be ``digested''; the character will
be read again and again, until the calling routine does something
like `|incr(loc)|' to get past it. Such special treatment of parentheses
insures that the structural information they contain won't be lost in
the midst of other error recovery operations.

@d backup==begin if (cur←char>")")or(cur←char<"(") then decr(loc);
  end {undoes the effect of |get←next|}

@p procedure get←next; {sets |cur←char| to next, balks at parentheses}
begin while loc=limit do fill←buffer;
incr(loc); cur←char:=xord[buffer[loc]];
if cur←char>="a" then
  if cur←char<="z" then cur←char:=cur←char-@'40 {uppercasify}
  else  begin if cur←char=invalid←code then
      begin err←print('Illegal character in the file');
@.Illegal character...@>
      cur←char:="?";
      end;
    end
else if (cur←char<=")")and(cur←char>="(") then decr(loc);
end;

@ The next procedure is used to ignore the text of a comment, or to pass over
erroneous material. As such, it has the privilege of passing parentheses.
It stops after the first right parenthesis that drops the level below
the level in force when the procedure was called.

@p procedure skip←to←end←of←item;
var l:integer; {initial value of |level|}
begin l:=level;
while level>=l do
  begin while loc=limit do fill←buffer;
  incr(loc);
  if buffer[loc]=')' then decr(level)
  else if buffer[loc]='(' then incr(level);
  end;
if input←has←ended then err←print('File ended unexpectedly: No closing ")"');
@.File ended unexpectedly...@>
cur←char:=" "; {now the right parenthesis has been read and digested}
end;

@ Sometimes we merely want to skip past characters in the input until we
reach a left or a right parenthesis. For example, we do this whenever we
have finished scanning a property value and we hope that a right parenthesis
is next (except for possible blank spaces).

@d skip←to←paren==repeat get←next@;@+ until (cur←char="(")or(cur←char=")")
@d skip←error(#)==begin err←print(#); skip←to←paren;
  end {this gets to the right parenthesis if something goes wrong}
@d flush←error(#)==begin err←print(#); skip←to←end←of←item;
  end {this gets past the right parenthesis if something goes wrong}

@ After a property value has been scanned, we want to move just past the
right parenthesis that should come next in the input (except for possible
blank spaces).

@p procedure finish←the←property; {do this when the value has been scanned}
begin while cur←char=" " do get←next;
if cur←char<>")" then err←print('Junk after property value will be ignored');
@.Junk after property value...@>
skip←to←end←of←item;
end;

@* Scanning property names.
We have to figure out the meaning of names that appear in the \.{PL} file,
by looking them up in a dictionary of known keywords. Keyword number $n$
appears in locations |start[n]| through |start[n+1]-1| of an array called
|dictionary|.

@d max←name←index=66 {upper bound on the number of keywords}
@d max←letters=500 {upper bound on the total length of all keywords}

@<Global...@>=
@!start:array[1..max←name←index] of 0..max←letters;
@!dictionary:array[0..max←letters] of ASCII←code;
@!start←ptr:0..max←name←index; {the first available place in |start|}
@!dict←ptr:0..max←letters; {the first available place in |dictionary|}

@ @<Set init...@>=
start←ptr:=1; start[1]:=0; dict←ptr:=0;

@ When we are looking for a name, we put it into the |cur←name| array.
When we have found it, the corresponding |start| index will go into
the global variable |name←ptr|.

@d longest←name=20 {length of \.{DEFAULTRULETHICKNESS}}

@<Glob...@>=
@!cur←name:array[1..longest←name] of ASCII←code; {a name to look up}
@!name←length:0..longest←name; {its length}
@!name←ptr:0..max←name←index; {its ordinal number in the dictionary}

@ A conventional hash table with linear probing (cf.\ Algorithm 6.4L
in {\sl The Art of Computer Pro\-gram\-ming\/}) is used for the dictionary
operations. If |hash[h]=0|, the table position is empty, otherwise |hash[h]|
points into the |start| array.

@d hash←prime=101 {size of the hash table}

@<Glob...@>=
@!hash:array[0..hash←prime-1] of 0..max←name←index;
@!cur←hash:0..hash←prime-1; {current position in the hash table}

@ @<Local...@>=
@!h:0..hash←prime-1; {runs through the hash table}

@ @<Set init...@>=
for h:=0 to hash←prime-1 do hash[h]:=0;

@ Since there is no chance of the hash table overflowing, the procedure
is very simple. After |lookup| has done its work, |cur←hash| will point
to the place where the given name was found, or where it should be inserted.

@p procedure lookup; {finds |cur←name| in the dictionary}
var k:0..longest←name; {index into |cur←name|}
@!j:0..max←letters; {index into |dictionary|}
@!not←found:boolean; {clumsy thing necessary to avoid |goto| statement}
begin @<Compute the hash code, |cur←hash|, for |cur←name|@>;
not←found:=true;
while not←found do
  begin if cur←hash=0 then cur←hash:=hash←prime-1@+else decr(cur←hash);
  if hash[cur←hash]=0 then not←found:=false
  else  begin j:=start[hash[cur←hash]];
    if start[hash[cur←hash]+1]=j+name←length then
      begin not←found:=false;
      for k:=1 to name←length do
      if dictionary[j+k-1]<>cur←name[k] then not←found:=true;
      end;
    end;
  end;
name←ptr:=hash[cur←hash];
end;

@ @<Compute the hash...@>=
cur←hash:=cur←name[1];
for k:=2 to name←length do
  cur←hash:=(cur←hash+cur←hash+cur←name[k]) mod hash←prime

@ The ``meaning'' of the keyword that begins at |start[k]| in the
dictionary is kept in |equiv[k]|. The numeric |equiv| codes are given
symbolic meanings by the following definitions.

@d comment←code=0
@d check←sum←code=1
@d design←size←code=2
@d design←units←code=3
@d coding←scheme←code=4
@d family←code=5
@d face←code=6
@d seven←bit←safe←flag←code=7
@d header←code= 8
@d font←dimen←code=9
@d lig←table←code=10
@d character←code=11
@d parameter←code=20
@d char←info←code=50
@d width=1
@d height=2
@d depth=3
@d italic=4
@d char←wd←code=char←info←code+width
@d char←ht←code=char←info←code+height
@d char←dp←code=char←info←code+depth
@d char←ic←code=char←info←code+italic
@d next←larger←code=55
@d var←char←code=56
@d label←code=70
@d lig←code=71
@d krn←code=72
@d stop←code=73

@<Glo...@>=
@!equiv:array[0..max←name←index] of byte;
@!cur←code:byte; {equivalent most recently found in |equiv|}

@ We have to get the keywords into the hash table and into the dictionary in
the first place (sigh). The procedure that does this has the desired
|equiv| code as a parameter. In order to facilitate \.{WEB} macro writing
for the initialization, the keyword being initialized is placed into the
last positions of |cur←name|, instead of the first positions.

@p procedure enter←name(v:byte); {|cur←name| goes into the dictionary}
var k:0..longest←name;
begin for k:=1 to name←length do
  cur←name[k]:=cur←name[k+longest←name-name←length];
{now the name has been shifted into the correct position}
lookup; {this sets |cur←hash| to the proper insertion place}
hash[cur←hash]:=start←ptr; equiv[start←ptr]:=v;
for k:=1 to name←length do
  begin dictionary[dict←ptr]:=cur←name[k]; incr(dict←ptr);
  end;
incr(start←ptr); start[start←ptr]:=dict←ptr;
end;

@ Here are the macros to load a name of up to 20 letters into the
dictionary. For example, the macro |load5| is used for five-letter keywords.

@d tail(#)==enter←name(#)
@d t20(#)==cur←name[20]:=#;tail
@d t19(#)==cur←name[19]:=#;t20
@d t18(#)==cur←name[18]:=#;t19
@d t17(#)==cur←name[17]:=#;t18
@d t16(#)==cur←name[16]:=#;t17
@d t15(#)==cur←name[15]:=#;t16
@d t14(#)==cur←name[14]:=#;t15
@d t13(#)==cur←name[13]:=#;t14
@d t12(#)==cur←name[12]:=#;t13
@d t11(#)==cur←name[11]:=#;t12
@d t10(#)==cur←name[10]:=#;t11
@d t9(#)==cur←name[9]:=#;t10
@d t8(#)==cur←name[8]:=#;t9
@d t7(#)==cur←name[7]:=#;t8
@d t6(#)==cur←name[6]:=#;t7
@d t5(#)==cur←name[5]:=#;t6
@d t4(#)==cur←name[4]:=#;t5
@d t3(#)==cur←name[3]:=#;t4
@d t2(#)==cur←name[2]:=#;t3
@d t1(#)==cur←name[1]:=#;t2
@d load3==name←length:=3;t18
@d load4==name←length:=4;t17
@d load5==name←length:=5;t16
@d load6==name←length:=6;t15
@d load7==name←length:=7;t14
@d load8==name←length:=8;t13
@d load9==name←length:=9;t12
@d load10==name←length:=10;t11
@d load11==name←length:=11;t10
@d load12==name←length:=12;t9
@d load13==name←length:=13;t8
@d load14==name←length:=14;t7
@d load15==name←length:=15;t6
@d load16==name←length:=16;t5
@d load17==name←length:=17;t4
@d load18==name←length:=18;t3
@d load19==name←length:=19;t2
@d load20==name←length:=20;t1

@ @<Enter all of the names and their equivalents, except the parameter names@>=
equiv[0]:=comment←code; {this is used after unknown keywords}
load8("C")("H")("E")("C")("K")("S")("U")("M")(check←sum←code);@/
load10("D")("E")("S")("I")("G")("N")("S")("I")("Z")("E")(design←size←code);@/
load11("D")("E")("S")("I")("G")("N")
  ("U")("N")("I")("T")("S")(design←units←code);@/
load12("C")("O")("D")("I")("N")("G")
  ("S")("C")("H")("E")("M")("E")(coding←scheme←code);@/
load6("F")("A")("M")("I")("L")("Y")(family←code);@/
load4("F")("A")("C")("E")(face←code);@/
load16("S")("E")("V")("E")("N")("B")("I")("T")@/@t\hskip2em@>
  ("S")("A")("F")("E")("F")("L")("A")("G")(seven←bit←safe←flag←code);@/
load6("H")("E")("A")("D")("E")("R")(header←code);@/
load9("F")("O")("N")("T")("D")("I")("M")("E")("N")(font←dimen←code);@/
load8("L")("I")("G")("T")("A")("B")("L")("E")(lig←table←code);@/
load9("C")("H")("A")("R")("A")("C")("T")("E")("R")(character←code);@/
load9("P")("A")("R")("A")("M")("E")("T")("E")("R")(parameter←code);@/
load6("C")("H")("A")("R")("W")("D")(char←wd←code);@/
load6("C")("H")("A")("R")("H")("T")(char←ht←code);@/
load6("C")("H")("A")("R")("D")("P")(char←dp←code);@/
load6("C")("H")("A")("R")("I")("C")(char←ic←code);@/
load10("N")("E")("X")("T")("L")("A")("R")("G")("E")("R")(next←larger←code);@/
load7("V")("A")("R")("C")("H")("A")("R")(var←char←code);@/
load3("T")("O")("P")(var←char←code+1);@/
load3("M")("I")("D")(var←char←code+2);@/
load3("B")("O")("T")(var←char←code+3);@/
load3("R")("E")("P")(var←char←code+4);@/
load3("E")("X")("T")(var←char←code+4); {compatibility with older \.{PL} format}
load7("C")("O")("M")("M")("E")("N")("T")(comment←code);@/
load5("L")("A")("B")("E")("L")(label←code);@/
load3("L")("I")("G")(lig←code);@/
load3("K")("R")("N")(krn←code);@/
load4("S")("T")("O")("P")(stop←code);

@ @<Enter the parameter names@>=
load5("S")("L")("A")("N")("T")(parameter←code+1);@/
load5("S")("P")("A")("C")("E")(parameter←code+2);@/
load7("S")("T")("R")("E")("T")("C")("H")(parameter←code+3);@/
load6("S")("H")("R")("I")("N")("K")(parameter←code+4);@/
load7("X")("H")("E")("I")("G")("H")("T")(parameter←code+5);@/
load4("Q")("U")("A")("D")(parameter←code+6);@/
load10("E")("X")("T")("R")("A")("S")("P")("A")("C")("E")(parameter←code+7);@/
load4("N")("U")("M")("1")(parameter←code+8);@/
load4("N")("U")("M")("2")(parameter←code+9);@/
load4("N")("U")("M")("3")(parameter←code+10);@/
load6("D")("E")("N")("O")("M")("1")(parameter←code+11);@/
load6("D")("E")("N")("O")("M")("2")(parameter←code+12);@/
load4("S")("U")("P")("1")(parameter←code+13);@/
load4("S")("U")("P")("2")(parameter←code+14);@/
load4("S")("U")("P")("3")(parameter←code+15);@/
load4("S")("U")("B")("1")(parameter←code+16);@/
load4("S")("U")("B")("2")(parameter←code+17);@/
load7("S")("U")("P")("D")("R")("O")("P")(parameter←code+18);@/
load7("S")("U")("B")("D")("R")("O")("P")(parameter←code+19);@/
load6("D")("E")("L")("I")("M")("1")(parameter←code+20);@/
load6("D")("E")("L")("I")("M")("2")(parameter←code+21);@/
load10("A")("X")("I")("S")("H")("E")("I")("G")("H")("T")(parameter←code+22);@/
load20("D")("E")("F")("A")("U")("L")("T")("R")("U")("L")("E")@/@t\hskip2em@>
  ("T")("H")("I")("C")("K")("N")("E")("S")("S")(parameter←code+8);@/
load13("B")("I")("G")("O")("P")
  ("S")("P")("A")("C")("I")("N")("G")("1")(parameter←code+9);@/
load13("B")("I")("G")("O")("P")
  ("S")("P")("A")("C")("I")("N")("G")("2")(parameter←code+10);@/
load13("B")("I")("G")("O")("P")
  ("S")("P")("A")("C")("I")("N")("G")("3")(parameter←code+11);@/
load13("B")("I")("G")("O")("P")
  ("S")("P")("A")("C")("I")("N")("G")("4")(parameter←code+12);@/
load13("B")("I")("G")("O")("P")
  ("S")("P")("A")("C")("I")("N")("G")("5")(parameter←code+13);@/

@ When a left parenthesis has been scanned, the following routine
is used to interpret the keyword that follows, and to store the
equivalent value in |cur←code|.

@p procedure get←name;
begin incr(loc); incr(level); {pass the left parenthesis}
cur←char:=" ";
while cur←char=" " do get←next;
if (cur←char>")")or(cur←char<"(") then decr(loc); {back up one character}
name←length:=0; get←letter←or←digit; {prepare to scan the name}
while cur←char<>" " do
  begin if name←length=longest←name then cur←name[1]:="X" {force error}
  else incr(name←length);
  cur←name[name←length]:=cur←char;
  get←letter←or←digit;
  end;
lookup;
if name←ptr=0 then err←print('Sorry, I don''t know that property name');
@.Sorry, I don't know...@>
cur←code:=equiv[name←ptr];
end;

@* Scanning numeric data.
The next thing we need is a trio of subroutines to read the one-byte,
four-byte, and real numbers that may appear as property values.
These subroutines are careful to stick to numbers between $-2↑{31}$
and $2↑{31}-1$, inclusive, so that a computer with two's complement
32-bit arithmetic will not be interrupted by overflow.

@ The first number scanner, which returns a one-byte value, surely has
no problems of arithmetic overflow.

@p function get←byte:byte; {scans a one-byte property value}
var acc:integer; {an accumulator}
@!t:ASCII←code; {the type of value to be scanned}
begin repeat get←next;
until cur←char<>" "; {skip the blanks before the type code}
t:=cur←char; acc:=0;
repeat get←next;
until cur←char<>" "; {skip the blanks after the type code}
if t="C" then @<Scan an ASCII character code@>
else if t="D" then @<Scan a small decimal number@>
else if t="O" then @<Scan a small octal number@>
else if t="H" then @<Scan a small hexadecimal number@>
else if t="F" then @<Scan a face code@>
else skip←error('You need "C" or "D" or "O" or "H" or "F" here');
@.You need "C" or "D" ...here@>
cur←char:=" "; get←byte:=acc;
end;

@ The |get←next| routine converts lower case to upper case, but it leaves
the character in the buffer, so we can unconvert it.

@<Scan an ASCII...@>=
if (cur←char>=@'41)and(cur←char<=@'176)and
 ((cur←char<"(")or(cur←char>")")) then
  acc:=xord[buffer[loc]]
else skip←error('"C" value must be standard ASCII and not a paren')
@:C value}\.{"C" value must be...@>

@ @<Scan a small dec...@>=
begin while (cur←char>="0")and(cur←char<="9") do
  begin acc:=acc*10+cur←char-"0";
  if acc>255 then
    begin skip←error('This value shouldn''t exceed 255');
@.This value shouldn't...@>
    acc:=0; cur←char:=" ";
    end
  else get←next;
  end;
backup;
end

@ @<Scan a small oct...@>=
begin while (cur←char>="0")and(cur←char<="7") do
  begin acc:=acc*8+cur←char-"0";
  if acc>255 then
    begin skip←error('This value shouldn''t exceed ''377');
@.This value shouldn't...@>
    acc:=0; cur←char:=" ";
    end
  else get←next;
  end;
backup;
end

@ @<Scan a small hex...@>=
begin while ((cur←char>="0")and(cur←char<="9"))or
   ((cur←char>="A")and(cur←char<="F")) do
  begin if cur←char>="A" then cur←char:=cur←char+"0"+10-"A";
  acc:=acc*16+cur←char-"0";
  if acc>255 then
    begin skip←error('This value shouldn''t exceed "FF');
@.This value shouldn't...@>
    acc:=0; cur←char:=" ";
    end
  else get←next;
  end;
backup;
end

@ @<Scan a face...@>=
begin if cur←char="B" then acc:=2
else if cur←char="L" then acc:=4
else if cur←char<>"M" then acc:=18;
get←next;
if cur←char="I" then incr(acc)
else if cur←char<>"R" then acc:=18;
get←next;
if cur←char="C" then acc:=acc+6
else if cur←char="E" then acc:=acc+12
else if cur←char<>"R" then acc:=18;
if acc>=18 then
  begin skip←error('Illegal face code, I changed it to MRR');
@.Illegal face code...@>
  acc:=0;
  end;
end

@ The routine that scans a four-byte value puts its output into |cur←bytes|,
which is a record containing (yes, you guessed it) four bytes.

@<Types...@>=
@!four←bytes=record @!b0:byte;@+@!b1:byte;@+@!b2:byte;@+@!b3:byte;@+end;

@ @d c0==cur←bytes.b0
@d c1==cur←bytes.b1
@d c2==cur←bytes.b2
@d c3==cur←bytes.b3

@<Glob...@>=
cur←bytes:four←bytes; {a four-byte accumulator}

@ Since the |get←four←bytes| routine is used very infrequently, no attempt
has been made to make it fast; we only want it to work.

@p procedure get←four←bytes; {scans an octal constant and sets |four←bytes|}
var c:integer; {leading byte}
@!r:integer; {radix}
@!q:integer; {|256/r|}
begin repeat get←next;
until cur←char<>" "; {skip the blanks before the type code}
r:=0; c0:=0; c1:=0; c2:=0; c3:=0; {start with the accumulator zero}
if cur←char="H" then r:=16
else if cur←char="O" then r:=8
else skip←error('An octal ("O") or hex ("H") value is needed here');
@.An octal ("O") or hex ("H")...@>
if r>0 then
  begin q:=256 div r;
  repeat get←next;
  until cur←char<>" "; {skip the blanks after the type code}
  while ((cur←char>="0")and(cur←char<="9"))or@|
      ((cur←char>="A")and(cur←char<="F")) do
    @<Multiply by |r|, add |cur←char-"0"|, and |get←next|@>;
  end;
end;

@ @<Multiply by |r|...@>=
begin if cur←char>="A" then cur←char:=cur←char+"0"+10-"A";
c:=(r*c0)+(c1 div q);
if c>255 then
  begin c0:=0; c1:=0; c2:=0; c3:=0;
  if r=8 then
    skip←error('Sorry, the maximum octal value is O 37777777777')
@.Sorry, the maximum octal...@>
  else skip←error('Sorry, the maximum hex value is H FFFFFFFF');
@.Sorry, the maximum hex...@>
  end
else if cur←char>="0"+r then skip←error('Illegal digit')
@.Illegal digit@>
else  begin c0:=c;
  c1:=(r*(c1 mod q))+(c2 div q);
  c2:=(r*(c2 mod q))+(c3 div q);
  c3:=(r*(c3 mod q))+cur←char-"0";
  get←next;
  end;
end;

@ The remaining scanning routine is the most interesting. It scans a real
constant and returns the nearest |fix←word| approximation to that constant.
A |fix←word| is a 32-bit integer that represents a real value that
has been multiplied by $2↑{20}$. Since \.{PLtoTF} restricts the magnitude
of reals to 1024, the |fix←word| will have a magnitude less than $2↑{30}$.

@d unity==@'4000000 {$2↑{20}$, the |fix←word| 1.0}

@<Types...@>=
@!fix←word=integer; {a scaled real value with 20 bits of fraction}

@ When a real value is desired, we might as well treat `\.D' and `\.R'
formats as if they were identical.

@p function get←fix:fix←word; {scans a real property value}
var negative:boolean; {was there a minus sign?}
@!acc:integer; {an accumulator}
@!int←part:integer; {the integer part}
@!j:0..7; {the number of decimal places stored}
begin repeat get←next;
until cur←char<>" "; {skip the blanks before the type code}
negative:=false; acc:=0; {start with the accumulators zero}
if (cur←char<>"R")and(cur←char<>"D") then
  skip←error('An "R" or "D" value is needed here')
@.An "R" or "D" ... needed here@>
else  begin @<Scan the blanks and/or signs after the type code@>;
  while (cur←char>="0") and (cur←char<="9") do
    @<Multiply by 10, add |cur←char-"0"|, and |get←next|@>;
  int←part:=acc; acc:=0;
  if cur←char="." then @<Scan the fraction part and put it in |acc|@>;
  if (acc>=unity)and(int←part=1023) then
    skip←error('Real constants must be less than 1024')
@.Real constants must be...@>
  else acc:=int←part*unity+acc;
  end;
if negative then get←fix:=-acc@+else get←fix:=acc;
end;

@ @<Scan the blanks...@>=
repeat get←next;
if cur←char="-" then
  begin cur←char:=" "; negative:=true;
  end
else if cur←char="+" then cur←char:=" ";
until cur←char<>" "

@ @<Multiply by 10...@>=
begin acc:=acc*10+cur←char-"0";
if acc>=1024 then
  begin skip←error('Real constants must be less than 1024');
@.Real constants must be...@>
  acc:=0; cur←char:=" ";
  end
else get←next;
end

@ To scan the fraction $.d←1d←2\ldots\,$, we keep track of up to seven
of the digits $d←j$. A correct result is obtained if we first compute
$f↑\prime=\lfloor 2↑{21}(d←1\ldots d←j)/10↑j\rfloor$, after which
$f=\lfloor(f↑\prime+1)/2\rfloor$. It is possible to have $f=1.0$.

@<Glob...@>=
@!fraction←digits:array[1..7] of integer; {$2↑{21}$ times $d←j$}

@ @<Scan the frac...@>=
begin j:=0; get←next;
while (cur←char>="0")and(cur←char<="9") do
  begin if j<7 then
    begin incr(j); fraction←digits[j]:=@'10000000*(cur←char-"0");
    end;
  get←next;
  end;
acc:=0;
while j>0 do
  begin acc:=fraction←digits[j]+(acc div 10); decr(j);
  end;
acc:=(acc+10) div 20;
end

@* Storing the property values.
When property values have been found, they are squirreled away in a bunch
of arrays. The header information is unpacked into bytes in an array
called |header←bytes|. The ligature/kerning program is stored in an array
of type |four←bytes|; note that such a program is at most 511 steps
long, since a label after step 255 may begin a program of length 256.
Another |four←bytes| array holds the specifications of extensible characters.
The kerns and parameters are stored in separate arrays of |fix←word| values.

Instead of storing the design size in the header array, we will keep it
in a |fix←word| variable until the last minute. The number of units in the
design size is also kept in a |fix←word|.

@<Glob...@>=
@!header←bytes:array[header←index] of byte; {the header block}
@!header←ptr:header←index; {the number of header bytes in use}
@!design←size:fix←word; {the design size}
@!design←units:fix←word; {reciprocal of the scaling factor}
@!seven←bit←safe←flag:boolean; {does the file claim to be seven-bit-safe?}
@!lig←kern:array[0..511] of four←bytes; {the ligature program}
@!nl:0..511; {the number of ligature/kern instructions so far}
@!unused←label:boolean; {was the last lig/kern step a label?}
@!kern:array[0..256] of fix←word; {the distinct kerning amounts}
@!nk:0..256; {the number of entries of |kern|}
@!exten:array[0..255] of four←bytes; {extensible character specs}
@!ne:0..256; {the number of extensible characters}
@!param:array[1..max←param←words] of fix←word; {\.{fontdimen} parameters}
@!np:0..max←param←words; {the largest parameter set nonzero}
@!check←sum←specified:boolean; {did the user name the check sum?}

@ @<Types...@>=
@!header←index=0..max←header←bytes;

@ @<Local...@>=
@!d:header←index; {an index into |header←bytes|}

@ We start by setting up the default values.

@d check←sum←loc=0
@d design←size←loc=4
@d coding←scheme←loc=8
@d family←loc=coding←scheme←loc+40
@d seven←flag←loc=family←loc+20
@d face←loc=seven←flag←loc+3

@<Set init...@>=
check←sum←specified:=false;
for d:=0 to 18*4-1 do header←bytes[d]:=0;
header←bytes[8]:=11; header←bytes[9]:="U";
header←bytes[10]:="N";
header←bytes[11]:="S";
header←bytes[12]:="P";
header←bytes[13]:="E";
header←bytes[14]:="C";
header←bytes[15]:="I";
header←bytes[16]:="F";
header←bytes[17]:="I";
header←bytes[18]:="E";
header←bytes[19]:="D";
@.UNSPECIFIED@>
for d:=family←loc to family←loc+11 do header←bytes[d]:=header←bytes[d-40];
design←size:=10*unity; design←units:=unity; seven←bit←safe←flag:=false;@/
header←ptr:=18*4; nl:=0; unused←label:=false; nk:=0; ne:=0; np:=0;

@ Most of the dimensions, however, go into the |memory| array. There are
at most 257 widths, 257 heights, 257 depths, and 257 italic corrections,
since the value 0 is required but it need not be used. So |memory| has room
for 1028 entries, each of which is a |fix←word|. An auxiliary table called
|link| is used to link these words together in linear lists, so that
sorting and other operations can be done conveniently.

We also add four ``list head'' words to the |memory| and |link| arrays;
these are in locations |width| through |italic|, i.e., 1 through 4.
For example, |link[height]| points to the smallest element in
the sorted list of distinct heights that have appeared so far, and
|memory[height]| is the number of distinct heights.

@d mem←size=1028+4 {number of nonzero memory addresses}

@<Types...@>=
@!pointer=0..mem←size; {an index into memory}

@ The arrays |char←wd|, |char←ht|, |char←dp|, and |char←ic| contain
pointers to the |memory| array entries where the corresponding dimensions
appear. Two other arrays, |char←tag| and |char←remainder|, hold
the other information that \.{TFM} files pack into a |char←info←word|.

@d no←tag=0 {vanilla character}
@d lig←tag=1 {character has a ligature/kerning program}
@d list←tag=2 {character has a successor in a charlist}
@d ext←tag=3 {character is extensible}

@<Glob...@>=
@!memory:array[pointer] of fix←word; {character dimensions and kerns}
@!mem←ptr:pointer; {largest |memory| word in use}
@!link:array[pointer] of pointer; {to make lists of |memory| items}
@!char←wd:array[byte] of pointer; {pointers to the widths}
@!char←ht:array[byte] of pointer; {pointers to the heights}
@!char←dp:array[byte] of pointer; {pointers to the depths}
@!char←ic:array[byte] of pointer; {pointers to italic corrections}
@!char←tag:array[byte] of no←tag..ext←tag; {character tags}
@!char←remainder:array[byte] of 0..255; {pointers to ligature labels,
  next larger characters, or extensible characters}

@ @<Local...@>=
@!c:byte; {runs through all character codes}

@ @<Set init...@>=
for c:=0 to 255 do
  begin char←wd[c]:=0; char←ht[c]:=0; char←dp[c]:=0; char←ic[c]:=0;@/
  char←tag[c]:=no←tag; char←remainder[c]:=0;
  end;
memory[0]:=@'17777777777; {an ``infinite'' element at the end of the lists}
memory[width]:=0; link[width]:=0; {width list is empty}
memory[height]:=0; link[height]:=0; {height list is empty}
memory[depth]:=0; link[depth]:=0; {depth list is empty}
memory[italic]:=0; link[italic]:=0; {italic list is empty}
mem←ptr:=italic;

@ As an example of these data structures, let us consider the simple
routine that inserts a potentially new element into one of the dimension
lists. The first parameter indicates the list head (i.e., |h=width| for
the width list, etc.); the second parameter is the value that is to be
inserted into the list if it is not already present.  The procedure
returns the value of the location where the dimension appears in |memory|.
The fact that |memory[0]| is larger than any legal dimension makes the
algorithm particularly short.

We do have to handle two somewhat subtle situations. A width of zero must be
put into the list, so that a zero-width character in the font will not appear
to be nonexistent (i.e., so that its |char←wd| index will not be zero), but
this does not need to be done for heights, depths, or italic corrections.
Furthermore, it is necessary to test for memory overflow even though we
have provided room for the maximum number of different dimensions in any
legal font, since the \.{PL} file might foolishly give any number of
different sizes to the same character.

@p function sort←in(@!h:pointer;@!d:fix←word):pointer; {inserts into list}
var p:pointer; {the current node of interest}
begin if (d=0)and(h<>width) then sort←in:=0
else begin p:=h;
  while d>=memory[link[p]] do p:=link[p];
  if (d=memory[p])and(p<>h) then sort←in:=p
  else if mem←ptr=mem←size then
    begin err←print('Memory overflow: more than 1028 widths, etc');
@.Memory overflow...@>
    print←ln('Congratulations! It''s hard to make this error.');
    sort←in:=p;
    end
  else  begin incr(mem←ptr); memory[mem←ptr]:=d;
    link[mem←ptr]:=link[p]; link[p]:=mem←ptr; incr(memory[h]);
    sort←in:=mem←ptr;
    end;
  end;
end;

@ When these lists of dimensions are eventually written to the \.{TFM}
file, we may have to do some rounding of values, because the \.{TFM} file
allows at most 256 widths, 16 heights, 16 depths, and 64 italic
corrections. The following procedure takes a given list head |h| and a
given dimension |d|, and returns the minimum $m$ such that the elements of
the list can be covered by $m$ intervals of width $d$.  It also sets
|next←d| to the smallest value $d↑\prime>d$ such that the covering found
by this procedure would be different.  In particular, if $d=0$ it computes
the number of elements of the list, and sets |next←d| to the smallest
distance between two list elements. (The covering by intervals of width
|next←d| is not guaranteed to have fewer than $m$ elements, but in practice
this seems to happen most of the time.)

@<Glob...@>=
@!next←d:fix←word; {the next larger interval that is worth trying}

@ Once again we can make good use of the fact that |memory[0]| is ``infinite.''

@p function min←cover(@!h:pointer;@!d:fix←word):integer;
var p:pointer; {the current node of interest}
@!l:fix←word; {the least element covered by the current interval}
@!m:integer; {the current size of the cover being generated}
begin m:=0; p:=link[h]; next←d:=memory[0];
while p<>0 do
  begin incr(m); l:=memory[p];
  while memory[link[p]]<=l+d do p:=link[p];
  p:=link[p];
  if memory[p]-l<next←d then next←d:=memory[p]-l;
  end;
min←cover:=m;
end;

@ The following procedure uses |min←cover| to determine the smallest $d$
such that a given list can be covered with at most a given number of
intervals.

@p function shorten(@!h:pointer;m:integer):fix←word; {finds best way to round}
var d:fix←word; {the current trial interval length}
@!k:integer; {the size of a minimum cover}
begin if memory[h]>m then
  begin k:=min←cover(h,0); d:=next←d; {now the answer is at least |d|}
  repeat d:=d+d; k:=min←cover(h,d);
  until k<=m; {first we ascend rapidly until finding the range}
  d:=d div 2; k:=min←cover(h,d); {now we run through the feasible steps}
  while k>m do
    begin d:=next←d; k:=min←cover(h,d);
    end;
  shorten:=d;
  end
else shorten:=0;
end;

@ When we are nearly ready to output the \.{TFM} file, we will set
|index[p]:=k| if the dimension in |memory[p]| is being rounded to the
|k|th element of its list.

@<Glob...@>=
@!index:array[pointer] of byte;

@ Here is the procedure that sets the |index| values. It also shortens
the list so that there is only one element per covering interval;
the remaining elements are the midpoints of their clusters.

@p procedure set←indices(@!h:pointer;@!d:fix←word); {reduces and indexes a list}
var p:pointer; {the current node of interest}
@!q:pointer; {trails one step behind |p|}
@!m:byte; {index number of nodes in the current interval}
@!l:fix←word; {least value in the current interval}
begin q:=h; p:=link[q]; m:=0;
while p<>0 do
  begin incr(m); l:=memory[p]; index[p]:=m;
  while memory[link[p]]<=l+d do
    begin p:=link[p]; index[p]:=m;
    end;
  link[q]:=p; memory[p]:=(l+memory[p]) div 2; q:=p; p:=link[p];
  end;
memory[h]:=m;
end;

@* The input phase.
We're ready now to read and parse the \.{PL} file, storing property
values as we go.

@<Glob...@>=
@!c:byte; {the current character or byte being processed}

@ @<Read all the input@>=
cur←char:=" ";
repeat while cur←char=" " do get←next;
if cur←char="(" then @<Read a font property value@>
else if (cur←char=")")and not input←has←ended then
  begin err←print('Extra right parenthesis');
  incr(loc); cur←char:=" ";
  end
@.Extra right parenthesis@>
else if not input←has←ended then junk←error;
until input←has←ended

@ The |junk←error| routine just referred to is called when something
appears in the forbidden area between properties of a property list.

@p procedure junk←error; {gets past no man's land}
begin err←print('There''s junk here that is not in parentheses');
@.There's junk here...@>
skip←to←paren;
end;

@ For each font property, we are supposed to read the data from the
left parenthesis that is the current value of |cur←char| to the right
parenthesis that matches it in the input. The main complication is
to recover with reasonable grace from various error conditions that might arise.

@<Read a font property value@>=
begin get←name;
if cur←code=comment←code then skip←to←end←of←item
else if cur←code>character←code then
  flush←error('This property name doesn''t belong on the outer level')
@.This property name doesn't belong...@>
else  begin @<Read the font property value specified by |cur←code|@>;
  finish←the←property;
  end;
end

@ @<Read the font property value spec...@>=
case cur←code of
check←sum←code: begin check←sum←specified:=true; read←four←bytes(check←sum←loc);
  end;
design←size←code: @<Read the design size@>;
design←units←code: @<Read the design units@>;
coding←scheme←code: read←BCPL(coding←scheme←loc,40);
family←code: read←BCPL(family←loc,20);
face←code:header←bytes[face←loc]:=get←byte;
seven←bit←safe←flag←code: @<Read the seven-bit-safe flag@>;
header←code: @<Read an indexed header word@>;
font←dimen←code: @<Read font parameter list@>;
lig←table←code: read←lig←kern;
character←code: read←char←info;
end

@ The |case| statement just given makes use of two subroutines that we
haven't defined yet. The first of these puts a 32-bit octal quantity
into four specified bytes of the header block.

@p procedure read←four←bytes(l:header←index);
begin get←four←bytes;
header←bytes[l]:=c0;
header←bytes[l+1]:=c1;
header←bytes[l+2]:=c2;
header←bytes[l+3]:=c3;
end;

@ The second little procedure is used to scan a string and to store it in
the ``{\mc BCPL} format'' required by \.{TFM} files. The string is supposed
to contain at most |n| bytes, including the first byte (which holds the
length of the rest of the string).

@p procedure read←BCPL(l:header←index;n:byte);
var k:header←index;
begin k:=l;
while cur←char=" " do get←next;
while (cur←char<>"(")and(cur←char<>")") do
  begin if k<l+n then incr(k);
  if k<l+n then header←bytes[k]:=cur←char;
  get←next;
  end;
if k=l+n then
  begin err←print('String is too long; its first ',n-1:1,
@.String is too long...@>
    ' characters will be kept'); decr(k);
  end;
header←bytes[l]:=k-l;
while k<l+n-1 do {tidy up the remaining bytes by setting them to nulls}
  begin incr(k); header←bytes[k]:=0;
  end;
end;

@ @<Read the design size@>=
begin next←d:=get←fix;
if (next←d<unity)or(next←d>=@'10000000000) then
  err←print('The design size must be between 1 and 1024')
@.The design size must...@>
else design←size:=next←d;
end

@ @<Read the design units@>=
begin next←d:=get←fix;
if next←d<=0 then
  err←print('The number of units per design size must be positive')
@.The number of units...@>
else design←units:=next←d;
end

@ @<Read the seven-bit-safe...@>=
begin while cur←char=" " do get←next;
if cur←char="T" then seven←bit←safe←flag:=true
else if cur←char="F" then seven←bit←safe←flag:=false
else err←print('The flag value should be "TRUE" or "FALSE"');
@.The flag value should be...@>
skip←to←paren;
end

@ @<Read an indexed header word@>=
begin c:=get←byte;
if c<18 then skip←error('HEADER indices should be 18 or more')
@.HEADER indices...@>
else if 4*c+4>max←header←bytes then
  skip←error('This HEADER index is too big for my present table size')
@.This HEADER index is too big...@>
else  begin while header←ptr<4*c do
    begin header←bytes[header←ptr]:=0; incr(header←ptr);
    end;
  read←four←bytes(4*c);
  header←ptr:=4*c+4;
  end;
end

@ The remaining kinds of font property values that need to be read are
those that involve property lists on higher levels. Each of these has a
loop similar to the one that was used at level zero. Then we put the
right parenthesis back so that `|finish←the←property|' will be happy;
there is probably a more elegant way to do this.

@d finish←inner←property←list==begin decr(loc); incr(level); cur←char:=")";
  end

@<Read font parameter list@>=
begin while level=1 do
  begin while cur←char=" " do get←next;
  if cur←char="(" then @<Read a parameter value@>
  else if cur←char=")" then skip←to←end←of←item
  else junk←error;
  end;
finish←inner←property←list;
end

@ @<Read a parameter value@>=
begin get←name;
if cur←code=comment←code then skip←to←end←of←item
else if (cur←code<parameter←code)or(cur←code>=char←wd←code) then
  flush←error('This property name doesn''t belong in a FONTDIMEN list')
@.This property name doesn't belong...@>
else  begin if cur←code=parameter←code then c:=get←byte
  else c:=cur←code-parameter←code;
  if c=0 then flush←error('PARAMETER index must not be zero')
@.PARAMETER index must not...@>
  else if c>max←param←words then
    flush←error('This PARAMETER index is too big for my present table size')
@.This PARAMETER index is too big...@>
  else  begin while np<c do
      begin incr(np); param[np]:=0;
      end;
    param[c]:=get←fix;
    finish←the←property;
    end;
  end;
end

@ @<Read ligature/kern list@>=
begin while level=1 do
  begin while cur←char=" " do get←next;
  if cur←char="(" then @<Read a ligature/kern command@>
  else if cur←char=")" then skip←to←end←of←item
  else junk←error;
  end;
finish←inner←property←list;
end

@ @<Read a ligature/kern command@>=
begin get←name;
if cur←code=comment←code then skip←to←end←of←item
else if (cur←code<label←code)or(cur←code>stop←code) then
  flush←error('This property name doesn''t belong in a LIGTABLE list')
@.This property name doesn't belong...@>
else  begin case cur←code of
  label←code:@<Read a label step@>;
  lig←code:@<Read a ligature step@>;
  krn←code:@<Read a kerning step@>;
  stop←code:@<Read a stop step@>;
  end;@/
  finish←the←property;
  end;
end

@ When a character is about to be tagged, we call the following
procedure so that an error message is given in case of multiple tags.

@p procedure check←tag(c:byte); {print error if |c| already tagged}
begin case char←tag[c] of
no←tag: do←nothing;
lig←tag: err←print('This character already appeared in a LIGTABLE LABEL');
@.This character already...@>
list←tag: err←print('This character already has a NEXTLARGER spec');
ext←tag: err←print('This character already has a VARCHAR spec');
end;
end;

@ @<Read a label step@>=
begin c:=get←byte;
check←tag(c);
if nl>255 then
  err←print('LIGTABLE with more than 255 commands cannot have further labels')
@.LIGTABLE with more than 255...@>
else  begin char←tag[c]:=lig←tag; char←remainder[c]:=nl;
  unused←label:=true;
  end;
end

@ @<Read a ligature step@>=
begin lig←kern[nl].b0:=0; lig←kern[nl].b1:=get←byte; lig←kern[nl].b2:=0;
lig←kern[nl].b3:=get←byte;
if nl=511 then
  err←print('LIGTABLE should never exceed 511 LIG/KRN commands')
@.LIGTABLE should never...@>
else incr(nl);
unused←label:=false;
end

@ @d stop←flag=128 {value indicating `\.{STOP}' in a lig/kern program}
@d kern←flag=128 {op code for a kern step}

@<Globals...@>=
@!krn←ptr:0..256; {an index into |kern|}

@ @<Read a kerning step@>=
begin lig←kern[nl].b0:=0; lig←kern[nl].b1:=get←byte; lig←kern[nl].b2:=kern←flag;
kern[nk]:=get←fix; krn←ptr:=0;
while kern[krn←ptr]<>kern[nk] do incr(krn←ptr);
if krn←ptr=nk then
  begin if nk<256 then incr(nk)
  else  begin err←print('At most 256 different kerns are allowed');
@.At most 256 different kerns...@>
    krn←ptr:=255;
    end;
  end;
lig←kern[nl].b3:=krn←ptr;
if nl=511 then
  err←print('LIGTABLE should never exceed 511 LIG/KRN commands')
@.LIGTABLE should never...@>
else incr(nl);
unused←label:=false;
end

@ @<Read a stop step@>=
begin if nl=0 then err←print('Why STOP? You haven''t started')
@.Why STOP?...@>
else  begin if unused←label then
    begin err←print('STOP after LABEL invalidates the label');
@.STOP after LABEL...@>
    for c:=0 to 255 do
      if (char←tag[c]=lig←tag)and(char←remainder[c]=nl) then
        char←tag[c]:=no←tag;
    unused←label:=false;
    end;
  lig←kern[nl-1].b0:=stop←flag;
  end;
end

@ Finally we come to the part of \.{PLtoTF}'s input mechanism
that is used most, the processing of individual character data.

@<Read character info list@>=
begin c:=get←byte; {read the character code that is being specified}
@<Print |c| in octal notation@>;
while level=1 do
  begin while cur←char=" " do get←next;
  if cur←char="(" then @<Read a character property@>
  else if cur←char=")" then skip←to←end←of←item
  else junk←error;
  end;
if char←wd[c]=0 then char←wd[c]:=sort←in(width,0); {legitimatize |c|}
finish←inner←property←list;
end

@ @<Read a character prop...@>=
begin get←name;
if cur←code=comment←code then skip←to←end←of←item
else if (cur←code<char←wd←code)or(cur←code>var←char←code) then
  flush←error('This property name doesn''t belong in a CHARACTER list')
@.This property name doesn't belong...@>
else  begin case cur←code of
  char←wd←code:char←wd[c]:=sort←in(width,get←fix);
  char←ht←code:char←ht[c]:=sort←in(height,get←fix);
  char←dp←code:char←dp[c]:=sort←in(depth,get←fix);
  char←ic←code:char←ic[c]:=sort←in(italic,get←fix);
  next←larger←code:begin check←tag(c); char←tag[c]:=list←tag;
    char←remainder[c]:=get←byte;
    end;
  var←char←code:@<Read an extensible recipe for |c|@>;
  end;@/
  finish←the←property;
  end;
end

@ @<Read an extensible r...@>=
begin if ne=256 then
  err←print('At most 256 VARCHAR specs are allowed')
@.At most 256 VARCHAR specs...@>
else  begin check←tag(c); char←tag[c]:=ext←tag; char←remainder[c]:=ne;@/
  exten[ne].b0:=0; exten[ne].b1:=0; exten[ne].b2:=0; exten[ne].b3:=0;
  while level=2 do
    begin while cur←char=" " do get←next;
    if cur←char="(" then @<Read an extensible piece@>
    else if cur←char=")" then skip←to←end←of←item
    else junk←error;
    end;
  incr(ne);
  finish←inner←property←list;
  end;
end

@ @<Read an extensible p...@>=
begin get←name;
if cur←code=comment←code then skip←to←end←of←item
else if (cur←code<var←char←code+1)or(cur←code>var←char←code+4) then
  flush←error('This property name doesn''t belong in a VARCHAR list')
@.This property name doesn't belong...@>
else  begin case cur←code-(var←char←code+1) of
  0:exten[ne].b0:=get←byte;
  1:exten[ne].b1:=get←byte;
  2:exten[ne].b2:=get←byte;
  3:exten[ne].b3:=get←byte;
  end;@/
  finish←the←property;
  end;
end

@ The input routine is now complete except for the following code,
which prints a progress report as the file is being read.

@p procedure print←octal(c:byte); {prints three octal digits}
begin print('''',(c div 64):1,((c div 8) mod 8):1,(c mod 8):1);
end;

@ @<Print |c| in octal...@>=
begin if chars←on←line=8 then
  begin print←ln(' '); chars←on←line:=1;
  end
else  begin if chars←on←line>0 then print(' ');
  incr(chars←on←line);
  end;
print←octal(c); {progress report}
end

@* The checking and massaging phase.
Once the whole \.{PL} file has been read in, we must check it for consistency
and correct any errors. This process consists mainly of running through
the characters that exist and seeing if they refer to characters that
don't exist. We also compute the true value of |seven←unsafe|; we make sure
that the charlists contain no loops; and we shorten the lists of widths,
heights, depths, and italic corrections, if necessary, to keep from
exceeding the required maximum sizes.

@<Glob...@>=
@!seven←unsafe:boolean; {do seven-bit characters generate eight-bit ones?}

@ @<Correct and check the information@>=
@<Make sure the ligature/kerning program ends with `\.{STOP}'@>;
seven←unsafe:=false;
for c:=0 to 255 do if char←wd[c]<>0 then
    @<For all characters |g| generated by |c|,
    make sure that |char←wd[g]| is nonzero, and
    set |seven←unsafe| if |c<128<=g|@>;
if seven←bit←safe←flag and seven←unsafe then
  print←ln('The font is not really seven-bit-safe!');
@.The font is not...safe@>
@<Doublecheck the lig/kern commands and the extensible recipes@>;
for c:=0 to 255 do
  @<Make sure that |c| is not the largest element of a charlist cycle@>;
@<Put the width, height, depth, and italic lists into final form@>

@ @<Make sure the ligature/kerning program ends...@>=
if unused←label then
  begin for c:=0 to 255 do
    if (char←tag[c]=lig←tag)and(char←remainder[c]=nl) then
      char←tag[c]:=no←tag;
  print←ln('Last LIGTABLE LABEL was not used.');
@.Last LIGTABLE LABEL...@>
  end;
if nl>0 then lig←kern[nl-1].b0:=stop←flag

@ The checking that we need in several places is accomplished by two
macros that are only slightly tricky.

@d existence←tail(#)==begin char←wd[g]:=sort←in(width,0);
    print(#,' '); print←octal(c);
    print←ln(' had no CHARACTER spec.');
    end;
  end
@d check←existence(#)==begin g:=#;
  if (g>=128)and(c<128) then seven←unsafe:=true;
  if char←wd[g]=0 then existence←tail

@<For all characters |g| generated by |c|...@>=
case char←tag[c] of
no←tag: do←nothing;
lig←tag: @<Check ligature program of |c|@>;
list←tag: check←existence(char←remainder[c])('The character NEXTLARGER than');
@.The character NEXTLARGER...@>
ext←tag:@<Check the pieces of |exten[c]|@>;
end

@ @<Glo...@>=
@!lig←ptr:0..511; {an index into |lig←kern|}

@ @<Check lig...@>=
begin if char←wd[c]=0 then
  begin print('There''s a LABEL but no CHARACTER spec for ');
@.There's a LABEL but...@>
  print←octal(c); print←ln('.'); char←wd[c]:=sort←in(width,0);
  end;
lig←ptr:=char←remainder[c];
repeat if lig←kern[lig←ptr].b2<kern←flag then
  begin check←existence(lig←kern[lig←ptr].b1)('LIG character generated by');
  check←existence(lig←kern[lig←ptr].b3)('LIG character generated by');
@.LIG character generated...@>
  end
else check←existence(lig←kern[lig←ptr].b1)('KRN character generated by');
@.KRN character generated...@>
incr(lig←ptr);
until lig←kern[lig←ptr-1].b0=stop←flag;
end

@ @<Check the pieces...@>=
begin if exten[char←remainder[c]].b0>0 then
  check←existence(exten[char←remainder[c]].b0)('TOP piece of character');
@.TOP piece of character...@>
if exten[char←remainder[c]].b1>0 then
  check←existence(exten[char←remainder[c]].b1)('MID piece of character');
@.MID piece of character...@>
if exten[char←remainder[c]].b2>0 then
  check←existence(exten[char←remainder[c]].b2)('BOT piece of character');
@.BOT piece of character...@>
check←existence(exten[char←remainder[c]].b3)('REP piece of character');
@.REP piece of character...@>
end

@ The lig/kern program may still contain references to nonexistent characters,
if parts of that program are never used. Similarly, there may be extensible
characters that are never used, because they were overridden by
\.{NEXTLARGER}, say. This would produce an invalid \.{TFM} file; so we
must fix such errors.

@d double←check←tail(#)==@t\1@>if char←wd[0]=0
      then char←wd[0]:=sort←in(width,0);
    print('Unused ',#,' refers to nonexistent character ');
    print←octal(c); print←ln('!');
    end;
  end
@d double←check←lig(#)==begin c:=lig←kern[lig←ptr].#;
  if char←wd[c]=0 then
    begin lig←kern[lig←ptr].#:=0; double←check←tail
@d double←check←ext(#)==begin c:=exten[g].#;
  if c>0 then if char←wd[c]=0 then
    begin exten[g].#:=0; double←check←tail
@d double←check←rep(#)==begin c:=exten[g].#;
  if char←wd[c]=0 then
    begin exten[g].#:=0; double←check←tail

@<Doublecheck...@>=
if nl>0 then for lig←ptr:=0 to nl-1 do
  if lig←kern[lig←ptr].b2<kern←flag then
    double←check←lig(b3)('LIG step')
  else double←check←lig(b1)('KRN step');
@.Unused LIG step...@>
@.Unused KRN step...@>
if ne>0 then for g:=0 to ne-1 do
  begin double←check←ext(b0)('VARCHAR TOP');
  double←check←ext(b1)('VARCHAR MID');
  double←check←ext(b2)('VARCHAR BOT');
  double←check←rep(b3)('VARCHAR REP');
@.Unused VARCHAR...@>
  end

@ @<Make sure that |c| is not the largest element of a charlist cycle@>=
if char←tag[c]=list←tag then
  begin g:=char←remainder[c];
  while (g<c)and(char←tag[g]=list←tag) do g:=char←remainder[g];
  if g=c then
    begin char←tag[c]:=no←tag;
    print('A cycle of NEXTLARGER characters has been broken at ');
@.A cycle of NEXTLARGER...@>
    print←octal(c); print←ln('.');
    end;
  end

@ @<Glob...@>=
@!delta:fix←word; {size of the intervals needed for rounding}

@ @d round←message(#)==if delta>0 then print←ln('I had to round some ',
@.I had to round...@>
  #,'s by',(((delta+1) div 2)/@'4000000):1:7,' units.')

@<Put the width, height, depth, and italic lists into final form@>=
delta:=shorten(width,255); set←indices(width,delta); round←message('width');@/
delta:=shorten(height,15); set←indices(height,delta); round←message('height');@/
delta:=shorten(depth,15); set←indices(depth,delta); round←message('depth');@/
delta:=shorten(italic,63); set←indices(italic,delta);
  round←message('italic correction');

@* The output phase.
Now that we know how to get all of the font data correctly stored in
\.{PLtoTF}'s memory, it only remains to write the answers out.

First of all, it is convenient to have an abbreviation for output to the
\.{TFM} file:

@d out(#)==write(tfm←file,#)

@ The general plan for producing \.{TFM} files is long but simple:

@<Do the output@>=
@<Compute the twelve subfile sizes@>;
@<Output the twelve subfile sizes@>;
@<Output the header block@>;
@<Output the character info@>;
@<Output the dimensions themselves@>;
@<Output the ligature/kern program@>;
@<Output the extensible character recipes@>;
@<Output the parameters@>

@ A \.{TFM} file begins with 12 numbers that tell how big its subfiles are.
We already know most of these numbers; for example, the number of distinct
widths is |memory[width]+1|, where the $+1$ accounts for the zero width that
is always supposed to be present. But we still should compute the beginning
and ending character codes (|bc| and |ec|), the number of header words (|lh|),
and the total number of words in the \.{TFM} file (|lf|).

@<Gl...@>=
@!bc:byte; {the smallest character code in the font}
@!ec:byte; {the largest character code in the font}
@!lh:byte; {the number of words in the header block}
@!lf:0..32767; {the number of words in the entire \.{TFM} file}
@!not←found:boolean; {has a font character been found?}
@!temp←width:fix←word; {width being used to compute a check sum}

@ It might turn out that no characters exist at all. But \.{PLtoTF} keeps
going and writes the \.{TFM} anyway. In this case |ec| will be~0 and |bc|
will be~1.

@<Compute the twelve...@>=
lh:=header←ptr div 4;@/
not←found:=true; bc:=0;
while not←found do
  if (char←wd[bc]>0)or(bc=255) then not←found:=false
  else incr(bc);
not←found:=true; ec:=255;
while not←found do
  if (char←wd[ec]>0)or(ec=0) then not←found:=false
  else decr(ec);
if bc>ec then bc:=1;
incr(memory[width]); incr(memory[height]); incr(memory[depth]);
incr(memory[italic]);@/
lf:=6+lh+(ec-bc+1)+memory[width]+memory[height]+memory[depth]+
memory[italic]+nl+nk+ne+np;

@ @d out←size(#)==out((#) div 256); out((#) mod 256)

@<Output the twelve subfile sizes@>=
out←size(lf); out←size(lh); out←size(bc); out←size(ec);
out←size(memory[width]); out←size(memory[height]);
out←size(memory[depth]); out←size(memory[italic]);
out←size(nl); out←size(nk); out←size(ne); out←size(np);

@ The routines that follow need a few temporary variables of different types.

@<Gl...@>=
@!j:0..max←header←bytes; {index into |header←bytes|}
@!p:pointer; {index into |memory|}
@!q:width..italic; {runs through the list heads for dimensions}
@!par←ptr:0..max←param←words; {runs through the parameters}

@ The header block follows the subfile sizes. The necessary information all
appears in |header←bytes|, except that the design size and the seven-bit-safe
flag must still be set.

@<Output the header block@>=
if not check←sum←specified then @<Compute the check sum@>;
header←bytes[design←size←loc]:=design←size div @'100000000;
  {this works since |design←size>0|}
header←bytes[design←size←loc+1]:=(design←size div @'200000) mod 256;
header←bytes[design←size←loc+2]:=(design←size div 256) mod 256;
header←bytes[design←size←loc+3]:=design←size  mod 256;
if not seven←unsafe then header←bytes[seven←flag←loc]:=128;
for j:=0 to header←ptr-1 do out(header←bytes[j]);

@ @<Compute the check sum@>=
begin c0:=bc; c1:=ec; c2:=bc; c3:=ec;
for c:=bc to ec do if char←wd[c]>0 then
  begin temp←width:=memory[char←wd[c]];
  if design←units<>unity then
    temp←width:=trunc((temp←width/design←units)*1048576.0);
  temp←width:=temp←width + (c+4)*@'20000000; {this should be positive}
  c0:=(c0+c0+temp←width) mod 255;
  c1:=(c1+c1+temp←width) mod 253;
  c2:=(c2+c2+temp←width) mod 251;
  c3:=(c3+c3+temp←width) mod 247;
  end;
header←bytes[check←sum←loc]:=c0;
header←bytes[check←sum←loc+1]:=c1;
header←bytes[check←sum←loc+2]:=c2;
header←bytes[check←sum←loc+3]:=c3;
end

@ The next block contains packed |char←info|.

@<Output the character info@>=
index[0]:=0;
for c:=bc to ec do
  begin out(index[char←wd[c]]);
  out(index[char←ht[c]]*16+index[char←dp[c]]);
  out(index[char←ic[c]]*4+char←tag[c]);
  out(char←remainder[c]);
  end

@ When a scaled quantity is output, we may need to divide it by |design←units|.
The following subroutine takes care of this, using floating point arithmetic
only if |design←units<>1.0|.

@p procedure out←scaled(x:fix←word); {outputs a scaled |fix←word|}
var z:real; {a number to output after conversion to fixed point}
@!n:byte; {the first byte after the sign}
@!m:0..65535; {the two least significant bytes}
begin if abs(x/design←units)>=16.0 then
  begin print←ln('The relative dimension',x/@'4000000:1:3,
    ' is too large.');
@.The relative dimension...@>
  print('  (Must be less than 16*designsize');
  if design←units<>unity then print(' =',design←units/@'200000:1:3,
      ' designunits');
  print←ln(')'); x:=0;
  end;
if x<0 then out(255) else out(0);
if design←units=unity then
  begin if x<0 then x:=x+@'100000000;
  n:=x div @'200000; m:=x mod @'200000;
  end
else  begin z:=(x/design←units)*16.0;
  if z<0 then z:=z+256.0;
  n:=trunc(z); m:=trunc(65536.0*(z-n));
  end;
out(n); out(m div 256); out(m mod 256);
end;

@ We have output the packed indices for individual characters.
The scaled widths, heights, depths, and italic corrections are next.

@<Output the dimensions themselves@>=
for q:=width to italic do
  begin out(0); out(0); out(0); out(0); {output the zero word}
  p:=link[q]; {head of list}
  while p>0 do
    begin out←scaled(memory[p]);
    p:=link[p];
    end;
  end;

@ @<Output the ligature/kern program@>=
if nl>0 then for lig←ptr:=0 to nl-1 do
  begin out(lig←kern[lig←ptr].b0);
  out(lig←kern[lig←ptr].b1);
  out(lig←kern[lig←ptr].b2);
  out(lig←kern[lig←ptr].b3);
  end;
if nk>0 then for krn←ptr:=0 to nk-1 do out←scaled(kern[krn←ptr])

@ @<Output the extensible character recipes@>=
if ne>0 then for c:=0 to ne-1 do
  begin out(exten[c].b0);
  out(exten[c].b1);
  out(exten[c].b2);
  out(exten[c].b3);
  end;

@ For our grand finale, we wind everything up by outputting the parameters.

@<Output the parameters@>=
for par←ptr:=1 to np do
  begin if par←ptr=1 then
    @<Output the slant (|param[1]|) without scaling@>
  else out←scaled(param[par←ptr]);
  end

@ @<Output the slant...@>=
begin if param[1]<0 then
  begin param[1]:=param[1]+@'10000000000;
  out((param[1] div @'100000000)+256-64);
  end
else out(param[1] div @'100000000);
out((param[1] div @'200000) mod 256);
out((param[1] div 256) mod 256);
out(param[1] mod 256);
end

@* The main program.
The routines sketched out so far need to be packaged into separate procedures,
on some systems, since some \PASCAL\ compilers place a strict limit on the
size of a routine. The packaging is done here in an attempt to avoid some
system-dependent changes.

@p procedure param←enter;
begin @<Enter the parameter names@>;
end;
@#
procedure name←enter; {enter all names and their equivalents}
begin @<Enter all of the names...@>;
param←enter;
end;
@#
procedure read←lig←kern;
var krn←ptr:0..256; {an index into |kern|}
@!c:byte; {runs through all character codes}
begin @<Read ligature/kern list@>;
end;
@#
procedure read←char←info;
begin @<Read character info list@>;
end;
@#
procedure read←input;
begin @<Read all the input@>;
end;
@#
procedure corr←and←check;
var c:byte; {runs through all character codes}
@!lig←ptr:0..511; {an index into |lig←kern|}
@!g:byte; {a character generated by the current character |c|}
begin @<Correct and check the information@>
end;

@ Here is where \.{PLtoTF} begins and ends.

@p begin initialize;@/
name←enter;@/
read←input; print←ln('.');@/
corr←and←check;@/
@<Do the output@>;
end.

@* System-dependent changes.
This section should be replaced, if necessary, by changes to the program
that are necessary to make \.{PLtoTF} work at a particular installation.
It is usually best to design your change file so that all changes to
previous sections preserve the section numbering; then everybody's version
will be consistent with the printed program. More extensive changes,
which introduce new sections, can be inserted here; then only the index
itself will get a new section number.
@↑system dependencies@>

@* Index.
Pointers to error messages appear here together with the section numbers
where each ident\-i\-fier is used.