Number: 2260 Date: 25-Sep-84 13':03':21 Submitter: Sannella.PA Source: jonl.pa Subject: improve the litatom hashing algorithm Assigned To: Attn: jonl Status: Open In/By: Problem Type: Performance Impact: Moderate Difficulty: Frequency: Priority: Perhaps System: Language Support Subsystem: Storage Formats/Mgt Machine: 1132 Disk: Lisp Version: 24-Sep-84 09':49':01 Source Files: Microcode Version: 5124 Memory Size: 4096 File Server: Server Software Version: Disposition: Description: ' ' Date': 3 Jun 84 23':11 PDT' From': Jonl.pa' Subject': FOO -- still the biggest loser in the world' To': LispCore^.pa' cc': Jonl.pa' ' Mitch Model''s recent note to LispSUpport emphasizes the problems we have with litatoms -- ' ...' 2) the hashing algorithm performs poorly when one nears that limit' ' It certainly would be nice if atom hashing could simply be an application of Interlisp''s general hashtable facility, although the "wired-in" approach does make the common case (low number of probes) go a bit faster.' ...' ' Now for the bad news again -- FOO is "still the biggest loser in the world". In my abc.sysout made from the recent full.sysout and having about 25K atoms, it takes *** 207 *** probes to find the atom FOO. I was wondering why it swapped so much when I merely did' (SETQ FOO 5)' and had trouble blaming the swapping on EVAL, or DWIM, or LISPX, or . . . ' ' See the file {Phylum}ATOMHIST.TXT for the printout of a histogram of ATOMHASH#PROBES (disregard the 0 entry -- thats for the single-character atoms); it seems to be a nicely-shaped exponential-decay curve for the interval from 1 to about 35 probes, and dribbles on randomly up to 207 probes. I have a log-plot for which 1-35 is not far off from a straight line with non-zero slope; from there on, it appears to be a random (i.e., "even") distribution.' ' -- JonL --' ' -----' ' Date': 5 Jun 84 21':36 PDT' From': JonL.pa' Subject': Re': Efficiency Hacks?' In-reply-to': LispSupport.pa''s message of 4 Jun 84 17':12':29 PDT (Monday)' To': LispSupport.pa' cc': JonL.pa' ' If Mitch Model''s conjectures are true, then we have a a few more good reasons to proceed with the extension to litatom capabilities in Interlisp-D. I have three main points to make, the second of which is rather lengthy, so I''ll warn you in advance':' ...' 2) The importance of swapping in the litatom-hash case' 3) Performance tools': how to find out if you are being bitten by (2).' ...' ' 2) They say that their subjective impression is that litatom hashing is slow. They could well be right. They should be pointed to ATOMHASH#PROBES, which is in the Lisp.sysout now (but may still be lacking in documentation?). ' ' A good hashing algorithm, such as that in Knuth''s book on sorting and searching, will have a small average-number-of-probes and a fairly short "tail", even when the table is around 85% full; as I mentioned in a note to LispCore^ last week, Interlisp-D''s litom-hash algorithm produces a fairly small average-number-of-probes but an *** incredibly long "tail" ***. So the problem isn''t just that they have a lot of litatoms; rather, it''s that they are referencing ones that live on the tail (of course, the problem would be moot if there were enough real memory to hold the full virtual space).' ' The importance of knowing how many probes it will take to locate an atom can''t be overlooked. If found immediately, then the time will be bounded (at worst) by the time to swap in the relevant page of the litatom hasharray, the page of pname-pointer space, and the page of pname-character space; if we take 40ms as an average page-fault time, then MKATOM shouldn''t take longer than about 1/10 of a second. [on a Dorado, when no swapping is involved, litatom hashing takes about 110us plus 25us for each probe.] On the other hand, there are 256 pages of litatom hashtable, 256 pages of pname-pointer space, and at least as many of pname-character space; a "worst case" could take over 750 faults, with a time of about 30 seconds **** due entirely to a poorly-performing hash algorithm ****.' ' That is, we have three-orders-of-magnitude slowdown; and we can''t blame "pageing" in general, because a litom-hash should cause at most a couple of page faults -- not a couple hundred faults!' ' Is such a "worst case" realistic? Will it ever happen in any notable frequency? The answer, unfortunately, is yes. Litatoms created "last" have a higher probability of falling into the long "tail"; thus the ones of "your" application, which you use all the time, will likely be the ones to suffer the "whip of the tail".' ' Consider some statistics taken from the recent Full.sysout, with my working environment loaded': ' 2.9 is the expected number of probes, averaged over all atoms' 75% of all atoms were found in 2 or fewer probes ' 95% of all atoms were found in 12 or fewer probes' Those statistics look "pretty good", *** but the remainder 5% were about equally distributed on the interval up to 207 probes ***. In fact, my favorite meta variable, FOO, is the one entry at 207! It took a long time to convince myself, let alone convince others, that the 10 second pause I experienced when typing' (SETQ FOO 5)' was due to litatom hashing, and not to DWIM, or GC or any of "the usual suspects". This is on a Dorado where, other things being equal, the average page-fault time is more like 25ms [I observed about 49ms for Dolphin and 37ms for DLion with 40MB Quantum disk. But remember, these are "one-shot" averages.]' ' Compare those 10 seconds with the usual few 10''s of milliseconds for this action!' ' ' ' 3) There is a difficulty in even noticing that litatom hashing is the problem, because of the interaction with virtual memory problems; one has to have a "nose" for this sort of smelly problem.' ' I tried the usual metering tools on the forms (MKATOM "NIL"), which takes only one probe, (MKATOM "T") which takes 0 probes (single character litatoms are handled separately), and (MKATOM "FOO"), which takes 207 probes.' ' -- SPY''s output is useless; it merely said that 100% of the time was being spent in MKATOM.' ' -- DOSTATS, if it were working correctly, would be equally useless. It conveniently filters out the page swap time; only by a "bug" in the filtering did it leak out that 43% of the time went into \WRITEMAP (and as it happens, \WRITEMAP time seems to be less than 25% of time spent in swapping).' ' -- TIMEALL at least told me about the gross page faulting behaviour ' ' Without suspecting pageing, one might be tempted to interpret the above results as saying that the probe comparison is poorly coded; yet more persistent analysis shows that it costs only about 61us per probe (on a DLion) when faulting isn''t involved -- so 207 probes on a DLion should cost at most 14ms, rather than 14+ seconds which does happen to me frequently.' ' By judicious use of \RELEASEWORKINGSET and a sub-piece of TIMEALL, I found that, in my environment, a "fresh" lookup of T takes 5 faults, NIL takes 8, and FOO takes 323. ' ' The confounding thing is that in a Full.sysout, before I load my applications, there are 20800 litatoms, and the lookup of the worst-case litatom (again, FOO!) costs only 55 probes and 95 page faults. Why should loading a "small" application, with only about 5000 litatoms, blow the whole thing up? The answer must be that the hash algorithm begins to break down at an occupancy level of about 70%' ' So one can merely recommend that a user snoop around with ATOMHASH#PROBES to "indict" the lookup of some of his favorite atoms.' ' ' -- JonL --' ' -----' ' Date': 11 Jun 84 15':00 PDT' From': JonL.pa' Subject': FOO' To': Sannella' cc': JonL.pa' ' I added the line' (QUOTE FOO)' to NEXT>LOADFULL.CM, just after the LOADUP(HUGE], for the obvious reason.' ' -- JonL --' ' Workaround: Test Case: Edit-By: Edit-Date: