Number: 1178 Date: 21-May-84 17':17':22 Submitter: Sannella.PA Source: masinter.pa Subject: Want hash tables that have more than 32K elements Lisp Version: Description: ' From': masinter.pa' Date': 14-May-84 21':13':13 PDT' Subject': can we make hash tables that have more than 32K elements?' To': jellinek, kaplan' cc': lispsupport' ' I think we can. We simply need to change it so that hash arrays keep buckets of entries rather than reprobe. Maybe we need another flavor of hash array, but that''s about it...' ' ' ----------' ' Return-Path': kipps@rand-unix' Received': from rand-unixu (RAND-UNIX.ARPA) by Xerox.ARPA ; 14 MAY 84 19':06':59 PDT' Date': Mon, 14 May 84 18':58 PDT' To': 1100Support.pasa, Masinter.PA' Cc': kipps@RAND-UNIX.ARPA, Henry@RAND-UNIX.ARPA, marti@RAND-UNIX.ARPA, guyton@RAND-UNIX.ARPA, obrien@RAND-UNIX.ARPA, vanMelle.pa, 1100Support.pa' Subject': Re': Bug report, HPRINT' In-reply-to': Your message of 14 May 84 17':36 PDT.' From': kipps@RAND-UNIX.ARPA' ' ' ' Yes, this is something I would really like to have working. The' circular datastructure is Rosie''s parse table and it''s very large.' To see what we''re trying to do, look at ELLIE/MAKEFILE in the' ELLIE parser generator code. In ELLIE/MAKEFILE, the two circular' datastructures which form the parse table are ELLIE/NTS and ELLIE/ENTRIES.' ' ELLIE/MAKEFILE allows us to create a Rosie parser, store it' to disk and load it fresh when building a new rosie sysout;' loading from disk is much faster than creating a new parser each' time we build a new rosie sysout. ELLIE/MAKEFILE works fine' on the vax. On the dolphin we''ve been punting until now (creating' a new parser each time a sysout is built) but I''d like to see it' work the way it''s supposed to.' ' ' Jim K.' -----' ' From': kaplan.pa' Date': 14-May-84 22':27':46 PDT' Subject': Re': can we make hash tables that have more than 32K elements?' In-reply-to': masinter''s message of 14-May-84 21':13':13 PDT' To': masinter' cc': jellinek, kaplan, lispsupport' ' We can immediately go to 64K, if we make arrayblock lengths be counted in quad-words instead of cells. This is a move that is simple and desirable for a number of reasons (it makes for larger arrays of other types, and it eliminates special checks for quad-word alignment).' ' If we went to linear chains in buckets, we could have arbitrarily sized hasharrays. But that might increase the page faulting/working set a bit.' ' There must be some theory somewhere about the relative advantages of reprobing vs chaining in buckets. Anybody know anything about this?' ' --Ron' ' -----' ' Date': 15 May 84 00':02 PST' From': JonL.pa' Subject': Re': can we make hash tables that have more than 32K elements?' In-reply-to': kaplan.pa''s message of 14-May-84 22':27':46 PDT' To': kaplan.pa' cc': masinter.pa, jellinek.pa, JonL.pa, lispsupport.pa' ' I think Knuth argues in his book on "searching and sorting" that a reasonable reprobe method (i.e., stepping by remainder modulo a second prime) is always a better use of the overall memory than chaining. These "flat" hash arrays can profitably be filled up to 85% without impacting the average probe time, whereas in general 50% of your memory will be used in linking the chains. On the other hand, he doesn''t count in the effect of cdr coding, which is an improvement on the amount of memory needed to do "chaining".' ' "Second prime"': the first probe slot is found by remaindering by a "first prime". The remainder, in each case, is applied to a condensation of the key into some reasonable interval of intergers -- e.g., on the pdp10, this would be non-negative integers up to 2^35-1; I''d suggest "condensing" to the interval 0 to 2^16-1 for Interlisp-D' ' ' -- JonL --' ' -----' ' From': masinter.pa' Date': 15-May-84 9':50':22 PDT' Subject': Re': can we make hash tables that have more than 32K elements?' In-reply-to': kaplan''s message of 14-May-84 22':27':46 PDT' To': kaplan' cc': masinter, jellinek, lispsupport' ' it was my belief that hashing was good for small tables, and that something like a BTree was best for big data structures.' ' -----' ' From': masinter.pa' Date': 15-May-84 9':54':38 PDT' Subject': Re': can we make hash tables that have more than 32K elements?' In-reply-to': JonL''s message of 15 May 84 00':02 PST' To': JonL' cc': kaplan, masinter, jellinek, lispsupport' ' knuth didn''t consider the workingset considerations on systems with demand paged virtual memories -- his analysis is valid only for tables that are swapped in. That''s probably reasonable for tables with up to 1K elements, but beyond that, we probably need to think more of memory usage and less of constant or small constant order of algorithm.' ' In Interlisp-10, I implemented hashing with a secondary reprobe which was one of 3,5,7,11,13,17,19' ' the table size was chosen to be relatively prime from those.' it kept the reprobe interval ''small'' so that for big tables you still were keeping the amount that you had to page in small.' ' Workaround: Test Case: Edit-By: Edit-Date: Attn: jellinek, kaplan Assigned To: In/By: Disposition: System: Language Support Subsystem: Storage Formats/Mgt Machine: Disk: Microcode Version: Memory Size: File Server: Server Software Version: Difficulty: Moderate Frequency: Everytime Impact: Moderate Priority: Perhaps Status: Open Problem Type: Performance Source Files: