CSLNotebookEntryToCedarusersDateNovember10,1983FromEdTaftLocationPaloAltoSubjectBTreepackageOrganizationPARC/CSLXEROXReleaseas[Indigo]Documentation>BTreeDoc.tiogaDraft[Indigo]Documentation>BTreeDoc.tiogaLasteditedbyTaft,November10,19831:56pmAbstractThisCedarpackagemaintainsanorderedcollectionofobjectsasaBTree.Theobjectsmaybeofdifferentsizes,andtheremaybealargenumberofthem(tensorhundredsofthousands).TheamountofvirtualmemoryrequireddoesnotdependonthesizeoftheBTree,andthecostoffinding,inserting,anddeletingobjectsincreasesonlyveryslowlyastheBTreegetslarger.Thepackagemakesveryfewassumptionsabouttherepresentationoftheobjectsbeingstoredoraboutthepropertiesofthestorageitself.ThismemogivesanoverviewoftheoperationoftheBTreepackageandpresentstheresultsofsomeperformancemeasurements.p^ ?q [ r1[ Iq([ r3C[ :; Vs>;!T$bq ;s;$P&<+Y 13?79:?AF :   } "#'c,.t257_=?X F 8f# 1#j%*,/P236-:=o?BDn 6 '"%)-]/"158=?EV 5  R/ &~(3*/(278<? EzG/ 3pG%'*]0259d>AG/ 1;  TVm$2IntroductionABTreeisadatastructureformaintaininganorderedcollectionofobjects(orentries)storedasatreeoffixed-sizepages.Theentriesareeachsmallerthanapageandarestoredmanyperpage;thus,eachinteriorpageofthetreehasmanybranches,soevenaverylargetreeisquiteshallow(adepthof3or4istypical).Thismeansthatfindinganarbitraryentryisrelativelyinexpensiveintermsofthenumberofpagesthatmustbeaccessed.Thecostofanupdateisalsoreasonableandhasanupperboundwhichislinearinthedepthofthetree.Finally,thetreehasverygoodlocality,soreferencestoconsecutiveentriesusuallyaccessthesamepage.ForfurtherinformationonthepropertiesofBTreesingeneral,pleasereadKnuth,TheArtofComputerProgramming,vol.3,section6.2.4.ThisBTreepackageisaCedartransliterationofonewritteninBCPLbyEdMcCreightandusedformaintainingthefiledirectoryinIFS.TheimportantpropertiesthatdistinguishitfromexistingMesaBTreepackagesare:GeneralitytheBTreepackagemakesnoassumptionsabouttherepresentationofentries;itsimplystoresuninterpretedblocksofwords.Itneedknownothingaboutwhatportionofanentrycontainsthekey''orthevalue''.Allknowledgeabouttherepresentationofentriesisvestedinclient-providedprocedures.ThepackagealsomakesnoassumptionsaboutthemeansusedtostoretheBTreepages,butinsteadcallsclient-providedprocedurestoaccessthem.Capacitythereisnopracticallimittothenumberofentriesthatcanbestored.Inparticular,thereisnorequirementthatallofthepagesoftheBTreeoccupyvirtualmemoryatthesametime.Theultimatelimittothesizeofthetreeis65,535pages,whichshouldbesufficienttostoreatleastamillionentries(evenmoreifpages''largerthan256wordsareused).Storageefficiencyoverheadconsistsofonlyonewordperentryandtwowordsperpage;andthepackagemaintainspagesbetween60and80percentfullontheaverage.Thepackagealsoattemptstokeeplargerentriesintheleafpagesandsmalleronesintheinteriorpagessoastominimizethetree'sdepth.Provenreliabilitysincethelast''bugwasfixedintheBCPLversionofthepackage,therehavebeenmanytensofthousandsofIFS-hoursofoperationwithnosuspicionofaBTree-relatedproblem.NochangesweremadetotheBTreemaintenancealgorithmsduringthetransliterationtoCedar.Itisintendedthatthispackagebesuitableforuseinmaintaininganyordereddatastructurethatmustbestoredexternally,i.e.,iseitherpermanentoris(potentially)toolargetokeepinvirtualmemory.Themostobviousapplicationisafiledirectory.Forsmallerapplicationsinvolvingtemporarydataonly,aninternaldatastructuresuchasaRedBlackTreeismoreefficienttoaccessandmaintain.uHDg/p ^ r [@h   '). 56;qv=[>O[rA[B[FHe Y   V"%*. /G25f7;?B`F1 XN "]G"(*-.25z8A9=AB;CG V A %zi$&,0i1 8 ?AkEVG/ TcsQ"$')5+/03 :`=?fADE4 SX $q|S!L&5(t+-i0W38:L @B: Q\| O.j )c ').k0N5:=TvBIO.C0O.EAG M1 r0M MD! K ] <) *-2U48F:R<CFS I] "  l7"%, 25I <3=@E Gw E3 rE3AE3$6&: -147 =,>CE C lh "C%)^.t2U5:9AG/ A  -!%(k 13~8 9>(@ @= AI  ;"8 )-0 4Z792<>BG >5 "<#'w <r<:<"$&+-247]9F=? F :l  F#?%'',05p; <?`C F 8H,'0!N%)-2)4 :;?@CD 7\; '#M%),Cw 4 rg44#%u(+<.1;47:4>A@DwG/ 2?k #%*-6/N17%:?bB?G 1L I@u n#1(+?,/d4Y8/9;=rCE /w -"h r-"a-"["O%&'+w-8/38:<BFH +{ V-i/$&,/189: C ) N{: ( /36T?@ 'Q9k! w"|'),.C 58=@F % F)? e!(z*0+ 3C58:=?}C $ < M!F 'n*. 6sFCGR XM UY!># *j.jv0U1XU 9 r?U@bUB}vDUEjU T" rT"YT"$:o'*+9/ 68:@DF R{ i#&b),06!8;= EG P M '%*? NQ  d#&|+-03u :c;?AE L&|g%(K .369;=?CeE* K Dw!#b*W+.a137p:?;A$F I[v!R$)035: CkG; G 0 qcG 7Gr"G"Gq$G%G(r.'G.G1546; BC E0M "#%),c169\=1>CEl C oL !$N)+,0F2Y69M B A qAArJAA ,#%* / 6>7l:<?TBG @:DMv $ ,m 57:? > =$&^*),148x <]"%)-? 7]<{?%C :h Hhn (*-605Z ;? 8,1j "% /179?@G 75 #&(u.026;=?PE G/ 5r9lp! (?+2Y5# <@OC 3 ?|8 K#| 1Hzh"$' 2J781: @Cp / W=] '+/2529 B -u !&*, /R37;? F ,R )bp $'*n-H27L9 @hE ('38q('('r('(' %*+T,2l 8;>AG & RRJ"%).K/3x57=CEG $ v$7$r$v$w$r!$"t$%x'1* ,\ 8TVm$4PerformanceTheperformanceoftheBTreepackagedependsonmanyvariables:treesize,pagesize,accesspatterns,cachingbehaviorofthestorageprimitives,andothers.Afewgeneralstatementsaboutperformanceareofferedhere,followedbysomemeasuredresults.Itiscrucialthatthestorageimplementationincludeasubstantialamountofcaching,managedinanLRUfashion.Inmanysituations,theBTreepackagereferencesagivenpagemultipletimesinthecourseofasingleupdate.Evenignoringthis,inmostapplicationsitiscommonforclientstomakereferencestothesameorclosely-relatedentriesoverashortintervaloftime;theresultinglocalityofreferencehasaperformancebenefitonlywhenredundantstoragereadsareavoidedbyuseofacache.ThesizeofthecacherelativetothesizeoftheBTreecanhaveasubstantialeffectonperformance,astheresultsbelowshow.Additionally,itshouldbenotedthatforalargeBTreeitisdesirabletohavealargepagesize;givenareasonablylargeamountofspaceforacache,itisbettertohaveamodestnumberofrelativelylargepagesthanalargernumberofsmallpages.ThisisbecausethemaincostofperforminganoperationonalargeBTreestoredonadiskisdiskaccesstime(asopposedtotransfertime);andlargerpagesmeanmoreentriesperpage,ashallowertree,andfeweraccessesperoperation.WhenupdatingaBTree,atradeoffmustbemadebetweenthedesireforgoodperformanceandthedesiretomaintainaconsistentpermanentstateofthetree.Ifitisdesiredthatthepermanentstatebeconsistentbetweeneveryupdatethenmorewritesarerequired,thusreducingperformance.TheclientprogramcontrolsthistradeoffbyuseofthemaintainRecomputableStateargumentofBTree.OpenandtheBTree.SetUpdateInProgressprocedure.Ofcourse,ifconsistencyisbeingmaintainedatalowerlevel,e.g.,bykeepingtheBTreeinanAlpinefileunderatransaction,thennoprovisionsneedbemadeattheBTreelevelformaintainingconsistency.ThemeasurementsbelowwereobtainedusingtheBTreeTesttool,whichisavailableasBTreeTest.bcdfromBTree.df.(ThesenumbersareforCedar4.4,usingPilotfilefacilities;nocorrespondingmeasurementshaveyetbeenmadeforCedar5.0.)ThisprogramusesBTreeVMtomanageaBTreestoredinatemporaryfile(whichisdeletedwhenthetoolisdestroyed).Itobtainsparametersfromtheuser,randomlyperformsvariousoperationsonthetree,anddisplaysanumberofperformancestatistics.RunningBTreeTestfromtheUserExecopensaviewercontainingacommandmenu,severaluser-adjustableparameters,andatableofresults.Theparametersare:DiskPages/BTreePagethenumberoffilepages(256words)perlogicalBTreepage.Thelegalrangeis[1..16].CacheSizethenumberoflogicalBTreepageskeptinBTreeVM'scache[8..255].MaxTreeEntriesthemaximumnumberofentriesthetestprogramwilleverpermitthetreetocontain[100..65535].(Notethattheupperlimitisarestrictionofthetestprogram,notoneimposedbytheBTreepackageitself.)LongUpdatethisbooleanswitchcontrolswhetherornottheBTreepackageistokeepthepermanentstateconsistentbetweenupdates.LongUpdate:yes''meansthatwritesareperformedonlywhennecessarytoobtainafreecacheentry.ValidateEveryUpdateturningonthisswitchcausestheentiretreetobecheckedaftereachupdate;thisisusefulonlyduringbasicdebuggingoftheBTreepackage,andisverycostly.AtestisstartedbyclickingStart,andisendedbyclickingStop.Theresultsareupdatedeveryfewsecondswhilethetestisrunning.ClickingInitTreeresetsthetreetoemptybeforethebeginningofatest.(ItisnecessarytodothisafterchangingDiskPages/BTreePageorMaxTreeEntries.)ResetStatsresetsthestatisticstozerobutdoesnotchangethetree.Theresultsare:Treesize,Levels,Entriesthesedescribethecurrentstateofthetree.ThetreesizeisinunitsofuHDg/p ^ r [ { &L+..2. 8;>BHE YL^y :%O ,u/4k6Y9\> E XX D "#$'- U{ 't,I-h 4-9.:@*EG T5 nE d $) 0d1|58T=AUBE? R P!#& ./1U7 9Q=?RC P  o0/ #$(U-R/25:?iAG O?  %5)-/579v;/DFH Iu . HW !H$)+/13!7]8:O>_@%CD G" Q%!U"&~+-v15J8n9?AxDG F' 38v K$(*,/037:=3BD Dr1$5&*H+r147b;,@dB Bv""N%'+D0269$< DG/ @]  +$!'c)-+.0Z13=8:=fD\G > UD!}%#)8+14:u BE =!#hv%=&=r6=7==+> F ;g D6 $&*, 3548\ ?|A B/E 9 .G"O$() 1406- <@AEG/ 8T ' 5b 5K$(^*1489K>@l 3 ' \!%'n*.0C 57 @ 2N jw<">%+V.557D"%v*+12 : ,r '&*,q1+ 8:9@oD *] s"#(e+= 2Dv (br(b`(b#T%A'+/!36;?jC<FI & "v $@r$@$$@i!%),.y59v ! rC!!1 &9(,/I27:=BdDG  p#''() 0B24d6<?ZB G v Knv  rP &+.,/1X5U:r;=B@oB S #i &)-0]476b<?C{   v 1r1q1J!J#(,L.25G68>"AXD   +#%',14U58 :0vr !$@%)+v01r34h7?;=C:F gSrv$g%<gr*g*g-0[3149A=@FHe  !0v'9(Mr5:5v7*8 r@AvBCr   z),"S&)Z,t/L3v r +  &L(-025@8s;e>AABDTG &TVm$5logicalBTreepages.Operations,ms/opthesearethenumberofeachtypeofBTreeoperation(lookup,enumerate,insert,delete,replace)thathavebeenperformedandtheaveragenumberofmillisecondstakenbyeach.Inthecaseofenumerate,whatiscountedisthetotalnumberofentriesenumeratedandtheaveragetimeperentry.Aninsertisanupdatethatinsertsanentrynotpreviouslypresent,whileareplaceisanupdatethatreplacesanexistingentrywithanewonehavingthesamekey.Insertedandreplacemententrieshavesizesselectedatrandomfrom[2..33].CacheRefs,Hit%thetotalnumberofcallstoBTreeVM.ReferencePage,andthepercentageofthosecallsthataccessedapagealreadypresentinthecache.StorageReads,WritesthenumberoftimesBTreeVMtransferredaBTreepagebetweenthecacheandpermanentstorage.Writes/updatetheaveragenumberofstoragewritesperupdateoperation(insert,delete,orreplace).Totalelapsedtimethetotalnumberofsecondsthetesthasrun,excludingtestoverheadsuchasupdatingtheresultsintheviewer,butincludinganyconcurrentactivityelsewhereinCedar.%R+Wtime,ms/(RorW)thepercentageofthetotalelapsedtimespentwaitingforstoragereadsandwrites,andtheaveragenumberofmillisecondstakenbyeachone.Herearesomeresults,obtainedonaDorado(Cedar4.2,runningonPilot):Testnumber1234567DiskPgs/BTreePg1111114CacheSize200200202020020050MaxTreeEntries2500250025002500600006000060000LongUpdateYesNoYesNoYesNoNoTreeSize190190190190436143601037Levels3333443Entries1961194919441940469134702747048ms/opLookup1.901.9138.826.272.855.040.1ms/opEnumerate0.080.081.821.682.752.990.84ms/opInsert4.7248.848.075.786.411971.9ms/opDelete6.6360.869.510112015485.9ms/opReplace3.9630.144.652.879.885.562.4Hit%1001008281818185Writes/update01.741.281.781.351.801.16%R+Wtime0879193949491ms/(RorW)21.717.018.727.428.328.8Afewthingsareworthnotingabouttheseresults.Test1usesacachelargeenoughtoholdtheentiretreeandneverwritesanypagestopermanentstorage(exceptattheendofthetest,whereitisn'tmeasured).ThisgivesagoodapproximationoftheCPUcostofoperationsperformedona3-levelBTree.Asisevidentfromcomparingtest1'sresultswiththeothers,normallymostofthetimerequiredforaBTreeoperationisspentwaitingforthedisk.Test2isthesameastest1exceptthatthepermanentstateiswrittentostorageattheendofeveryupdate.Tests3and4aresimilarto1and2exceptthatthecacheismuchsmallerthanthetree,somanymorereadsandwritesoccur.Interestingly,thechoiceofwhetherornottowritestorageaftereachupdatehasonlyamodesteffecton%R+Wtimeintests3and4.Thisispresumablybecausemostdirtypagesgetwrittenanywayasaresultofbeingdisplacedduringlaterreferencestootherpartsofthetree.(Thedecreaseinlookuptimefromtest3totest4isagoodindicationthatthisiswhatishappening.)Tests5through7areforamuchlargerBTree.Tests6and7showtheeffectofvaryingthepageuHDg/r _/Fhv \ r\(\u" 'q)c,026_<Br [  ha '*g,168 @WDF Ye OW* a^$&B(+1B37 ?EBD W z !$e(*.@0 7]<@CApF0G V}!$& )&+0|269>A To\ $v Q rQ.Qu"$7'.(8;;= DOE PL Cm#[% 'iv MrM8M"$n(. 5z6:=ChE L) v I rIoI"$>(,/39>BYDv G2b*rG2G2$%*-n025;>DG Ecp-!'*c 1I6<]> v CiLr'CC ')\+.36:V?ALE Ah  @! )-/2 >!t `"f#(-,/46v rm*>!-*>(*>0*>8n*>@/*>G*>v (rm(!-(((0(8n(@/(G(v &rm&!-&(&0&8n&@/&G&v %H rm%H!-%H(%H0%H8n%H@/%HG%Hv #brm#!-#(#0#8n#@/#G#v !2rm!!-!(!0!8n!@/!G! }ZR K$L',/134L59=BCG/  M!#+*.357}:3;>HA*EAF . >R $&k(,.0n 7=?@E   jN!j%(+V/59!:=P@FHe 8N! _" dI T!$&-1 2t7$8=~?ApD%E Em? #r$(+-136;>AD E  j #&*0+1 256:>AD n b}q>E #%s(),q-12 9?BqE  8K"(%,/ 6E7;?@C!F=v  rE  '!$!%V&'+^ 147:8<=n Dr!%*\-/613>69B=>CFH QTVm$6size,givenafixed-sizecache.Itisevidentthatthelargerpagesizeyieldsbetterperformanceinallrespects.Thevariationsintimeperreadorwriteacrossallthetestsarepartiallyexplainableasbeingrelatedtothesizeofthetree(alargertreemeansincreasedtimespentseekingbetweendifferentpagesofit).Variationsamongtestswithequal-sizetreesareprobablyPilotartifacts.Finally,itshouldbenotedthatthetestprogramprobestheBTreeatrandom.Realapplicationsmaybeexpectedtoexhibitmorelocalityintreereferences,andconsequentlyahigherhitrateandlesstimespentreadingandwritingstorage.uHDg/r _/ ]  A%P(N*.2o5J9e= EG ] [  e !%C)L+7-028J ?A7D Ye } M$*-16<AEGh W x) !$'$,0< UC[<"$*.046p;? F S R U"$ +.e 67PACFi Q BTVm$ TIMESROMAN TIMESROMAN HELVETICALOGOLAUREL TIMESROMAN TIMESROMAN TIMESROMANY G # /1j/42ںBTreeDoc.tioga10-Nov-83 14:09:27