Camelot Notes November 28, 1988 Bob Hagmann, Xerox PARC The following is based on the "Guide to the Camelot Distributed Transaction Facility" Release 1 Edition 0.98 (51) [aleph] dated May 3, 1988. We also have the release. Note: our standard character set does not have an underscore. That character code is a back arrow ("_"). Please read "_" as underscore. It's been years since I've looked at C code. It's real hard for me to read much of the code. The fifth executable statement in CamLib/Lock.c is a call to LockLookup. I decided to check it out. After grepping and poking around for an hour, I finally found it (thanks to a comment). I was not encouraged. C is worse than I remembered! Let me describe the operating environment. We are currently running Cedar on SUN's under SUN OS. To use Camelot, we also have ported to Mach (all but threads so far). We are using the Cedar language as the implementation language for our data server. We can call C routines from Cedar. Many of the clients of the data server would not be running Mach, let alone Camelot. They could be almost anything (e.g., a Symbolics workstation). We want to run Cedar on other hardware besides SUNs. The project is to build a large scale hypertext database system, The project has lots more features and goals, but this one pushes much of the design. Large scale means that a typical server might have about 100 megabytes of memory, 20 gigabytes of magnetic disk, and a terabyte of jukebox optical WORM disks. It also should scale and adapt to available technology (a long discussion could follow here). Questions and/or concerns (mostly a reply of mail): 1) Recovery I have a need to insert into the log useful information that does not correspond, all that well, to recoverable storage. Can I log clean pages and get them inserted into virtual memory during recovery with them looking clean (without the disk write)? Can I get data added to the checkpoints or be informed just after a checkpoint happens? I also would like to do some non-write ahead log recovery. Alpine has a trick for new files. It writes the file directly to disk before commit. This means that the data is only written once. Write ahead log demands that the data be written twice: once to the log and eventually to its home location on disk. For large objects (transfer time comparable with seek + latency), this is a win. Can I trick Camelot into letting me do this? I need to force pages to backing store and know when they have made it. I would like help with both of these eventually, not right away. 2) Begin/End from different threads Page 128 says that the "end call for a transaction must come from the same process (identified by taPort) as the earlier begin call." First of all, I am confused by the use of the term process. Second, it should be obvious, I guess, but which port is the taPort (Dean told me)? It was omitted from the description on page 125 of TA_BEGIN. Can the end call come from a different data server? One thing I want to do is build a simple transaction data server. Its interface just has start, end, and abort transaction calls. It returns the tid on start, and needs the tid for the other calls. The tid can then be passed to any other data server to link them together as a transaction. The simple transaction data server can be used by any program that can RPC to it, not just programs running under Camelot. A second thing is that I want to be able to take the tid and pass it around to different machines. The machines can issue concurrent requests to various data servers. Clearly the tid may become invalid and other such errors. Is there a problem with this? 3) Applications get to set the commit protocol Why? I don't want to participate in a possibly blocking commit protocol since it can block my server. 4) Can commit hang? Suppose a misbehaving server (not mine!) gets deadlocked or caught in a loop in response to a prepare message. Does this hang the commit protocol? 5) Is there unmapping of memory Section 10.4 talks about mapping. Can you unmap? Virtual memory is not big enough. 6) How many and how big are segments? Is the one segment restriction only for the Camelot Library? Are segments 48 bits or 32 bits of address (see pages 111 and 202). I want gigabytes of recoverable storage, but I don't care about the size or number of segments. It should be changeable. 7) What are the optr's in Chapter 10? In general, the guide implicitly assumes that you can tell Mach calls and data types. 8) Locking It should be clearly stated that locking is a function implemented by the library. 9) Commit protocol Is it legal to issue DS_PinObject, DS_LogNewValue, and DS_LogOldValueNewValue calls after a ST_Vote call has been received for a transaction? I see that it is OK to call DS_Prepare (although the comment about spooling lock tables in DS_Prepare makes me wonder even more about #1). I want to delay updating recoverable storage until commit starts - I want intentions. 10) Old value logging Is it possible to issue the DS_LogOldValueNewValue with a null new value, repin the object, and then later call DS_LogNewValue with the new value? Otherwise, to modify some data involves: (1) allocating some storage for the old value, (2) copying the value into the storage, and (3) applying the change, (4) logging, and (5) deallocating the storage. I can avoid the allocation, copy, and deallocation if I can call DS_LogOldValueNewValue with a null new value. The Camelot logger does a copy, but it has to do it anyway. I'm assuming the allocation, copy, and deallocation is more expensive than a pair of context switches. Maybe Mach allows me to allocate some copy on write virtual memory so I can avoid the copy, but then Mach does the copy (our Mach tape and documentation arrived today so I am a bit hazy about how to really use Mach). 11) Mach fears Mach has restrictions on the number of threads, no adequate debugger, and primitive threads (C threads). Much of the C library, apparently, will not work with concurrent access. Why I am not using the Camelot Library In my travels, I think I have discovered four reasons why the Camelot Library will not suit my purposes. Stuff like START_SERVER and the like seem useful, but little else. Let me explain why to make sure I've not missed something and for your own information. Comments solicited. 1) I have too much recoverable data. It looks like I can just ignore the library's support for a single recoverable segment and build my own buffer manager. No big deal. I know that Camelot now only supports a single recoverable segment per server, but it is not architecturally limited to one. The Library is limited to one. This does not prevent me from using the Library. 2) BEGIN_TRANSACTION and END_TRANSACTION These macros (and all that) seem to imply that the library wants to control all concurrent transactions. (3) below is one instance where this is violated. However, the real question here is the transaction abort forces a longjmp to the END_TRANSACTION (right?). This is a real problem. For threads that are executing on behalf of a RPC call to the server where the BEGIN_TRANSACTION thread is suspended or where a TS_Join was done, where does it make sense to longjmp to? Even worse, the longjmp does not do a stack unwind (right?). This will blow out all of the Cedar language monitors held whenever the longjmp happens. (The data structure I am most worried about is thread_data_block fields current and topLevel in camlib_internals.h). I am definitely not willing to give up the language advantages of Cedar to use the library. 3) The lock manager does not let me adapt it. I need to put in triggers. I need different flavors of locks. Details 1) Page 32 - Initialize_Server "called once when the server is first brought up" -- make it really, really clear that this is a once in a lifetime event. 2) Page 38 - 4.2.3 - make the consequences of having BEGIN_TRANSACTION and END_TRANSACTION delimit a "single code block." Talk about long jump. 3) Page 98 - 8.4.2.1 - CamLib uses netname. Which is the right thing to do? 4) Page 98 - 8.4.2.3 - So what are valid and/or interesting names? 5) Page 111 - 10.2 - second paragraph. Is it me, or does everybody have lots of trouble with this paragraph? 6) Page 121 - 11.1 - first bullet -- locks do not exist to the transaction manager. They are a fiction of CamLib. This bullet is misleading (it reinforced my idea that locking was provided by Camelot exclusive of CamLib), or it is just plain wrong. 7) Page 125 - 11.2.2 - you have to define "hybrid atomic" if you are going to use it. 8) Page 128 - 11.2.2 - ER_COMMIT_FAILED should be AC_COMMIT_FAILED? 9) Page 133-134 - 11.4.4 - Again, there are no locks at this level of abstraction. 10) Page 135 - 11.5.1 - RT_Abort should be ST_Abort? Ê†˜Ititle˜ Isubtitle˜Iauthorsšœ˜Ilead1šœ¨˜¨N˜‰N˜ÒNšœî˜îNšœ–˜–headšœ3˜3˜N˜ÔN˜€N˜@—˜#N˜ŠN˜¡N˜—˜.N˜f—˜N˜“—˜N˜T—˜%N˜ü—˜'N˜U—˜ N˜R—˜N˜ð—˜N˜ôN˜Ø—˜N˜²——šœ'˜'NšœuÏkœœ™˜ššœ$˜$NšœÕ˜Õ—šœœœœ˜(Nš œîœœSœœœ œ¢˜Æ—˜-N˜>——šœ˜Nšœš˜šNš œ5œœœœ6˜N˜NN˜CN˜nN˜úN˜UN˜CN˜RN˜4——…—"º$F