WallabyDoc.tioga
Jack Kent
Jack Kent, May 13, 1987 4:08:22 pm PDT
Willie-Sue, May 26, 1987 5:07:24 pm PDT
Wallaby
CEDAR 7.0 — FOR INTERNAL XEROX USE ONLY
Wallaby
Jack Kent and Doug Terry
© Copyright 1987 Xerox Corporation. All rights reserved.
Abstract: Wallaby is a tool for browsing Walnut databases. It enables the Walnut user to pose relatively sophisticated queries (sophisticated at least when compared to the two available Walnut queries "List all message in message-set Foo sorted by date" and "List all message sets that own a given message") about a Walnut database, simply by filling in a form.
Created by: Jack Kent and Doug Terry
Maintained by: <WallabyImplementors^.pa>
Keywords: database, servers, queries, browsing, LoganBerry
XEROX  Xerox Corporation
   Palo Alto Research Center
   3333 Coyote Hill Road
   Palo Alto, California 94304

For Internal Xerox Use Only
1. Introduction
A. History
Cedar users have long desired the means to fully exploit the database management system underlying Walnut. Wallaby was developed for this reason.
Once upon a time, posing queries to a walnut database was neither easy nor efficient. The existing package (WalnutQuery), while supporting a very rich query language, was simply too complex for the casual Walnut user. Moreover, the Walnut schema was defined in a way that obscured much of the semantics behind the client's mail model. As an example, many properties common to every mail message (e.g. subject, sender) were not declared as attributes in the Walnut message relation. As a result, Walnut queries that might have been resolved instantaneously (on the order of seconds) were doomed to wait, minutes (or even hours) for a full database text search.
To rectify the aforementioned problems, we developed Wallaby: a tool that enables the Walnut user to (easily and efficiently) pose queries about his Walnut mail database. It's built around a generalized user-interface for browsing single-table databases (refer to LoganBerryToolsDoc for details about this browser), a revised Walnut schema, an interface that maps (single-table) requests for LoganBerry information requests into multiple-table Walnut (or Cypress) information requests, and an enhanced version of Cypress.
B. Usage
The user invokes Wallaby by buttonning `Query' in his Walnut command window. This creates a Walnut instantiation of the LoganBerryBrowser (called a Wallaby browser) and produces a viewer that looks something like the below:
[Artwork node; type 'ArtworkInterpress on' to command tool]
A Wallaby Browser
As stated in LoganBerryToolsDoc, the browser has several areas or subviewers.
The top viewer is the caption-menu which simply indicates the name of the database being browsed.
Below the caption is the command-menu. Contained within are tioga buttons that allow the user to either intiate a query (a la the browse buttons), stop a query (the stop button), reconstruct a prior query (the history button) or remove non-essential information from the command-menu (the details button). The use of the history button and details button are explained below.
Toggling the details button removes (and replaces) the filter buttons; it's intended for the fastidious Wallaby user who can't stomach non-essential information on the screen. Toggling the history button creates (and removes) a history window below the input form and details subviewer (it is not shown in the browser above). This area contains a list of the operations that have been executed in the past; these operations are given in a format that could be typed into a CommandTool to get identical results (except the database name is omitted). Clicking on one of these "history commands" fills in the input form with the values that previously existed and also sets the previous details. Thus, a query can be re-executed by selecting a line in the history window and clicking "Browse" again. The user is also free to change any of the values once they are refreshed.
Below the menu is the input form. Each line of the input form allows the user to type in a pattern for a given attribute. Preceding each attribute name on the input form is a choice button that allows the user to select what pattern matching filter is too be applied to the attribute value. Clicking on this choice button cycles through the various options (see section 2).
Below the input form is another choice button that allows users to select an access path. (see section 3)
The user starts the query process by first filling in a form (shown below in an "unfilled-in" state).
[Artwork node; type 'ArtworkInterpress on' to command tool]
The Wallaby Form
Notice that there is a slot in the form for each of the five attributes of a Walnut mail message: the message set that it belongs to (one of possibly many), the sender of the message, the recipient(s) of the message, the date that the message was sent, the subject of the message and the actual text of the message. The user constrains the value of an attribute by filling in the slot corresponding to that attribute. After initiating the query (refer to section 4), any mail message that "matches" the form (refer to section 2) is returned to the user, where he can then perform further operations on the output (refer to sections 4).
2. Matching
(Clients should refer to LoganBerryToolsDoc for a more complete explanation.) Exactly how the filled-in text alongside an attribute button is "matched" against a mail attribute is determined by a user-selected filter. This filter is chosen by clicking the button to the left of the mail attributes. As of this time, Wallaby provides ten filters (listed in the order they would be encountered starting with a left click on DWIM). We describe their behavior below, using the notation that Pattern is the user specified fill-in and Candidate is the information against which pattern is matched.
1. Exact Filter: Candidate passes filter if it matches Pattern exactly.
2. Prefix Filter: Candidate passes filter if it matches some (not neccessarily proper) prefix of Pattern.
3. Wildcard Filter: Pattern includes arbitrary mix of text and kleene star (*). The criteria for Candidate passing filter are what you would expect.
4. Regular Expression: Pattern includes arbitrary mix of text and the regular expression operators. (refer to RegularExpressionDoc.tioga for a complete description of these operators). Again, the criteria for Candidate passing filter are what you would expect.
5. Soundex: Refer to LoganBerryToolsDoc.tioga
6. Subrange: Refer to LoganBerryToolsDoc.tioga
7. Numrange: Refer to LoganBerryToolsDoc.tioga
8. Daterange: Pattern includes some Tempus1 comprehensible date (lower bound), followed by hyphen, followed by another Tempus comprehensible date (upper bound). Candidate passes filter if (a) it can be interpreted as a date AND (b) the date falls within the designated range.
9. Date: Pattern is some Tempus comprehensible date. Candidate passes filter if (a) it can be interpreted as a date AND (b) date matches Candidate.
10. Synonym: Pattern is a word. Candidate passes filter if it is a synonym of Pattern. (not implemented...???)
Very often, the type of filter desired by the user can be inferred from the Pattern alone. To this end, the DWIM filter has been provided. DWIM means simply, "try to do what I mean". More specifically, DWIM looks for special characters reserved by the respective filters within the pattern and attempts to infer the filter type from this information.
[Artwork node; type 'ArtworkInterpress on' to command tool]
Example 1: A form to find all mail in my database, sent between February 1, 1986 and today, from someone at princeton (DWIM becomes a date-range).
[Artwork node; type 'ArtworkInterpress on' to command tool]
Example 2: A form to find all mail in my database, sent to the distribution list RiverRats.
[Artwork node; type 'ArtworkInterpress on' to command tool]
Example 3: A form to find all mail in my database, sent from someone whose name sounds like `Tearee', on the subject of Wallaby, with the word "suggestions" occurring somewhere in the message text.
3. Selecting An Access Path
An access path describes the method a DBMS uses to retrieve a given set of records from a database. Choosing the best access path for processing a query, as Wallaby does, is one facet of database query optimization. In general, query optimization can be quite complicated, especially when there are multiple tables per database and tables are distributed across multiple servers. (Fortunately, neither of these scenarios apply to Wallaby.)
Wallaby chooses from any one of six access paths when processing a query. The first five correspond to the five available on the related mail attribute fields. The last access path is an unordered traversal of the entiure database.
The client can override Wallaby's default access path selection by buttonning the desired index. The default (ANY) indicated Wallaby will try to choose the best access path.
[Artwork node; type 'ArtworkInterpress on' to command tool]
An access path selection set to "Let Wallaby Optimize"
4. Initiating a Query and Viewing The Results
Initiating a query is done simply by clicking one of the browse buttons (Browse or BrowseToMsgSet). The operations differ in how they present the results of a Wallaby query.
The first presentation is reminiscent of Wallaby's mail query predecessor (WalnutQuery). When the Wallaby user clicks "BrowseToMsgSet", the results of a query (in the form of messages) will be added to a user-specified message set. (Called the presentation receptacle) (Selecting multiple message sets will NOT provide multiple query receptacles.) An example is given below:
(1) Select Message Set for receptacle (in this case PrinceQuery)
[Artwork node; type 'ArtworkInterpress on' to command tool]
(2) Fill in form and access path (if necessary) and button "Browse To Message Set"
[Artwork node; type 'ArtworkInterpress on' to command tool]
(3) Open Query Message Set to view results.
[Artwork node; type 'ArtworkInterpress on' to command tool]
The second presentation mode offers a more lightweight mechanism for viewing the results of a query, one which allows users to circumvent the (sometimes sizeable) overhead of Walnut message set operations. To initiate a query in this mode, the Wallaby user buttons "Browse" (instead of "BrowseToMsgSet"). The "results" of his query will then appear as tioga buttons in the viewer below the wallaby form window. Each tioga button appears as five attributes (we call this a message clump); collectively these attributes uniquely identify a message. Buttonning a clump opens a walnut message window and displays the corresponding message text. We include an example of this browsing method below.
(1) Fill in Form and button Browse
[Artwork node; type 'ArtworkInterpress on' to command tool]
(2) Overview messages satifsying query by looking at clumps
[Artwork node; type 'ArtworkInterpress on' to command tool]
(3) Button Clump to open message
[Artwork node; type 'ArtworkInterpress on' to command tool]
The first presentation mode ("BrowseToMsgSet Receptacle") offers permance and consistency at the expense of performance. Message set receptacles persist until explicitly deleted (using Walnuts "Destroy MsgSet") and always present a consistent Walnut state. Viewer receptacles will rarely persist across a rollback and may display inconsistent Walnut information (if followed by a Walnut Expunge). Wallaby makes no attempt to invalidate message clumps resulting from expunges. Caveat Emptor.
Certain of the message set operations (like R.O.T.) were designed specifically to be used with the results of a wallaby query. An example follows:
Example 1: Suppose I wish to archive all mail from date D and earlier. First, I pose a query that directs all mail satisfying this query to message set M. Second, I R.O.T. message set M (which, for every mail message in M, removes all the links to other message sets in the mail database). Finally, I archive M and then delete it.
5. Public Databases
Maybe you are unaware that Walnut can be used for browsing public databases. (or maybe you're unaware that public mail databases exist) Currently five such databases exist for your perusal: the ARPA-Digest-BBoard database (2400 msgs as of Feb 1, 1987), the Cedar-BBoard database (2000 msgs), the CSL-BBoard database (2400 msgs), the Services-BBoard database (4000 msgs) and the CSL-Notebook database (? msgs). The first four public databases are "Bulletin Board databases"; they acrue mail sent on pre-selected distribution lists relevant to their name (e.g. the Services-BBoard is on the distribution lists Activities.pa, Astronomy.pa, Bikers, etc...). The CSL-Notebook database has a slightly different flavor than a Bulletin Board databases as information contained within is not widely disseminated. CSL-Notebook entries contains ideas, knowledge, designs, and papers (drafts or finished) submitted by lab members to be read (primarily) by lab members.
To open a public database, first close your personal database (by buttoning `destroy' in the Walnut command menu) and then type "Walnut <name of root file corresponding to public database>" to the command tool. The root file names are given below, along with a briief description of their contents.
Root File   Contents
[Ebbetts.Alpine]<ArpaDigest-BBoard>Walnut.root mail coming in to Arpa-BBoard
[Luther.Alpine]<Cedar-BBoard>Walnut.root mail about Cedar  
[Luther.Alpine]<CSL-BBoard>Walnut.root items of general interest to CSL
[Luther.Alpine]<Services-BBoard>Walnut.root useful services in the PA area
[Luther.Alpine]<CSL-Notebook>Walnut.root CSL notebook entries
We expect Walnut users will discover that, while Wallaby is helpful for browsing personal databases, it's indispensable for browsing public databases. Here are three reasons:
1. Size of Public Database Message Sets
Many of the message sets in the public databases are large and thus time costly to enumerate.
2. Organization of Public Database Message Sets
Organizing the public databases is a time-consuming, thankless task and for this reason, one that's often forgotten.
3. Meaning of a Message-Set
Often the meaning of public database message-set category is too general, too specific or even worse, it's incomprehensible to the naive user.
As a example, consider the services bulletin-board database (below): it was last organized Jan, 1986 and the active-set (as of 2/1/87) now contains about 2200 messages. Even the boldest of Walnut users will cringe at the thought of enumerating this message set. And, what's a xenia?
[Artwork node; type 'ArtworkInterpress on' to command tool]
The Services Bulletin Board Database
The CSL-Notebook whiteboards provide an easy access path to both the CSL-Notebook entry database and the Bulletin Board databases. Issue the command WBCSL-Notebook to open the CSL-Notebook whiteboards. For more information, see [Indigo]<CSL-Notebook>Documentation>CSL-NotebookDoc.tioga.
6. Limitations (At least #3 and #4 will be fixed)
(1) Wallaby provides no mechanism for filtering out semantic cruft (like "Re:" or "the"...librarians call these stopwords) from the beginning of a subject field. For now, clients interested in a particular subject X should first pose a query on X and then, assuming there is mail satisfying the query, pose a query on RE: X.
(2) Walnut does not canonicalize the sender/recipient address format. So for example, an address can be of the form "person@site.arpa or "person%site.csnet@relay.cs.net" or "person%site.bitnet@wiscvm.wisc.edu" or (most unfortunately as person's not at the beginning) "...!site1!site2!person@Host3.arpa".
(3) There is no cross-validation between Walnut and Wallaby of what they show on the screen. One consequence of this omission is that Wallaby clumps may be still be selected even when they correspond to already deleted messages.