Heading:
The IDL project--CSL expectations
Page Numbers: Yes X: 527 Y: 10.5"
Inter-Office Memorandum
ToR. Taylor, B. Wegbreit, W. TeitelmanDateJanuary 19, 1977
FromRon KaplanLocationPalo Alto
SubjectThe IDL project: CSL objectives andOrganizationPARC-CSL
expectations
XEROX
Filed on: <IDL>CSL-IDL.bravo
This memo concerns the proposed project to implement the IDL data analysis system in INTERLISP on MAXC. It outlines CSL’s objectives in supporting this project and describes CSL’s expectations as to the costs and benefits of the project, the milestones to be reached in the implementation effort, and the capabilities of the resulting system. This memo is an augmented summary of our January 6, 1977 meeting.
A.CSL objectives
The current facilities at PARC for the organization and analysis of quantitative data are either woefully inadequate or non-existent. This has added tremendously to the cost of conducting experimental studies of computational and psychological issues: in some cases (e.g. my own psycholinguistic work) the perceived data analysis costs have been prohibitive; in other instances (e.g. Card and Moran’s user studies), inordinate amounts of time and energy have gone into hodge-podge schemes for statistical processing. Empirical investigations of user-behavior will become increasingly important to future CSL research (particularly as it relates to OIS), so the primary objective of CSL support for IDL is to obtain facilities at PARC that will encourage experimental user studies by making data analyses simple and efficient to perform.
There are several secondary objectives for the proposed project:
(1) A sophisticated computing environment for the day-to-day research activites of future PARC social scientists should give them a direct, intuitive understanding of important design and implementation issues upon which psychological and sociological studies might have a bearing. We would thus enlarge the pool of social scientists able to conduct OIS-relevant investigations, and presumably magnify the (currently very small) impact that behavioral research has on other CSL projects.
(2) IDL embodies the strong claim that applications software should be organized around a notion of functional decomposition in the user’s conceptual domain--that the user interface should be a set of basis operators together with natural rules of combination. This contrasts with the menu approach found in most current PARC applications software (Bravo, Markup, Draw, etc.). The implementation and use of IDL should provide information for comparing and evaluating these differing user-interface philosophies.
B.Performance specifications
The capabilities and performance characteristics of the INTERLISP IDL system depend on the resources we invest in the project. The minimal investment will provide a system with the same capabilities as the current PPL system--it will be a transcription from PPL into INTERLISP, with minor design changes and simplifications to take advantge of Lisp’s more powerful programming environment. For a standard mix of computational tasks (loading a particular data set and performing regression and variance analyses), the INTERLISP system should be between 10 and 100 times faster than the PPL implementation. Consequently, it should be reasonable for a researcher to perform these computations in interactive mode (i.e. 2-3 minutes real time) on a moderately loaded MAXC (load average below 4). The minimal system will thus provide a useful data analysis facility at PARC.
For a very slight investment increment, we will load W. Teitelman’s display facilities into the system. As we understand Teitelman’s package, it will enable IDL commands to be constructed by pointing at the screen, and will permit scrolling through large data-objects printed by the standard IDL print routines. The following important qualification is attached to the display capabilites: While these facilities will be demonstrable, there is no guarantee that they will be efficient enough (both in time and space) to be included in the ordinary users’ version of the system without considerable additional investment. However, given anticipated improvements in MAXC 1 INTERLISP, the imminent availability of MAXC 2, and the eventual appearance of D1’s, we are optimistic that the graphics capabilities will be widely exercised.
For a more sizable increase in resources, we envision more major enhancements of the system. The language might be revised, fundamental design and implementation decisions might be reworked, and certain new capabilities might be added through function extension (e.g. primitives for multivariate analyses). One important direction for future development is to maximize the utility of the display interface, perhaps modifying the underlying organization of the system to take full advantage of graphics capabilities.
Note, however, that we are not proposing such major enhancements as part of our initial build-a-data-analysis-system-for-PARC effort: We expect the system to be a valuable research tool even without these augmentations. Major extensions and revisions (including more elaborate graphics) will emerge if and when we decide to convert the IDL project from a straightforward implementation effort to an area of substantive research.
C.Milestones and cut-points
The implementation effort will proceed through a series of milestones; failure to meet some of these milestones will be taken as evidence against continued investment in the project.
Within 1 month: Complete major design and implementation decisions; organize and construct the INTERLISP environment for IDL programming; establish basic coding conventions; begin implementation of data manipulation primitives. Cut or post-pone the project if our programmer does not live up to expectations.
Within 3 months: Complete the implementation of data manipulation primitives (array storage and access functions). Gather preliminary data on speed and space efficiency for typical computations--cut if we seriously over-estimated improvements over PPL.
Within 5 months: Complete transcription of the major data-processing routines (the compression and analysis routines defined in the IDL manual). Make the system available for experimental use by friendly users.
Within 6 months: Finish debugging the system to attain capabilities and performance requirements outlined above. The system should be available to more hostile users.
Within 7 months: Modify the PPL documentation to reflect any discrepancies in the INTERLISP implementation. Provide documentation to help future system maintenance. The system should be available to ordinary users.
At this point the minimal system configuration will be complete. Another month might be needed for smoothing the user-interface and adding certain utility programs to help user’s in defining and mangaging their data-bases. This final month should also see the addition of display facilities, at least for demonstration and probably for ordinary use. Display work will be deferred to this late date to avoid confounding the separable issues of implementing an already designed system and exploring new avenues for user interaction.
Other major enhancements to the system will be carried out over a much longer period, according to the needs of users and the research interests of Sheil and Kaplan, and perhaps Card and Moran. The use of displays in this domain will be a primary research focus for me, once the minimal system is built. This research is not properly a part of the IDL project proposed here, since it will not require temporary programming assistance, and since its relevance spans many prototype systems besides IDL (KRL, GSP, etc.).
D.Estimated Costs
For the minimal system:
Temporary programmer (Jan Dericksen): 8 months.
Sheil: 1/2 time for one month, 1/5 time for 7 months.
Kaplan: 1/2 time for one month, 1/5 time for 7 months.
Secretary (for documentation): 1 month.
Future research into IDL and IDL-like user interfaces will be conducted after the minimal system is finished. There are no time estimates for the effort (by Kaplan, Sheil, Card, Moran...) that will go into this research. However, the temporary programmer will not be necessary.
c: B. Sheil, W. Sutherland