Postscript Version
DEVELOPING AN EXPERIMENT MANAGEMENT SYSTEM
Miron Livny (*),
Yannis Ioannidis (*),
Gary G. Borisy (**),
John M. Norman (***)
(*) Department of Computer Sciences
(**) Department of Zoology
(***) Department of Soil Science
University of Wisconsin - Madison
CONTACT INFORMATION
Miron Livny
Computer Sciences Department
1210 W. Dayton St.
University of Wisconsin
Madison, WI 53706
Phone: (608) 262-0856
Fax : (608) 262-9777
Email: miron@cs.wisc.edu
WWW PAGE
http://www.cs.wisc.edu/ZOO
PROGRAM AREA
Virtual Environments
KEYWORDS
Experiment management, complex object visualization, visual queries, visual
abstractions, visual translation specifications
PROJECT SUMMARY
This project has focused on developing the ZOO Desktop Experiment
Management Environment (DEME)
that will provide scientists with support for the management of long term
experimental studies.
Although our goal is for the system to be as generic as possible,
special emphasis has been given to plant-growth simulation experiments in Soil
Sciences and spectroscopy experiments in Biochemistry.
The focus of our research has been on the following areas:
prototyping pieces of the system, user-interfaces, data translation between
objects in object-oriented databases and flat files, and communication with
experimentation environments.
Below, we briefly describe the technical results in each area and
provide references to the relevant papers that we have co-authored.
Development
A prototype system based on the Moose object-oriented data model and the Fox
declarative query language that we have
designed has been partially developed on top of the Informix relational database system.
The architecture of ZOO has been pretty much finalized and
significant progress has been made in its implementation [1].
The system is at a stage where all of its modules are operational (in varying
degrees of the final overall functionality desired).
Specific details on the efforts on several of these modules
are mentioned in the corresponding subsections below.
The more advanced modules
are being used by three domain scientists on the University of Wisconsin campus
to design their experimental studies and the corresponding databases.
Specifically, a large Soil Sciences experiment and a large NMR spectroscopy experiment
are being captured as Moose schemas, with the ultimate goal of connecting the
entire experimentation environments with ZOO.
All users of the ZOO-related tools already developed have been
enthusiastic about the prospects of having the complete ZOO system operational
in their labs [2,3,4,5].
User Interfaces
One of the main research emphasis within this project has been on
user-interfaces, and especially schema management.
One of the most advanced and critical modules of ZOO is Opossum, a flexible,
customizable, and extensible schema management system that approaches
user-schema interaction in an innovative way [6].
It is based on a visualization formalism that we have introduced earlier, which
separates the visual domain from the data domain and establishes a mapping
(metaphor) between the two [7].
Opossum employees several novel techniques to offer the
following capabilities: enhancement of schema visualizations with
user-specific annotations, organizational information, and aesthetic
preferences; exploration of schemas through choice of visual representations;
and creation of new visual representation styles when existing ones prove
unsatisfactory.
Opossum is operational and is probably the module of ZOO most heavily used by
our domain scientist collaborators.
One of the most important characteristics of Opossum is its support for large
visual schema layout.
Motivated by the needs of our collaborators, we have developed techniques that
strike a balance between user
specification and automatic generation of layouts, work at multiple
granularities, and are generally applicable.
In particular, we have introduced a general framework and layout algorithm that
(a) deals with arbitrary types of visual objects, (b) allows objects to be viewed
in any one of several different visual representations (at different levels of
detail), and (c) uses a small number of user-specified layouts to guide heuristic
decisions for automatically deriving many other layouts in a manner that attempts
to be consistent with the user's preferences [8].
The algorithm has been implemented within Opossum
and has been rather effective in capturing the intuition of the scientists using
the system.
Translation Between Object-Oriented Data and Flat Files
Another issue in ZOO that we have emphasized is the
interaction of the system with external experimentation
environments, in particular, simulators [9].
These simulators use flat files as their input (resp. output), which are
generated based on (resp. generate) the corresponding Moose objects inside ZOO.
This problem of translating database objects into a flat format to be written
out in a flat Ascii file or, conversely, translating the contents of a file into
a complex database object arises in several applications.
In the context of ZOO, we have introduced Frog, a visual tool that can be
used to specify translations
between database objects and flat files, requiring no programming by the user.
The tool can deal with objects of arbitrary complexity, without the object
complexity being directly reflected in the complexity of the corresponding
visual interaction.
Based on the visual actions of the user, the tool stores enough information in
a map-file, whose contents are used at run-time by another tool,
Turtle,
to translate any chosen database object into the appropriate file layout.
These tool has been partially developed within ZOO and have been tried in a
limited way by our collaborators with success.
PROJECT REFERENCES
[1]
Y. Ioannidis, M. Livny, S. Gupta, and N. Ponnekanti,
"ZOO: A Desktop Experiment Management Environment" ,
Proc. 22nd International VLDB Conference,
Bombay, India, September 1996.
[2]
J. L. Markley, J. B. Olson, Jr., R. Chylla, W. M. Westler, E. L. Ulrich,
Y. Ioannidis, and M.Livny, "Approaches to Automating the Assignment of
NMR Spectra of Proteins" (abstract), 37th Experimental NMR Conference,
Asilomar, CA, March 1996.
[3]
J. L. Markley, E. L. Ulrich, R. A. Chylla, M. Livny, Y. E. Ioannidis, J.
B. Olson, D. Argentar, and W. M. Westler, "NMR Laboratory Process Control
and Data Management" (abstract), 34th Annual Eastern Analytical
Symposium, Somerset, NJ, November 1995.
[4]
E. L. Ulrich, D. Argentar, M. Livny, Y. E. Ioannidis, J. L. Markley,
"The NMR Dictionary" (abstract), mmCIF Workshop, American Crystallographic
Association Meeting, Montreal, Canada, July 1995.
[5]
E. L. Ulrich, M. Livny, Y. Ioannidis, C. Mortezai-Zanjani, A. Klimowicz,
and J. L. Markley, "BioMagResBank" (abstract), Keystone Symposia, Frontiers
of NMR in Molecular Biology - IV, Keystone, CO, April 1995.
[6]
E. Haber, Y. Ioannidis, and M. Livny,
"OPOSSUM: Desk-Top Schema Management through Customizable Visualization" ,
Proc. 21st International VLDB
Conference, Zurich, Switzerland, September 1995, pp. 527-538.
[7]
E. Haber, Y. Ioannidis, and M. Livny, "Formalizing Visual Metaphors in Any
Dimension" (position paper), Proc. 2nd FADIVA Workshop, Glasgow,
Scottland, July 1995.
[8]
Y. Ioannidis, M. Livny, J. Bao, and E. Haber,
"User-Oriented Visual Layout at Multiple Granularities" ,
Proc. 3nd International Workshop on Advanced
Visual Interfaces, Gubbio, Italy, May 1996, pp. 184-193.
[9]
V. Anjur, Y. Ioannidis, and M. Livny,
"Frog and Turtle: Visual Bridges Between Files and Object-Oriented Data" ,
Proc. 8th International Conference on Scientific and Statistical Database
Management, Stockholm, Sweden, June 1996.
AREA BACKGROUND
In the past few years, several scientific communities have initiated
very ambitious and broad-ranged projects in their disciplines.
The NASA Eos effort and the NIH Human Genome project
are two examples of such national and international scientific endeavors.
A major part of these projects is the collection
of huge amounts of data (sometimes measured in petabytes) on complex
phenomena.
Managing this surge of scientific data poses
significant challenges, many of which cannot be effectively
addressed by existing database technology.
This has resulted in much research activity in the area of
Scientific Database Systems.
Nevertheless, still little attention has been devoted to the needs of small teams
of scientists who perform individual experimental studies in their laboratories.
In particular, a major problem that many experimental scientists are facing is
that there are no adequate experiment management tools that are
powerful enough to capture the complexity of the experiments and
at the same time are natural and intuitive to the non-expert.
A small laboratory that can easily generate and store several megabytes of
data per day is still dependent on the good old paper notebook when it comes
to keeping track of the data.
The experience of several researchers working in the area as well as our own
experience from installed software (pieces of ZOO) as tested and evaluated
in real-life settings has shown that one of the biggest challenges of experiment
management environments like ZOO lies with their user interface.
Thus, such systems are no different from general database systems, whose user
interfaces have time and again been considered as (one of) the most important
area(s) in need of research attention.
The main problems faced by such systems are related to schema visualization,
experiment flow modeling, visual query languages, visual queries,
incremental queries, and others.
AREA REFERENCES
A. Bonner, A. Shrufi, and S. Rozen,
"Database requirements for workflow management in a high-throughput
genome laboratory",
in Proc. NSF Workshop on Workflow and Process Automation in
Information Systems, Athens, GA, May 1996.
J. G. Carbonell and R. D. Brown,
"Anaphora resolution: A multi-strategy approach",
in Proc. of the International Conference of Computational
Linguistics, Budapest, Hungary, 1988.
I-M. Chen and V. Markowitz,
"An overview of the Object Protocol Model (OPM) and the OPM data
management tools",
Information Systems, 20(5):393--418, July 1995.
I. Cruz,
"DOODLE: A visual language for object-oriented databases",
in Proc. 1992 ACM-SIGMOD Conference on the Management of Data,
pages 71--80, San Diego, CA, June 1992.
I. Cruz,
"User-defined visual languages for querying data",
Technical Report CS-93-58, Brown University, December 1993.
I. Cruz,
"Solving the expressiveness clash in declarative graph drawing:
Results and open problems",
in Proc. Int. Workshop on Constraints for Graphics and
Visualization, Cassis, France, September 1995.
J. Cushing et al,
"Object-oriented database support for computational chemistry",
in H. Hinterberger and J.C. French, editors, Proc. 6th
International Working Conference on Statistical and Scientific Database
Management, Zurich, Switzerland, September 1992.
J. C. French, A. K. Jones, and J. L. Pfaltz,
"Summary of the final report of the NSF workshop on scientific
database management",
ACM-SIGMOD Record, 19(4):32--40, December 1990.
E. Haber,
"Visual Schema Management for Database Systems",
PhD thesis, University of Wisconsin--Madison, August 1995.
M. Hsu,
"Special issue on workflow systems",
IEEE Data Engineering, 18(1), March 1995.
Y. Ioannidis,
"Visual user interfaces for database systems",
in Proc. ACM Workshop on Strategic Directions in Computing
Research, Cambridge, MA, June 1996.
Available through http://www.cs.brown.edu/people/ifc/hci.html.
J. D. Mackinlay,
"Automatic design of graphical presentations",
Technical Report STAN-CS-86-1138, Department of Computer Science,
Stanford University, 1986.
V. M. Markowitz and W. Fang,
"SDT - a database schema design and translation tool",
Technical Report LBL-27843, Lawrence Berkeley Laboratory, Berkeley,
CA, May 1991.
J. Paredaens et al,
"An overview of GOOD",
ACM-SIGMOD Record, 20(1):25--31, March 1992.
B. Shneiderman,
"Drawing queries for visual information seeking",
IEEE Software, 11(6):70--77, November 1994.
A. Silberschatz, M. Stonebraker, and J. Ullman,
"Database research: Achievements and opportunities into the 21st
century",
ACM Sigmod Record, 25(1):52--63, March 1996.
E. Szeto and V. M. Markowitz,
"ERDRAW - a graphical schema specification tool",
Technical Report LBL-PUB-3084, Lawrence Berkeley Laboratory,
Berkeley, CA, May 1991.
J. Wainer, M. Weske, G. Vossen, and C. Medeiros,
"Scientific workflow systems",
in Proc. NSF Workshop on Workflow and Process Automation in
Information Systems, Athens, GA, May 1996.
RELATED PROGRAM AREAS
Adaptive Human Interfaces; Speech and Natural Language Understanding.
POTENTIAL RELATED PROJECTS
None currently.