Postscript Version

SYNTHESIZING CONVERSATION BETWEEN HUMAN-LIKE COOPERATIVE AGENTS

Norman I. Badler
Mark Steedman

Center for Human Modeling and Simulation
Computer and Information Science Department
University of Pennsylvania

CONTACT INFORMATION

Computer and Information Science Department
200 South 33rd Street
University of Pennsylvania
Philadelphia, PA 19104-6389
Phone: (215) 898-5862 (Badler) / (215) 898-2012 (Steedman)
Fax : (215) 573-7453 (Badler) / (215) 898-0587 (Steedman)
Email: badler@central.cis.upenn.edu / steedman@linc.cis.upenn.edu

WWW PAGE

Badler
Steedman

PROGRAM AREA

Other Communication Modalities (See the explanation page)

KEYWORDS

Communicating agents, virtual humans, gesture, situated behavior, visual attention, facial animation, dialog planning, intonational synthesis.

PROJECT SUMMARY

The last few years have seen great maturation in the computation speed and control methods needed to portray 3D virtual humans suitable for real interactive applications. This project focuses on various aspects of real-time virtual humans that communicate with users or each other (Badler 1997). These aspects include interactive control, autonomous action, gesture, situated behavior (Trias 1996), visual attention, facial animation, dialog planning, and intonation generation. The underlying architecture consists of a sense-control-act structure that permits reactive behaviors to be locally adaptive to the environment and objects in it (Levison 1996), and a Parallel Transition Network (parallel finite-state machine controller) that can be used to drive virtual humans through complex communicative and manual tasks. Ths project basically studies deep connections between language and animation. Among the testbeds being developed are two systems. Jack Presenter (Noma 1997) is constructed on top of the UPenn Jack software by Visiting Professor Tsukasa Noma; Jack Presenter narrates and gestures appropriately to both audience and a 2D ``whiteboard'' image. JackMOO (Smith 1997) is an extension to lambdaMOO which uses a language-based interface to command real-time virtual human avatars to interact with each other. These efforts are leading to an ongoing definition of a Parameterized Action Representation for mediating between language instructions and animated, situated actions.

On the conversational side of this project, CCG grammar theory has been further developed as infrastructure for the Information Based Intonational Synthesis (IBIS) component of the synthesis, and the semantics for the intonational module of CCG has been further developed as a basis for same. (Steedman 1996a Steedman 1996b, Steedman 1997.)

The DIALUP Modal Logic Programming Language has been developed for the conversational planning component underlying the animation (Stone 1997a, 1997b, 1997c). The related temporal event representation that will underpin the action plan, Dynamic Event Calculus, has been developed in Steedman 1997b.

PROJECT REFERENCES

Badler, N. ``Virtual Humans for Animation, Ergonomics, and Simulation,'' IEEE Workshop on Non-Rigid and Articulated Motion. Puerto Rico, June 1997.

Levison, L. Connecting Planning and Acting Via Object-Specific Reasoning. PhD thesis, CIS, University of Pennsylvania, 1996.

Noma, T. and Badler, N. ``A Virtual Human Presenter,'' Animated Interface Agents Workshop, IJCAI-97, Japan.

Smith, T.J.,Shi, J., Granieri, J., and Badler, N. JackMOO: A Prototype System for Natural Language Avatar Control,'' to appear, Pacific Graphics '97.

Steedman, M. Surface Structure and Interpretation, Linguistic Inquiry Monograph 30, MIT Press, 1996a.

Steedman, M. ``Representing Discourse Information for Spoken Dialogue Generation,'' Proceedings of International Symposium on Spoken Dialogue, International Conference on Spoken Language Processing (held in conjunction with ICSLP-96), Philadelphia Sept. 1996, 89-92, 1996c.

Steedman, M. ``Information Structure and the Syntax-Phonology Interface,'' submitted 1997a.

Steedman, M. ``What We Talk About When We Talk About Time,'' Tutorial Notes for International Conference on Temporal Logic, Manchester, July, 1997b.

Stone, M. ``Reasoning in Natural Language Generation through Fast Modal Logic Programming,'' in Proceedings of AAAI Fall Symposium on Communicative Action, Cambridge, MA. 1997a.

Stone, M. ``Representing Scope in Intuitionistic Deductions,'' Theoretical Computer Science, 1997b, to appear.

Stone, M. ``Efficient Constraints on Possible Worlds for Reasoning about Necessity,'' Tech. Report IRCS 97-7 and CIS 97-10, University of Pennsylvania. 1997c.

Trias, T., Chopra, S., Reich, B., Moore, M., Badler, N., Webber, B., and Geib, C. ``Decision Networks for Integrating the Behaviors of Virtual Agents and Avatars,'' IEEE VRAIS, 1996.

AREA BACKGROUND

Only fifty years ago, computers were barely able to compute useful mathematical functions. Twenty-five years ago, enthusiastic computer researchers were predicting that all sorts of human tasks from game-playing to automatic robots that travel and communicate with us would be in our future. Today's truth lies somewhere in-between. We have balanced our expectations of complete machine autonomy with a more rational view that machines should assist people to accomplish meaningful, difficult, and often enormously complex tasks. When those tasks involve human interaction with the physical world, computational representations of the human body can be used to escape the constraints of presence, safety, and even physicality.

Virtual humans are computer models of people that can be used

Recent improvements in computation speed and control methods have allowed the portrayal of 3D humans suitable for interactive and real-time applications. There are many reasons to design specialized human models that individually optimize character, performance, intelligence, and so on. Many research and development efforts concentrate on one or two of these criteria.

In a very general way, we can characterize the state of virtual human modeling along at least five dimensions:

  1. Appearance: Cartoon shape -----+--> Physiologically accurate model
  2. Function: Cartoon actions -----+->Human limitations
  3. Time: Off-line generation -----+-> Real-time production
  4. Autonomy: Direct animation -----+--> Intelligent
  5. Individuality: Specific person ---+----> Varying personalities

The arrows and hash marks are meant to be qualitative indicators of where we think usable technology exists today. Understanding that the arrows can actually extend an undetermined distance to the right, the idea is nonetheless being conveyed that we (and others) have proceeded rather far beyond the individual rendering of still frames as realized by traditional hand animation or even computer assisted cartoon animation. If we need to invoke them, the appearance of increasingly accurate physiologically- and biomechanically-grounded human models may be obtained. We can create virtual humans with functional limitations that go beyond cartoons into instantiations of known human factors data. Animated virtual humans can be created in human time scales through motion capture or computer synthesis. Virtual humans are also beginning to exhibit the early stages of autonomy and intelligence as they communicate, react, and make decisions in novel, changing environments rather than being forced into fixed movements. Investigations are underway to create characters with individuality and personality who react to and interact with other real or virtual people.

AREA REFERENCES

Badler, N.,``Virtual Humans for Animation, Ergonomics, and Simulation.'' SIGGRAPH '97 Course Notes, August 1997.

Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, W., Douville, B., Prevost, S., and Stone, M. " Animated Conversation: Rule-Based Generation of Facial Expression, Gesture and Spoken Intonation for Multiple Conversational Agents,'' Computer Graphics, pp. 413-420, July 1994.

Pelachaud, C., Badler, N., and Steedman, M. ``Generating Facial Expressions for Speech,'' Cognitive Science 20(1), pp. 1-46, 1996..

RELATED PROGRAM AREAS

1. Virtual Environments
2. Speech and Natural Language Understanding
4. Adaptive Human Interfaces
6. Intelligent Interactive Systems for Persons with Disabilities

POTENTIAL RELATED PROJECTS

Virtual humans may be applicable to any number of projects under this IRI program. Indeed, the underlying Jack software is now commercially available from Transom Technologies, Inc.