Alan W. Biermann
Department of Computer Science
Duke University
Department of Computer Science
Duke University
Box 90129
Durham, NC 27708-0129
Phone: (919) 660-6500
Fax: (919) 660-6519
Email: awb@cs.duke.edu
http://www.cs.duke.edu/cgi-bin/facinfo?awb
Speech and Natural Language Understanding.
Voice interactive systems, human-machine collaboration, multimedia systems, user modeling, dialogue theory, human factors
When humans collaborate with each other, they undertake a variety of behaviors that enable fast and efficient convergence to the goal. Each participant is continuously involved in mental problem solving and when one or the other sees a solution, he or she will announce it. Frequently, however, there will be obstacles to the solution and attention will focus on these. What are the critical paths to success and what must be done to overcome the obstacles? The participants will open dialogue on these difficulties and bring to bear resources towards their solution.
This project seeks to embed in the machine the facilities to enable it to cooperate with a human in the same way. Specifically, knowledge for problem solving is coded in the machine with Prolog-style rules. The rules are used to attempt to prove that the goal is solved. If the proof is successful, then very little dialogue will follow. But if the proof is not successful, then the system will look for key "missing axioms" in the proof and initiate dialogue to attempt to find the needed information. The result is an aggressive interaction that addresses one problem and then another in sequence as the roadblocks to success are discovered. The dialogue jumps from one issue to the next, occasionally giving up on one, returning to a previous topic, opening a different one, and so forth until a set of steps are found that achieves success. Within this context, a series of issues arises and some of them are listed here:
Our project has implemented several speech-interactive dialogue systems to test ideas and to gain experience with them. Two examples are our Circuit-Fixit-Shoppe and Programming Tutor systems. The Circuit-Fixit-Shoppe was completed in 1991 and tested as described in the references listed below. This system demonstrated many of the characteristics described above and was successfully used by human subjects in 141 problem solving sessions to find bugs in and repair electric circuits. The Programming Tutor is currently operative, has a much cleaner and simpler design, and has full graphics and typed text communication as well as speech for a full multimedia capability.
Both systems have been tested extensively with human subjects with high success rates. Success in problem solving was in the 80 percent-plus range, speaking rates were as high as several sentences per minute, sentence recognition rates were in the 80s, and user subjective responses were very positive.
Ronnie W. Smith and D. Richard Hipp, Spoken Natural Language Dialog Systems, Oxford University Press, New York, 1994
Ronnie W. Smith, D. Richard Hipp, and Alan W. Biermann, "An Architecture for Voice Dialogue Systems Based on Prolog-Style Theorem Proving," Computational Linguistics, Vol. 21, No. 3, September, 1995.
Curry I. Guinn, "Mechanisms for Mixed Initiative Human-Computer Collaborative Discourse," 34th Annual Meeting of the ACL, Santa Cruz, June 24-27, 1996.
Curry I. Guinn, "Dialogue Mechanisms for Conflict Resolution in Natural Language Discourse," 1996 Symposium On Human Interaction With Complex Systems," Dayton, Ohio, August 26-28, 1996.
Alan W. Biermann and Philip M. Long, "The Composition of Messages in Speech-Graphics Interactive Systems," International Symposium on Spoken Dialogue, Philadelphia, Penn., October 2-3, 1996.
Alan W. Biermann, Curry I. Guinn, Michael S. Fulkerson, Gregory Keim, Zheng Liang, Douglas M. Melamed, Krishnan Rajagopalan, "Goal-Oriented Multimedia Dialogue with Variable Initiative," to be presented at the International Symposium on Methodologies for Intelligent Systems-1997, Charlotte, North Carolina, October 15-18, 1997.
This general area is extremely broad and includes fields related to every stage of processing: speech recognition, parsing theory, semantics theory, representation of knowledge, collaborative theory, dialogue theory, natural language generation, speech generation, multimedia communication, user modeling, and much more.
Computational Linguistics, the journal.
James Allen, Natural Language Understanding, Second Edition, Benjamin/Cummings Publishing Company, Inc., 1994.
Virtual Environments
Other Communicative Modalities
Adaptive Human Interfaces
Usability and User-Centered Design
Intelligent Interactive Systems for Persons with Disabilities
There are long lists of related projects including the following: the human factors of voice interactive problem-solving systems, learning for optimization of dialogue performance, strategies for tutoring and their automation, studies of multimedia systems.