Postscript Version
Robust, Incremental Parsing and Disambiguation for a Dialog Agent
Lenhart K. Schubert
Department of Computer Science
University of Rochester
Rochester, NY 14627-0226
CONTACT INFORMATION
Email:
schubert@cs.rochester.edu
Phone: (716) 275-8845
Fax : (716) 461-2018
WWW PAGE
http://www.cs.rochester.edu/u/schubert/
PROGRAM AREA
Speech and Natural Language Understanding.
KEYWORDS
dialog parsing, robust parsing, incremental parsing, integrated disambiguation
PROJECT SUMMARY
The goal of this project is to develop general techniques for robust
parsing and disambiguation of task-oriented spoken dialogs. The
primary test bed is the TRAINS transportation planning assistant at
U. Rochester. This system assists a user to solve transportation
planning problems of varying complexity, using spoken interaction
and a map display with use of pointing, menus, etc.
While in the current system turn-taking is controlled by the user
through a button, our eventual goal is to have the system participate
as a full partner in a free-flowing dialog. A primary challenge in
such dialogs lies in the disruption of ordinary grammatical structure
by repairs, interjected acknowledgements and corrections, etc.
In human-machine dialogs, the problem is compounded by the speech
recognition errors introduced by the speech recognizer.
The techniques we are developing for parsing such dialogs include
the use of fragmentary parses (in addition to complete utterance
parses where possible), domain-specific rules to hypothesize speech
acts based on surface constituents, the use of cue words and prosody,
and a novel dialog chart parser with one track for each
speaker and with metarules for forming constituents that straddle
acknowledgements, repairs, etc., and even "jump" from track to track.
An essential part of our project is the development of disambiguation
methods for resolving both structural and word sense ambiguities.
These methods must be usable in an incremental fashion, both to keep
in check the growth in alternative analyses and interpretations, and
(eventually) to allow mid-sentence acknowledgements, corrections, etc.,
by the system while the user is speaking. Our current approach involves
a notion of head patterns associated with phrase structure rules.
The patterns are based on word-senses of head words of syntactic
constituents, or classes of these word-senses. The frequencies of
these patterns will be extracted from various text and dialog
corpora, and used in a probabilistic parser. This approach is
expected to combine the benefits of word n-gram based disambiguation,
probabilistic grammars, and semantic preference-based approaches.
To further the goal of integrating syntactic analysis and discourse
analysis, we are also participating in a major effort to develop
a general dialog annotation scheme that will capture the pragmatically
most important properties and relationships of dialog segments, and
to produce annotated dialog corpora based on this scheme.
The following points summarize the progress to date on the various facets
of our research.
- A robust parser for speech input to the TRAINS system
has been under continual development for several years, while
remaining operational at all times. It handles dialogs with minimally
trained users in near real time, achieving robustness through use of
postprocessing to correct speech recognition errors, domain-based
rules for mapping syntactic fragments to speech act hypotheses,
and dialog strategies to keep the dialog flowing efficiently despite
recognition and interpretation errors. Overall system evaluation
has shown that speech input provides better performance than
keyboard input, despite the greater word-error rate for speech.
(Allen et al. 1996).
- A preliminary implementation of the dialog-chart-based parser
was able to form constituents across acknowledgements, editing terms,
repairs, etc., by "hiding" these interruptions at the terminal
nodes of parse trees (Core & Schubert 1996). But difficulties
with overlapped speech and certain complex repairs have led to the
formulation of a new approach based on metarules to specify
how phrase structure may be restarted, interrupted or overlapped
(Core & Schubert 1997).
- We formulated a method of resolving prepositional phrase
attachment ambiguities based on using statistical information
about multiple word patterns and word class patterns in the vicinity
of the ambiguity. The method improved on previous approaches and
showed good transfer properties in training on one corpus and
testing on a quite different corpus (Core 1996). We expect that
the new technique based on head patterns (described above) will
be more accurate and more general. It avoids the inherent "myopia"
of word n-grams, e.g., the inability to treat The train has
departed and The second train has not yet departed as
instantiating some of the same word patterns.
- The Discourse Research Initiative (DRI) is a multi-site effort
to develop standardized coding schemas for corpora.
DRI includes members from MIT, CMU, AT&T Bell Labs, Lotus
Research Group, DFKI, and many other sites.
In the context of this initiative, we have
developed an utterance annotation scheme called DAMSL (Dialog Act
Markup in Several Layers) for use as a working standard for the
members of the DRI group (Allen & Core 1997).
- Several related projects have been partially motivated by the
work on robust dialog parsing and disambiguation. We have developed
a word parser that robustly analyzes the morphological and semantic
structure of any given word, even in the absence of relevant lexical
information. In addition we have worked on issues in discourse
structure and semantics, including the formal discourse structure
underlying extended "generic passages" (Carlson & Spejewski 1997),
the significance of thematic roles in discourse about events
(Carlson 1997), the representation of referential connections
in complex sentences and extended discourses (Schubert 1996),
and the modelling of the beliefs and thought processes of agents
described in a discourse (Kaplan & Schubert 1997).
PROJECT REFERENCES
J.F. Allen and M.G. Core, "Draft of DAMSL: Dialog Act Markup
in Several Layers", draft manual, Dept. of Computer Science, Univ. of
Rochester, Rochester, NY, March 1997.
J.F. Allen, B. Miller, E.K. Ringger, and T. Sikorski,
"A robust system for natural spoken dialogue", Proc. of the 34th Ann.
Meet. of the Assoc. for Computational Linguistics (ACL'96), pp. 62-70,
1996.
G.N. Carlson, "Thematic roles and the individuation of events,"
to appear in S. Rothstein (ed.), Events and Grammar.
G.N. Carlson and B. Spejewski, "Generic passages", Natural
Language Semantics 5 1997, pp. 1-65 (to appear).
M.G. Core, "Using parsed corpora for structural disambiguation in
the TRAINS domain", in Proc. of the 34th Ann. Meet. of the Assoc. for
Computational Linguistics (ACL'96), pp. 345-7, 1996. Expanded version
available as Tech. Rep. 608, Dept. of Computer Science, Univ. of Rochester,
Rochester NY 14627-0226.
M.G. Core and L. K. Schubert, "Handling speech repairs and other
disruptions through parser metarules", in Working Notes, AAAI Spring
Symposium on Computational Models for Mixed Initiative Interaction,
Stanford, CA, March 24-26, 1997.
M.G. Core and L.K. Schubert, "Dialog parsing in the TRAINS
system", Tech. Rep. 612, Dept. of Computer Science, Univ. of Rochester,
Rochester, NY 14627-0226, March 1996.
A.N. Kaplan and L.K. Schubert. "Simulative inference in a
computational model of belief." In Second International Workshop on
Computational Semantics, pp. 107-121, Tilburg, The Netherlands,
January 8-10, 1997.
AREA BACKGROUND
Our project falls into the area of "conversational problem-solving
assistants". This relatively new area still lacks standard surveys
or reference works, beyond basic background on NL processing,
such as (Allen 1994, Pereira & Grosz 1994). But traditional NLP
techniques cannot deal with intertwined discourse, with the
problem-solving and plan reasoning aspects, or with multiple
modalities of communication. So the area naturally remains eclectic,
drawing not only on the NLP literature but also on the literature
concerning the intentions and plans that drive communication (e.g.,
Cohen et al. 1990), knowledge representation, reasoning and
planning (e.g., Genesereth & Nilsson 1987, Allen et al. 1990),
and miscellaneous work on speech processing, psycholinguistics (e.g,
Clifton et al. 1994), formal semantics, discourse processing,
multimodal communication, hybrid reasoning, and architectures for
intelligent systems.
AREA REFERENCES
J.F. Allen, L.K. Schubert, G. Ferguson, P. Heeman, C.H. Hwang, T. Kato,
M. Light, N. Martin, B. Miller, M. Poesio, and D. Traum, "The TRAINS
project: a case study in building a conversational planning agent",
J. of Expt. and Theor. Artif. Intell. 7, 1995, pp.7-48.
J.F. Allen, Natural Language Understanding (2nd ed.),
Benjamin Cummings, Menlo Park, CA, 1994.
J.F. Allen, J. Hendler, and A. Tate, (eds.), Readings in Planning,
Morgan Kaufmann, San Mateo, CA, 1990.
C. Clifton, L. Frazier, and K. Rayner, (eds.), Perspectives on
Sentence Processing, Erlbaum, Hillsboro, NJ, 1994.
P. Cohen, J. Morgan, and M. Pollack, (eds.), Intentions in
Communication, MIT Press, Cambridge, MA, 1990.
M. Genesereth and N. Nilsson, Logical Foundations of Artificial
Intelligence, Morgan Kaufmann, San Mareo, CA, 1987.
F. Pereira and B. Grosz, (eds.), Natural Language Processing,
MIT Press, Cambridge, MA, 1994.
RELATED PROGRAM AREAS
Other Communication Modalities, Adaptive Human Interfaces.