next up previous contents index
Next: 6.3 Dialogue Modeling Up: 6 Discourse and Dialogue Previous: 6.1 Overview

6.2 Discourse Modeling

Donia Scott & Hans Kamp
University of Brighton, UK
University of Stuttgart, Germany

6.2.1 Overview: Discourse and Dialogue

A central problem which the development of dialogue systems encounters is one that it has inherited directly from contemporary linguistics, where one is still struggling to achieve a genuine integration of semantics and pragmatics. A satisfactory analysis of dialogue requires in general both semantic representation, i.e. representation of the content of what the different participants are saying, and pragmatic information - what kinds of speech acts they are performing (are they asking a question, answering a question that has just been asked, asking a question for clarification of what was just said, making a proposal, etc.?), what information is available to each of the participants and what information does she want; and, more generally, what is the purpose behind their various utterances or even behind their entering upon the dialogue in the first place. Determining the semantic representation of an utterance and its pragmatic features must in general proceed in tandem: to determine the pragmatic properties of the utterance it is often necessary to have a representation of its content; conversely, it is---especially for the highly elliptical utterances that are common in spoken dialogue---often hardly possible to identify content without an independent assessment of the pragmatic role the utterance is meant to play. A dialogue system identifying the relevant semantic and pragmatic information will thus have to be based on a theory in which semantics and pragmatics are (i) both developed with the formal precision that is a prerequisite for implementation and (ii) suitably attuned to each other and intertwined.

Current approaches to discourse and dialogue from the field of artificial intelligence and computational linguistics are based on four predominant theories of discourse which emerged in the mid- to late-eighties:

[Hob85]:
A theory of discourse coherence based on a small, limited set of coherence relations, applied recursively to discourse segments. This is part of a larger, still-developing theory of the relations between text interpretation and belief systems.
[GS86]:
A tripartite organization of discourse structure according to the focus of attention of the speaker (the attentional state), the structure of the speaker's purposes (the intentional structure) and the structure of sequences of utterances (the linguistic structure); each of these three constituents deal with different aspects of the discourse.
[MT87]:
A hierarchical organization of text spans, where each span is either the nucleus (central) or satellite (support) of one of a set of discourse relations. This approach is commonly known as Rhetorical Structure Theory (RST).
[McK85]:
A hierarchical organization of discourse around fixed schemata which guarantee coherence and which drive content selection in generation.

No theory is complete, and some (or aspects of some) lend themselves more readily to implementation than others. In addition, no single theory is suitable for use on both sides of the natural language processing coin: the approaches advocated by Grosz and Sidner, and by Hobbs are geared towards whereas those of Mann and Thompson, and of McKeown are more appropriate for natural language generation. With the burgeoning of research on natural language generation since the late-eighties has come an expansion of the emphasis of computational approaches of discourse towards discourse production and, concomitantly, dialogue.

One important aspect of dialogues is that the successive utterances which make it up are often interconnected by cross references of various sorts. For instance, one utterance will use a pronoun (or a deictic temporal phrase such as the day after, etc.) to refer to something mentioned in the utterance preceding it. Therefore the semantic theory underlying sophisticated dialogue systems must be in a position to compute and represent such cross references. Traditional theories and frameworks of formal semantics are sentence based and therefore not suited for discourse semantics without considerable extensions.

6.2.2 Discourse Representation Theory

Discourse Representation Theory (DRT) (cf. [Kam81,KR93]), a semantic theory developed for the express purpose of representing and computing trans-sentential anaphora and other forms of text cohesion, thus offers itself as a natural semantic framework for the design of sophisticated dialogue systems. DRT has already been used in the design of a number of question-answering systems, some of them of considerable sophistication.

Currently, DRT is being used as the semantic representation formalism in VERBMOBIL [Wah93], a project to develop a machine translation system for face-to-face spoken dialogue funded by the German Department of Science and Technology. Here the aim is to integrate DRT-like semantics with the various kinds of pragmatic information that are needed for translation purposes.

6.2.3 Future Directions

Among the key outstanding issues for computational theories of discourse are:

Nature of Discourse Relations:
Relations are variously viewed as textual, rhetorical, intentional, or informational. Although each type of relation can be expected to have a different impact on a text, current discourse theories generally fail to distinguish between them.
Number of Discourse Relations:
Depending on the chosen theoretical approach, these can range from anywhere between two and about twenty-five. Altogether, there are over 350 relations available for use (see [Hov90]).
Level of Abstraction at which Discourse is Described:
In general, approaches advocating fewer discourse relations tend to address higher levels of abstraction.
Nature of Discourse Segments:
A key question here is whether discourse segments have psychological reality or whether they are abstract linguistic units akin to phonemes. Recently, there have been attempts to identify the boundary features of discourse segments [HG92,LP93].
Rôle of Intentions in Discourse:
It is well-recognized that intentions play an important rôle in discourse. However, of the four predominant computational theories, only that of Grosz and Sidner provides an explicit treatment of intentionality.
Mechanisms for Handling Key Linguistic Phenomena:
Of the predominant theories, only RST fails to address the issues of discourse focus, reference resolution and cue phrases. Existing treatments of focus, however, suffer from terminological confusion between notions of focus, theme and topic, also rife in the text linguistics literature.
Mechanisms for Reasoning about Discourse:
Cue phrases and certain syntactic forms are useful signals of prevailing discourse functions (e.g., discourse relations, discourse focus and topic) but do not occur with predictable regularity in texts. Reasoning mechanisms for retriving and/or generating these discourse functions are thus required.

Recent advances have not involved the development of new theories but have been rather through the extension and integration of existing theories. Notable among them are:

There are many implemented systems for discourse understanding and generation. Most involve hybrid approaches, selectively exploiting the power of existing theories. Available systems for handling dialogue tend either to have sophisticated discourse generation coupled to a crude discourse understanding systems or vice versa; attempts at full dialogue systems are only now beginning to appear.



next up previous contents
Next: 6.3 Dialogue Modeling Up: 6 Discourse and Dialogue Previous: 6.1 Overview