next up previous contents index
Next: 9.3 Text and Images Up: 9 Multimodality Previous: 9.1 Overview

9.2 Representations of Space and Time

Gérard Ligozat
LIMSI-CNRS, Orsay, France

Asserting that most human activities requiring intelligence are grounded in space and time is a commonplace remark. In the context of multimodal environments, spatial and temporal information has to be represented, exchanged, and processed between different components using various modes.

In particular, spatial and temporal knowledge may have to be extracted from natural language and further processed, or general knowledge involving spatial and temporal information may have to be expressed using natural language.

This makes apparent the fact that processing spatial and temporal knowledge in natural language draws upon two main domains:

  1. Knowledge representation and reasoning about spatial and temporal knowledge, a branch of Artificial Intelligence, including such aspects as qualitative physics.
  2. Theoretical and computational linguistics.

In the first domain, the main goal is upon devising formalisms for representing and reasoning about spatial and temporal information in general.

In the second domain, which is of course closely related to the first, and often considered as a subdomain of application, the main focus is upon spatial and temporal information in natural language, understanding its content and meaning, and processing it.

Both aspects interact in many applications having to do at the same time with real world data and linguistic data: story understanding, scene description, route description.

In fact, despite the obvious interest and ultimate necessity of considering both spatial and temporal aspects jointly, the state of development of the branch of AI, computational linguistics or formal philosophy dealing with time is much more advanced than that of the corresponding branches which deal with space.

9.2.1 Time and Space in Natural Language

Understanding temporal and spatial information in natural language---or generating natural language to express temporal or spatial meanings, implies: (1) identifying the linguistic elements conveying this information (markers), (2) elucidating its semantic (and pragmatic) content, (3) devising suitable systems of representation to process this content, (4) implementing and using those representations.

Nature and Contents of the Markers for Time

The basic linguistic markers for temporal notions in many languages are verb tenses (e.g., preterite, pluperfect in English) and temporal adverbials (yesterday, to-morrow, two weeks from now).

However, tenses also express aspectual notions, which are important for understanding the implications of a given utterance: compare He was crossing the street and He crossed the street. Only the second sentence allows the inference He reached the other side of the street.

A basic property of temporal information in natural language is its deictic nature: an event in a past tense means it happened before the time of speech. Reichenbach introduced the time of speech S, time of reference R, and time of event E to explain the difference of meaning between the basic tenses in English.

Another important component is the fact that verbs behave in various ways according to their semantic class (Aktionsart). Variants or elaborations of Vendler's classification (states, activities, accomplishments and achievements) have been in common use.

Systems of Representation

The definition and study by Prior of modal tense logics has resulted in an important body of work in formal logic, with applications in philosophy (the original motivation of Prior's work), the semantics of natural language, and computer science. The simplest versions of tense logics use modal operators , , , . For instance, means that p will be true at some future time, and that p will be true at all future times; and have the corresponding interpretations in the past.

Hence tense logic, in analogy to natural language, uses time as an implicit parameter: a formula has to get its temporal reference from the outside.

In Artificial Intelligence, both [McD82] and [All83] introduced reified logics for dealing with temporal information. Reification consists in incorporating part of the meta-language into the language: a formula in the object language becomes a propositional term in the new language. For example, p being true during time t might be written . This allows to make distinctions about different ways of being true, as well as quantification about propositional terms. A recent survey of temporal reasoning in AI is [Vil94].

Processing Temporal Information

Typically, recent work on temporal information in natural language uses some or all of the preceding tools. A great deal of work is concerned with determining the temporal value of a given sentence. Good examples are [Web88] and [MS88].

Nature and Contents of the Markers for Space

Primary linguistic markers of space are spatial prepositions (in, on, under, below, behind, above) and verbs of motion (arrive, cross). The seminal work by [Her86] showed that prepositions cannot be analyzed in purely geometric terms, and that their interpretation depends heavily on contextual factors. Following initial work by [Van86], Vieu, Aurnague and Briffault developed general theories of spatial reference and interpretations of prepositions in French (see references in [COS93]). It appears that spatial information also involves:

Developing systems for dealing with spatial information is best understood in the larger context of spatial reasoning in AI.

9.2.2 Implementation Issues

A general computational framework for expressing temporal information is in terms of binary constraint networks: Temporal information is represented by a network whose nodes are temporal entities (e.g., intervals), and information about binary relations between entities is represented by labels on the arcs.

In Allen's approach, such a qualitative network will represent a finite set of intervals, and labels on the arcs will be disjunctions of the thirteen basic relations (representing incomplete knowledge). Basic computational problems will be:

  1. Determining whether a given network is coherent, i.e., describes at least one feasible scenario.
  2. Finding all scenarios compatible with a given network.
  3. For a given network, answering the previous questions in case new intervals or constraints are added.

The first two problems are NP-hard in the full algebra of intervals, whereas they are polynomial in the case of time points. Recent results of [NB93] identify a maximal tractable subset.

Most algorithms in this framework are variants of the constraint propagation method first introduced in this context by Allen.

Binary constraint networks also are a suitable representation for representing quantitative constraints between time points. A case in point are time maps used by Dean and McDermott (see [Vil94]).

9.2.3 Future Directions

A promising direction of research in the domain of temporal information in natural language is concerned with the integration of the textual, or discourse level. Two basic aspects are:

  1. Temporal anaphora: in a given sentence, part of the indexes of reference (S, R, E) are determined by other sentences in the text.
  2. Temporal structure: this has to do with determining general principles for the temporal structure of discourse. [Nak88] and [LA93] are recent examples.
In the spatial domain, two directions of interest are:
  1. The development of an interdisciplinary, cognitively motivated field of research on spatial relations ([MF91,FCF92,COS93]). This combines results from cognitive psychology on the perception and processing of spatial information, research in Artificial Intelligence on the representation of spatial information, as well as research on geographic information systems, which has to deal with spatial information both as numeric information (pixels) and symbolic information, e.g., maps, diagrams.

    Typical applications include the generation of route descriptions in natural language, the maintenance and querying of spatial databases, the interpretation and generation of maps describing spatio-temporal phenomena.

  2. A trend towards the development of general formalisms for spatial reasoning, including:



next up previous contents
Next: 9.3 Text and Images Up: 9 Multimodality Previous: 9.1 Overview