Postscript Version

Natural Language Generation for a Speech Prosthesis

Ivan Sag, Herb Clark, Ann Copestake, Dan Flickinger

Center for the Study of Language and Information (CSLI)
Stanford University

CONTACT INFORMATION

Ann Copestake
CSLI, Ventura Hall
Stanford University
Stanford, CA 94305-4115
Phone: (415) 725-2312
Fax : (415) 725-2166
Email: aac@csli.stanford.edu

WWW PAGE

http://www-csli.stanford.edu/~aac/nsfproj.html

PROGRAM AREA

Intelligent Interactive Systems for Persons with Disabilities

KEYWORDS

natural language generation, cogeneration, augmentative and alternative communication, AAC, statistical NLP, HPSG, unification grammar

PROJECT SUMMARY

This project is developing a novel approach to natural language generation, applying it to computer-aided text and speech generation for people with physical disabilities. Many people who cannot speak because of physical disability utilize text-to-speech generators as prosthetic devices. However, users of speech prostheses often have more general loss of motor control, and despite aids such as word prediction, text entry is slow and difficult. For typical users, current speech prostheses have output rates less than a tenth of the speed of normal speech. This prevents natural social conversation, since it completely disrupts the usual processes of turn-taking, and can lead to negative effects on the listener's attitude to the prosthesis user. The main focus of this research is the investigation of techniques which can improve rates sufficiently for more natural conversation to be possible, without sacrificing flexibility of content. This new approach employs a combination of a wide-coverage grammar, corpus-based word frequency data, and conversational templates. Applied to speech prosthesis, it enables the production of full sentences from minimal user input in a context-sensitive way. The approach can also be applied more generally for efficient production of formulaic text like the structured reports used widely in business and government and also has utility in computer-aided language learning, both for people who are not fully literate, and those for whom English is not their first language.

The project started in March 1997. It combines the research interests of two existing groups at CSLI: the Archimedes project http://csli-www.stanford.edu/arch/arch.html and the English Resource Grammar Online project http://hpsg.stanford.edu/hpsg/erg.html . In the first phase we are concentrating on four aspects:

  1. Expanding the existing English Resource Grammar, especially the lexicon. This includes enhancing the existing representation of lexical semantics.
  2. Implementing an initial version of the generation algorithm.
  3. Determining an initial set of conversational templates, in cooperation with two people with ALS (Lou Gehrig's disease).
  4. Beginning work on combining statistical information and the symbolic grammar by making use of lexical semantic categories to enhance prediction.

PROJECT REFERENCES

Ann Copestake. Applying Natural Language Processing Techniques to Speech Prostheses In Working Notes of the 1996 AAAI Fall Symposium on Developing Assistive Technology for People with Disabilities http://www-csli.stanford.edu/~aac/papers/disai.ps.gz

Ann Copestake, Dan Flickinger and Ivan Sag. Minimal Recursion Semantics: An Introduction ms. CSLI, 1997 ftp://ftp-csli.stanford.edu/linguistics/sag/mrs.ps.gz

Ann Copestake. Augmented and alternative NLP techniques for augmentative and alternative communication To appear in the Proceedings of the ACL workshop on Natural Language Processing for Communication Aids, Madrid, 1997 http://www-csli.stanford.edu/~aac/papers/aac.ps.gz

Ann Copestake and Alex Lascarides. Integrating symbolic and statistical representations: the lexicon-pragmatics interface to appear in the Proceedings of the ACL, Madrid, 1997. The relationship of this paper to the project is slightly indirect, but the first part of it illustrates the sort of combination of statistical and symbolic techniques which we are developing. http://www-csli.stanford.edu/~aac/papers/compounds.ps.gz

AREA BACKGROUND

Augmentative and alternative communication (AAC) is concerned with technology to help people who have difficulty with communication. It is necessarily a broad area, because of the diverse needs it aims to address. People may have communication difficulties for purely physical reasons, or because they have some cognitive or linguistic impairment. Some simple AAC devices store a fixed number of prerecorded messages but others synthesize speech from arbitrary text (or alternative symbols) input by the user. Letter and word prediction to support text input to AAC devices was among the first practical applications of natural language processing (NLP) techniques. In spite of this, work on AAC by NLP researchers has been somewhat limited (although see Alm et al (1992) and Demasco and McCoy (1992)). Currently there are attempts to encourage more research in this area, via two workshops, NLP&AAC'96 (http://alpha.mic.dundee.ac.uk/~slanger/workshop.html) at the University of Dundee, and NLP for communication aids (http://www-csli.stanford.edu/~aac/clworkshop.html) in conjunction with ACL/EACL'97 in Madrid. A special issue of Natural Language Engineering (edited by Stefan Langer) is also planned.

Work on AAC also has close connections with work on HCI for people with disabilities. AAC devices are often special-purpose computers, or, in some cases, a general-purpose laptop used with special-purpose software. Furthermore, the increasing availability of computer technology and, in particular, of Internet access, has a considerable potential to empower people with disabilities. For example, email can be used by people who are deaf or who do not have intelligible speech. Ideally, technology for AAC should be integrated with computer applications, so that, for example, the same techniques which enable the user to input text to a speech synthesizer also allow them to write email, send messages to bulletin boards and so on.

AREA REFERENCES

Norman Alm, John L. Arnott and Alan F. Newell. Prediction and conversational momentum in an augmentative communication system Communications of the ACM, 35(5), 47--57, 1992

John J. Darragh and Ian H. Witten. The reactive keyboard, Cambridge University Press, 1992.

Patrick W. Demasco and Kathleen F. McCoy. Generating text from compressed input:an intelligent interface for people with severe motor impediments Communications of the ACM, 35(5), 68--78, 1992

Alistair D. N. Edwards (editor). Extra-Ordinary Human-Computer Interaction, Cambridge University Press, 1995

John Perry, Elisabeth Macken, Neil Scott, Jan McKinley. Disability, Inability, and Cyberspace. To appear in: Batya Friedman (editor), Human Values and the Design of Computer Technology Cambridge University Press/CSLI, 1997. http://www-csli.stanford.edu/~john/disabilities/batya.html

RELATED PROGRAM AREAS

Speech and Natural Language Understanding
Adaptive Human Interfaces
Usability and User-Centered Design