Postscript Version
Natural Language Generation for a Speech Prosthesis
Ivan Sag, Herb Clark, Ann Copestake, Dan Flickinger
Center for the Study of Language and Information (CSLI)
Stanford University
CONTACT INFORMATION
Ann Copestake
CSLI, Ventura Hall
Stanford University
Stanford, CA 94305-4115
Phone: (415) 725-2312
Fax : (415) 725-2166
Email: aac@csli.stanford.edu
WWW PAGE
http://www-csli.stanford.edu/~aac/nsfproj.html
PROGRAM AREA
Intelligent Interactive Systems for Persons with Disabilities
KEYWORDS
natural language generation, cogeneration, augmentative and
alternative
communication, AAC, statistical NLP, HPSG, unification grammar
PROJECT SUMMARY
This project is developing a novel approach to natural language
generation,
applying it to computer-aided text and speech generation for people
with
physical disabilities. Many people who cannot speak because of
physical
disability utilize text-to-speech generators as prosthetic devices.
However,
users of speech prostheses often have more general loss of motor
control, and
despite aids such as word prediction, text entry is slow and
difficult. For
typical users, current speech prostheses have output rates less than a
tenth of
the speed of normal speech. This prevents natural social
conversation, since
it completely disrupts the usual processes of turn-taking, and can
lead to
negative effects on the listener's attitude to the prosthesis user.
The main
focus of this research is the investigation of techniques which can
improve
rates sufficiently for more natural conversation to be possible,
without
sacrificing flexibility of content. This new approach employs a
combination of
a wide-coverage grammar, corpus-based word frequency data, and
conversational
templates. Applied to speech prosthesis, it enables the production of
full
sentences from minimal user input in a context-sensitive way. The
approach can
also be applied more generally for efficient production of formulaic
text like
the structured reports used widely in business and government and also
has utility in computer-aided language learning, both for people who
are
not fully literate, and those for whom English is not their first
language.
The project started in March 1997. It combines the research interests
of two existing groups at CSLI: the Archimedes project
http://csli-www.stanford.edu/arch/arch.html
and the English Resource Grammar Online project
http://hpsg.stanford.edu/hpsg/erg.html
.
In the first phase we are
concentrating on four aspects:
- Expanding the existing English Resource Grammar,
especially the lexicon. This includes enhancing the existing
representation
of lexical semantics.
- Implementing an initial version of the generation algorithm.
- Determining an initial set of conversational templates,
in cooperation with two people with ALS (Lou Gehrig's disease).
- Beginning work on combining
statistical information and the symbolic grammar by making use of
lexical semantic categories to enhance prediction.
PROJECT REFERENCES
Ann Copestake.
Applying Natural Language Processing Techniques to Speech Prostheses
In Working Notes of the 1996 AAAI Fall Symposium on Developing
Assistive Technology for People with Disabilities
http://www-csli.stanford.edu/~aac/papers/disai.ps.gz
Ann Copestake, Dan Flickinger and Ivan Sag.
Minimal Recursion Semantics: An Introduction
ms. CSLI, 1997
ftp://ftp-csli.stanford.edu/linguistics/sag/mrs.ps.gz
Ann Copestake.
Augmented and alternative NLP techniques
for augmentative and alternative communication
To appear in the Proceedings of
the ACL workshop on Natural Language Processing for
Communication Aids, Madrid, 1997
http://www-csli.stanford.edu/~aac/papers/aac.ps.gz
Ann Copestake and Alex Lascarides.
Integrating symbolic and statistical representations: the
lexicon-pragmatics interface
to appear in the Proceedings of
the ACL, Madrid, 1997.
The relationship of this paper to the project is slightly indirect,
but the first part of it
illustrates the sort of combination of statistical and symbolic
techniques which we are developing.
http://www-csli.stanford.edu/~aac/papers/compounds.ps.gz
AREA BACKGROUND
Augmentative and alternative communication (AAC)
is concerned with technology to help people who have difficulty with
communication. It is necessarily a broad area, because of the
diverse needs it aims to address. People may have communication
difficulties
for purely physical reasons, or because they have some cognitive or
linguistic
impairment. Some simple AAC devices store a fixed number of
prerecorded
messages but others synthesize speech from arbitrary text (or
alternative symbols) input by
the user. Letter and word prediction to support text
input to AAC devices was among the first practical applications of
natural
language processing (NLP) techniques.
In spite of this, work on AAC by NLP researchers has been
somewhat limited (although see Alm et al (1992)
and Demasco and McCoy (1992)). Currently there are attempts to
encourage more research in this area, via two workshops, NLP&AAC'96
(http://alpha.mic.dundee.ac.uk/~slanger/workshop.html)
at the
University of Dundee, and
NLP for communication aids
(http://www-csli.stanford.edu/~aac/clworkshop.html)
in conjunction with
ACL/EACL'97 in Madrid. A special issue of Natural Language
Engineering (edited by Stefan Langer) is also planned.
Work on AAC also has close connections with work on HCI for people
with
disabilities. AAC devices are often special-purpose computers, or, in
some
cases, a general-purpose laptop used with special-purpose software.
Furthermore, the increasing availability of computer technology and,
in
particular, of Internet access, has a considerable potential to
empower people
with disabilities. For example, email can be used by people who are
deaf or
who do not have intelligible speech. Ideally, technology for AAC
should be
integrated with computer applications, so that, for example, the same
techniques which enable the user to input text to a speech synthesizer
also
allow them to write email, send messages to bulletin boards and so on.
AREA REFERENCES
Norman Alm, John L. Arnott and Alan F. Newell.
Prediction and conversational momentum in an augmentative
communication system
Communications of the ACM, 35(5), 47--57, 1992
John J. Darragh and Ian H. Witten.
The reactive keyboard, Cambridge University Press, 1992.
Patrick W. Demasco and Kathleen F. McCoy.
Generating text from compressed input:an intelligent interface for
people with severe motor impediments
Communications of the ACM, 35(5), 68--78, 1992
Alistair D. N. Edwards (editor).
Extra-Ordinary Human-Computer Interaction,
Cambridge University Press, 1995
John Perry, Elisabeth Macken, Neil Scott, Jan McKinley.
Disability, Inability, and
Cyberspace. To appear in: Batya Friedman (editor),
Human Values and the Design
of Computer Technology Cambridge University Press/CSLI, 1997.
http://www-csli.stanford.edu/~john/disabilities/batya.html
RELATED PROGRAM AREAS
Speech and Natural Language Understanding
Adaptive Human Interfaces
Usability and User-Centered Design