...
The current version of this Survey of the State of the Art in Human Language Technology will change in late Spring/early Summer, 1996, to reflect the copyediting by the publisher.

...phoneme
Linguistic symbols presented between slashes, e.g., /p/, /t/, /k/, refer to phonemes; the minimal sound unit by changing it one changes the meaning of a word. The acoustic realizations of phonemes in speech are referred to as allophones, phones, or phonetic segments, and are presented in brackets, e.g., [p], [t], [k].

...,
Here and in the following, the notation 46#46 stands for the sequence 47#47.

...spelling.
For example, we treat as the same word the present and past participle of the verb read (I read vs. I have read) in the LM while the acoustic model  will have different models corresponding to the different pronunciations.

...way
Instead of having a single partition of the space of histories, one can use the exponential family to define a set of features that are used for computing the probability of an event. See the discussion on Maximum Entropy  in [LRR93,DR72,BDPDP94] for more details.

...tokenizer
Tokenizing English is fairly straightforward since white space separates words and simple rules can capture many of the punctuations. Special care has to be taken for abbreviations. For oriental languages such as Japanese and Chinese word segmentation is a more complicated problem since space is not used between words.

...Understanding
I am grateful to Victor Zue for many very helpful suggestions.

Stephen Pulman....
This survey draws in part on material prepared for the European Commission LRE \ Project 62-051, FraCaS: A Framework for Computational Semantics. I am \ grateful to the other members of the project for their comments and contribution\ s.


Maintained by Mike Noel and Wei Wei