next up previous contents index
Next: Ch. 10: Transmission and Storage Up: Survey of the Previous: 9.6 Modality Integration: Facial

Chapter 9 References

ACL83
Association for Computational Linguistics. Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, 1983.

ACL90
Association for Computational Linguistics. Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, 1990.

ACL92
Association for Computational Linguistics. Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, University of Delaware, 1992.

ACL93
Association for Computational Linguistics. Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Ohio State University, 1993.

AGvB93
F. D. Anger, H. W. Güsgen, and J. van Benthem, editors. Proceedings of the IJCAI-93 Workshop on Spatial and Temporal Reasoning (W17), Chambéry, France, August 1993.

AL91
C. Abry and M. T. Lallouache. Audibility and stability of articulatory movements: Deciphering two experiments on anticipatory rounding in French. In Proceedings of the 12th International Congress of Phonetic Sciences, volume 1, pages 220--225, Aix-en-Provence, France, 1991.

All83
Jonathan Allen. Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11):832--843, 1983.

ANL94
ACL. Proceedings of the Fourth Conference on Applied Natural Language Processing, Stuttgart, Germany, 1994. Morgan Kaufmann.

ARP93
Advanced Research Projects Agency. Proceedings of the 1993 ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Morgan Kaufmann.

Asi94
IEEE. Proceedings of the 28th Asilomar Conference on Signals, Systems and Computers, October 1994.

AV93
Michel Aurnague and Laure Vieu. A logical framework for reasoning about space. In Anger et al. [AGvB93], pages 123--158.

BAT92
Robert C. Berwick, Steven P. Abney, and Carol Tenny, editors. Principle-Based Parsing: Computation and Psycholinguistics. Kluwer, Dordrecht, The Netherlands, 1992.

BB92
G. E. Blonder and R. A. Boie. Capacitive moments sensing for electronic paper. U.S. Patent 5 113 041, May 1992.

BC94
G. Burdea and P. Coiffet. Virtual Reality Technology. John Wiley, New York, 1994.

Bel95
Y. Bellik. Interfaces Multimodales: Concepts, Modeles et Architectures. PhD thesis, Universite d'Orsay, Paris, 1995.

Ber93
N. O. Bernsen. Modality theory: Supporting multimodal interface design. In ERCIM, editor, Proceedings of the Workshop ERCIM on Human-Computer Interaction, Nancy, November 1993.

BF90
D. A. Berkley and J. L. Flanagan. HuMaNet: An experimental human/machine communication network based on ISDN. AT&T Technical Journal, 69:87--98, September 1990.

BHMW93
Christoph Bregler, Hermann Hild, Stefan Manke, and Alex Waibel. Improving connected letter recognition by lipreading. In Proceedings of the 1993 International Joint Conference on Speech and Signal Processing, volume 1, pages 557--560. IEEE, April 1993.

BJKZ85
R. Bajcsy, A. Joshi, E. Krotkov, and A. Zwarico. LandScan: A natural language and computer vision system for analyzing aerial images. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, pages 919--921, Los Angeles, 1985.

BL85
P. Bergeron and P. Lachapelle. Controlling facial expressions and body movements in the computer generated animated short `tony de peltrie'. In SigGraph '85 Tutorial Notes, 1985.

BLMA92
C. Benoît, M. T. Lallouache, T. Mohamadi, and C. Abry. A set of French visemes for visual speech synthesis. In G. Bailly and C. Benoît, editors, Talking Machines: Theories, Models, and Designs, pages 485--504. Elsevier Science, 1992.

BMK94
C. Benoît, T. Mohamadi, and S. Kandel. Effects of phonetic context on audio-visual intelligibility of French. Journal of Speech and Hearing Research, 37:1195--1203, 1994.

BOK94
Christoph Bregler, Stephen Omohundro, and Yochai Konig. A hybrid approach to bimodal speech recognition. In Asilomar [Asi94].

Bol80
R. A. Bolt. Put-that-there: Voice and gesture at the graphic interface. Computer Graphics, 14(3):262--270, August 1980.

BOYBJ90
F. Brooks, M. Ouh-Young, J. Batter, and P. Jerome. Project GROPE: Haytic displays for scientific visualization. Computer Graphics, 24(4):177--185, 1990.

BPW93
N. I. Badler, C. B. Phillips, and B. L. Webber. Simulating Humans: Computer Graphics Animation and Control. Oxford University Press, New York, 1993.

Bre82
J. Bresnan, editor. The Mental Representation of Grammatical Relations. MIT Press, Cambridge, Massachusetts, 1982.

Bro90
N. Michael Brooke. Visible speech signals: Investigating their analysis, synthesis and perception. In M. M. Taylor, F. Néel, and D. G. Bouwhuis, editors, The Structure of Multimodal Dialogue. Elsevier Science, Amsterdam, 1990.

BZ91
G. Burdea and J. Zhuang. Dextrous telerobotics with force feedback. Robotica, 9(1 & 2):171--178; 291--298, 1991.

Che57
C. Cherry. On Human Communication. Wiley, New York, 1957.

CM90
M. M. Cohen and D. W. Massaro. Synthesis of visible speech. Behaviour Research Methods, Instruments & Computers, 22(2):260--263, 1990.

CM93
M. M. Cohen and D. W. Massaro. Modeling coarticulation in synthetic visual speech. In N. M. Thalmann and D. Thalmann, editors, Models and techniques in computer animation, pages 139--156. Springer-Verlag, Tokyo, 1993.

CNS93
J. Coutaz, L. Nigay, and D. Salber. The MSM framework: A design space for multi-sensori-motor systems. In L. Bass, J. Gornostaev, and C. Under, editors, Lecture Notes in Computer Science, Selected Papers, EWCHI'93, East-West Human Computer Interaction, pages 231--241. Springer-Verlag, Moscow, August 1993.

CO71
W. Condon and W. Osgton. Speech and body motion synchrony of the speaker-hearer. In D. Horton and J. Jenkins, editors, The Perception of Language, pages 150--184. Academic Press, 1971.

Coh93
Anthony Cohn. Modal and non-modal qualitative spatial logics. In Anger et al. [AGvB93], pages 87--92.

COL94
Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan, 1994.

COS93
Proceedings of the European Conference on Spatial Information Theory (COSIT'93), volume 716 of Lecture Notes in Computer Science. Springer-Verlag, September 1993.

DAR91
Defense Advanced Research Projects Agency. Proceedings of the Fourth DARPA Speech and Natural Language Workshop, Pacific Grove, California, February 1991. Morgan Kaufmann.

DPH93
Jr. Deller, John R., John G. Proakis, and John H. Hansen. Discrete-Time Processing of Speech Signals. MacMillan, 1993.

EAC93
European Chapter of the Association for Computational Linguistics. Proceedings of the Sixth Conference of the European Chapter of the Association for Computational Linguistics, Utrecht University, The Netherlands, 1993.

EHSH93
Paul Ekman, Thomas Huang, Terrence Sejnowski, and Joseph Hager. Final report to NSF of the planning workshop on facial expression understanding (July 30 to August 1, 1992). Technical report, University of California, San Francisco, March 1993.

EL90
L. Erman and V. Lesser. The Hearsay-II speech understanding system: A tutorial. In Readings in Speech Recognition, pages 235--245. Morgan Kaufmann, 1990.

Erb75
N. P. Erber. Auditory-visual perception of speech. Journal of Speech and Hearing Disorders, 40:481--492, 1975.

ESC94
European Speech Communication Association. Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis, New Paltz, New York, September 1994.

FCF92
A. U. Frank, I. Campari, and U. Formentini. Proceedings of the international conference GIS---from space to territory: Theories and methods of spatio-temporal reasoning. In Proceedings of the International Conference GIS---From Space to Territory: Theories and Methods of Spatio-Temporal Reasoning, number 639 in Springer Lecture Notes in Computer Science, Pisa, Italy, September 1992. Springer-Verlag.

Fin86
Kathleen Finn. An Investigation of Visible Lip Information to be use in Automatic Speech Recognition. PhD thesis, Georgetown University, 1986.

Fis68
C. G. Fisher. Confusions among visually perceived consonants. Journal of Speech and Hearing Research, 11:796--804, 1968.

FKK92
A. G. Fraser, C. R. Kalmanek, A. E. Kaplan, W. T. Marshall, and R. C. Restrick. XUNET 2: A nationwide testbed in high-speed networking. In INFOCOM 92, Florence, Italy, May 1992.

Fla92
J. L. Flanagan. Technologies for multimedia information systems. Transactions, Institute of Electronics, Information and Communication Engineers, 75(2):164--178, February 1992.

Fla94
J. L. Flanagan. Technologies for multimedia communications. Proceedings of the IEEE, 82(4):590--603, April 1994.

FM93
S. K. Feiner and K. R. McKeown. Automating the generation of coordinated multimedia explanations. In Maybury [May93], pages 117--138.

FSJ93
J. L. Flanagan, A. C. Surendran, and E. E. Jan. Spatially selective sound capture for speech and audio processing. Speech Communication, 13:207--222, 1993.

Fur89
Sadaoki Furui. Digital Speech Processing, Synthesis, and Recognition. Marcel Dekker, New York, 1989.

GGMCB94
B. Le Goff, T. Guiard-Marigny, M. Cohen, and C. Benoît. Real-time analysis-synthesis and intelligibility of talking faces. In ESCA [ESC94], pages 53--56.

GGP92
Oscar Garcia, Alan Goldschen, and Eric Petajan. Feature extraction for optical automatic speech recognition or automatic lipreading. Technical Report GWU-IIST-92-32, The George Washington University, Department of Electrical Engineering and Computer Science, November 1992.

GGP94
Alan Goldschen, Oscar Garcia, and Eric Petajan. Continuous optical automatic speech recognition. In Asilomar [Asi94].

GMAB94
T. Guiard-Marigny, A. Adjoudani, and C. Benoît. A 3-D model of the lips for visual speech synthesis. In ESCA [ESC94], pages 49--52.

Gol93
Alan Goldschen. Continuous Automatic Speech Recognition by Lipreading. PhD thesis, The George Washington University, Washington, DC, 1993.

Her86
A. Herkovits. Language and Cognition. Cambridge University Press, New York, 1986.

HL94
C. Henton and P. Litwinovitz. Saying and seeing it with feeling: techniques for synthesizing visible, emotional speech. In ESCA [ESC94], pages 73--76.

HM93
A. G. Hauptmann and P. McAvinney. Gestures with speech for graphic manipulation. International Journal of Man-Machine Studies, 38(2):231--249, February 1993.

HPS94
Marcus Hennecke, K. Prasad, and David Stork. Using deformable templates to infer visual speech dynamics. In Asilomar [Asi94].

ICP93
Bulletin de la communication parlée, 2, January 1993. Université Stendhal, Grenoble, France.

Kar90
H. Karlgren, editor. Proceedings of the 13th International Conference on Computational Linguistics, Helsinki, 1990. ACL.

Kei68
W. D. Keidel. Information processing by sensory modalities in man. In Cybernetic Problems in Bionics, pages 277--300. Gordon and Breach, 1968.

LA93
A. Lascarides and N. Asher. Maintaining knowledge about temporal intervals. Linguistics and Philosophy, 16(5):437--493, 1993.

Lig93
Gérard Ligozat. Models for qualitative spatial reasoning. In Anger et al. [AGvB93], pages 35--45.

MA94
M. W. Mak and W. G. Allen. Lip-motion analysis for speech segmentation in noise. Speech Communication, 14:279--296, 1994.

Mas87
D. W. Massaro. Speech perception by ear and eye: a paradigm for psychological inquiry. Lawrence Earlbaum, Hillsdale, New Jersey, 1987.

May93
M. T. Maybury, editor. Intelligent Multimedia Interfaces. AAAI Press, Menlo Park, California, 1993.

McD82
D. McDermott. A temporal logic for reasoning about processes and plans. Cognitive Science, 6:101--155, 1982.

McK94
P. McKevitt. The integration of natural language and vision processing. Artificial Intelligence Review Journal, 8:1--3, 1994. Special volume.

MF91
D. M. Mark and A. U. Frank, editors. Cognitive and Linguistic Aspects of Geographic Space, Dordrecht, 1991. NATO Advanced Studies Institute, Kluwer.

MM76
Harry McGurk and John MacDonald. Hearing lips and seeing voices. Nature, 264:746--748, December 23/30 1976.

MP91
Kenji Mase and Alex Pentland. Automatic lipreading by optical flow analysis. Systems and Computer in Japan, 22(6):67--76, 1991.

MS88
M. Moens and M. J. Steedman. Temporal ontology and temporal reference. Computational linguistics, 14(2):15--28, 1988.

MTPT88
N. Magnenat-Thalmann, E. Primeau, and D. Thalmann. Abstract muscle action procedures for human face animation. Visual Computer, 3:290--297, 1988.

MTS92
Joseph Mariani, D. Teil, and O. Da Silva. Gesture recognition. Technical Report LIMSI Report, Centre National de la Recherche Scientifique, Orsay, France, December 1992.

Nak88
A. Nakhimovsky. Aspect, aspectual class, and the temporal structure of narrative. Computational Linguistics, 14(2):29--43, 1988.

NB93
B. Nebel and H-J. Bürckert. Reasoning about temporal relations: A maximal tractable subclass of Allen's interval algebra. Technical Report RR-93-11, DFKI, Saarbrücken, Germany, 1993.

Neu89
Bernd Neumann. Natural language description of time-varying scenes. In D. Waltz, editor, Semantic Structures, pages 167--207. Lawrence Earlbaum, Hillsdale, New Jersey, 1989.

NH88
A. Netravali and B. Haskel. Digital Pictures. Plenum Press, New York, 1988.

Nis86
Nishida. Speech recognition enhancement by lip information. ACM SIGCHI Bulletin, 17(4):198--204, April 1986.

Par74
F. I. Parke. A parametric model for human faces. PhD thesis, University of Utah, Department of Computer Sciences, 1974.

PB81
S. M. Platt and N. I. Badler. Animating facial expressions. Computer Graphics, 15(3):245--252, 1981.

PBBB88
Eric Petajan, Bradford Bischoff, David Bodoff, and N. Michael Brooke. An improved automatic lipreading system to enhance speech recognition. CHI 88, pages 19--25, 1988.

PBV94
Catherine Pelachaud, Norman Badler, and Marie-Luce Viaud. Final report to NSF of the standards for facial animation workshop. Technical report, University of Pennsylvania, Philadelphia, October 1994.

Pet84
Eric Petajan. Automatic Lipreading to Enhance Speech Recognition. PhD thesis, University of Illinois at Urbana-Champaign, 1984.

PF91
Christine Podilchuk and Nariman Farvardin. Perceptually based low bit rate video coding. In Proceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing, volume 4, pages 2837--2840, Toronto, May 1991. Institute of Electrical and Electronic Engineers.

Pie61
J. R. Pierce. Symbols, Signals and Noise. Harper and Row, New York, 1961.

PJN90
C. Podilchuk, N. S. Jayant, and P. Noll. Sparse codebooks for the quantization of non-dominant sub-bands in image coding. In Proceedings of the 1990 International Conference on Acoustics, Speech, and Signal Processing, pages 2101--2104, Albuquerque, New Mexico, April 1990. Institute of Electrical and Electronic Engineers.

PM89
Alex Pentland and Kenji Mase. Lip reading: Automatic visual recognition of spoken words. Technical Report MIT Media Lab Vision Science Technical Report 117, Massachusetts Institute of Technology, January 1989.

Rab89
L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, February 1989.

RM94
Ram Rao and Russell Mersereau. Lip modeling for visual speech recognition. In Asilomar [Asi94].

RMS92
David B. Roe, Pedro J. Moreno, Richard W. Sproat, Fernando C. N. Pereira, Michael D. Riley, and Alejandro Macarron. A spoken language translator for restricted-domain context-free languages. Speech Communication, 11:311--319, 1992. System demonstrated by AT&T Bell Labs and Telefonica de Espana, VEST, Worlds Fair Exposition, Barcelona, Spain.

RS78
Lawrence R. Rabiner and Ronald W. Schafer. Digital Processing of Speech Signals. Signal Processing. Prentice-Hall, Englewood Cliffs, New Jersey, 1978.

Sal90
M. W. Salisbury. Talk and draw: Bundling speech and graphics. IEEE Computer, pages 59--65, August 1990.

Sil93
Peter Silsbee. Computer Lipreading for Improved Accuracy in Automatic Speech Recognition. PhD thesis, The University of Texas at Austin, 1993.

Sil94
Peter Silsbee. Sensory integration in audiovisual automatic speech recognition. In Asilomar [Asi94].

Smi89
Steve Smith. Computer lip reading to augment automatic speech recognition. Speech Tech, pages 175--181, 1989.

SP54
W. H. Sumby and I. Pollack. Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26:212--215, 1954.

SS93
J. Schirra and E. Stopp. ANTLIMA---a listener model with mental images. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 175--180, Chambery, France, August 1993.

STHN90
M. Saintourens, M. H. Tramus, H. Huitric, and M. Nahas. Creation of a synthetic face speaking in real time with a synthetic voice. In Gérard Bailly and Christian Benoît, editors, Proceedings of the First ESCA Workshop on Speech Synthesis, pages 249--252, Autrans, France, 1990. European Speech Communication Association.

Sum79
Q. Summerfield. Use of visual information for phonetic perception. Phonetica, 36:314--331, 1979.

Sum87
Quentin Summerfield. Some preliminaries to a comprehensive account of audio-visual speech perception. In Barbara Dodd and Ruth Campbell, editors, Hearing by Eye: The Psychology of Lipreading, pages 3--51. Lawrence Earlbaum, Hillsdale, New Jersey, 1987.

SWL92
David Stork, Greg Wolff, and Earl Levine. Neural network lipreading system for improved speech recognition. In Proceedings of the 1992 International Joint Conference on Neural Networks, Baltimore, Maryland, 1992.

Van86
C. Vandeloise. L'espace en français: sémantique des prépositions spatiales. Seuil, Paris, 1986.

Vil94
L. Vila. A survey on temporal reasoning in artificial intelligence. AICOM, 7(1):832--843, 1994.

VY92
M. L. Viaud and H. Yahia. Facial animation with wrinkles. In Proceedings of the 3rd Workshop on Animation, Eurographic's 92, Cambridge, England, 1992.

WAF93
Wolfgang Wahlster, Elisabeth André, W. Finkler, H.-J. Profitlich, and Thomas Rist. Plan-based integration of natural language and graphics generation. Artificial Intelligence, pages 387--427, 1993.

Wah89
W. Wahlster. One word says more than a thousand pictures. on the automatic verbalization of the results of image sequence analysis systems. Computers and Artificial Intelligence, 8:479--492, 1989.

Wai93
A. Waibel. Multimodal human-computer interaction. In Eurospeech '93, Proceedings of the Third European Conference on Speech Communication and Technology, volume Plenary, page 39, Berlin, September 1993. European Speech Communication Association.

Wat87
K. Waters. A muscle model for animating three-dimensional facial expression. In Proceedings of Computer Graphics, volume 21, pages 17--24, 1987.

Web88
B. L. Webber. Tense as discourse anaphor. Computational linguistics, 14(2):61--73, 1988.

WMJB83
W. Wahlster, H. Marburger, A. Jameson, and S. Busemann. Over-answering yes-no questions: Extended responses in a NL interface to a vision system. In Proceedings of the 8th International Joint Conference on Artificial Intelligence, pages 643--646, Karlsruhe, 1983.

WRLG90
J. Wilpon, L. Rabiner, C. Lee, and E. Goldman. Automatic recognition of key words in unconstrained speech using hidden markov models. IEEE Transactions on Acoustics, Speech and Signal Processing, 38(11):1870--1878, 1990.

YGS89
Ben Yuhas, Moise Goldstein, and Terrence Sejnowski. Integration of acoustic and visual speech signals using neural networks. IEEE Communications Magazine, pages 65--71, 1989.

YYI92
A. Yamada, T. Yamamoto, H. Ikeda, T. Nishida, and S. Doshita. Reconstructing spatial images from natural language texts. In Proceedings of the 14th International Conference on Computational Linguistics, pages 1279--1283, Nantes, France, August 1992. ACL.