next up previous contents index
Next: Ch. 6: Discourse and Dialogue Up: Ch. 5: Spoken Output Technologies Previous: Spoken Language Generation

5.5 References

ACL86
Association for Computational Linguistics. Proceedings of the 24th Annual Meeting of the Association for Computational Linguistics, Columbia University, New York, June 1986.

AHK87
Jonathan Allen, M. Sharon Hunnicutt, and Dennis Klatt. From text to speech---the MITalk system. MIT Press, Cambridge, Massachusetts, 1987.

ANL92
Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy, March 1992.

ANSK90
M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara. Voice conversion through vector quantization. Journal of the Acoustical Society of Japan, E-11:71--76, 1990.

AR82
Bishnu S. Atal and J. R. Remde. A new model of LPC excitation for producing natural-sounding speech at low bit rates. In Proceedings of the 1982 International Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 614--617. Institute of Electrical and Electronic Engineers, May 1982.

BB90
Gérard Bailly and Christian Benoît, editors. Proceedings of the First ESCA Workshop on Speech Synthesis, Autrans, France, 1990. European Speech Communication Association.

BB92
G. Bailly and C. Benoît, editors. Talking Machines: Theories, Models, and Designs. Elsevier Science, 1992.

BF90
Joan Bachenko and Eileen Fitzpatrick. A computational grammar of discourse-neutral prosodic phrasing in English. Computational Linguistics, 16:155--170, 1990.

BFOS84
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove, California, 1984.

BS87
K. Bartkova and C. Sorin. A model of segmental duration for speech synthesis in French. Speech Communication, 6:245--260, 1987.

But75
B. Butterworth. Hesitation and semantic planning in speech. Journal of Psycholinguistic Research, 4:75--87, 1975.

Cam92
W. N. Campbell. Syllable-based segmental duration. In Bailly and Benoît [BB92], pages 211--224.

Car83
Sandra Carberry. Tracking user goals in an information-seeking environment. In Proceedings of the Third National Conference on Artificial Intelligence, pages 59--63, Washington, DC, August 1983.

Caw93
Alison Cawsey. Explanation and Interaction: The Computer Generation of Explanatory Dialogues. MIT Press, 1993.

CCC94
Jennifer Chu-Carroll and Sandra Carberry. A plan-based model for response generation in collaborative task-oriented dialogues. In Proceedings of the National Conference on Artificial Intelligence, pages 799--805, Menlo Park, California, 1994. AAAI Press.

CCL90
Cecil Coker, Kenneth Church, and Mark Liberman. Morphology and rhyming: Two powerful alternatives to letter-to-sound rules for speech synthesis. In Bailly and Benoît [BB90], pages 83--86.

CCZ92
Jyun-Shen Chang, Shun-De Chen, Ying Zheng, Xian-Zhong Liu, and Shu-Jin Ke. Large-corpus-based methods for Chinese personal name recognition. Journal of Chinese Information Processing, 6(3):7--15, 1992.

CG86
R. Carlson and B. Granström. A search for durational rules in a real-speech data base. Phonetica, 43:140--154, 1986.

Chu85
Kenneth Church. Stress assignment in letter to sound rules for speech synthesis. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics, pages 246--253, University of Chicago, July 1985. Association for Computational Linguistics.

Chu88
Kenneth Church. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing, pages 136--143, Austin, Texas, 1988. ACL.

CI91
W. N. Campbell and S. D. Isard. Segment durations in a syllable frame. Journal of Phonetics Computation Speech and Language, 19:37--47, 1991.

CJS89
Robin Cohen, Marlene Jones, Amar Sanmugasunderam, Bruce Spencer, and Lisa Dent. Providing responses specific to a user's goals and background. International Journal of Expert Systems, 2(2):135--162, 1989.

CL92
Keh-Jiann Chen and Shing-Huan Liu. Word identification for Mandarin Chinese sentences. In COLING [COL92], pages 101--107.

COL88
Proceedings of the 12th International Conference on Computational Linguistics, Budapest, 1988.

COL92
ACL. Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France, August 1992.

Dal89
Robert Dale. Cooking up referring expressions. In Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, pages 68--75, Vancouver, British Columbia, June 1989. Association for Computational Linguistics.

DH88
J. R. Davis and J. Hirschberg. Assigning intonational features in synthesized spoken directions. In Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pages 187--193, SUNY, Buffalo, New York, June 1988. Association for Computational Linguistics.

DHRS92
R. Dale, E. H. Hovy, D. Rösner, and O. Stock, editors. Aspects of Automated Natural Language Generation. Number 587 in Lecture Notes in AI. Springer-Verlag, Heidelberg, 1992.

DL93
T. Dutoit and H. Leich. MBR-PSOLA: Text-to-speech synthesis based on an MBEre-synthesis of the segments database. Speech Communication, 13:432--440, 1993.

dM95
C. d'Alessandro and P. Mertens. Automatic pitch contour stylization using a model of tonal perception. Computer Speech and Language, 9:257--288, 1995.

DMZ90
Robert Dale, Chris S. Mellish, and Michael Zock, editors. Current Research in Natural Language Generation. Academic Press, London, 1990.

DN91
Michael Dedina and Howard Nusbaum. PRONOUNCE: a program for pronunciation by analogy. Computer Speech and Language, 5:55--64, 1991.

DS90
K. J. M. J. De Smedt. IPF: an incremental parallel formulator. In Robert Dale, Chris S. Mellish, and Michael Zock, editors, Current Research in Natural Language Generation. Academic Press, London, 1990.

Elh92
M. Elhadad. Using Argumentation to Control Lexical Choice: A Functional Unification-Based Approach. PhD thesis, Computer Science Department, Columbia University, 1992.

ESC94
European Speech Communication Association. Proceedings of the Second ESCA/IEEE Workshop on Speech Synthesis, New Paltz, New York, September 1994.

FK88
T. Fujisaki and H. Kawai. Realization of linguistic information in the voice fundamental frequency contour of the spoken Japanese. In Proceedings of the 1988 International Conference on Acoustics, Speech, and Signal Processing, pages 663--666, New York, 1988.

FLL85
G. Fant, J. Liljencrants, and Q. Lin. A four parameter model of glottal flow. Speech Transactions Laboratory Quarterly and Status Report, 1985(4):1--13, 1985.

FR73
J. L. Flanagan and L. R. Rabiner, editors. Speech Synthesis. Dowden, Hutchinson & Ross, 1973.

Fud84
Eric Fudge. English Word-Stress. Allen and Unwin, London, 1984.

Fuj92
H. Fujisaki. Modeling the process of fundamental frequency contour generation. In Speech perception, production and linguistic structure, pages 313--326. Ohmsha IOS Press, 1992.

GGG93
Peter C. Gordon, Barbara J. Grosz, and Laura A. Gilliom. Prounouns, names and the centering of attention in discourse. Cognitive Science, 17(3):311--348, 1993.

GN92
B. Granström and L. Nord. Neglected dimensions in speech synthesis. Speech Communication, 11:459--462, 1992.

Gol91
Andrew Golding. Pronouncing Names by a Combination of Case-Based and Rule-Based Reasoning. PhD thesis, Stanford University, 1991.

Gra84
Robert Granville. Controlling lexical substitution in computer text generation. In Proceedings of the 10th International Conference on Computational Linguistics, pages 381--384, Stanford University, California, July 1984. ACL.

Gro77
Barbara J. Grosz. The representation and use of focus in dialogue understanding. Technical Report 151, SRI International, Menlo Park, California, 1977.

GS86
Barbara J. Grosz and Candace L. Sidner. Attention, intention, and the structure of discourse. Computational Linguistics, 12(3):175--204, 1986.

Hal94
Susan M. Haller. Recognizing digressive questions using a model for interactive generation. In Proceedings of the 7th International Workshop on Natural Language Generation, pages 181--188, Kinnebunkport, Maine, June 1994.

HCC90
J. Hart, R. Collier, and A. Cohen, editors. A perceptual study of intonation. Cambridge University Press, Cambridge, England, 1990.

HF94
Merle Horne and Marcus Filipsson. Computational extraction of lexico-grammatical information for generation of Swedish intonation. In ESCA [ESC94], pages 220--223.

Hin83
Donald Hindle. User manual for Fidditch, a deterministic parser. Technical Report Technical Memorandum 7590-142, Naval Research Laboratory, 1983.

Hir93
Julia Hirschberg. Pitch accent in context: Predicting intonational prominence from text. Artificial Intelligence, 63:305--340, 1993.

HL93
Julia Hirschberg and Diane Litman. Empirical studies on the disambiguation of cue phrases. Computational Linguistics, 19(3):501--530, September 1993.

Hob93
Jerry R. Hobbs. Intention, information, and structure in discourse. In Proceedings of the NATO Advanced Research Workshop on Burning Issues in Discourse, pages 41--66, Maratea, Italy, 1993.

HP86
Julia Hirschberg and Janet Pierrehumbert. The intonational structuring of discourse. In ACL [ACL86], pages 136--144.

HS80
K. Hakoda and H. Sato. Prosodic rules in connected speech synthesis. Trans. IECE, pages 715--722, 1980.

HY90
Jill House and Nick Youd. Contextually appropriate intonation in speech synthesis. In Bailly and Benoît [BB90], pages 185--188.

ICS92
Proceedings of the 1992 International Conference on Spoken Language Processing, Banff, Alberta, Canada, October 1992. University of Alberta.

IS93
N. Iwahashi and Y. Sagisaka. Duration modeling with multiple split regression. In Eurospeech '93, Proceedings of the Third European Conference on Speech Communication and Technology, volume 1, pages 329--332, Berlin, September 1993. European Speech Communication Association.

Kar90
H. Karlgren, editor. Proceedings of the 13th International Conference on Computational Linguistics, Helsinki, 1990. ACL.

Kem87
Gerard Kempen, editor. Natural Language Generation: Recent Advances in Artificial Intelligence, Psychology, and Linguistics. Kluwer Academic, Boston, Dordrecht, 1987.

KK90
D. H. Klatt and L. C. Klatt. Analysis, synthesis a, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87:820--857, 1990.

KKZ92
Lauri Karttunen, Ronald M. Kaplan, and Annie Zaenen. Two-level morphology with composition. In COLING [COL92], pages 141--148.

Kla80
Dennis H. Klatt. Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America, 67:971--995, 1980.

Kla87
Dennis H. Klatt. Review of text-to-speech conversion for English. Journal of the Acoustical Society of America, 82(3):737--793, 1987.

Kos83
Kimmo Koskenniemi. Two-Level Morphology: a General Computational Model for Word-Form Recognition and Production. PhD thesis, University of Helsinki, 1983. Publications of the Department of General Linguistics,University of Helsinki, No. 11. Helsinki.

KP95
V. Kraft and T. Portele. Quality evaluation of five German speech synthesis systems. Acta Acustica, 3:351--365, 1995.

KTS92a
N. Kaiki, K. Takeda, and Y. Sagisaka. Linguistic properties in the control of segmental duration for speech synthesis. In Bailly and Benoît [BB92], pages 255--264.

KTS92b
Hiroaki Kato, Minoru Tsuzaki, and Yoshinori Sagisaka. Acceptability and discrimination threshold for distortion of duration in Japanese words. In ICSLP [ICS92], pages 507--510.

Lad84
D. Robert Ladd. English compound stress. In Dafydd Gibbon and Helmut Richter, editors, Intonation, Accent and Rhythm, pages 253--266. W. de Gruyter, Berlin, 1984.

LCS93
Ming-Yu Lin, Tung-Hui Chiang, and Keh-Yi Su. A preliminary study on unknown word problem in Chinese word segmentation. In ROCLING 6, pages 119--141. ROCLING, 1993.

Lev89
W. Levelt. Speaking: from intention to articulation. MIT Press, Cambridge, Massachusetts, 1989.

LP77
Mark Liberman and Alan Prince. On stress and linguistic rhythm. Linguistic Inquiry, 8:249--336, 1977.

LP84
M. Liberman and J. B. Pierrehumbert. Intonational invariance under changes in pitch range and length. In Language Sound Structure, pages 157--233. MIT Press, 1984.

LS92
Mark Liberman and Richard Sproat. The stress and structure of modified noun phrases in English. In Anna Szabolcsi and Ivan Sag, editors, Lexical Matters. CSLI (University of Chicago Press), 1992.

LSM93
J. Laroche, Y. Stylianou, and E. Moulines. HNS: Speech modification based on a harmonic + noise model. In Proceedings of the 1993 International Conference on Acoustics, Speech, and Signal Processing, pages 550--553, 1993.

Mae76
S. Maeda. A characterization of American English intonation. PhD thesis, MIT, 1976.

May92
Mark T. Maybury. Communicative acts for explanation generation. International Journal of Man-Machine Studies, 37(2):135--172, 1992.

MC90a
Kathleen F. McCoy and Jeannette Cheng. Focus of attention: Constraining what can be said next. In Cécile L. Paris, William R. Swartout, and William C. Mann, editors, Natural Language Generation in Artificial Intelligence and Computational Linguistics, pages 103--124. Kluwer Academic, Boston, 1990.

MC90b
E. Moulines and F. Charpentier. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. Speech Communication, 9:453--468, 1990.

McC86
Kathleen F. McCoy. The ROMPER system: Responding to object-related misconceptions using perspective. In ACL [ACL86].

McD80
David D. McDonald. Natural Language Production as a Process of Decision Making Under Constraint. PhD thesis, Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology, 1980.

McD83
David D. McDonald. Description directed control: its implications for natural language generation. In Barbara J. Grosz, Karen Sparck Jones, and B. L. Webber, editors, Readings in Natural Language Processing. Morgan Kaufmann Publishers, Inc., 1983.

McK85
Kathleen R. McKeown. Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Text. Studies in Natural Language Processing. Cambridge University Press, 1985.

McK88
Kathleen R. McKeown. Generating goal-oriented explanations. International Journal of Expert Systems, 1(4):377--395, 1988.

McR95
Susan McRoy. The repair of speech act misunderstandings by abductive inference. Computational Linguistics, 1995. In press.

MM95a
Megan Moser and Johanna D. Moore. Investigating cue selection and placement in tutorial discourse. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, MIT, 1995. Association for Computational Linguistics.

MM95b
Megan Moser and Johanna D. Moore. Using discourse analysis and automatic text generation to study discourse cue usage. In Proceedings of the AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, 1995.

MMI94
H. Matsumoto, Y. Maruyama, and H. Inoue. Voice quality conversion based on supervised spectral mapping. Journal of the Acoustical Society of Japan, E, 1994. In press.

Mon90
Alex Monaghan. Rhythm and stress in speech synthesis. Computer Speech and Language, 4:71--78, 1990.

Moo89
Johanna D. Moore. Responding to ``Huh?'': Answering vaguely articulated follow-up questions. In Proceedings of the Conference on Human Factors in Computing Systems, pages 91--96, Austin, Texas, 1989.

Moo95
Johanna D. Moore. Participating in Explanatory Dialogues: Interpreting and Responding to Questions in Context. MIT Press, 1995.

MP92
Johanna D. Moore and Martha E. Pollack. A problem for RST: The need for multi-level discourse analysis. Computational Linguistics, 18(4):537--544, December 1992.

MP93
Johanna D. Moore and Cécile L. Paris. Planning texts for advisory dialogues: Capturing intentional and rhetorical information. Computational Linguistics, 19(4):651--694, December 1993.

MRT93
K. R. McKeown, J. Robin, and M. Tanenblatt. Tailoring lexical choice to the user's vocabulary in multimedia explanation generation. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 226--234, Ohio State University, 1993. Association for Computational Linguistics.

MS95
E. Moulines and Y. Sagisaka. Voice conversion: State of the art and perspectives. Speech Communication, 16(2), 1995. Guest editors.

NH88
S. Nakajima and H. Hamada. Automatic generation of synthesis units based on context oriented clustering. In Proceedings of the 1988 International Conference on Acoustics, Speech, and Signal Processing, pages 659--662, New York, April 1988. Institute of Electrical and Electronic Engineers.

OGC93
J. P. Olive, A. Greenwood, and J. Coleman. Acoustics of American English Speech, A Dynamic Approach. Springer-Verlag, 1993.

O'S89
D. O'Shaughnessy. Parsing with a small dictionary for applications such as text to speech. Computational Linguistics, 15:97--108, 1989.

Par88
Cécile L. Paris. Tailoring object descriptions to the user's level of expertise. Computational Linguistics, 14(3):64--78, September 1988.

PBH94
J. Pitrelli, M.E. Beckman, and J. Hirschberg. Evaluation of prosodic transcription labelling in the ToBI framework. In Proceedings of the 1994 International Conference on Spoken Language Processing, pages 123--126, Yokohama, Japan, September 1994.

PC92
S. Parthasarathy and C. H. Coker. Automatic estimation of articulatory parameters. Computer Speech and Language, 6:37--75, 1992.

PH90
Janet Pierrehumbert and Julia Hirschberg. The meaning of intonational contours in interpretation of discourse. In Philip R. Cohen, Jerry Morgan, and Martha E. Pollack, editors, Intentions in Communication, pages 271--311. MIT Press, Cambridge, Massachusetts, 1990.

Pie81
J. Pierrehumbert. Synthesizing intonation. Journal of the Acoustical Society of America, 70:985--995, 1981.

Pre95
S. A. Prevost. Intonation, Context and Contrastiveness in Spoken Language Generation. PhD thesis, University of Pennsylvania, Philadelphia, Pa., expected 1995.

PS94
S. A. Prevost and M. J. Steedman. Specifying intonation from context for speech synthesis. Speech Communication, 15(1-2), 1994.

PSM91
C. L. Paris, W. R. Swartout, and William C. Mann, editors. Natural Language Generation in Artificial Intelligence and Computational Linguistics. Kluwer Academic, July 1991.

Rd94
G. Richard and C. d'Alessandro. Time-domain analysis-synthesis of the aperiodic component of speech signals. In Proceedings of the ESCA Workshop on Speech Synthesis, pages 5--8, 1994.

Ril89
Michael Riley. Some applications of tree-based modelling to speech and language. In Proceedings of the Second DARPA Speech and Natural Language Workshop, Cape Cod, Massachusetts, October 1989. Defense Advanced Research Projects Agency.

Ril92
M. D. Riley. Tree-based modeling of segmental durations. In Bailly and Benoît [BB92], pages 265--273.

SB91
K. N. Stevens and C.A. Bickley. Constraints among parameters simplify control of Klatt formant synthesizer. Phonetics, 19:161--174, 1991.

SHY92
Richard Sproat, Julia Hirschberg, and David Yarowsky. A corpus-based synthesizer. In ICSLP [ICS92], pages 563--566.

Sib92
Penelope Sibun. Generating text without trees. Computational Intelligence, 8(1):102--122, 1992.

Sid79
Candace L. Sidner. Toward a Computational Theory of Definite Anaphora Comprehension in English Discourse. PhD thesis, Massachusetts Institute of Technology, Cambridge, Mass., 1979.

Sil87
Kim Silverman. The structure and processing of fundamental frequency contours. PhD thesis, Cambridge University, Cambridge, England, 1987.

SKIM92
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, and Katsuhiko Mimura. ATR -Talk speech synthesis system. In ICSLP [ICS92], pages 483--486.

Spr92
Richard Sproat. Morphology and Computation. MIT Press, Cambridge, Massachusetts, 1992.

Spr94
Richard Sproat. English noun-phrase accent prediction for text-to-speech. Computer Speech and Language, 8:79--94, 1994.

SR87
Terrence Sejnowski and Charles Rosenberg. Parallel networks that learn to pronounce English text. Complex Systems, 1, 1987.

SSGC94
Richard Sproat, Chilin Shih, William Gale, and Nancy Chang. A stochastic finite-state word-segmentation algorithm for Chinese. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 66--73, Las Cruces, New Mexico, 1994. Association for Computational Linguistics.

Str94
T. Strzalkowski. Reversible Grammar in Natural Language Processing. Kluwer Academic Publishers, 1994.

Sun87
J. Sundberg. The science of the singing voice. Northern Illinois University Press, Dekalb, Illinois, 1987.

TAS92
K. Takeda, K. Abe, and Y. Sagisaka. On the basic scheme and algorithms in non-uniform unit speech synthesis. In Bailly and Benoît [BB92], pages 93--105.

TL90
Evelyne Tzoukermann and Mark Y. Liberman. A finite-state morphological processor for Spanish. In Karlgren [Kar90], pages 277--286.

Tra92
C. Traber. F0 generation with a database of natural F0 pattern and with a neural network. In Bailly and Benoît [BB92], pages 287--304.

Ume75
N. Umeda. Vowel duration in American English. Journal of the Acoustical Society of America, 58(2):434--445, 1975.

VBP90
R. Van Bezooijen and L.C.W. Pols. Evaluation of text-to-speech systems: some methodological aspects. Speech Communication, 9:263--270, 1990.

Vit91
Tony Vitale. An algorithm for high accuracy name pronunciation by parametric speech synthesizer. Computational Linguistics, 17:257--276, 1991.

VMT92
H. Valbret, E. Moulines, and J.P. Tubach. Voice transformation using PSOLA. Speech Communication, 11:175--187, 1992.

VS92
J. P. H. Van Santen. Contextual effects on vowel duration. Speech Communication, 11:513--546, 1992.

VS93
J. P. H. Van Santen. Perceptual experiment for diagnostic testing of text-to-speech systems. Computer Speech and Language, 7:49--100, 1993.

VS94
J. P. H. Van Santen. Assignment of segmental duration in text-to-speech synthesis. Computer Speech and Language, 8:95--128, 1994.

VSSOH95
J. Van Santen, R. Sproat, J. Olive, and J. Hirshberg, editors. Progress in Speech Synthesis. Springer Verlag, New York, 1995.

WH92
Michelle Q. Wang and Julia Hirschberg. Automatic classification of intonational phrase boundaries. Computer Speech and Language, 6:175--196, 1992.

WLC92
Liang-Jyh Wang, Wei-Chuan Li, and Chao-Huang Chang. Recognizing unregistered names for Mandarin word identification. In COLING [COL92], pages 1239--1243.

WM77
L. Witten and P. Madams. The telephone inquiry service: a man-machine system using synthetic speech. International Journal of Man-Machine Studies, 9:449--464, 1977.

WT93
Zimin Wu and Gwyneth Tseng. Chinese text segmentation for text retrieval: Achievements and problems. Journal of the American Society for Information Science, 44(9):532--542, 1993.

Yar94
David Yarowsky. Homograph disambiguation in speech synthesis. In ESCA [ESC94], pages 244--247.

YF79
S. J. Young and F. Fallside. Speech synthesis from concept: a method for speech output from information systems. Journal of the Acoustic Society of America, 66(3):685--695, September 1979.

YTA93
Y. Yamashita, M. Tanaka, Y. Amako, Y. Nomura, Y. Ohta, A. Kitoh, O. Kakusho, and R. Mizoguchi. Tree-based approaches to automatic generation of speech synthesis rules for prosodic parameters. Trans. IEICE, E76-A(11):1934--1941, 1993.