Creating and using natural speech prompts.
Drag and arrange states onto the canvas so that you have the following setup.
RAD can substitute any recorded voice for the computer's synthetic speech.
The recorded speech can be aligned with the animated agent so that lip, face
and body movements are in sync. The process of creating a natural speech
prompt involves four steps:
1. Record yourself speaking the prompt.
2. Transcribe your prompt in the text box.
3. Align the audio and text transcription.
4. Save the resulting sound file. (file suffix is .sob for sound object)
Double click on the "natural_speech" state to open it's configuration dialogue.
Select the "Recorded" tab to raise the recorded speech tab.
Select "Edit" to edit a new sound object. This will open BaldiSync. BaldiSync
allows you to record your voice, enter your transcription, align and save the file.
Now you must record yourself saying "Can I borrow your towel? My car just hit a water
buffalo". In order to record, select the record button, speak, then select the stop
button. If your first recording is not satisfactory, you can simply keep re-recording
until you are pleased with the quality of the utterance. You should try to keep from
having too much trailing silence or noise in the recording (see tips below). Once
you've recorded your speech, the display should resemble this:
Some common problems encountered in recording speech for alignment are:

1. Speech is too loud.
Speaking too loudly or close to the mic will produce a scatchy sounding recording.
Note that the peaks in the sound energy display are clipped and appear to extend past
the boundaries of the display window. Move the microphone further from your mouth
and speak more softly.

2. Too much silence.
You should begin speaking immediately after selecting the record button, and stop
recording directly after you've stopped speaking. The record button can be selected a
second time to stop recording.
An ideal recording will have no clipping and very little leading and trailing silence.
It is possible to delete leading and trailing silence from a recording, consult the
BaldiSync documentation for details. Below is an example of an ideal recording:

Once you are satisfied with the quality of your recording, select the "Align" button
to align your speech with the text that you entered. The computer will generate some
phonetic anc word labels that are displayed below the wave. Select "Animate" for a
preview of how your speech will be synchronized with the animated agent. The alignment
boundaries are adjustable, details can be found in the BaldiSync documentation.
Note that the name of the sob file now appears in the "File" field of the "Recorded"
tab. Make sure that "recorded" is selected for the prompt type, then select "Ok" to close
the dialogue.