Object Tabs
Each object has a set of properties that control its behavior. Unlike the "Preferences" dialogue, which controls the entire dialog, these properties apply only to the object, and will override the global settings for that state. The properties of a state are broken into a series of tabs. The tabs that are available for a specific type of state vary based on the state. The tabs explained below are all of the tabs available among all of the object types.

TTS Tab

Prompt
Enter text for speech synthesis. You may include variables with the prompt window, substitution will occur at run time. Speech will be processed from text at run time. Alternatively you may select "-> Rec" to generate a sound object of the TTS prompt. This will immediately process the text and reduce processing delays during run time. In addition, creating a sound object allows you to manually edit the visual alignment using the BaldiSync.

"Save" and "Load" allow you to save and load text into the prompt box from a file.

TTS Parameters
Name/Language/Dialect/Gender: Leave blank to use the default synthetic voices. Allows the user to select among available speech synthesizers based on the above criteria. Use of the * option indicates no preference. All available speech synthesizer names will be displayed in the Names combo box if the * options are used. To change a language, simply select a new language from the menu. If available, you may also specify the dialect and gender for that language.

If a specific voice is selected in this tab, then the sliders can be used to modify the pitch, range and speed of the TTS speech.

Captioning
Show Captioning: Turns the captioning window on / off during run time. The captioning window displays closed captioning of the text to speech in a small top level window during run time.

Save Geometry: Using your mouse, configure the captioning window for location and size. When the dialog reaches this state at run time, the captioning window will assume this geometry. Unless otherwise specified, this new geometry will be maintained for downstream objects.


Recorded Tab

Prompt
A recorded prompt can be used to make your dialog run faster. In addition use your own voice in lieu of synthetic speech. Recorded prompts utilize a sound object. A sound object contains the audio, text and visual alignment information necessary to produce animated speech. The prompt window displays the text transcription of the sound object. If you wish to use TTS for a prompt which you've already configured as recorded, select "<- TTS" to change the prompt type and copy the text to the TTS tab. As with the TTS tab, you can "Save" and "Load" text into the prompt box by using the respective button.

File
The file which contains a recorded sound object (sob). If a file is selected, then the path to that file will be displayed in this entry, ie. "c:/applications/prompts/hello.sob". To edit an existing sob select "Edit". Use "Select" to browse for a sob to associate with this state. "Play" will play the selected sob, and "Clear" empties the entry field.

It is possible to place a variable name in the file field, as long as that variable will contain the path to a sob file at run time. In this case, the text associated with the prompt will not be visible in the prompt area.

Captioning
Show Captioning: Turns the captioning window on / off during run time. The captioning window displays closed captioning of the text to speech in a small top level window during run time.

Save Geometry: Using your mouse, configure the captioning window for location and size. When the dialog reaches this state at run time, the captioning window will assume this geometry. Unless otherwise specified, this new geometry will be maintained for downstream objects.


Recognition Tab

Recognizer
Name/language/Dialect/Sample Rate/Description
Allows the user to select among available speech recognizers based on the above criteria. Use the * option indicate no preference. Note: All recognizer names will be displayed in the NAMES window if the * options are used. The Recognizer setting below are NOT calibrated between different recognizers. You might need to adjust the settings when changing between recognizers to achieve the desired performance.

Enable Remote Review
Allows a RAD user within the Domain to review the dialog in real time, including audio output. The reviewer can override the recognition using the recognition results window.

Out of Vocabulary Rejection Median
Determines the recognition confidence required to reject an utterance as being "out of the recognition vocabulary." A lower number rejects more and a higher number rejects less. This makes a high number more forgiving of incorrect pronunciations.

Recommend 9 for 16 kHz adult recognizer
Recommend 22 for 8 kHz adult recognizer

Determines the recognizer's sensitivity to spot recognition vocabulary within an utterance. A low number spot less and a high number spots more.

Recommend 9 for 16 kHz adult recognizer
Recommend 22 for 8 kHz adult recognizer

Grammar Garbage Threshold
Rejection setting for Grammar Type recognizers.

Repair
The speech recognizer can only choose between words available in the recognition vocabulary. The dialogue will branch to the recognition port that contains the closest matching word or phrase. With repair turned off, the recognizer is forced to decide between the available vocabulary regardless of confidence score.

However, with repair engaged, the recognizer is allowed to reject all available vocabulary when it is not confident about matching a word or phrase. This is called "out of vocabulary rejection." Selecting repair provides an automatic connection to a pre-determined subdialogue when the user says something that is "out of vocabulary." This is similar to adding *any to a recognition port except that branching to the repair subdialog is accomplished automatically.

Editing Default Repair
The default repair subdialogue firsts says "sorry." If invoked a second time, it says, "I still don't understand." The third time it says, "I give up," and terminates the application. You can edit the default repair subdialogue just like any other RAD app: From the main menu, select View -> Repair Default.

Selecting an alternative repair subdialogue
Alternatively you can specify any subdialogue in your application as "Repair" using the combo box. The specified subdialogue will be invoked when "out of vocabulary rejection" occurs. The application will repeat the calling object after exiting the repair.


DTMF Tab

DTMF
DTMF applies only to telephony applications. Specifies the DTMF parameters for the current state. DTMF (Dual Tone Multi-Frequency) are the tones generated by a touch tone telephone.

Mode
Local and typeahead are not currently documented.

Output Variable
Specifies the name of the variable within the "User" environment that will contain the DTMF response.

Terminating Conditions
Specifies a DTMF selection that will end the DTMF recognition for that state.

Maximum Number of Tones
Specifies a number of DTMF selections that will end DTMF recognition for a state.

Timeout (msec)
Specifies the number of milliseconds before ending the DTMF recognition for that state.

Vocabulary Mappings
Not currently documented.

Grammar Mappings
Not currently documented.


Misc Tab

Audio Parameters
Maximum Record Duration
Specifies the maximum length of time the speech recognizer will record the user's utterance.

Leading Silence Duration
Specifies the maximum length of time the speech recognizer will continue to record if it is detecting only silence.

Trailing Silence Duration
Specifies the maximum length of time the recognizer will continue to record after the user stops speaking. If sound is detected then silence is detected, the recognizer will continue to record for the duration of this setting. The default value of this setting may need to be adjusted when the user is expected to say something that contains natural pauses, such as a telephone number. (1 503 pause 246 pause 1342). The trailing silence setting must be increased to prevent cutting off the speaker prematurely.

Record Backoff
Specifies the length of time between the beep and start of recording.

Voice Detection Threshold
Set this value using the microphone calibration option. The VDT specifies the minimum sound threshold above which speech will be detected.

Playback
From object
Plays back a wave object object as the response to this state instead of accepting live input, this way a prompt that was recorded in the same dialogue can be re-used. The name of the wave object should be placed in the entry box.

From file
Plays back a wave file from disk as the response to this state instead of accepting live input. The "Browse" button allows the user to browse for a wave file.


On Enter / On Exit Tabs

Recognizer
These tabs contain Tcl/Tk code to be executed before entering, and after exiting the state. This code will be evaluated in the "User" Tcl interpreter. All of the objects in RAD execute their code in this interpreter, and thus their variables are accessible.

For more information on Tcl/Tk, consult our Tcl/Tk links page or the reference of your choice.


Tucker-Maxon Tab

The Tucker-Maxon package includes various media and education related objects, and data capture facility.

Dynamic Recognition Adjustment
Changes the "out of vocabulary rejection median" setting during a dialog based on recognition performance. This is used primarily for applications that teach speech production. Selecting Dynamic Rejection engages this feature so that mis-recognitions make the recognizer more forgiving and successful recognitions make the recognizer more discriminating.

Movement
Indicates the increment the "out of vocabulary rejection median" will changed after the trigger level is reached.

Trigger
Indicates the number of mis-recognitions or recognitions required to change the "out of vocabulary rejection median" (OVRM) setting by the amount set in the "movement" slider. For example, if the "trigger" is set to 5, and the "movement" is set to 3, the OVRM is will decrease 3 points (more discriminating) when 5 recognitions are made. Conversely the OVRM will increase 3 points (more forgiving) when 5 mis-recognitions are made in a row.

A ms-recognition is an instance where the recognizer rejects all available vocabulary in favor of "garbage". This occurs when the recognizer is not confident about that what the user said matches the available words.