NAME

hmminit.tcl - Model initialization using vector quantization and Viterbi state alignment.


AVAILABILITY

script/hmm_1.0


SYNOPSIS

  hmminit.tcl basename [options]

PARAMETERS

basename
Base-name of HMM model (e.g. digit/timit)

OPTIONS

-modellist modellist [Default = all models]
Specify a subset of the models. Only the parameters for these models will then be updated.
-pickfile filename [Default = segment.pick]
Define the selected segments used for model initialization.
-numiter int [Default = 10]
Number of iterations for VQ/Viterbi initialization.
-config configfile
Read command line options (configuration info) from this file. The configuration file is in essence a Tcl script which sets the required internal variables.
 set config(init,pickfile) segment.pick
 set config(init,numiter) 10

These values will override the preset default values. Subsequent command line options will override the values specified by the configuration script. Command line parameters are specified using the param variable.

 set param(init,basename) foo

Since command line parameters are typically not optional the user needs to specify the command line parameters as a single "-" character for the settings defined in the configuration script to take effect.

The configuration file is also used to specify the feature post-processing script.

 set config(feature,script) user.tcl

If this variable is not defined in the configuration file, then only the base features as saved in the feature cache are used during training.

-help
Provides a short description of the command line options.

DESCRIPTION

hmminit.tcl is used to create initial HMM model parameter estimates. For this purpose you would need access to labeled and segmented data. The TIMIT database is typically a good source for segment/label data for building phonetic based models.

The model initialization is a three step process. First, all data (segments) for the particular class are loaded into memory, based on the information provided by the pickfile) (see pickdata.tcl). Each segment is then cut in equal sized sub segments, depending on the number of states in the particular model.

All data allocated to a particular HMM state are then combined and the initial mixture mean vectors are estimated using vector quantization. The initial mixture variances are set to the pooled variance of the data. During this step only the mixture means and covariances are estimated. The mixture weights and state transition probability matrixes are not estimated.

Finally the parameter estimates are further improved, by performing a Viterbi state realignment. Here the allocation of data to a particular state is determined by the state alignment rather than the equal sized segments used during the previous step.


SEE ALSO

genfeature.tcl, hmmtrain.tcl, hmmembed.tcl


AUTHOR

Johan Schalkwyk
Center for Spoken Language Understanding
Oregon Graduate Institute of Science & Technology


Last modified on Wed Mar 11 11:10:54 PST 1998.