As part of the dual UKF algorithm, we implemented the UKF for weight estimation. This represents a new parameter estimation technique that can be applied to such problems as training feedforward neural networks for either regression or classification problems.
Recall that in this case we write a
state-space representation for the unknown weight parameters
as given in Equation 5. Note that in this case both the
UKF and EKF are order
(
is the number of weights). The
advantage of the UKF over the EKF in this case is also not as obvious, as
the state-transition function is linear. However, as pointed out
earlier, the observation is nonlinear. Effectively, the EKF builds up
an approximation to the expected Hessian by taking outer products of
the gradient. The UKF, however, may provide a more accurate estimate
through direct approximation of the expectation of the
Hessian. Note another distinct advantage of the UKF occurs when either
the architecture or error metric is such that differentiation with
respect to the parameters is not easily derived as
necessary in the EKF. The UKF effectively evaluates both the Jacobian and Hessian precisely through its sigma point propagation, without the need to perform any analytic differentiation.
We have performed a number of experiments applied to training neural networks on standard benchmark data. Figure 4 illustrates the differences in learning curves (averaged over 100 experiments with different initial weights) for the Mackay-Robot-Arm dataset and the Ikeda chaotic time series. Note the slightly faster convergence and lower final MSE performance of the UKF weight training. While these results are clearly encouraging, further study is still necessary to fully contrast differences between UKF and EKF weight training.
![]() | ||
| Figure: Comparison of learning curves for the EKF and UKF training. a) Mackay-Robot-Arm, 2-12-2 MLP, b) Ikeda time series, 10-7-1 MLP. | ||