Skip to Book Content
Book cover image

Appendix C - Jitter Calculations

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Russell D. Reed and Robert J. Marks II
Copyright © 1999 Massachusetts Institute of Technology
 

Appendix C: Jitter Calculations

The following calculations are used in chapter 17.

C.1 Jitter: Small-Perturbation Approximation

For small noise amplitudes, the network output y(x + n) can be approximated by

(C.1)

where H is the Hessian matrix with elements hij = 2y/(xixj). Assuming an even noise distribution so that <nk> = 0 for k odd, one can write

Click To expand

where m4 is the fourth moment <n4>. Dropping all terms higher than second order in σ gives

(C.2)

and when H is assumed to be zero, this reduces to (17.15). The Laplacian term, Tr(H) = 2y, omitted in (17.15), can be described as an approximate measure of the difference between the average surrounding values and the precise value of the field at a point [100]. The third term in (C.2) is the first order regularization term in (17.15).

Training with nonjittered data simply minimizes the error at the training points and puts no constraints on the function at other points. In contrast, training with jitter minimizes the error while also forcing the approximating function to have small derivatives and a local average that approaches the target in the vicinity of each training point.