Books24x7 Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks

A.1 Newton's Method

This gives the update rule

(A.10)

(A.11)

(where the subscripts index time rather than vector elements). At each step, w is changed by a fraction η of the difference (w^* - w_k). Because the error surface is quadratic, the solution could be obtained in a single step when η = 1. For nonlinear optimization tasks such as most neural network problems, however, the linear approximation is only locally valid and smaller step sizes are used to avoid straying too far from the region of validity; one-step convergence is not possible and iteration with a smaller step size is necessary. In the linear case, the eventual solution is the same however, w∞ = w^*.

Appendix A - Linear Regression

A.1 Newton's Method