Books24x7 Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks

9.5 SuperSAB

SuperSAB [372] is another adaptive learning rate method based on the delta-bar-delta heuristics. It is based on an earlier method called SAB, which stands for "self-adapting back propagation." Like the method of Vogl et al. (section 9.4), the learning rate is both increased and decreased multiplicatively.

Parameters include the initial learning rate η _start, an increase factor η⁺ > 1, and a decrease factor 0 < η^- < 1. Each weight has its own learning rate η _ij(t) which changes with time t. The algorithm is:

Initialize all learning rates η_ij(0) =η _start.
Do a back-propagation step with momentum.
For each weight w_ij
- if the sign of its derivative is unchanged then increase the learning rate, η_ij( t + 1 )=η⁺ .η_ij(t);
- otherwise (the sign changed), retract the step w_ij (t + 1) = w_ij (t) - Δw_ij (t), decrease the learning rate η_ij(t + 1) =η_- η_ij (t), and set Δw_ij (t + 1) = 0 so momentum has no effect in the next cycle.
Go to 2.

Typical suggested values are η⁺ = 1.2 and η^- = 0.5. (There appear to be typographical errors in [372]. This is based on the explanation accompanying the formula.)

Reported results have been inconsistent. In some cases SuperSAB is among the fastest methods [9]; others have reported it to be very unstable [8]. The possibility of instability, especially when momentum is high, is noted in the original paper. This shows itself as a sudden large increase in the error. Sometimes the error will correct itself in subsequent steps; otherwise a restart may be necessary. Because η increases multiplicatively and can become large quickly, it is reasonable to set limiting values on both η and the maximum allowed weight magnitude. Because of the instability problems and because it does not appear to have major speed advantages, other methods may be preferable in general.

Chapter 9 - Faster Variations of Back-Propagation

9.5 SuperSAB