View Table of ContentsCreate a BookmarkAdd Book to My BookshelfPurchase This Book Online
Skip to Book Content
Book cover image

Chapter 9 - Faster Variations of Back-Propagation

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Russell D. Reed and Robert J. Marks II
Copyright © 1999 Massachusetts Institute of Technology

Previous Section Next Section

9.5 SuperSAB

SuperSAB [372] is another adaptive learning rate method based on the delta-bar-delta heuristics. It is based on an earlier method called SAB, which stands for "self-adapting back propagation." Like the method of Vogl et al. (section 9.4), the learning rate is both increased and decreased multiplicatively.

Parameters include the initial learning rate η start, an increase factor η+ > 1, and a decrease factor 0 < η- < 1. Each weight has its own learning rate η ij(t) which changes with time t. The algorithm is:

  1. Initialize all learning rates ηij(0) =η start.

  2. Do a back-propagation step with momentum.

  3. For each weight wij

    • if the sign of its derivative is unchanged then increase the learning rate, ηij( t + 1 )=η+ij(t);

    • otherwise (the sign changed), retract the step wij (t + 1) = wij (t) - Δwij (t), decrease the learning rate ηij(t + 1) =η- ηij (t), and set Δwij (t + 1) = 0 so momentum has no effect in the next cycle.

  4. Go to 2.

Typical suggested values are η+ = 1.2 and η- = 0.5. (There appear to be typographical errors in [372]. This is based on the explanation accompanying the formula.)

Reported results have been inconsistent. In some cases SuperSAB is among the fastest methods [9]; others have reported it to be very unstable [8]. The possibility of instability, especially when momentum is high, is noted in the original paper. This shows itself as a sudden large increase in the error. Sometimes the error will correct itself in subsequent steps; otherwise a restart may be necessary. Because η increases multiplicatively and can become large quickly, it is reasonable to set limiting values on both η and the maximum allowed weight magnitude. Because of the instability problems and because it does not appear to have major speed advantages, other methods may be preferable in general.


Top of current section
Previous Section Next Section
Books24x7.com, Inc. © 1999-2001  –  Feedback