Chapter 7: Weight-Initialization Techniques

Overview

The following sections summarize some techniques for initializing weights in sigmoidal networks. The basic motivation is to speed up learning by choosing better initial solutions. A survey and empirical comparison of a number of techniques is given by Thimm and Fiesler [368].

There are two clusters of methods. One consists of methods for choosing parameters controlling the distribution of random initial weights. The motivation here is to avoid sigmoid saturation problems that cause slow training. Most of these methods do not use domain-specific information. The other cluster consists of techniques for initializing the system from an approximate solution found by another modeling system; common choices include rule-based systems, decision trees, or nearest-neighbor classifiers. The motivation here is to reduce training times and probability of convergence to poor minima by starting the system near a good solution. An advantage of these methods, besides faster training, is that they provide ways to embed domain-dependent information in a network.

Chapter 7 - Weight-Initialization Techniques

Chapter 7: Weight-Initialization Techniques

Overview