Skip to Book Content
Book cover image

Chapter 16 - Heuristics for Improving Generalization

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Russell D. Reed and Robert J. Marks II
Copyright © 1999 Massachusetts Institute of Technology
 

16.3 Pruning Methods

Pruning algorithms are surveyed in chapter 13. The following paragraphs outline a few main points. Because the target function is unknown, it is difficult to predict ahead of time what size network will learn the data without overtraining. Not knowing the optimum network configuration, one can train many networks and choose the smallest or least complex one that learns the data. Although simple, this approach can be inefficient if many networks must be trained before an acceptable one is found. Even if the optimum size is known, the smallest networks just complex enough to fit the data may (depending on the learning algorithm) be sensitive to initial conditions and learning parameters. It may be hard to tell if the network is too small to learn the data, if it is simply learning very slowly, or if it is stuck in a local minima due to an unfortunate set of initial conditions or parameters. Thus, even if one finds a small network that will reliably learn the data, there might be a still smaller network that would work but is very difficult to train.

The pruning approach is to train a network that is somewhat larger than necessary and then remove unnecessary elements. The large initial size allows the network to learn reasonably quickly with less sensitivity to initial conditions and local minima while the reduced complexity of the trimmed system favors improved generalization. In several studies, e.g., [345], [344], pruning techniques produced solutions for small networks that generalized well and were not reliably obtainable by training the reduced network with random weights.

Although pruning techniques provide a means to simplify a network, they must be guided by other criteria to decide how simple the network should be. That is, there is still a need for external information and theoretical criteria to decide when to stop pruning.