Skip to Book Content
Book cover image

Chapter 14 - Factors Influencing Generalization

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Russell D. Reed and Robert J. Marks II
Copyright © 1999 Massachusetts Institute of Technology
 

14.2 The Need for Additional Information

A fundamental reason for less than perfect generalization is that the problem is ill-posed because the samples alone do not uniquely determine an interpolating function (see figure 14.1). Any of an infinite number of functions passing through the sample points is equally valid according to the sample set error and other criteria are needed to choose among them. If nothing more is known about the target function, there is no basis for selecting one solution over another.

Additional information must be provided. This is often done by biasing the training procedure to favor certain types of solutions, thereby placing constraints on the sets of solutions considered. Different approximation methods can be viewed from this perspective in terms of the biases imposed and how they are implemented. Selection of a neural network rather than a polynomial approximation, decision tree, or some other fitting system already limits the set of solutions that will be considered; further selection of a particular structure, parameters, and training algorithm provides additional constraints. These choices may reflect unrecognized assumptions about the solution.

However they are implemented, the ability of the constraints to lead to good generalization depends on how well they reflect actual properties of the (unknown) target function and is relatively independent of the optimization technique used to find a solution (assuming they are equally successful in satisfying the constraints).

Click To expand
Figure 14.1: Samples alone do not provide enough information to uniquely determine an interpolating function. An infinite number of functions can be fit through the sample points; all are equally valid according to the sample set error and other criteria are needed to choose among them.