Chapter 8: The Error Surface

Overview

Because the network output is a function of its weights, the error is a function of w. In general, E(w) is a multidimensional function and impossible to visualize. If it could be plotted as a function of w, however, E might look like a landscape with hills and valleys, high where E is high and low where E is low. Back-propagation, as an approximation to gradient descent, could then be viewed as placing a marble at some random point on the landscape and letting it roll downhill. If the surface were shaped like a smooth bowl, the marble (the weight state) would always roll to the lowest point; back-propagation would always find the best solution and local minima would never be a problem. Usually, of course, the surface is not so simple. Because the shape of the error surface has a fundamental effect on the learning process, it is useful to examine some of its properties. Many of the figures that follow are adapted from Hush, Horne, and Salas [183], [181].

Chapter 8 - The Error Surface

Chapter 8: The Error Surface

Overview