Skip to Book Content
Book cover image

Chapter 12 - Constructive Methods

Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Russell D. Reed and Robert J. Marks II
Copyright © 1999 Massachusetts Institute of Technology
 

12.8 Construction from a Voronoi Diagram

A constructive method based on Voronoi tessellation of the training data is described by Bose and others [55], [53], [54]. A similar method is described by Murphy [279].

Figure 12.7: Constructive methods can be based on a Voronoi tessellation of the training points. The Voronoi diagram of a set of "base" points partitions the space into cells depending on which base point is closest. Each cell is a convex region bounded by hyperplanes. A layered network can be constructed which forms the same partition and generates the required outputs in each cell.

The Voronoi tessellation is related to the familiar nearest-neighbor-classifier partition. Figure 12.7 illustrates a two-dimensional example, but the principle applies in higher dimensions as well. Given a set of base points, the surrounding space is partitioned into regions, or cells, depending on which base point is closest. With a Euclidean metric, the resulting cells are convex regions bounded by hyperplanes. (Other tessellations using different metrics and different criteria are possible; this is the most common variation.) There are efficient algorithms for obtaining the partition in high dimensions. Most are based on its dual, the Delaunay tessellation.

Given a Voronoi diagram of the training points, a layered network can be generated to form the same partition and produce the required outputs in each cell. As noted in section 4.1, a network with two hidden layers would be sufficient. Nodes in the first hidden layer would implement the partition hyperplanes, nodes in the second hidden layer would combine these into the convex cell partitions, and the output node would combine these to generate the required output for each cell. It isn't necessary to implement every partition hyperplane, however, because many don't separate points with different labels.

The algorithm described by Bose and colleagues [55], [53], [54] automatically constructs a network to fit a given set of training data. It chooses the number of layers, number of nodes in each layer, and sets appropriate weights to realize the mapping. The layers are only partially connected in general. The process is completely automatic, so repetitive trial and error experiments with different structures and different training parameters are avoided, as well as long training times and uncertainties about convergence to local minima. The algorithm is rather involved, however, and will not be described here. Details can be found in [55], [53], [54].

Remarks This method designs networks for classification problems with binary or discrete-class targets. Generalization issues are not addressed; with consistent data, the resulting network will classify every training point correctly.

Because the design is based on a nearest-neighbor classifier, the resulting network has properties like a nearest-neighbor classifier. That is, accuracy can be good when training data are abundant, but may be poor when data are sparse. Class boundaries may be rather jagged in some cases. Advantages over a naive nearest-neighbor classifier are economy (it does not store every training point) and evaluation speed (it does not slowly search through every training point to classify a new input).

In general, only hyperplanes that separate differently labeled cells need to be realized by hidden nodes. In some cases, a single hidden node can fill in for several hyperplanes of the Voronoi diagram so the resulting network can be relatively small. The network may not be minimal though because the algorithm is not always smart enough to see when a single hidden node could do the job of several partition hyperplanes. In figure 12.7, for example, there are 17 planes separating the 0s and 1s, but as few as three or four hidden nodes would probably be enough.