## Abstract

We present an inverse technique to determine particle-size distributions by training a layered perception neural network with optical backscattering measurements at three wavelengths. An advantage of this approach is that, even though the training may take a long time, once the neural network is trained the inverse problem of obtaining size distributions can be solved speedily and efficiently.

© 1990 Optical Society of America

Previously, methods such as smoothing and statistical and Backus–Gilbert inversion techniques were used to find profiles of particle distributions.[1]–[4] The smoothing technique requires a judicious choice of two parameters that control the smoothness of the solution. Statistical inversion techniques require prior knowledge of the statistical properties of the unknown function and the measurement errors. The Backus–Gilbert technique requires a good compromise between the spread and the variance.

In this Letter we present an alternative method based on an artificial neural network technique. The neural network technique offers a different approach in that the network memorizes the experience gained by the training; even though the training may take a long time, once the training is done the inversion can be performed instantly. As an example, we use a simple inverse problem to find particle distribution by using single scattering. We use this example to demonstrate that this technique has a potential for more general multiple scattering problems.

We consider the inverse problem of
finding the particle-size distribution from the measurements of
backscattered light on an optically thin medium containing particles.
The first-order scattering approximation is used. The measured quantity
is the backscattered intensity *β*(*λ _{i}*) at three different wavelengths

*λ*(

_{i}*i*= 1, 2, 3), and it is related to the size-distribution function

*n*(

*r*) by a Fredholm integral equation of the first kind as

*m*is the particle refractive index,

*r*is the radius of the particle, and

*K*(

*λ*) is the backscattering cross section.[5] We assume that the particles are spherical so that the backscattering cross section can be computed by the Mie solution. The inversion problem is to find the distribution,

_{i}, m, r*n*(

*r*), from

*β*(

*λ*) measurements. In real-life remote-sensing applications, a large amount of data is collected continually, and it is important to develop a speedy inversion algorithm. The neural network technique presented in this Letter can perform speedy inversion once the neural network is trained.

_{i}Here we utilize a layered perceptron neural network to determine particle-size distributions.[6]
The size-distribution function is assumed to be a log-normal
distribution function so that it is characterized by the mean radius *r _{m}* and the standard deviation

*σ*. The inverse problem is to obtain

*r*and

_{m}*σ*for given

*β*(

*λ*). Although

_{i}*β*(

*λ*) is linearly related to the distribution

_{i}*n*(

*r*), the relationship between input

*β*(

*λ*) and

_{i}*r*and

_{m}*σ*is nonlinear. Earlier techniques such as the Backus–Gilbert technique can handle only linear inversion. It is also our purpose here to demonstrate that the neural network can perform a nonlinear inversion problem. The inverse problem by Kitamura,[6] however, is to obtain

*n*(

*r*) at 31 points, and therefore the relationship between input and output is linear. We also found that

*β*(

*λ*)'s can become close together for some

_{i}*r*and

_{m}*σ*, resulting in a nonunique inverse solution. An algorithm is presented to find the ranges from

*r*and

_{m}*σ*so that unique solutions can be obtained. Finally we show that increasing the number of iterations in training the neural network causes the inverse solutions to tend to converge to the real value.

An artificial neural network can be defined as a highly connected array of elementary processors called neurons. Here we consider the multilayer perceptron (MLP) type of artificial neural network.[7]–[11]

As shown in Fig. 1,
the MLP-type neural network consists of one input layer, one or more
hidden layers, and one output layer. Each layer employs several neurons,
and each neuron in the same layer is connected to the neurons in the
adjacent layer with different weights. A schematic diagram of this model
is depicted in Fig. 1. We use three inputs [*β*(*λ*_{1}), *β*(*λ*_{2}), *β*(*λ*_{3})] and two output neurons (*r _{m}, σ*).
Signals pass from the input layer, through the hidden layers, to the
output layer. Except in the input layer, each neuron receives a signal
that is a linearly weighted sum of all the outputs from the neurons of
the former layer. The neuron then produces its output signal by passing
the summed signal through the sigmoid function 1/(1 +

*e*

^{−}

*).*

^{x}The backpropagation learning algorithm is employed for training the neural network. Basically this algorithm uses the gradient descent algorithm to get the best estimates of the interconnected weights, and the weights are adjusted after every iteration. The iteration process stops when a minimum of the difference between the desired and actual output is searched by the gradient descent algorithm.[10],[11]

We consider the backscattering of light from a volume distribution of spherical particles with 31 radii ranging from 0.01 to 40 *μ*m. We assume that the size-distribution function *n*(*r*) is governed by the log-normal function so that it is characterized by two quantities: the mean radius *r _{m}* and the standard deviation

*σ*. Therefore it is given by

First we conduct a study of the forward problem of finding *β*(*λ _{i}*) for various

*r*and

_{m}*σ*. Since the radius of particles varies from 0.01 to 40

*μ*m, i.e., −2 ≤ log(

*r*) ≤ 1.66, the ranges for

*r*and

_{m}*σ*are chosen so that −1 ≤ log(

*r*) ≤ 0.44 and 0.03 ≤ log(

_{m}*σ*) ≤ 1. Thus the actual size of particles ranging from (

*r*) to

_{m}/σ*r*will be within the range for

_{m}σ*r*. The inverse problem is generally nonunique in getting

*r*and

_{m}*σ*for given

*β*(

*λ*) if

_{i}*r*and

_{m}*σ*are allowed as in the above ranges. In what follows, we restrict the ranges for

*r*and

_{m}*σ*such that unique solution can be obtained. The algorithm in finding such ranges is also discussed.

Both log(*r _{m}*) and log(

*σ*) are divided into 10 intervals for generating the training and testing data. We chose the refractive index of the particle to be

*m*= 1.53 −

*j*0.008 and the wavelengths to be

*λ*

_{1}= 0.53

*μ*m,

*λ*

_{2}= 1.06

*μ*m, and

*λ*

_{3}= 2.12

*μ*m. The study of

*β*(

*λ*) reveals that for some values of log(

_{i}*r*) and log(

_{m}*σ*), the inputs are close to one another, resulting in nonunique solution for

*r*and

_{m}*σ*with such

*β*(

*λ*). For a unique solution of

_{i}*r*and

_{m}*σ*for a given

*β*(

*λ*), the change in

_{i}*β*(

*λ*) for a given change of

_{i}*r*and

_{m}*σ*must be sufficiently large. Therefore we define the distance

*D*, a measure of separation of

*β*(

*λ*), as

_{i}*r*) and log(

_{m}*σ*) into a number of intervals such that and

In order to ensure that the *β*(*λ _{i}*)'s are sufficiently separated, we require that

*D*exceed a minimum distance

*D*. To find

_{m}*D*, we first notice that there is a large difference in magnitude between

_{m}*β*(

*λ*) and

_{i}, σ_{j}, r_{ml}*β*(

*λ*) for

_{i}, σ_{k}, r_{ml}*k*>

*j*. For instance, we have

*β*(

*λ*

_{i}, σ_{1},

*r*

_{m}_{1}) ∼ 10

^{−15}and

*β*(

*λ*

_{i}, σ_{M}, r_{m}_{1}) ∼ 10

^{−6}. Thus

*D*cannot be fixed for all

_{m}*σ*but should vary according to

_{j}*σ*. In addition, for the same

_{j}*σ*, the value for

_{j}*β*(

*λ*) increases from

_{i}, σ_{j}, r_{ml}*l*= 1 to

*l*=

*N*. The lowest value occurs when

*l*= 1, Hence the minimum distance

*D*is chosen proportionally to

_{m}*β*(

*λ*) obtained from the first mean radius

_{i}*r*

_{m}_{1}. Specifically,

*D*

_{1}is a constant. Thus

*D*is a fixed quantity when

_{m}*D*

_{1}and

*σ*are fixed. Therefore we can determine the allowable range of log(

_{j}*r*) for that particular log(

_{m}*σ*), the lower and upper bounds of log(

_{j}*r*), by enforcing the requirement that

_{m}*D*≥

*D*. Similarly for each log(

_{m}*σ*),

_{j}*j*= 1, 2, …, we computed the corresponding allowable ranges of log(

*r*). From the diagram of all the allowable ranges for log(

_{m}*r*), we can estimate the desired region for log(

_{m}*r*) and log(

_{m}*σ*).

The constant *D*_{1} in Eq. (4) controls the size of the allowable region for log(*r _{m}*) and log(

*σ*). A large value of

*D*

_{1}will generally create a small allowable region, but the values of

*β*(

*λ*) are reasonably separated, and therefore unique sets of

_{i}*β*(

*λ*) can be obtained. On the other hand, a small value of

_{i}*D*

_{1}will create a large allowable region, but the sets of

*β*(

*λ*) are close to one another. Unique sets of

_{i}*β*(

*λ*) are thus difficult to obtain, resulting in a large percentage of error in obtaining the unknown size distribution. A value of

_{i}*D*

_{1}ranging from 0.1 to 50 has been tested for finding the suitable

*D*

_{1}. It was found that a value of 10 for

*D*

_{1}is a good compromise between the percentage error and the size of the allowable region for log(

*r*) and log(

_{m}*σ*). With such a value, the allowable region is found to be −0.328 ≤ log(

*r*) ≤ 0.44 and 0.03 ≤ log(

_{m}*σ*) ≤ 0.5, as shown in Fig. 2.

Based on the allowable region discussed above, a group of 480 sets of data was generated from Eq. (1). In order to maximize the computing accuracy of the neural network, we first normalize all the data from zero to unity. We use 462 data sets to train the neural network. The remaining 18 sets are used to test the system. Finally, as shown in Fig. 3, the results are converted back to the original values.

Figure 3 shows the performance of the neural network in terms of absolute percentage error for log(*r _{m}*) and log(

*σ*). It is clear from the figure that an increasing number of iterations tends to cause the outputs to converge to the real value and hence lowers the absolute percentage of errors. Except when the desired outputs log(

*r*) and log(

_{m}*σ*) are small, the neural network yields good results for most of the testing data, with an absolute percentage of error of less than 10%.

In summary, we
have presented an inverse technique of finding particle-size
distribution by using optical sensing and a neural network. The
size-distribution function is assumed to be a log-normal function, so
that it is characterized by a mean radius *r _{m}* and a standard deviation

*σ*. It was shown that the neural network yields good results for the testing data, with an absolute percentage of errors of less than 10% for most of the testing input

*β*(

*λ*). A major advantage of this technique is that, once the neural network is trained, the inverse problem of obtaining the size distributions can be solved speedily and efficiently (in a small fraction of a second) by a microcomputer workstation.

_{i}This research is partially supported by a grant-in-aid for scientific research from the Ministry of Education, Science, and Culture of Japan (Monbusho), the National Science Foundation, and NASA.

## Figures

## References

**1. **D. L. Phillips, J. Assoc. Comput. Mach. **9**, 84 (1962). [CrossRef]

**2. **P. Edenhofer, J. N. Franklin, and C. H. Papas, IEEE Trans. Antennas Propag. **AP-21**, (1973).

**3. **G. Backus and F. Gilbert, Phil. Trans. R. Soc. London Ser. A **266**, (1970).

**4. **E. R. Westwater and A. Cohen, Appl. Opt. **12**, 1340 (1973). [CrossRef] [PubMed]

**5. **A. Ishimaru, *Wave Propagation and Scattering in Random Media* (Academic, New York, 1978), Vol. 2.

**6. **S.
Kitamura and P. Qing, “Neural network application to solve Fredholm
integral equations of the first kind,” presented at the International
Joint Conference on Neural Networks, Washington D.C., June 1989.

**7. **M.
El-Sharkawi, R. Marks II, M. E. Aggoune, D. C. Park, M. J. Damborg,
and L. Atlas, “Dynamic security assessment of power systems using back
error propagation artificial neural networks,” presented at the 2nd
Annual Symposium on Expert System Application to Power Systems, Seattle,
Wash., July, 1989.

**8. **L.
Atlas, R. Cole, Y. Muthusamy, A. Lippman, G. Connor, D. Park, M.
El-Sharkawi, and R. Marks II, “A performance comparison of trained
multilayer perceptrons and trained classification trees,” Proc. IEEE (to
be published).

**9. **D.
Park, M. El-Sharkawi, R. Marks II, L. Atlas, and M. Damborg, “Electric
load forecasting using artificial neural networks,” IEEE Trans. Power
Syst. (to be published).

**10. **D. E. Rumelhar and J. L. McClelland, eds., *Parallel Distributed Processing* (MIT Press, Cambridge, Mass.1986).

**11. **R. P. Lippmann, IEEE Trans. Acoust. Speech Signal Process . **ASSP-4**, 4 (1987).