Download Embedded Systems
Transcript
many kinds of information faster. Neural networks are hot, but they're far from perfected. At this juncture several commercial products show promise, but even the best of these will no doubt change (for the better) as we learn more about our brains and more about computers. But enough general introduction. Let's create an NNT, beginning with a few definitions for the back-propagation model, the architecture we'll use. The Back-Propagation Model Although a back-propagation neural network model is generally considered a good one, it has no "standard" definition. Neural network experts agree mostly on the algorithms they use to describe network training and operation, but don't agree how these algorithms should be implemented. We'll try to simplify our implementation by describing the back-propagation (BP) model (as we understand it) in some detail. We'll examine each of its elements and describe how these elements combine to form the BP topology. Figure 1 (remember you looked before) shows a simple, three-layer, BP model. • Each circle represents a processing element (PE). • Each arrow represents an interconnection (synapse) and its associated weight. • The PEs with the letter "b" inside are called bias nodes. In this article (and our book), subscripted lower case letters represent the attributes of individual PEs (nodes) and of individual connections. • the letter "i" represents an input, • "0" represents an output, • "w" represents a connection weight, • "n" represents the number of nodes in a layer. • The subscripts "i,j,l" refer to the input, hidden, and output layers, respectively. (If you use more than one hidden layer, the subscript "k" represents it.) For example, "ii" is the input to an input layer PE, "oj" is the. output of a hidden layer PE, and "nt is the number of PEs in the output layer. Bold lower case letters represent vectors. For example, "ii" represents the input vector to the input layer, made up of all the individual inputs. "01" repreI 18 MICRO CORNUCOPIA, #51, Jan-Feb 1990 sents the output vector of the output layer. We often work with a combination of an input vector and its associated output vector. This combination of an input and its associated output comprises a "pattern vector," represented by a "p." We list the input part first, then the output. Typically we divide all our patterns into two categories or sets: a training set and a testing set. The subscripts "r,s" are associated with training and testing, respectively. Neural networks are hot, but they're far from perfected even the best of these will no doubt change ... as we learn more about our brains and more about computers. So, for example, "pr" is a training pattern and "ps" is a testing pattern. In both cases, in the representation of the vector (for example, in the pattern files for the NNT), the output components follow the input components. Connection weights require two subscripts that represent the sending and receiving layers. For example, the weight of the connection from an input PE to a hidden PE is "Whi." Note that the receiving PE layer is the first subscript, and the sending PE layer is the second. While this may seem somewhat counter-intuitive to you, it's the generally accepted way to represent weights, and corresponds to the matrix notation which sometimes represent weights. We represent matrices by bold capital letters. For example, "Wji" represents the matrix of connection weights to the hidden layer (from the input layer). Don't despair (if you're despairing); we'll use vectors and matrix notation sparingly. We'll also use three coefficients later: • the learning coefficient, eta (11), • the momentum factor, alpha (a), • and the error term, delta (8). Later, we'll describe each of the network elements. In Part 2 we'll describe the operation and training of the BP network of Figure 1, step by step. Let's examine now how we present input to the network. Network Input On the left of Figure 1, note the inputs entering the network via the input layer (a layer of processing nodes). These inputs can be a set of raw data, a set of parameters, or whatever we've chosen to represent as a pattern. For our BP NNT, the value of each input can take on any value between a and 1. That is, the input values are analog and are normalized between the values a and 1. The fact that we can use analog inputs adds significant flexibility toourNNT. Does the normalization between aand 1 constrain us in any way? Probably not, at least not usually. Whenever we deal with a real-life computer system that's receiving input, we're always limited to some extent by the size of the number we can put in. As long as the resolution of our input data doesn't get lost in the normalization process, we're all right. In our implementation of the BP NNT, we use standard floating point variables, called "float" in C. Floats are 32 bits long, using 24 bits for a value, and 8 bits for an exponent. We therefore have a resolution of about 1 part in 16 million, or, stated another way, resolution to 7 decimal places. If your data has 7 significant figures or less, you'll be okay. We haven't found this at all limiting. Even input data from a 16-bit analog to digital (A/D) converter requires a little less than 5 digits of resolution. Most of the applications we've seen require 3 to 5 digits of resolution. Normalizing our input patterns can provide us with a tool for preprocessing our data in various ways. You can normalize the data by considering all the "n" inputs together, normalize each channel separately, or normalize groups of channels in some way that makes sense. In some cases, the way you choose to normalize the inputs can affect the performance of the NNT, so this is one place you can really experiment.