Download Embedded Systems

Transcript
many kinds of information faster.
Neural networks are hot, but they're
far from perfected. At this juncture
several commercial products show promise, but even the best of these will no
doubt change (for the better) as we learn
more about our brains and more about
computers.
But enough general introduction. Let's
create an NNT, beginning with a few
definitions for the back-propagation
model, the architecture we'll use.
The Back-Propagation Model
Although a back-propagation neural
network model is generally considered a
good one, it has no "standard" definition. Neural network experts agree
mostly on the algorithms they use to describe network training and operation,
but don't agree how these algorithms
should be implemented.
We'll try to simplify our implementation by describing the back-propagation
(BP) model (as we understand it) in some
detail. We'll examine each of its elements
and describe how these elements combine to form the BP topology.
Figure 1 (remember you looked
before) shows a simple, three-layer, BP
model.
• Each circle represents a processing element (PE).
• Each arrow represents an interconnection (synapse) and its associated weight.
• The PEs with the letter "b" inside
are called bias nodes.
In this article (and our book), subscripted lower case letters represent the
attributes of individual PEs (nodes) and
of individual connections.
• the letter "i" represents an input,
• "0" represents an output,
• "w" represents a connection
weight,
• "n" represents the number of
nodes in a layer.
• The subscripts "i,j,l" refer to the
input, hidden, and output layers,
respectively.
(If you use more than one hidden
layer, the subscript "k" represents it.)
For example, "ii" is the input to an
input layer PE, "oj" is the. output of a
hidden layer PE, and "nt is the number
of PEs in the output layer.
Bold lower case letters represent vectors. For example, "ii" represents the
input vector to the input layer, made up
of all the individual inputs. "01" repreI
18
MICRO CORNUCOPIA, #51, Jan-Feb 1990
sents the output vector of the output
layer.
We often work with a combination of
an input vector and its associated output
vector. This combination of an input and
its associated output comprises a "pattern vector," represented by a "p."
We list the input part first, then the
output. Typically we divide all our patterns into two categories or sets: a training set and a testing set. The subscripts
"r,s" are associated with training and
testing, respectively.
Neural networks
are hot, but they're
far from perfected
even the best of
these will no doubt
change ... as we
learn more about
our brains and more
about computers.
So, for example, "pr" is a training pattern and "ps" is a testing pattern. In both
cases, in the representation of the vector
(for example, in the pattern files for the
NNT), the output components follow the
input components.
Connection weights require two subscripts that represent the sending and receiving layers. For example, the weight of
the connection from an input PE to a hidden PE is "Whi." Note that the receiving
PE layer is the first subscript, and the
sending PE layer is the second. While
this may seem somewhat counter-intuitive to you, it's the generally accepted
way to represent weights, and corresponds to the matrix notation which
sometimes represent weights.
We represent matrices by bold capital
letters. For example, "Wji" represents the
matrix of connection weights to the hidden layer (from the input layer).
Don't despair (if you're despairing);
we'll use vectors and matrix notation
sparingly.
We'll also use three coefficients later:
• the learning coefficient, eta (11),
• the momentum factor, alpha (a),
• and the error term, delta (8).
Later, we'll describe each of the network elements. In Part 2 we'll describe
the operation and training of the BP network of Figure 1, step by step. Let's examine now how we present input to the
network.
Network Input
On the left of Figure 1, note the inputs
entering the network via the input layer
(a layer of processing nodes). These inputs can be a set of raw data, a set of
parameters, or whatever we've chosen to
represent as a pattern.
For our BP NNT, the value of each
input can take on any value between a
and 1. That is, the input values are
analog and are normalized between the
values a and 1. The fact that we can use
analog inputs adds significant flexibility
toourNNT.
Does the normalization between aand
1 constrain us in any way? Probably not,
at least not usually. Whenever we deal
with a real-life computer system that's receiving input, we're always limited to
some extent by the size of the number we
can put in. As long as the resolution of
our input data doesn't get lost in the normalization process, we're all right.
In our implementation of the BP
NNT, we use standard floating point
variables, called "float" in C. Floats are
32 bits long, using 24 bits for a value, and
8 bits for an exponent.
We therefore have a resolution of
about 1 part in 16 million, or, stated
another way, resolution to 7 decimal
places. If your data has 7 significant
figures or less, you'll be okay. We
haven't found this at all limiting. Even
input data from a 16-bit analog to digital
(A/D) converter requires a little less than
5 digits of resolution. Most of the applications we've seen require 3 to 5 digits of
resolution.
Normalizing our input patterns can
provide us with a tool for preprocessing
our data in various ways. You can normalize the data by considering all the
"n" inputs together, normalize each
channel separately, or normalize groups
of channels in some way that makes
sense. In some cases, the way you choose
to normalize the inputs can affect the
performance of the NNT, so this is one
place you can really experiment.