Object-oriented implementation of the backpropagation algorithm.The Error-Backpropagation (or simply, backpropagation back·prop·a·ga·tion n. A common method of training a neural net in which the initial system output is compared to the desired output, and the system is adjusted until the difference between the two is minimized. ) algorithm is the most important algorithm for the supervised training of multilayer feed-forward Artificial Neural Networks (artificial intelligence) artificial neural network - (ANN, commonly just "neural network" or "neural net") A network of many very simple processors ("units" or "neurons"), each possibly having a (small amount of) local memory. (ANN). It derives its name from the fact that error signals are propagated backward through the network on a layer-by-layer basis. In this article we use an object-oriented approach for implementing the backpropagation algorithm. ********** The backpropagation algorithm is based on the selection of a suitable error function or cost function, whose values are determined by the actual and desired outputs of the network, and which is also dependent on the network parameters such as the weights and the thresholds (Jacek, 1995). The basic idea is that the cost function has a particular surface over the weigh space and therefore an iterative it·er·a·tive adj. 1. Characterized by or involving repetition, recurrence, reiteration, or repetitiousness. 2. Grammar Frequentative. Noun 1. minimization process such as the gradient descent Gradient descent is an optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or the approximate gradient) of the function at the current point. method can be used for its minimization. The method of gradient descent is based on the fact that, the gradient of a function always points in the direction of maximum increase of the function then, moving to the direction of the negative gradient induces a maximal max·i·mal adj. 1. Of, relating to, or consisting of a maximum. 2. Being the greatest or highest possible. "downhill" movement that will eventually reach the minimum of the function surface over its parameter space In generative art people talk about parameter space as the set of possible parameters for a generative system. In statistics one can study the distribution of a random variable. Several models exist, the most common one being the normal distribution (or Gaussian distribution). . This is a rigorous and well-established technique for minimization of functions and has probably been the main factor behind the success of back-propagation. A typical multi-layer feed-forward ANN is shown in Figure 1. This type of network is also known as a Multilayer Perceptron multilayer perceptron - A network composed of more than one layer of neurons, with some or all of the outputs of each layer connected to one or more of the inputs of another layer. (MLP (Meridian Lossless Packing) The compression technique used in DVD-Audio that provides the highest audio quality. It delivers two channels at 192 kHz with 24-bit samples or six channels at 96 kHz. ). The units (or nodes) of the network are nonlinear A system in which the output is not a uniform relationship to the input. nonlinear - (Scientific computation) A property of a system whose output is not proportional to its input. threshold units described by equations (1) and (2) and their activation function is given by equation (3): u = [summation summation n. the final argument of an attorney at the close of a trial in which he/she attempts to convince the judge and/or jury of the virtues of the client's case. (See: closing argument) over (N/i=0)] [w.sub.i][x.sub.i] (1) y = f([summation over (N/i=0)] [w.sub.i][x.sub.i]) (2) f(u) = 1/1 + exp exp abbr. 1. exponent 2. exponential (-[alpha]u) (3) Where [x.sub.1],[x.sub.2],...,[x.sub.n] are the input signals, [w.sub.1],[w.sub.2],...,[w.sub.n] are the synaptic synaptic /syn·ap·tic/ (si-nap´tik) 1. pertaining to or affecting a synapse. 2. pertaining to synapsis. syn·ap·tic adj. Of or relating to synapsis or a synapse. weights, u is the activation potential of the neuron neuron, specialized cell in animals that, as a unit of the nervous system, carries information by receiving and transmitting electrical impulses. neuron or nerve cell Any of the cells of the nervous system. and [alpha] is the slope parameter of (3). The units are arranged in layers and each unit in a layer has all its inputs connected to the units of a preceding layer (or to the inputs from the external world in the case of the units in the first layer), but it does not have any connections to units of the same layer to which it belongs. The layers are arrayed one succeeding the other so that there is an input layer, multiple intermediate layers and finally an output layer. Intermediate layers, that is those that have no inputs or outputs to the external world, are called hidden layers. Figure 1 shows a MLP with only one hidden layer. Backpropagation neural networks neural network or neural computing, computer architecture modeled upon the human brain's interconnected system of neurons. Neural networks imitate the brain's ability to sort out patterns and learn from trial and error, discerning and extracting are usually fully connected. This means that each unit is connected to every output from the preceding layer (or to every input from the external world if the unit is in the first layer). Generally, the input layer is considered as just a distributor of the signals from the external world and is not therefore counted as a layer (Lefteri & Robert, 1997). The backpropagation training consists of two passes of computation: a forward pass and a backward pass (Jacek, 1995). In the forward pass an input pattern vector is applied to the sensory nodes of the network that is, to the units in the input layer. The signals from the input layer propagate prop·a·gate v. 1. To cause an organism to multiply or breed. 2. To breed offspring. 3. To transmit characteristics from one generation to another. 4. to the units in the first layer and each unit produces an output according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. equation (2). The outputs of these units are propagated to units in subsequent layers and this process continues until, finally, the signals reach the output layer where the actual response of the network to the input vector is obtained. During the forward pass the synaptic weights of the network are fixed. During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance with an error signal which is propagated backward through the network against the direction of synaptic connections. THE MATHEMATICAL ANALYSIS Analysis has its beginnings in the rigorous formulation of calculus. It is the branch of mathematics most explicitly concerned with the notion of a limit, whether the limit of a sequence or the limit of a function. OF THE BACKPROPAGATION ALGORITHM In the forward pass of computation, given an input pattern vector [y.sup.(p)], each hidden node j receives a net input: [x.sup.(p).sub.j] = [summation over (k)] [w.sub.jk] [y.sup.(p).sub.k] (4) Where [w.sub.jk] represents the weight between hidden node j and input node k, thus node j produces an output: [y.sup.(p).sub.j] = f([x.sup.p.sub.j]) = f([summation over (k)] [w.sub.jk] [y.sup.(p).sub.k]) (5) Each output node i thus receives: [x.sup.(p).sub.i] = [summation over (j)] [w.sub.ij] [y.sup.(p).sub.j] = [summation over (j)] [w.sub.ij] f([summation over (k)] [w.sub.jk] [y.sup.(p).sub.k]) (6) Where wij represents the weight between output node j and hidden node j. Hence it produces for the final output: [y.sup.(p).sub.i] = f([x.sup.(p).sub.i]) = f([summation over (j)] [w.sub.ij] [y.sup.(p).sub.j]) = f([summation over (j)] [w.sub.ij] f([summation over (k)] [w.sub.jk] [y.sup.(p).sub.k])) (7) The backpropagation algorithm can be implemented in two different modes: on-line mode and batch mode. In the on-line mode the error function is calculated after the presentation of each input pattern and the error signal is propagated back through the network modifying the weights before the presentation of the next pattern. This error function is usually the Mean Square Error (MSE MSE Mouse (computer) MSE Materials Science & Engineering MSE Mean Squared Error MSE Mean Square Error MSE Master of Science in Engineering MSE Manufacturing Systems Engineering MSE Mechanically Stabilized Earth ) of the difference between the desired and the actual responses of the network over all the output units. Then the new weights remain fixed and a new pattern is presented to the network and this process continuous until all the patterns have been presented to the network. The presentation of all the patterns is usually called one epoch or one iteration One repetition of a sequence of instructions or events. For example, in a program loop, one iteration is once through the instructions in the loop. See iterative development. (programming) iteration - Repetition of a sequence of instructions. . In practice many epochs are needed before the error becomes acceptably small. In the batch mode the error signal is calculated for each input pattern and the weights are modified ever time the input pattern has been presented. Then the error function is calculated as the sum of the individual MSE errors for each pattern and the weights are accordingly modified (all in a single step for all the patterns) before the next iteration. Thus, in the batch mode, the error or cost function calculated as the MSE over all output units i and over all patterns p is given by: [MATHEMATICAL EXPRESSION A group of characters or symbols representing a quantity or an operation. See arithmetic expression. NOT REPRODUCIBLE IN ASCII ASCII or American Standard Code for Information Interchange, a set of codes used to represent letters, numbers, a few symbols, and control characters. Originally designed for teletype operations, it has found wide application in computers. ] (8) Clearly E is a differentiable dif·fer·en·tia·ble adj. 1. That can be differentiated: differentiable species. 2. Mathematics Possessing a derivative. function of all the weights and therefore we can apply the method of gradient descent. For the hidden-to-output connections the gradient descent rule gives: [DELTA][w.sub.ij] = -[eta] [partial]E/[partial][w.sub.ij] (9) Where h is a constant that determines the rate of learning; it is called the learning rate of the backpropagation algorithm. Using the chain rule we have: [partial]E/[partial][w.sub.ij] = [partial]E/[partial][y.sup.(p).sub.i] * [partial][y.sup.(p).sub.i]/[partial][w.sub.ij] [partial][y.sup.(p).sub.i]/[partial][w.sub.ij] = [partial][y.sup.(p).sub.i]/[partial][x.sup.(p).sub.i] * [partial][x.sup.(p).sub.i]/[partial][w.sub.ij] (10) Giving: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11) Thus: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (12) Where: [[delta].sup.(p).sub.i] = ([d.sup.(p).sub.i] - [y.sup.(p).sub.i])f'([x.sup.(p).sub.i]) (13) For the input-to-hidden connections the gradient descent rule gives: [DELTA][w.sub.jk] = -[eta][partial]E/[partial][w.sub.jk] (14) Using the chain rule we obtain: [partial]E/[partial][w.sub.jk] = [partial]E/[partial][y.sup.(p).sub.j]*[partial][y.sup.(p).sub.j]/[par tial][w.sub.jk] [partial][y.sup.(p).sub.j]/[partial][w.sub.jk] = [partial][y.sup.(p).sub.j]/[partial][x.sup.(p).sub.j]*[partial][x.sup .(p).sub.j]/[partial][w.sub.jk] (15) Giving: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (16) Now for [partial]E/[partial][y.sub.j] we have: [partial]E/[partial][y.sup.(p).sub.j] = -[summation over (p)] [summation over (i)]([d.sup.(p).sub.i] - [y.sup.(p).sub.i]) [partial]f([x.sup.p.sub.i])/[partial][y.sup.(p).sub.j] = -[summation over (p)] [summation over (i)] ([d.sup.(p).sub.i] - [y.sup.(p).sub.i]) [partial]f([x.sup.p.sub.i])/[partial][x.sup.(p).sub.i]*[partial][x.su p.(p).sub.i]/[partial][y.sup.(p).sub.j] = -[summation over (p)] [summation over (i)] ([d.sup.(p).sub.i] - [y.sup.(p).sub.i])f'([x.sup.(p).sub.i])[w.sub.ij] (17) Thus: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (18) Which gives: [DELTA][w.sub.jk] = [eta] [summation over (p)] [summation over (i)] ([d.sup.(p).sub.i] - [y.sup.(p).sub.i])f'([x.sup.(p).sub.i])[w.sub.ij]f'([x.sup.(p).sub.j] )[y.sup.(p).sub.k] = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (19) With: [[delta].sup.(p).sub.j] = f'([x.sup.(p).sub.j]) [summation over (i)] [[delta].sup.(p).sub.i] [w.sub.ij] (20) From equation (11) and (18) we can see that if the activation function was not differentiable then we would be unable to implement the gradient-descent rule, as it would be impossible to calculate the partial derivatives partial derivative In differential calculus, the derivative of a function of several variables with respect to change in just one of its variables. Partial derivatives are useful in analyzing surfaces for maximum and minimum points and give rise to partial differential of E with respect to the weights. It is for this reason that differentiability of the activation function is so important in backpropagation learning. Note also that the derivative of the sigmoid function “S-curve” redirects here. For the recording company, see S-Curve Records. A sigmoid function is a mathematical function that produces a sigmoid curve — a curve having an "S" shape. is very easy to compute since: f'(u) = f (u) [1-f (u)] (21) This means that it is not necessary to compute f' (u) separately once we have found f(u). This is one of the advantages of the sigmoid function as computation time In computational complexity theory, computation time is a measure of how many steps are used by some abstract machine in a particular computation. For any given model of abstract machine, the computation time used by that abstract machine is a computational resource which can be can reduce significantly. Although we have written the update rules for the batch mode of training it is clear that the weight updates in the online case would be given again by equations (12) and (19) without of course the summation over the training patterns. Note that equation (19) has the same form as equation (12) but with a different definition of the [delta]'s. In general, with an arbitrary number of layers, the backpropagation update rule always has the form: [DELTA][w.sub.lm] = [eta] * [[delta].sub.l] * [y.sub.m] (23) Where [DELTA][w.sub.lm] is the weight correction, [eta] is the learning rate, is the local gradient, and [y.sub.m] is the input signal of node. This general rule for the adaptation of the weights is also known as generalized delta rule The delta rule is a gradient descent learning rule for updating the weights of the artificial neurons in a single-layer perceptron. For a neuron with activation function .
IMPLEMENTATION OF THE BACKPROPAGATION ALGORITHM We have used C++ as object-oriented programming language object-oriented programming language - object-oriented programming for implementing the backpropagation algorithm. Object-oriented programming object-oriented programming, a modular approach to computer program (software) design. Each module, or object, combines data and procedures (sequences of instructions) that act on the data; in traditional, or procedural, programming the data are separated from the is a programming paradigm A programming paradigm is a fundamental style of programming regarding how solutions to problems are to be formulated in a programming language. (Compare with a methodology, which is a style of solving specific software engineering problems). where a program is made up of a collection of independent "objects" which perform the actions of a program by interacting with each other (Lafore, 1999). An object is typically a collection of related data and routines. Our implementation makes extensive use of two important principles of OOP See object-oriented programming. OOP - object-oriented programming : encapsulation (1) In object technology, the creation of self-contained modules that contain both the data and the processing. See object-oriented programming. (2) The transmission of one network protocol within another. and abstraction (Mohamed & Laitinen, 1998). Encapsulation describes the notion that all of the inner working of an object is hidden from other objects. OOP allows for information hiding Keeping details of a software routine (function or object) private. Programmers only know what input is required and what outputs are expected. See encapsulation and abstraction. , which means that the inner working of an object can literally be protected from other objects. An object can be thought of as a "black box," meaning that other objects do not know and don't need to know how the duty of the object is being performed. By providing an "interface" to the outside world, an object describes how it is to be communicated with. This separation of the inner working from the interface also allows for abstraction. Abstraction is a programming technique where the essential behavior of an object is separated from its specific purpose. This allows for efficiency when dealing with similar objects because common behavior can be shared through "inheritance." Another benefit of OOP is that, once created, objects are easy to reuse reuse - Using code developed for one application program in another application. Traditionally achieved using program libraries. Object-oriented programming offers reusability of code via its techniques of inheritance and genericity. for other applications. Our implementation of the back-propagation algorithm has the following design objectives: * allow the user to specify the number and size of all layers; * allow the use of one or more hidden layers; * be able to save and restore the state of the network; * display key information at the end of the simulation; and * drawing the network as specified by the user. C++ Classes and Class Hierarchy (programming) class hierarchy - A set of classes and their interrelationships. One class may be a specialisation (a "subclass" or "derived class") of another which is one of its "superclasses" or "base classes". This section describes, in brief, the C++ classes of our system that implement the backpropagation algorithm. In this system we use a class hierarchy with the inheritance feature. Also, we use polymorphism polymorphism, of minerals, property of crystallizing in two or more distinct forms. Calcium carbonate is dimorphous (two forms), crystallizing as calcite or aragonite. Titanium dioxide is trimorphous; its three forms are brookite, anatase (or octahedrite), and rutile. with dynamic binding and function overloading In programming, using the same name for two or more functions. The compiler determines which function to use based on the type of function, arguments passed to it and type of values returned. with static binding. First let us look at the class hierarchy used for this program (Figure 2 shows the major classes using UML (Unified Modeling Language) An object-oriented analysis and design language from the Object Management Group (OMG). Many design methodologies for describing object-oriented systems were developed in the late 1980s. notation) (Grady, James, & Ivar, 1999). An abstract class is a class that is never meant to be instantiated as an object, but serves as a base class from which others can inherit To receive property according to the state laws of intestate succession from a decedent who has failed to execute a valid will, or, where the term is applied in a more general sense, to receive the property of a decedent by will. inherit v. functionality and interface definitions. The CBaseLayer class is such a class. One of its functions is set = zero (see appendix A), which indicates that this class is an abstract base class. From the CBaseLayer class are two branches. One is the CInputLayer class, and the other is the COutputClass. The middle layer class CMiddleLayer is very much like output layer in function and so inherits from the COutputClass class. The CNNtwork class is used to set up communication channels between layers and to feed and remove data from the network. The CNNtwork class performs the interconnection of layers by setting the pointer of an input array of a given layer to output array of previous layer. Another connection that the CNNtwork class is responsible for is setting the pointer of an output_error array to the back_error array of the next layer (remember, errors flow in reverse, and the back_error array in the output error of the layer reflected at its inputs). The CNNtwork class stores an array of pointers to layers and an array of layer sizes for all the layers defined. These layer objects and arrays are dynamically allocated on the heap with the New and Delete functions in C++. Details in Implementation of the Backpropagation Algorithm Function overloading can be seen in the definition of the calc_error() function in the CBaseLayer class (see appendix A). It is used in the CInputLayer with no parameters, while it is used in the COutputClass (which the CInputLayer inherits from) with one parameter. Using the same function name is not a problem, and this is referred to as overloading In programming, the ability to use the same name for more than one variable or procedure, requiring the compiler to differentiate them based on context. (language) overloading - (Or "Operator overloading"). . Besides function overloading, you may also have operator overloading In programming, the ability to use the same operator to perform different operations. For example, arithmetic operators such as +, -, * and / could be defined to perform differently on certain kinds of data. operator overloading - overloading , which is using an operator that performs some familiar function like + for addition, for another function, say, vector addition (Geom.) that kind of addition of two lines, or vectors, AB and BC, by which their sum is regarded as the line, or vector, AC. See also: Addition . When you have overloading with the same parameters and keyword virtual, then you have the potential for dynamic binding, which means that you determine which overloading function to execute at run time and not at compile time The time it takes to translate a program from source language into machine language. Linker time may also be included in compile time. See compile and linker. (programming) compile time . Compile time binding is referred to as static binding. If you put a bunch of C++ objects in an array of pointers to the base class, and then go through a loop that indexes each pointer and executes an overloading virtual function that pointer is pointing to, then you will be using dynamic binding. This is exactly the case in the function calc_out(), which is declared with the virtual keyword in the CBaseLayer base class. Each descendant of CBaseLayer can provide a version of calc_ont(), which differs in functionality from the base class, and the correct function will be selected at run time based on the object's identity. In this case caic_out(), which is a function to calculate the outputs for each layer, is different for in input layer than for the other two types of layers. Note the following definitions in the CBaseLayer base class: int num_inputs; int num_outputs; [float.sup.*] output; // pointer to array of outputs [float.sup.*] inputs;//pointer to array of inputs, which //are outputs of some other layer friend CNNtwork; There are also two pointers to arrays of floats in this class. They are the pointers to the outputs in a given layer and the inputs to a given layer. To get a better idea of what a layer encompasses, Figure 3 shows you a small feed forward backpropagation network, with a dotted line that shows you the three layers of the network. A layer contains neurons Neurons Nerve cells in the brain, brain stem, and spinal cord that connect the nervous system and the muscles. Mentioned in: Speech Disorders and weights. The layer is responsible for calculating its output (calc out()), stored in the float [outputs.sup.*] array, and errors (calc_error()) for each of its respective neurons. The errors are stored in another array called [float.sup.*] output_errors defined in the COutputLayer class. Note that the CInputLayer class does not have any weights associated with it and therefore is a special case. It does not need to provide any data members or function members related to errors or backpropagation. The only purpose of the input layer is to store data to be forward propagated to the next layer. With the output layer, there are a few more arrays present. First, for storing backpropagation errors, there is an array called [float.sup.*] back_errors. There is a weights array called [float.sup.*] weights, and finally, for storing the expected values Expected value The weighted average of a probability distribution. Also known as the mean value. that initiate the error calculation process, there is an array called [flaot.sup.*] expected_values. Note that the middle layer needs almost all of these arrays and inherits them by being a derived class (programming) derived class - (Or "subclass") In object-oriented programming, a class that is derived from a base class by inheritance. The derived class contains all the features of the base class, but may have new features added or redefine existing features. of the COutputLayer class. SYSTEM MODES There are two modes of operation in our system: Training Mode and Testing Mode. Training Mode: In this mode the user provides the system with the following information: * The training file in the current directory is called training.txt. This file contains exemplar ex·em·plar n. 1. One that is worthy of imitation; a model. See Synonyms at ideal. 2. One that is typical or representative; an example. 3. An ideal that serves as a pattern; an archetype. 4. pairs or patterns. Each pattern has a set of inputs followed by a set of outputs. Each value is separated by one or more spaces. Figure 4 shows an example of a training.txt file See ASCII file. . * the values for the error tolerance and the learning rate; * the maximum number of cycles, or passes through the training data; * the number of layers (the training mode layout is shown in Figure 5); and * the size for each layer, from the input to the output. After preparing previous information the system begins training and reports the current cycle number and the error for each cycle. Also it may watch the error to see that it is on the whole, decreasing with time, if it is not, then the system gives a chance to restart To resume computer operation after a planned or unplanned termination. See boot, warm boot and checkpoint/restart. the training because this will start with a brand new set of random weights and another, possibly better, solution. Once the training is done the system gives information about the number of cycles and patterns used and the total error. The system can draw the network under training in two cases, without training data and with training data, as shown in Figures 6 and 7 respectively. Another file that is used in training is the weights file. Once the system reaches the error tolerance that is specified by the user, or the maximum number of iterations, the system saves the state of the network, by saving all of its weight in a file called weight.txt. This file can then be used subsequently in another run of the system in testing mode. In addition, the output generated by the network is provided in an output file called output.txt. Testing Mode In this mode, the user provides test data to the system in a file called test.txt. This file contains only input patterns. When this file is applied to an already trained network, an output.txt file is generated, which contains the output from the network for all of the input patterns. The network goes through one cycle of operation in this mode, covering all the patterns in the test data file. To start up the network, the weight file, weight.txt is read to initialize To start anew, which typically involves clearing all or some part of memory or disk. the state of the network. The user must provide the same network size parameters used to train the network. CASE STUDY: CHARACTER RECOGNITION The problem presented in this section deals with recognizing or categorizing alphabetic characters. We will input various alphabetic characters to our system and train the network to recognize these as separate categories. Representing Characters A 5x7 grid of pixels represents each character. To represent the character A, for example, we use the pattern shown in figure 8. Here the blackened black·en v. black·ened, black·en·ing, black·ens v.tr. 1. To make black. 2. To sully or defame: a scandal that blackened the mayor's name. 3. boxes represent value 1, while blank boxes represent zeros. We can represent all characters this way, with a binary map of 35 pixel values. We wish to distinguish alphabetic characters by assigning them to different bins. We would apply the inputs and train the network with anticipated responses. Here is the input file that we used for distinguishing five different characters, A, X, H, B, and I as shown in Table 1. Each line has a 5x7-dot representation for each character. Now we need to name each of the output categories. We can assign a simple 3-bits representation as follows: A 000 x 010 H 100 B 101 I 111 Let's train the network to recognize these characters. The training.dat file (DATa file) A file that uses the .DAT extension. It is widely used for a variety of data content. See extension. looks as in Figure 9: Now we can start training the network with parameters (learning rate 0.1, tolerance 0.001 , and repetition count 1000) and with three layers of sizes 35 (input), 5 (middle), and 3 (output), we get after total cycles of 3322 a final error of 0.00164419. We note also that the tolerance specified is nearly met. The match for the last pattern is: For input vector: 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 Output vector is: 0.979606 0.985861 0.989526 Expected output vector is: 1.00000 1.00000 1.00000 To see the outputs of all the patterns, we need to copy the training.txt file to the test.txt file without the expected output field and rerun re·run n. The act or an instance of rebroadcasting a recorded movie or a recorded television performance. tr.v. re·ran , re·run, re·run·ning, re·runs To present a rerun of. the system in test mode. Running the system in test mode shows the following result in output.txt file as shown in Figure 10. We noted that the training patterns are learned very well. If a smaller tolerance is used, it would be possible to complete the learning in fewer cycles. CONCLUSION In this article we have used an object-oriented approach for implementing one of the most powerful neural network algorithms called backpropagation algorithms. The algorithm uses the so-called generalized delta rule and trains the network with exemplar pairs of patterns. The main motivation for using object-oriented features as encapsulation and abstraction that we can have well packaged, reusable re·use tr.v. re·used, re·us·ing, re·us·es To use again, especially after salvaging or special treatment or processing. re·us , extensible, and reliable programs and program segments.
APPENDIX A
The header file for the CBaseLayer class hierarchy and the
network class CNNtwork:
#define MAX_LAYERS 5
class CNNtwork;
class CBaseLayer
{
protected:
int num_inputs;
int num_outputs;
[float.sup.*] outputs;//pointer to array of outputs
[float.sup.*] inputs;//pointer to array of inputs, which
//are outputs of some other layer
friend CNNtwork;
public:
virtual void calc_out()=0;
};
class ClnputLayer: public CBaseLayer
{
public:
ClnputLayer (int,int);
~ClnputLayer ();
virtual calc_out();
};
class CMiddleLayer;
class COutputLayer: public CBaseLayer
{
protected:
[float.sup.*] weights;
[float.sup.*] output_errors;//array of errors at output;
[float.sup.*] back_errors;//array of errors back-propagated
to inputs
[float.sup.*] expected_values;//to inputs
friend CNNtwork;
public:
COutputLayer (int,int);
~COutputLayer ();
virtual void calc_out();
void calc_error(float &);
void randomize_weights();
void update_weights(const float);
void list_weights();
void write_weights(int, [FILE.sup.*]);
void read_weights(int, [FILE.sup.*]);
void list_errors();
void list-outputs();
};
class CMiddleLayer: public COutputLayer
{
private:
CMiddleLayer (int,int);
~ CMiddleLayer ();
void calc_error();
};
class CNNtwork
{
private:
[CBaseLayer.sup.*] ptr[MAX_LAYERS];
int num_of_layers;
int layer_size[MAX_LAYERS];
[float.sup.*] buffer,
unsigned training;
public:
CNNtwork ();
~ CNNtwork ();
void set_training_value(const unsigned &);
unsigned get_training_value();
void get_layer_info();
void set_up_network();
void randomize_weights();
void update_weights(const [float.sup.*]);
void write_weights([FILE.sup.*]);
void read_weights([FILE.sup.*]);
void list_weights();
void write_outputs([FILE.sup.*]);
void list_outputs();
void list_errors();
void forward_prop();
void backward_prop(float &);
int fill_IObuffer([FILE.sup.*]);
void set_up_pattem(int);
};
Table 1
Characters Representation
Letter Representation
A 00100010101000110001111111000110001000
X 10001010100010000100001000101010001010
H 10001100011000111111100011000111111100
B 11111100011000111111100011000111111101
I 00100001000010000100001000010000100111
Acknowledgements I wish to express my sincere thanks, appreciation, and gratitude to Professor Samia Siad Al-Azab and Professor Farouk Aeob for their invaluable suggestions on this work. References Jacek, Z.M. (1995). Introduction to artificial neural systemsy. Boston: PWS See personal Web server. Publishing. Lefteri, T.H., & Robert, U.E. (1997). Fuzzy fuzz·y adj. fuzz·i·er, fuzz·i·est 1. Covered with fuzz. 2. Of or resembling fuzz. 3. Not clear; indistinct: a fuzzy recollection of past events. 4. and neural approaches in engineering. New York New York, state, United States New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of : Wiley. LaFore, R. (1999). Object-oriented programming in C++. Indianapolis, IN: Sams. Mohamed, R., & Laitinen, M. (1998). Transition to OO software development. New York: Wiley Grady, B., James, R., & Ivar, J. (1999). The unified modeling language See UML. (language) Unified Modeling Language - (UML) A non-proprietary, third generation modelling language. The Unified Modeling Language is an open method used to specify, visualise, construct and document the artifacts of an object-oriented software-intensive system user guide. Addison-Wesley. |
|
||||||||||||||||

with activation function
Printer friendly
Cite/link
Email
Feedback
Reader Opinion