Herself’s Artificial Intelligence

Humans, meet your replacements.

Herself’s Artificial Intelligence header image 1

Neural Networks

Neural Networks

Neural nets are good at doing what computers traditionally do not do well, pattern recognition. They are good for sorting data, classifying information, speech recognition, diagnosis, and predictions of non-linear phenomena. Neural nets are not programmed but learn from examples either with or without supervised feedback.

Modeled after the human brain, they give more weight to connections used frequently and reduce the size (weight) of connections not used. Some neural nets must be supervised while learning, given data to sort and given feedback as to whether data is correctly sorted, forward feed back propagation networks are the best understood and most successful of these. Some, such as self organizing networks, figure things out for themselves.

If a neural net is too large it will memorize rather than learn. Neural nets usually are composed of three layers, input, hidden, and output. More layers can be added, but usually little is gained from doing so. The connections vary by the network type. Some nets have connections from each node in one layer to the next, some have backward connections to the previous layer and some have connections with in the same layer.

McCulloch and Pitts, in 1943, proved that networks comprised of neurodes could represent any finite logical expression. In 1949 Hebb defined a method for updating the weights in neural networks. Kolmogorov’s Theorem was published in the 1950’s. It states that any mapping between two sets of numbers can be exactly done with a three layer neural network. He did not refer to neural networks in his paper, this was applied later. His paper also describes how the neural network is to be constructed. The input layer has one neurode for every input. These neurodes have a connection to each neurode in the hidden layer. The hidden layer has (2*n + 1) Neurodes, n is the number of inputs. The hidden layer sums a set of continuous real monotonically increasing functions, like the sigmoid function. The output layer has one neurode for every output. 205

Rosenblatt in 1961 developed the Perception ANN (artificial neural network). In the 1960’s Cooley and Tucky devised the Fast Fourier Transform algorithm which made signal processing with neural networks feasible. Widrow and Hoff then developed Adaline. 1969 was the year neural networks almost died. A paper published by Minsky and Papert showed that the XOR function could not be done with the Adeline and other similar networks. 1972 brought new interest with Kohonen and Anderson independently published papers about networks that learned with out supervision, SOM, (self organizing maps). Grossberg and Carpenter developed the ART (adaptive resonance theory) which learns with out supervision in the late 1960’s. The 1970’s brought NEOCOGNItrON, for visual pattern recognition. Hopfield published PDP (”Parallel Distributed Processing”) in three volumes. These books described neural networks in a way that was easy to understand.

Neural networks map sets of inputs to sets of outputs. Learning is what shapes the neural networks surface. Supervised learning algorithms take inputs and match them to outputs, correcting the network if the output does not match the desired output. Unsupervised learning algorithms do not correct the output given by the neural net. The net is provided with inputs, but not with outputs.

Training data for a neural net should be fairly representative of the actual data that will be used. All possibilities should be covered and the proportion of data in each area should match the proportion in the real data. Ways of training of neural nets:
Hard coded weights determined by experience or mathematical formulas can serve in place of a training algorithm.
Supervised training uses input and matching output patterns to let the net set the weights.
Graded training only uses input patterns, but then the neural net receives feedback on how accurate its answer is.
Unsupervised Training uses only input patterns then the neural nets out put is the correct answer.

Autonomous learning in neural nets is different from other unsupervised learning systems in that the neural net can learn selectively, it doesn’t learn every pattern input, only those that are ‘important’. An autonomous learning neural net has the following capabilities; it organizes information into categories without outside input and will reorganize them if it makes sense to do so; it retrieves information from less than perfect input; it is configured to work in parallel to keep speed reasonable; the system is always selectively learning; priorities given to input patterns can change; it can generalize; and it has more memory space than it needs; it must be able to expand and add to its knowledge rather than overwriting previously learned knowledge. Of course something this wonderful should also make your coffee and sort your email for you too.

The delta rule is used for error correction in backpropagation networks. This is also known as the least mean squared rule. NewWeight = OldWeight LearningConstant*NeurodeOutput(desiredOutput-actualOutput) The delta rule uses local information for error correction. This rule looks for a minimum. In an effort to find a minimum it may find a local minimum rather than the global minimum. Picture trying to find the deepest hole in your yard, if you measure small sections at a time you may locate a hole but it may not be the deepest in the yard. The generalized delta rule seeks to correct this by looking at the gradient for the entire surface, not just local gradients.

Simulated annealing is a statistical way to solve optimization problems, like setting a schedule or wiring a network. Boltzmann networks use this algorithm to learn. A random solution is chosen and compared to the current best solution found. The better of the two is kept and then depending on the problem some random changes are made. The amount of randomness in each loop is decreased over time allowing the net to slowly settle into a solution. The randomness helps to keep the net from settling into local minimas rather than global minimas.

Self organization is a form of unsupervised learning. This sets weights with a ‘winner take all’ algorithm. Each neurode learns a classification. Input vectors will be classed into the group to which they are closest.

The Lyapunov function, also known as the energy function, is used to test for convergence of the neural network. The function decreases as the network changes and assures stability.

Neural net building tool ( Java )

More information:
Birth of a Learn Law, Steve Grossberg

Introduction to Neural Networks

See also:
Neural networks in financial modeling
Song of the neurons

Tags: neural networks · source code · topics in artificial intelligence

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

You must log in to post a comment.