In order to understand how my chosen strategy for finding new weights, namely an SGD, actually operates, I did a bit of analysis.
The graph below shows the flow of one single input, in a trivial, “linear” neural network, with only one neuron in each of its three layers (input, hidden, output).
The expected output value in this case is 1, illustrated by the dashed purple (?) line.
The cyan line shows how the output of the network fairly quickly converges towards the expected value.
The other lines show how the other components of the network, hidden layer, hidden weights, output weights and error change during the training iterations.
After only 100 iterations, the output is very close to the expected value (1), and the error has shrunk correspondingly to almost 0.
The hidden weight (hw) decreases moderately, while the output weight (ow) increases substantially.
A classic from Behavioral Economics:
“A bat and a ball together cost 1.10$. The bat costs 1$ more than the ball. How much does the ball cost?”
The BE-guys have found that most people get this wrong.
For the mathematically inclined: a simple equation system with two variables will solve the problem:
x + y = 1.10
x – y = 1.00
It’s been a while since I played with Machine Learning and AI. Last time I was using C++, this time I used Python & Numpy.
A world of difference: although I no longer can find the C++ hack, I know fore sure that it had very many more lines of code, and in general was more complex to write than doing it in Python with Numpy. Particularly Numpy’s array broadcasting makes manipulation of huge matrices very convenient, compared to writing the corresponding low level code. Of course, the C++ version run much quicker, but for non-industrial use it’s much more convenient to use Python.
Anyways: the target detection capability of this Neural Network, trained by 500 iterations, on 200 samples, where half the samples are non-targets, i.e. noise, while the other half of data is targets, i.e non-random signals (think sonars trying to determine whether a sound is a submarine or something else) is impressive.
The graph below shows the target detection capability of this neural network on a set of 10000 data points (think ping returns), after the network has been properly trained (“Machine Learning”). Half of the data points it will attempt to classify (the first 5000) are noise, the rest of them are signal, i.e real Targets (“submarines”).
The network is capable of making a perfect separation of targets vs non-targets, despite a fairly limited number of training samples and training iterations.
Rest assured that Machine Learning & AI will result in HUGE changes for society!
“Open the door, HAL!”…..