Machine Learning – normalizing your inputs

Often it pays off  to pre-process/normalize your input data before passing it further down the line of your Neural Network, eg. to get the range of input values into a scale more suitable for your network.

Im experimenting with images, where each pixel is represented by 3 values (RGB), each in range 0..255. The input to my network consists of such 3-dimensional pixels, where each pixel is represented by the sum of the 3 dimensions. That means each pixel has a range of 0-765.

That’s a pretty large range to tune the networks weights to deal with. So, I needed to scale down the range.

The below image illustrates three different ways to do such a scaling:

  • Subtracting the mean of the input data
  • Dividing the input data by the input variance
  • Doing both of the above

Ideally, I’d want my inputs to be centered around zero, since the Sigmoid function, responsible of calculating the nodes of the network, has an active range centered around zero.

By normalizing my inputs by both subtracting the mean and dividing by the variance, my data has both desired properties: smaller absolute range and centered around zero.


Posted in AI, development, Machine Learning, Neural networks | Tagged , , , , | Leave a comment

Improving Python performance by factor of 100

I’m doing a bit of Machine Learning on my spare time. A neural network that analyzes images.

Anyways, since I don’t have zillions of computing power, any decent amount of learning data takes hours and days to process thru my (very modest, 101 x 10 x 1) neural network.

So today I spent a few hours looking for ways to improve performance. In my code, there was an obvious first candidate, the function that calculates new weights for the various level nodes in the network. Basically, with 101 input nodes, 10 hidden nodes, and 1 output node, and thus 101 hidden weights, each time going thru a learning iteration there are 101 x 100 nodes to process.

My initial attempt was to do the calculations in a nested for-loop. For a fairly limited set of learning data, and equally limited set of learning iterations, the program took in the order of 10h of computing.

Now I changed the implementation of the function computing the new weights, to skip iteration all together, instead using numpy’s powerful matrix manipulation. It took a while to figure out how to do it, but boy, the difference in execution time is striking: from 10 hours to 6 minutes, i.e. a performance boost of factor 100 !
To get a feel for what factor 100 means: apply that factor – either way – to your salary, monthly bills or daily commute time… 🙂

Code below.

def new_weights(input,hidden,output,hw,ow,error,mu):

for h,hval in np.ndenumerate(hidden):
for i,ival in np.ndenumerate(input):
slope_o = output * (1 - output)
slope_h = hidden[h] * (1 - hidden[h])
dx3dw = input[i] * slope_h * ow[0][h] * slope_o
hw[h,i] += dx3dw * error * mu

for h,hval in np.ndenumerate(hidden):
slope_o = output * (1. - output)
dx3dw = hidden[h] * slope_o
ow[0][h] += dx3dw * error * mu
return hw,ow

def new_weights2(input,hidden,output,hw,ow,error,mu):

slope_o = output * (1 - output)
slope_h = np.array(hidden * (1 - hidden))
dx3dw = np.outer(input,slope_h) * ow * slope_o
dx3dw = dx3dw.transpose()
hw += dx3dw * error * mu
dx3dw0 = np.outer(hidden,slope_o)
dx3dw0 = dx3dw0.transpose()
ow += dx3dw0 * error * mu
return hw,ow


Posted in AI, Complex Systems, development, Machine Learning, Neural networks, performance, software | Tagged , , , , , , | 1 Comment

The power of Machine Learning

Machine Learning has an amazing ability to detect patterns where mere mortals fail to do so.

As always, any technology can be used for good, as well as bad.

Posted in AI, Machine Learning, Neural networks | Tagged , , , | Leave a comment

Intuition leading you astray…

Posted in development | Leave a comment

Demographics simulation

There’s some interesting reading about demographics and Global birthrates: as you might have learned from the news, the birthrates in meny advanced (mostly western) countries are declining, in some places to such low levels that should the trend continue, those populations may become extinct within a handful of generations.

While in other parts of the world, fundamentally the non-developed world, birth rates are – and have been for a long time – alarmingly high.

It appears that most experts on demographics agree that in order for any population to sustain itself, the birth rate, that is, number of kids per woman, should on average be 2.1.

In many western countries, birth rates are now well below 2, which means that there are fewer children born than there are people dying, e.g per year. Which in turn results in a diminishing population.

Anyways: I wrote a demographic simulation that allows me vary a number of parameters, most fundamentally the birth rate, but also other params such as initial population size, number of generations for the simulation, expected lifetime and its distribution etc.

Below some images from some simple simulation runs.

It is indeed very clear that a birthrate of about 2.1 will result in a balanced demography – anything less than that and there is a clear risk of extinction; anything over and above 2.1 will result in an explosion of population. The larger the birth rate, the faster the explosion.

In all the images below, the initial population is 10.000, and the simulation runs for 10 generations, which clearly is not enough to fully see the dramatic change near the inflection point of birth rate 2.1, but you should at least be able to observe how very sensitive to birth rates polulation growth is.

Looking at the first graph, where birth rate is 3.5, which is a not uncommon birth rate in non-developed societies, you will notice that in 10 generations, the population will grow from 10.000 to 4.5M! Now, 10 generations, say that corresponds to some 250 years, might be too long a time for causing any immediate concern, so it might be interesting to note that with the same parameters, the population will grow by factor almost 6 in just 2 generations, and by factor almost 17 in just 4 generations!

Similarly, at he other end of the birth rate spectrum, populations with birth rate lower than 2 should feel reason for worry – if such a low birth rate stays for a handful of generations, the entire population might collapse.


NOTE! The last image is on a lin-log scale.

Posted in Culture, Organization, Politik, Simulation, Society | Tagged , , , , , , | Leave a comment

Neural Networks – training the network using an SGD

In order to understand how my chosen strategy for finding new weights, namely an SGD, actually operates, I did a bit of analysis.

The graph below shows the flow of one single input, in a trivial, “linear” neural network, with only one neuron in each of its three layers (input, hidden, output).

The expected output value in this case is 1, illustrated by the dashed purple (?) line.

The cyan line shows how the output of the network fairly quickly converges towards the expected value.

The other lines show how the other components of the network, hidden layer, hidden weights, output weights and error change during the training iterations.

After only 100 iterations, the output is very close to the expected value (1), and the error has shrunk correspondingly to almost 0.

The hidden weight (hw) decreases moderately, while the output weight (ow) increases substantially.

Posted in AI, development, Machine Learning, Neural networks | Tagged , , , , | Leave a comment

Artificial Superintelligence..

soon here to kill you…?

Posted in AI, development, Machine Learning | Tagged , , | Leave a comment

The bat and the ball…

A classic from Behavioral Economics:
“A bat and a ball together cost 1.10$. The bat costs 1$ more than the ball. How much does the ball cost?”

The BE-guys have found that most people get this wrong.

For the mathematically inclined: a simple equation system with two variables will solve the problem:

x + y = 1.10

x – y = 1.00

Posted in Behavioral Economics, Math | Tagged , | Leave a comment

Machine Learning

It’s been a while since I played with Machine Learning and AI. Last time I was using C++, this time I used Python & Numpy.
A world of difference: although I no longer can find the C++ hack, I know fore sure that it had very many more lines of code, and in general was more complex to write than doing it in Python with Numpy. Particularly Numpy’s array broadcasting makes manipulation of huge matrices very convenient, compared to writing the corresponding low level code. Of course, the C++ version run much quicker, but for non-industrial use it’s much more convenient to use Python.

Anyways: the target detection capability of this Neural Network, trained by 500 iterations, on 200 samples, where half the samples are non-targets, i.e. noise, while the other half of data is targets, i.e non-random signals (think sonars trying to determine whether a sound is a submarine or something else) is impressive.

The graph below shows the target detection capability of this neural network on a set of 10000 data points (think ping returns), after the network has been properly trained (“Machine Learning”). Half of the data points it will attempt to classify (the first 5000) are noise, the rest of them are signal, i.e real Targets (“submarines”).

The network is capable of making a perfect separation of targets vs non-targets, despite a fairly limited number of training samples and training iterations.

Rest assured that Machine Learning & AI will result in HUGE changes for society!

“Open the door, HAL!”…..
#MachineLearning #AI

Posted in AI, development, Machine Learning, Neural networks | Tagged , , | Leave a comment

Signal vs Noise – target recognition using Neural Networks

  1. A Python hack implementing a 3 level Neural Network. After 500 iterations of training, the network is capable of fully separating Targets (non random numeric patterns) from noise, i.e random numeric sequnces.
Posted in AI, development, Neural networks | Tagged , , , | 1 Comment