In my previous post, I described some experiments with neural networks and image recognition that I entertained myself with during xmas holidays. Basically, I wanted to see if I could create a neural network process that, given tiny (10×10, that is only 100 pixels) images, would be capable of doing ‘target recognition’, that is identifying whether an image belongs to category ‘A ‘or category ‘B’.
What the respective categories are, that is, what the problem given to the neural network is all about, would in a typical neural network application be various target recognition tasks in many different application domains, such as identifying cancer cells in human bodies, number plate recognition, image recognition (as I’m doing in my experiments), speech recognition, military target recognition, e.g. determining whether an echo from a sonar system is a submarine or a biological structure, or a multitude of other problem domains where a traditional, analytic deterministic algorithm approach is very difficult or even impossible, but where a stochastic, non-determistic, evolutionary approach like a neural network can be applied.
In my previous post, I showed that it is indeed possible for my neural network to determine if a tiny 10×10 image (scaled down from a 100×100 image) contains enough information and specific category patterns to provide target recognition capability, that is, to determine whether an image represents a man (category ‘A’) or a woman (category ‘B).
To me, despite having spent a number of hours designing, implementing and testing my neural network process, it’s still absolutely amazing that the program can identify classifying patterns in images as small as 10×10. Below is such a 10 x 10 image:
No, there’s not a missing link in this post – the above speck of pixels really is the actual image , or more correctly, one of the few hundred images, the neural network program analyzes…!
The human eye is barely able to actually see the image at all, even less able to detect any patterns regarding gender traits in such a tiny grid of pixels.
Therefore, to give the reader a better hum of what the neural network program has to work with – and what it’s capable of doing! – in its attempt to categorize the image to either male or female, below I’ve blown up the 10 x 10 image by 3000%:
So, can you, based on the blown-up image, see any gender-specifid patterns, to decide whether its a man or a woman…? At least I can’t, but in it turns out that in 76% of the cases, that is, 3 times out of 4, my neural network is able to, despite severe restrictions, such as those below, to correctly categorize previously ‘unseen’ images, i.e. images it has not been trained with.
- the images being very small, only 10×10 pixels, thus providing very little room for any detectable patterns.
- the amount of images in the training set, i.e the set of images used for teaching the network what constitutes a man, and what constitutes a woman, being less than 100, hardly gives the neural network enough exposure to what distinguishes men from women.
For illustration, below is the original, 100 x 100 image of the same photo:
Now, at this resolution, which is 100 times larger than what my neural network program has to work with, the human eye and brain have no problems.
With enough iterations run on the set of ‘training images’, the program is capable of doing a 100% accurate separation of the training images, into either men or women.
But what happens when I let the program analyze new portrait images, i.e images that were not part of the training set ? It turns out that from a 100% successful separation, the separation now drops to 76%, that is, the neural network now makes mistakes because it is exposed to images it has not encountered during its training.
The graph above shows the results from a run where the about 100 images of the training set are analysed together with some 100 new images, i.e. images that were not part of the machine learning process.
As you can see, the separation now, with these new images included, is not perfect – there are some few mixup’s in the spread of men vs women along the x-axis, but for the vast majority of the images, the neural network process has been capable of clearly separating men from women. It turns out that for this particular run, the success factor, was about 76%, that is, in 76 % of the cases, or 3 out of 4 images, the program was able to correctly classify the image as being male or female.
So, the question then is whether the result of 76% correct calls is good enough, or too low for real world applications of neural networks – after all, we would ideally always want to be 100% certain when making a call or decision, wouldn’t we…?
Well, unfortunately – or fortunately, if you are like me! – the world at large in which we live is far from deterministic, absolute, linear and predictable, regardless how much our MBA’s and other empty suits pretend it is (btw, what a boring place such a world would be, no surprises what so ever, a perfect clockwork executing predetermined mechanical processes indefinitely, a place that only a full blown grey bureaucrat or MBA empty suit would be able to love…)
Instead, our world is a highly complex, dynamic, non-linear and non-deterministic system, where outcomes can vary enourmously with tiny changes of inputs. That is, in a world like ours, the only certain things are death and taxes, and even for those, you can not be certain of the details.
The problem of image recognition is very difficult, virtually impossible, for computers to do with a traditional deterministic approach, i.e. deterministic algorithms that are 100% capable of providing a clear ‘yes/no’ answer to the question ‘does this image represent a male or a female ? Thus, lacking deterministic algorithms for the task at hand, we have three options: either give up altoghether, declaring the problem as unsolvable, or to process the images manually – after all, asop to computers, human brains are excellent image processors, or rely upon luck, that is, randomly place each image in one of the two categories. With that latter approach, the probability for success in image classification should be 50%, that is, with the random approach, we’d be correct in half of our calls.
Thus, unless we want to process the images manually, any approach that gives us better than 50% of correct calls is helpful and worhwhile.
Thus, an application like mine, with a 26 % house advantage, that is, it performs 26% better than a random selection, will be very useful in many domains, where there are no deterministic models.
To understand this, think about various domains where probability and randomness rule, e.g. the Roulette table in a casino, where the house advantage is around 3-5% (depending on flavor of Roulette), or Black Jack, where the house advantage is just 0.28%, but still, the casinos are making a fortune with these tiny house advantages. Therefore, a ‘house advantage’, of 26% is simply enourmous in any stochastic system, such as in my image recognition application.
Even with the very limited training set of some 100 images, even with only 100 pixels per image to analyze, my neural network is capable of making the right call in 3 out of 4 cases.
Should I have more processing power available, instead of just my lowly laptop, I could easily radically increase the hit rate of my program, by adding more images to the training set, to increase the image size by a couple of orders of magnitude, and to run more training iterations.
Neural networks as well as other non-deterministic approaches are here to stay, particularly in the age of ‘Big Data’.