Often it pays off to pre-process/normalize your input data before passing it further down the line of your Neural Network, eg. to get the range of input values into a scale more suitable for your network.
Im experimenting with images, where each pixel is represented by 3 values (RGB), each in range 0..255. The input to my network consists of such 3-dimensional pixels, where each pixel is represented by the sum of the 3 dimensions. That means each pixel has a range of 0-765.
That’s a pretty large range to tune the networks weights to deal with. So, I needed to scale down the range.
The below image illustrates three different ways to do such a scaling:
- Subtracting the mean of the input data
- Dividing the input data by the input variance
- Doing both of the above
Ideally, I’d want my inputs to be centered around zero, since the Sigmoid function, responsible of calculating the nodes of the network, has an active range centered around zero.
By normalizing my inputs by both subtracting the mean and dividing by the variance, my data has both desired properties: smaller absolute range and centered around zero.