## Fourier analysis for web visits – finding the signal in hidden data

Continuing my exploration of Python, I decided to apply what I learned last night about matplotlib, scipy & numpy libraries, specifically FFT’s, on a dataset familiar to me – the frequency of visits to one of my blogs over the past month. The data comes from my sailing blog, where I write in Swedish only, which should eliminate most of the ‘noise’ generated from time zone differences. Furthermore, at this time of year, there’s not much traffic on that blog, so the dataset should be moderate in size.

I took the hourly visits for the past month, and made this dataset ‘the signal’ to the FFT’s.  The graph above has two parts, the first showing the histogram (blue) over blog visitors, the green showing the filtered ‘signal’ given by the FFT.

From the blue signal, it appears that there’s some periodic activity going on, and in the bottom part of the graph we can see that there is a spike in activity slightly over a frequency of 0.04 Hz.

What does this mean…? Well, at least to me a frequency of 0.04 Hz on this type of data doesn’t say much, but for the fun of it, let’s convert that number to the corresponding period:

0.04 (with a few more decimals)  Hz happens to correspond to a period of  24h!  That is, there is some periodic activity occuring on my blog every 24 hours!  Perhaps not so surprising, the vast majority of the readers of that blog residing in the same or nearby time zones, and most of them probably checking in after a hard days work….

To see this more clearly, let’s plot the power from the lower part of the graph above over the periods:

Now we can see that there is a clear spike in the ‘power’ (the activity) of the dataset with a period of 24, that is, every 24h there is more intense activity on this blog than at other times. This is an interesting observation, not very visible from the original histogram, but it really jumps at you when the time series is transformed by the FFT to the frequency domain.

[Inspiration for this post mainly from Science Blogs]