## Wind Speed Distribution – is it Normal…?

As a sailor, I’ve become obsessed by wind. Trying to understand its behavior has kept me busy for decades, and still, I can’t claim I fully understand wind behavior.

Anyways, one of the things I’ve wondered is whether wind is governed by the normal (Gauss) distribution. It’s not. As can be seen from the graphs above, with data graciously provided by the Swedish Meteorological Service, SMHI.

Graphs 1 & 3 show a time series of wind, in terms of direction and speed, from two different stations on the swedish east cost, the lighthouses ‘Svenska Högarna’  (abt 600 data points) and ‘Söderarm’ (abt 300.000 data points). For Svenska Högarna, the time series is short, just June 2017 to October 2017, while the time series for Söderarm is from 1951 to 2017. As an aside: if you look carefully at the data for Söderarm, you can in the smoothed plots see that something seems to happen to the data in 1995… I got in touch with the good folks at SMHI, and they confirmed that in november 1995 the measurements went from manual, 3 times a day, to fully automatic, once per hour, which explains the larger fluctuations of the data after 1995.  Also, these two graphs display the data in smoothed form, where I’ve used a Hanning Window and convolution to detect the overall trend for TWS and TWD.

Graphs 2 & 4 show the relative distribution of wind speed (TWS), measured in m/s. It’s from these histograms obvious, that wind speed does not follow normal distribution – instead, wind is characterised by a Long Tail, typical for phenomena governed by different types of power laws.  Apparently the wind pro’s uses Weibull distribution to model wind speed behavior, at least according to this reference.

On top of the wind speed distribution of graphs 2 & 4, I’ve overlayed a normal probability density function, in orange, as well as a random sampling with same parameters as the wind data, by a red, dashed line. Furthermore, on the graphs the mean is illustrated by the red dashed vertical line, and the 1st, 2d & 3d standard deviations by dashed orange lines.

The raw data from SMHI came as .csv’s, and I used Python Pandas to read and manipulate the data. Pandas are great for processing large amounts of more or less structured data, it takes only about 60 lines of code to transform the raw data from the csv’s to the Matplotlib graphs.