SInce I’ve been playing with Gephi for a while, I decided to see if I could use Gephi’s GeoLayout algorithm for anything meaningful.
On the net, I found webpages displaying the position (in latitude/longitude) for a set of capital cities around the world. The website’s I found displayed the position in traditional format, i.e. degrees/minutes/seconds. Unfortunately, Gephi’s ‘GeoLayout’ expects positions being given in decimal degrees, in floats.
The first task thus was to convert positions in the form of 59° 18′ 30” N to a decimal representation. That’s for the latitude, for the longitude it’s pretty much the same, except the North/South indicator changes to East/West.
In decimal position representation, southerly latitudes are negative, as well as westerly longitudes are negative. Thus, I needed not only to transform the lat & long strings to numeric decimal representation, I had furthermore to make them signed, depending on whether the position was north or south, or east or west.
So, the first task was to parse strings: while I wouldn’t have had any problems doing that in C or C++, or even Ada – after all, I’ve spent a fair amount of years programming in those languages – I decided that this problem might be a good excercise to learn the basics of a “new”, popular language, Python. That seems to be the language of choice today, for many applications….
So I decided to spend a few hours with Python, eventually getting absolutely flabbergasted and increasingly frustrated by the language: a language wildly mixing any and all programming paradigms – procedural/functional/declarational/object oriented etc etc – and also exhibiting a huge number of external “modules” that can be loaded…. How the f*ck to know what modules there are, or what they can do…?!
And the perhaps most frustrating thing of all: Python is sensitive to indentation…! That is, in order to define a block, e.g. a loop or an if-statement, you have to indent the code…!
Ok, thanks to emacs’ Python mode, the indentation problem wasn’t too bad, but still, I felt being back at Fortran – column positions matter… wasn’t that a cause for some rocket failure in the 50ies….?
Anyways, with some browsing of the Python tutorials on the net, I figured out how to transform the Lat/long positions to decimal form.
Next problem was to “map” the ‘city’/position data I managed to extract from the first website to the ‘city’/population data I found on a different website.
Basically, the problem at hand was to make a map from the city/position data structure I had as a data structure in my Python program, to the city/population data structure I had parsed from an other file.
Turns out Python’s “dictionary” built-in data type did the job wonderfully: by using the dictionary data type, it was piece of cake to make the mapping, and thereby provide Gephi with a file with a few hundred entries in the form:
City Latitude Longitude Population
Even though Python feels a bit overwhelming for a beginner, I can see its power: despite the fact that I have less than 3 hours of Python programming experience, I managed to hack a program that would have taken me at least the same amout of time to do with the languages I know – despite the confusing mix of programming paradigms, Python seems to be very powerful.
The image above shows the result of the exercise: the “warmer” and larger the text, the higher the population of the city. And the cities are layed out according to their position, thanks to Gephi’s GeoLayout.