Archive for December, 2010

Ngrams: when Statistics overtook Mathematics

20/12/2010

The Internet is abuzz (1 2 3)with Google’s Ngrams Viewer which allows to track frequency of usage of words and short phrases in “lots of books” from 1500 to 2008, in six languages.

It would be easy to spend hours playing at these data. Here is a quick look at when Statistics (rightfully) overtook Mathematics, in English and in French books:

Statistics overtook Mathematics half-a-century earlier in English than it did in French. I wonder what caused the bump in French around 1960.

Advertisements

Data on shared bicycles in Lyon

14/12/2010

Pablo Jensen et al. recently posted on arXiv a short analysis of a fantastic data set: 11 million trips made with the shared bicycles Vélo’v in Lyon. For every trip, the start station, final station, and trip time, duration and distance are available.

Among other things, they show that the average Vélo’v is faster than the average car at peak time, that cyclists are faster on Wednesdays and slower during the week-ends, and that winter speeds are higher (presumably because casual – hence slower – cyclists only cycle during the summer). My guess would be that cyclists who ride their own bikes, rather than use Vélo’v, are even faster, since they are probably more used to cycling and definitely have better and lighter bikes.

Given the start and final point of a trip, they can also calculate the length of the shortest path which obeys all one-way streets, and they show that a whopping 61% of cyclists take a shortcut, which must involve cycling the wrong way or on pavements. (It goes without saying that Parisian cyclists would never do such a thing.)

This is a fantastic data set, and I cannot wait to see more analyses, or some visualizations à la Pedro M. Cruz.