Archive for the ‘Uncategorized’ Category

Post-doctoral position in Paris: Statistical modelling for Historical Linguistics

08/08/2017

A postdoc position is open, to come work with me and several Linguists at École Normale Supérieure, on questions related to Statistical modelling for the history of human languages and for monkey communication systems.

See the detailed announcement.

Deadline for application is 23 August.

Lecturer position in Statistics at Dauphine

21/10/2016

An associate professor (“Maître de conférences”) position in Applied or Computational Statistics is expected to be open at Université Paris-Dauphine. The recruitment process will mostly take place during the spring, for an appointment date of 1 September 2017.

However, candidates must first go through the national “qualification”. This process should not be problematic, but is held much earlier in the year: you need to sign up by 25 October (next week!), then send some documents by December. Unfortunately, the committee cannot consider applications from candidates who do not hold the “qualification”.

If you need help with the process, feel free to contact me.

David Cox is the inaugural recipient of the International Prize in Statistics

20/10/2016

David Cox was announced today as the inaugural recipient of the International Prize in Statistics.

My first foray into Statistics was an analysis of Cox models I did for my undegraduate thesis at ENS in 2005. I had no idea back then that David Cox was still alive and active; in my mind, he was a historical figure, on par with other great mathematicians who gave their names to objects of study — Euler, Galois, Lebesgue…

When I arrived at Oxford a few months later, I was amazed to meet him, and to see that he was still very active, both as a researcher and as the organizer of events for doctoral students.

David Cox is the perfect choice as the first person to receive this prize. I hope that the inauguration of this prize will help show the public that Statistics require complex and innovative methods, that have been tackled by some exceptional minds, and should not be seen as a “sub-science” compared to other more “noble” sciences.

MCMSki 4

06/01/2014

I am attending the MCMSki 4 conference for the next 3 days; I guess I’ll see many of you there!

I am organizing a session on Wednesday morning on Advances in Monte Carlo motivated by applications; I’m looking forward to hearing the talks of Alexis Muir-Watt, Simon Barthelmé, Lawrence Murray and Rémi Bardenet during that session, as well as the rest of the very strong programme.

I’ll also be part of the jury for the best poster prize; there are many promising abstracts.

Expect some blog posts in French

24/06/2013

This is just a warning that from now on, a small proportion of my blog posts will be in French. I’ll use French for posts which I think will appear primarily to French speakers: either posts for students of courses that I give in French, or posts on the French higher educational system which would be of little interest to people outside of France. I guess this is similar to what Arthur Charpentier does.

In particular, I’ll keep posting in English for anything related to my research topics.

PhD acknowledgement

18/04/2013

Acknowledgement from an anonymous doctoral dissertation in the University Microforms International database:

If I had a dime for every time my wife threatened to divorce me during the past three years, I would be wealthy and not have to take a postdoctoral position which will only make me a little less poor and will keep me away from home and in the lab even more than graduate school and all because my committee read this manuscript and said that the only alternative to signing the approval to this dissertation was to give me a job mowing the grass on campus but the Physical Plant would not hire me on account of they said I was over-educated and needed to improve my dexterity skills like picking my nose while driving a tractor-mower over poor defenseless squirrels that were eating the nuts they stole from the medical students’ lunches on Tuesday afternoon following the Biochemistry quiz which they all did not pass and blamed on me because they said a tutor was supposed to come with a 30-day money-back guarantee and I am supposed to thank someone for all this?!!

(From a UMI press release, quoted in The Whole Library Handbook 2, 1995)

Source: Futility Closet via Arthur Charpentier.

Unicode in LaTeX

16/04/2013

The way I type LaTeX has changed significantly in the past couple of months. I now type most of my math formulae in unicode, which makes the source code much more readable.

A few months ago, I might have written

$\lambda/\mu=\kappa/\nu \Rightarrow \exists \Theta,\forall i, \sum_{j\in\mathbb{N}} E[D_{i,j}]=\Theta$

to display

\lambda/\mu=\kappa/\nu \Rightarrow \exists \Theta,\forall i, \sum_{j\in\mathbb{N}} E[D_{i,j}]=\Theta.

Now, to type the same equation, my LaTeX source code looks like this:

$λ/μ=κ/ν ⇒ ∃Θ,∀i, ∑_{j∈ℕ} E[D_{i,j}]=Θ$

which produces exactly the same output. The source code is much easier to read; it is also slightly easier to type. Here is how the magic works:

  • In the preamble, add
    \usepackage[utf8x]{inputenc}
    \usepackage{amssymb}
  • A number of special characters (including all Greek letters) were already easily available to me because I use a bépo keyboard (if you are a French speaker, you should try it out); otherwise, all characters are available using any keyboard to users of a Unix-like OS thanks to this great .XCompose file. For example, to get ℕ, use the keys Compose+|+N (pretty intuitive, and faster than typing \mathbb{N}). To get ∃, use Compose+E+E; to get ∈, use Compose+i+n, and so on.
  • There are two issues with this solution: first, the unicode symbol α maps to \textalpha instead of \alpha; second, the blackboard letters map to \mathbbm instead of \mathbb. This can lead to errors, but I wrote this file which solves the issue by including in the preamble:
    \input{greektex.tex}

This is useful for LaTeX, but also for all other places where you might want to type math: thanks to this .XCompose file, typing math in a blog post, tweet or e-mail becomes easy (for example, this is the last blog post where I will use WordPress’ $latex syntax). And if there ever is a LaTeX formula that you cannot access from your keyboard, you can use a website such as unicodeit.net which converts LaTeX source code to unicode.

I originally heard about this on Christopher Olah‘s blog.

 

Beeminder will help you keep your resolutions

23/01/2013

I make many resolutions, and keep hardly any on the long term, which I think is very common. For about two years, I have been using Beeminder, a tool developed by the recently mentioned Dan Reeves, which is fantastic for number-inclined people (which includes most readers of this blog, presumably).

For example, I have a stack of academic papers to read which keeps growing. I have set the (modest) goal to read at least 2 of those papers per week on average, i.e. that I read 104 papers in a year; others might want to lose 10 kg by July, or get their number of unread e-mails back to 0 by next month. Usually, such resolutions have the following effect: during the first couple of weeks, I will indeed read 2 papers/week; in week 3, I’ll read only 1, but think that I’ll catch up the next week, which I never do; by week 4, I’ve got so much catching up to do that I might as well give up on the resolution.

Beeminder helps check that I stay on track constantly. On any given day, I must be on the “yellow brick road” which leads to my goal, and which allows some leeway based on the variance of data entered up until now. A big goal in the far future is thus transformed into a small goal in the near future. In the graph below, the number at the top left is the number of days I have before I must read another paper to avoid derailing.

If you need an extra incentive, you can pledge money, which is only charged in you fail at your goal (you are allowed to fail your goal once for free; if you fail and want to start again, you need to promise to pay should you fail a second time).

This is all explained in further detail on Beeminder’s blog. Apart from academic papers to read, I am reading how much I swim, boring admin tasks, and several other goals I would not keep otherwise, and I have found Beeminder to be a great way to achieve these goals. Give it a try!

PLoS scientific paper gets also published on Wikipedia

21/05/2012

Wikipedia is awesome, and we all rely on it on a day-to-day basis. For basic statistical topics, I usually find it very reliable. For more advanced topics, however, coverage is often sketchy or even non-existent. The article on ABC is exceedingly short; the one on the Wang-Landau algorithm is rather odd; the on on quantitative comparative linguistics is essentially a long list.

I often get annoyed when people complain about Wikipedia, for two reasons: 1. Although imperfect, it is pretty good, and often better than other more widely accepted general knowledge encyclopedias; 2. If you find a problem, you should fix it rather than complain.

Although I am far from being the most active contributor on Wikipedia, I edit it quite a lot, and am an administrator of Wikimedia Commons. But I have hardly done any contributions in Statistics, my area of expertise. The reason is simple: when I do Statistics, it is for my job, do I either do research or prepare lectures. Editing Wikipedia is a hobby, so I want it to be separate from my job.

But this is sub-optimal: Wikipedia could be improved if more specialists were to participate. Also, and I should try not to be too depressed by this, anything I write on Wikipedia will probably be read thousands time more than the scientific papers I write. One could thus argue that it would be an efficient use of my time for me and other statisticians to edit Wikipedia on Statistics topics. One could even go further and state that it should be one of our duties.

There are several obstacles here. The first obstacle is that Wikipedia’s model is that it can be edited by anyone; other encyclopedias which attempt to have articles written by specialists are nowhere near as successful (e.g. Citizendium). Some kind of assurance that an expert’s edits are given more importance than a layman’s could incentivize more experts to participate. I actually believe that in the large majority of cases, an expert’s contribution would be recognized as such and that no one would try to deconstruct it (most exceptions would be in topics subject to hot debate either in the scientific community or in the media). A second obstacle is that editing Wikipedia is not recognized in the career path of scientists: I might want scientists in general to participate in Wikipedia; but I won’t do it, because I really need to finish these three papers and grant proposals (a kind of NIMBY issue). A third obstacle is that scientists simply cannot be bothered to edit Wikipedia.

PLoS Computational Biology has started an initiative to address these obstacles. In a nutshell: scientists write a review paper. It gets published in the journal, and the scientists get a publication on their CV. At the same time, the paper gets copied over to Wikipedia. The initial version is marked as being written by specialists, but anyone can edit it (e.g. improve the wording, add relevant links, etc.). Everyone wins. The first such article is on Circular permutation in proteins.

Obviously, scientists should also edit Wikipedia even when they don’t get a publication out of it. But such initiatives will help get scientists involved. Explaining scientific topics to the general public is one of the duties of scientists, and Wikipedia is one of the best tools to do this.

Greek Stochastics γ

08/06/2011

The Greek Stochastics γ conference was held last week in Crete and focused this year on MCMC methods (last year was of statistics for biology). The schedule allowed little time for presentations from participants, but there were several “short courses”. In particular, Gareth Roberts gave a rather illuminating explanation on convergence of Gibbs samplers, in which each step of the sampler is viewed as a projection in a functional space.

In a Gibbs sampler with two steps (1. update X given \theta; 2. update \theta given X), step 1 is a projection (recall that P is a projection iff P^2=P, and it is clear than performing step 1 twice in a row is the same as performing it only once); the same is true of step 2. The spaces we are projecting on are ugly, but if you view them as lines, it is easy to see that the successive iterations converge to the intersection of the two lines, which corresponds to a fixed point, as shown in this very ugly figure. The convergence is faster if the two lines are  close to orthogonal, which corresponds to correlation close to 0 in the Gibbs sampler. Apparently, this line of thought was initiated by Yael Amit. I find it very helpful.

Another fascinating talk was given by David Spiegelhalter (the “general interest” talk), on communicating in Statistics. David studies the question “Why do people find probability unintuitive and difficult?”; he suggests that the answer in “because probability in unintuitive and difficult”. He gave a startling example: in a phone survey, if you ask which is the greater risk between 1/100, 1/10 and 1/1000, about 25-30% of people give the wrong answer. Media reports often use such notation, with the numerator fixed to 1. The message would be better understood if probabilities were given with a fixed denominator (1/100, 10/100, 0.1/100). David had many other suggestions on explaining risk and uncertainty to non-statisticians, all very useful.