Thursday, March 3, 2016

Which are the fastest (and slowest) running countries in the world?

As running becomes more and more popular around the world, the need to rank countries by the speed of their runners is clearly needed. Why should I care you ask? It's a good point you're making I'd retort, but we seem to like ranking of all sorts of things so why not? We rank countries by feelings of love, Cities by bike-friendliness, and even professions by right swipes.

(Note: Just scroll down to see the maps and lists of fastest countries without going through the explanations.)  

To get some data on running speeds I decided to look at the New York City Marathon, one of the largest and most international marathons in the world. I collected results from 2011, 2013 and 2014 -- about 150,000 runners all together. A simple way to find countries' speed is to take the average speed of the runners of each country and compare. The problem with this approach is that it is somewhat naive. Different countries may be represented by different compositions of runners. A country with more female runners may appear slower than a country with more male runners. In a poorer country, it may only be the highly motivated, faster, runners who choose to make a pricey trip to New York. 

To construct an adjusted rankings of countries' speed I follow the insights supplied by the gravity model in Economics -- a model originally developed to analyze trade flows between countries and regions. Analogous to the manner gravity is determined by objects' mass and the distance between them, this model predicts that the strength of trade between two countries is affected by their distance from each other and the size of their economies. The closer are countries and the bigger their economies, the stronger is the predicted trade.

For a runner to decide to compete in the NYC marathon she must consider how much she wishes to participate in the run (and enjoy a visit to NYC) against the price of the trip and the time it takes. If the runner lives in NYC than the price is quite low and the time is negligible. Even a runner who is not highly invested in marathon running, and is slower than average, may still decide to participate. On the other hand, for a runner who lives in Delhi, the cost is substantial and the trip's duration is likely to be a number of days at least. A runner incurring the cost of the trip is someone who is possibly more motivated and hence a faster runner than average. As a result, we would observe that Americans run the NYC marathon slower than Indians, and could incorrectly conclude that average speed of all American runners is slower than all Indian runners.

By controlling for age, gender, distance from NYC, and GDP per capita, I find the average marathon adjusted 'pseudo'-times of countries with more than 20 participants. You can see below two world maps colored according to countries' NYC marathon running times (in number of hours to complete the marathon). The upper map shows the adjusted times and the lower one the original times.

So which countries are the fastest? Below is the list of countries ranked by their adjusted average running time. You can also see their ranking according to the original, non-adjusted, average running times on the right.

Rank Country Adjusted time Original time Original rank
1 Kenya 3:39:10 2:37:52 1
2 Slovenia 3:40:24 3:51:21 2
3 Norway 3:46:58 4:18:29 22
4 Luxembourg 3:47:56 4:16:29 18
5 Switzerland 3:48:53 4:17:39 20
6 Portugal 3:49:57 3:54:50 3
7 Denmark 3:52:20 4:13:49 14
8 Austria 3:53:59 4:14:20 15
9 Belgium 3:55:11 4:12:56 12
10 Sweden 3:56:07 4:19:45 25
11 Czech Republic 3:57:04 4:00:22 5
12 Iceland 3:58:08 4:19:32 23
13 Spain 3:58:52 4:07:12 8
14 Latvia 3:59:24 4:02:29 6
15 Bermuda 4:00:10 4:12:11 11
16 Slovakia 4:00:25 3:55:31 4
17 France 4:01:01 4:19:50 26
18 United States of America 4:03:09 4:35:30 46
19 Poland 4:04:55 4:03:45 7
20 Netherlands 4:05:31 4:25:46 33
21 Germany 4:07:12 4:29:22 38
22 Croatia 4:07:41 4:07:48 9
23 Canada 4:08:53 4:24:53 31
24 Finland 4:10:22 4:35:16 45
25 United Kingdom 4:10:42 4:28:04 36
26 Costa Rica 4:12:53 4:14:36 16
27 Italy 4:13:39 4:27:34 34
28 Greece 4:14:37 4:18:24 21
29 Russia 4:16:37 4:13:16 13
30 Chile 4:18:08 4:17:31 19
31 Colombia 4:18:16 4:16:22 17
32 Israel 4:20:36 4:09:54 10
33 Brazil 4:25:20 4:25:20 32
34 Hungary 4:25:51 4:19:40 24
35 Panama 4:27:32 4:33:45 43
36 South Korea 4:27:33 4:48:01 52
37 Mexico 4:29:08 4:31:26 41
38 Dominican Republic 4:30:47 4:27:41 35
39 Argentina 4:30:54 4:29:49 39
40 Peru 4:31:01 4:24:34 29
41 Ecuador 4:32:54 4:24:43 30
42 Australia 4:32:58 4:22:49 27
43 Japan 4:33:48 4:53:25 57
44 Uruguay 4:34:27 4:33:10 42
45 Ireland 4:35:58 4:34:37 44
46 Venezuela 4:38:40 4:37:01 47
47 Hong Kong 4:41:29 4:24:13 28
48 Guatemala 4:42:06 4:28:41 37
49 United Arab Emirates 4:42:39 4:50:53 54
50 Singapore 4:47:53 4:30:55 40
51 Turkey 4:48:21 4:43:36 49
52 The Bahamas 4:51:05 4:53:01 56
53 Paraguay 4:53:43 4:42:32 48
54 New Zealand 4:54:53 4:52:28 55
55 China 5:00:03 4:47:01 50
56 South Africa 5:20:13 4:47:56 51
57 India 5:42:24 4:48:47 53
58 Philippines 5:54:54 5:08:41 58
59 Indonesia 5:57:59 5:26:36 59

It is easy to see how factors like average age and gender composition affect the country rankings in the list by checking some of the highlighted countries in the scatter plot below.

Countries by average age and fraction of female runners

Kenya, the fastest country in the world by far is represented by much younger runners than any other country. For that reason, the adjusted rankings, while still showing Kenya as the fastest country in the world, reduce the gap between it and the second fastest country considerably. Slovakia, represented by somewhat younger runners and by a very low fraction of female runners is relegated from being the 4th fastest country in the original running times to the 16th fastest in the adjusted times.

On the other hand, Canada is represented by a high fraction of women and South Korea is represented by much older runners than any other country. For those reasons both these countries' rankings are higher using the adjusted running times.

Tuesday, February 9, 2016

I, We, or no one: Does gender matter?

How do writers refer to their own single author work when writing about it? Do they use singular pronouns (e.g. 'I predict the existence of a 9th planet'), more participative plural pronouns (e.g. 'Our model fits the observed data'), or avoid pronouns all together (e.g. 'The paper examines tool usage by parrots')? Is there a difference between male and female authors?

In a previous post I looked at 60,000 Economics publications over 30 years to check what do single authors actually choose to do. The data shows that authors of more recent publications tend to use the passive form less frequently, and that more experienced authors tend to use the singular pronouns more often.

Should gender matter for the choice of pronouns? Recent research into gender and preferences has shown that gender can have a strong effect on behavior and choices. Women tend to seek less competition, take less risk, and avoid over-confident behavior relatively to men. In our case, if using 'I' and 'My' is a sign of (over)confidence, it is possible that male authors would tend to use it more often. If there is a perceived risk in accentuating your own contribution by using 'I', then female authors might be more likely to use 'We' or a passive tone. In practice, the opposite occurs:

This graph above shows the fraction of authors using singular pronouns over time and by gender. Female single authors are consistently more likely to use 'I' and 'My', and less likely to use 'We' and 'Our', than male authors. This result is highly statistically significant, even when controlling for other factors such as experience, citations, and journal ranking.

All in all, female economists are about 20% more likely to use single pronouns than male authors. What can explain this large difference in choice due to gender? the floor is open for suggestions!

Wednesday, January 13, 2016

Framing matters

Would you rather contribute to a charity with the goal of saving children's lives or to a charity that aims at preventing the deaths of children?  

Both charities' goals are identical, but many would prefer to contribute to the second charity rather to the first one. The first charity's goal is framed as making a gain while the second is framed as preventing a loss. Since people tend to focus on losses more than on gains, the second charity's cause feels more urgent.

What is healthier? A burger which is 75% lean or one that is 25% fat? Again, both burgers are identical but people rank the 75% lean burger as far better.

This phenomenon, in which decisions can be strongly affected by the description of a situation is called framing. As with charities, framing can have a strong effect on individual actions in many social situations. For example, a 'community meeting' may reach more amicable results than a 'stakeholder meeting'. A few days ago, I was strongly reminded of that in a class experiment I conducted in my Game Theory course.

In this simple experiment I let students taking the course play a dictator game. In this game students are put in groups of two, where only one of them (the 'dictator') decides how to split a hypothetical sum of 10 Euros. This experiment mirrors some aspects of real life decisions facing people, such as charitable giving or helping a stranger. I randomly presented students with two versions of this game -- one named 'the dictator game' and the other 'the sharing game'.

The graph above presents the fraction of students choosing to keep different amounts of money. The black line shows the distribution of students' choice in the dictator version and the blue line the distribution of students' choice in the sharing game. The difference in the students' choice of how much of the sum to keep is clearly visible (it is also statistically significant). In the dictator version almost all students demonstrated a selfish (or rational) behavior, deciding to take the full sum of 10 Euros for themselves. In the sharing game on the other hand, almost half the students decided to share the sum equally with the other student in their group. Even being familiar with the framing bias, I was surprised to find such a strong effect of this simple change in the game's description.

Since framing can have large effects even in the simplest of environments it shouldn't be a surprise that framing is used all around us -- supermarkets place their low priced 'own brand' next to the highest priced premium brand, internet providers offer a special discounts on their (much pricier) high-speed services, and pollsters adjust their questions to reach a desired answer.

Sidenote: While framing is a well established phenomenon, the results of the class experiment should be taken with a few grains of salt. Dreber et al. find no effect of social framing on behavior in a dictator game, using a much larger number of participants than in my class experiment.