It’s that time of the year again – time to crowdsource a
democracy index.
First, a bit of context. Last year, I had the idea of using the
Allourideas pairwise comparison software
to crowdsource a ranking of countries by their degree of “democracy” in 2010. I
asked students in my Dictatorships and Revolutions class to set the ball
rolling, and then posted the link to the widget here, allowing anybody to
vote. Surprisingly, in just a couple of months of voting the
results were interestingly close to standard indexes of freedom or democracy:
the crowdsourced ranking had a correlation of 0.84 with Freedom House’s
widely-used ratings
of political and civil liberties for 2010, and the basic crowdsourced
ranking was generally plausible (see this
post for further analysis of these results) . In fact, by now the correlation has increased
to 0.86 (8556 votes total), which is about as good the correlation between Freedom House and Polity IV (0.87).
I am interested in seeing if this kind of crowdsourced
measure can be used as a sort of quick and dirty index of democracy. To be
sure, crowdsourcing the construction of an index of democracy in this way is usually
not a good way of generating reliable social science data. For one thing, the
exercise does not impose any restrictions on how the concept of “democracy”
should be understood, which means that it implicitly aggregates all kinds of
different ideas about democracy, weighting them by the degree to which larger
numbers of people consider them important. (It takes a “democratic” approach to
concept formation, you might say). But it does have the virtue of being cheap (total
cost: about $0, compared to over US$500,000 annually for the Freedom House “Freedom
in the World” report, and $120,000 annually for the Polity IV project), aggregating
the dispersed information of large numbers of people from all over the world,
and making it possible to generate various measures of “uncertainty” around the
crowdsourced estimates. So I want to repeat the exercise, and produce a
democracy ranking for 2011:
(Click here if you can’t see
the widget above. Vote as many times as you'd like, and don't worry if you have to use the "I can't decide" button).
In theory, the more votes, and the more diverse the voting
population – the more people from Africa, Latin America, Asia, and the rest of
the world – the more informative the results should be. So please vote early
and often, and share!
I am also interested in which dimensions of the
complex concept of democracy people tend to weigh more when making these
sorts of comparisons. Do people put more weight on the presence or absence of
elections, for example, than on economic equality? You can help me to figure this out by using the
widget below to rank various dimensions or components of democracy in terms
of their importance to you (or adding your own):
(Click here if you can’t
see the widget above. Vote as many times as you'd like, and don't worry if you have to use the "I can't decide" button).
The “seed” dimensions of democracy for these comparisons are
taken from a recent piece by Michael Coppedge, John Gerring, et al. (“Conceptualizing
and Measuring Democracy”, Perspectives
on Politics 9(2): pp. 42-62; ungated
here) that is very much worth reading if you are interested in the issue of
how to measure democracy. But I make no claim that these are the only dimensions of democracy that
matter; if people have other ideas, you can add them in (I will need to approve
any suggestions, though). You are also
welcome to discuss in comments the kinds of considerations that you used to make
distinctions between countries, or any other considerations that might improve the usefulness of this sort of exercise.
(Part 1 of possibly several, depending on time and mood)
A couple of months ago, I set up a democracy ranking website using the Allourideas software as part of a class project to crowdsource a democracy index (which has now been completed; more on that project in an upcoming post). The site works by presenting the user with a random comparison between two countries, and asking them to vote on which of these countries was more democratic in 2010 (click here if you can't see the widget below):
The 100 or so students in my class started the ball rolling, and their responses generated an initial democracy index that had a correlation of about 0.62 with the Freedom in the World index produced by Freedom House: respectable but not great. The post describing the initial results got some links from Mark Belinsky, the Allourideas blog, and Jonathan Bernstein, which increased the number of votes substantially. In fact, as of this writing, the website has registered 4402 (valid) votes, from about 203 different IP addresses, mostly in the USA, New Zealand, and Australia:
4,402 valid votes means at most 4,402 distinct comparisons out of a possible 36,672 potential comparisons of 192 countries (most comparisons have appeared only once, but a few have appeared a couple of times), or about 12% of all possible comparisons. How has the increase in the number of voters changed the generated index? And how does it compare to the current Freedom House index for 2010? As we shall see, the extra votes appear to have improved the crowdsourced index considerably.
Here is a map of the scores generated by the "crowd" - i.e., voters in the exercise (darker is more democratic, all data here):
And here's a scatterplot comparing the generated scores to Freedom House's scores for 2010 (click here for a proper large interactive version):
The Y axis represents the score generated by the Allourideas software: basically, the probability that the country would prevail in a comparison with a randomly selected country. For example, the Allourideas software predicts that Denmark (the highest ranked country) has a 96% chance, given previous votes, of prevailing in a “more democratic” comparison with another randomly selected country for 2010, whereas North Korea (the lowest ranked country) only has a 5% chance of prevailing in this comparison. The X axis represents the sum of the Freedom House Political Rights and Civil Liberties scores for last year (from the “Freedom in the World 2011” report), reversed and shifted so that 0 is least democratic and 12 is most democratic (i.e., 14-PR+CL). The correlation between Freedom House and the crowdsourced index is a fairly high 0.84 (which is about as high as the correlation between the combined Freedom House score and the Polity2 score for 2008: 0.87). But how good is this, really? What do these scores really represent?
At the extremes, judgments of democracy appear to be “easy”: Freedom House and the crowd converge. For example, among countries that Freedom House classifies as “Free,” only six countries (Benin, Israel, Mongolia, Sao Tome and Principe, and Suriname) receive a score of 40 or below from the “crowd,” which is the highest score that any country Freedom House classifies as “Not Free” receives (Russia). But in the middle there is a fair amount of overlap (just as with expert-coded indexes, whose high levels of correlation are driven by the “extreme” cases – clear democracies or clear dictatorships). Some of these disagreements could further be attributed to the relative obscurity of some of the countries involved, given the location of the voters in this exercise (few people know much about Benin, and anyway the index got no votes from Africa), but some of the disagreements seem to have more to do with the average conceptual model used by the crowd (e.g., the case of Israel). The crowd would seem to weigh the treatment of Palestinians more heavily than Freedom House in its (implicit) judgment of Israel’s democracy. This is unsurprising, since the website does not ask participants to stick to a particular “model” of democracy; the average model or concept of democracy to which the crowd appears to be converging seems to be slightly different than the model used by Freedom House.
We can try to figure out where the crowd differs the most from Freedom House by running a simple regression of Freedom House’s score on the score produced by the crowd, and looking at the residuals from the model as a measure of “lack of fit.” This extremely simple model can account for about 69% of the variance in the crowdsourced scores on the basis of the Freedom House score (all data available here); we can improve the fit (to 72%) by adding a measure of “uncertainy” as a control (the number of times a country appeared in an “I don’t know” event, divided by the total number of times it appeared in any comparison). What (I think) we’re doing here is basically trying to predict Freedom House’s index on the basis of the crowdsourced judgment plus a measure of the subjective uncertainty of the participants. The results are of some interest: for example, participants in the exercise appear to think Venezuela, Honduras, and Papua New Guinea have higher levels of democracy than Freedom House thinks, and they also appear to think that Sierra Leone, Lithuania, Israel, Mongolia, Kuwait, Kiribati, Benin, and Mauritius have lower levels of democracy than Freedom House thinks.
A more interesting test, however, would be to do what Pemstein, Meserve, and Melton do here with existing measures of democracy. Their work takes existing indexes of democracy as (noisy) measurements of the true level of democracy and attempts to estimate their error bounds by aggregating their information in a specific way. I might try do this later (I need to learn to use their software, and might only have time in a few weeks), though it is worth noting that a simple correlation of the crowdsourced score for 2010 with the “Unified Democracy Scores” Pemstein et. al. produce for 2008 by aggregating the information from all available indexes is an amazing 0.87, and a simple regression of one on the other has an R2 of .76. So the crowdsourced index seems to be doing something much like what the Unified Democracy Scores are doing: averaging different models of democracy and different "perspectives" on each country.
This all assumes, however, that there is something to be measured – a true level of democracy, which is only loosely captured by existing models. On this view, existing indexes of democracy reflect different interpretations of the concept of democracy, plus some noise due to imperfect information and the vagaries of judgment; they each involve a “fixed” bias due to potential misinterpretation of the concept, plus the uncertainty involved in trying to apply the concept to a messy reality whose features are not always easy to discern (try figuring out the level of civil rights violations in the Central African Republic compared with Peru in 2010, quick!). The crowdsourced index actually goes further and averages the different interpretations of democracy of every participant, just as the Unified Democracy Scores aggregate the different “models” of democracy used by different existing indexes. To the extent that the crowd’s models converge to the true model of democracy, then the crowdsourced index should also eliminate that “bias” due to misinterpretation. But it is not clear that there is a true model, or that the crowd will converge to it even if it existed: the crowdsourced index may have a higher bias (total amount of misinterpretation of the concept) than the indexes created by professional organizations. (And this conceptual bias might shift if more people from other countries voted; I’d really love to get more votes from Africa and Asia).
Even if there is no true model of democracy, it would be interesting to “reverse-engineer” the crowd’s implicit model by trying to figure out its components. (What do people weigh most, when thinking about democracy? Violations of civil liberties? Elections? Opportunities for participation? Economic opportunities?). One could do this, I suppose, by trying to predict the crowdsourced scores from linear combinations of independently gathered measures of elections, civil liberties, etc.; some form of factor analysis might help here? My feeling is that the crowd weighs economic “outcomes” more than experts do (so that crowdsourced assessments of democracy will be correlated with perceptions of how well a country is doing, like GDP growth), but I haven’t tried to investigate that possibility.
It would also be interesting to repeat the exercise by asking people to stick to a particular model of democracy (e.g., Freedom House’s checklist, or the checklist developed by my students – more on that later). It would also be great if the allourideas software had an option that allowed a voter to indicate that two countries are equal in their level of democracy (I think one could do this, but then I would have to modify the client; right now, the only way of signalling this is to click on the “I don’t know” button). Perhaps next year I will try some of these possibilities. All in all, it seems that crowdsourcing a democracy index produces reasonable results, and might produce even better results if the crowdsourcing is done with slightly more controls. (E.g., one could imagine using Amazon's "Mechanical Turk" and a specific model of democracy for generating data on particular years). I would nevertheless be interested in thoughts/further analysis from my more statistically sophisticated readers.
In an upcoming post I will explain how my students produced an index of democracy for 2010, 1995, and 1980, and how that crowdsourced effort compares with other existing indexes. (Short version: pretty well).
[Update 8:40pm: Made some minor changes in wording, added a couple of links]