Friday, August 28, 2015

The Mismeasure of Growth

About six months ago, Tom Pepinsky wrote a post, on the occasion of Lee Kuan Yew’s death, where he argued graphically that Lee Kuan Yew’s claim to have taken Singapore “from Third World to First” was a bit overstated. (Yes, I’m posting about this six months later - but I have never claimed that this blog offers hot takes on the news!). Using Kristian Gleditsch’s expanded GDP data, he noted that, in percentile terms, Singapore was already quite wealthy by the time it became independent, especially when compared to its neighbours:

By this measure, Singapore was as wealthy as the UK (per capita) by the mid-1970s, not because it had grown especially fast, but because it had started from a relatively high base. On this view, the most we could say is that Singapore escaped the “middle income trap,” not so much the “third world.”

The post got a fair bit of attention, though also, as I recall, a bit of pushback on Twitter and in the comments about both the data source used (Gleditsch rather than the Penn World Table or the Maddison dataset) and the decision to look at the percentile rank of income rather than the actual per capita income. Indeed, the figure above looks different if we use the Penn World Table’s latest measure of “expenditure side real GDP, at chained PPPs” (recommended by the Penn World Table investigators for “comparison of living standards across countries and over time”):

(There’s no data for Myanmar in the PWT 8.1).

Now Singapore’s starting income rank is much closer to Malaysia’s (they were, after all, part of the same country until 1965), solidly in the middle, and does not reach the UK’s income rank until the 1990s, instead of the 1970s. The difference between the two graphs is even starker if, instead of percentile ranks, we simply look at the actual income per capita numbers in PWT8.1 vs the Gleditsch data:

Using the recommended PWT 8.1 measure, Singapore at independence in 1965 had a per capita income of around $3,000 per capita, only a bit higher than Malaysia’s, and only one-sixth of US income; using the Gleditsch data, by contrast, Singapore starts out at nearly double the income level of Malaysia (more than $6000 compared with around $3,500), about a third of US income (and about half of UK income). It’s a big head-start, and it does make Lee’s achievement look a bit less impressive (an average growth rate for the period 1965-1990, when Lee was Prime Minister, of 4.8% rather than 6.9% per year for the PWT8.1 measure). At the time, I thought that the difference between the two estimates of Singaporean GDP was simply a matter of different data sources. But when you dig deeper, it turns out that the source of Gleditsch’s numbers for Singapore was … the Penn World Table (version 8.0)!

What is going on here? In this particular case, the discrepancy is due, first, to adjustments in the 2005 PPPs used between versions 8.0 and 8.1 of the PWT that increased the base price level in many countries and years, and hence lowered their measured GDP, and second, to the fact that the Gleditsch data reports, not the “expenditure side” measure of GDP (basically real GDP adjusted for changes in the terms of trade), but the measure for “output side real GDP at chained PPPs” (which is not adjusted for terms of trade). The latter measure, according to the PWT’s handy guide, is the one that should be used “to compare relative productive capacity across countries and over time,” rather than living standards (which may be affected by favourable terms of trade - e.g., unusually low import prices or unusually high export prices).1 The combined effect of these two differences makes Singapore’s economic performance look less impressive on the Gleditsch measure (PWT 8.0) than on the PWT8.1’s “expenditure side” measure (or even the PWT8.1’s “output side” measure):

Indeed, the estimated growth rates for the period of Lee’s premiership of independent Singapore (1965-1990),2 according to all the different datasets available (Penn World Table 8.0, Penn World Table 8.1, World Development Indicators, Gleditsch, Maddison) do vary a fair amount:

(I include a measure from PWT8.1 for “real consumption of households and government, current PPPs,” which is also used to compare growth in living standards, according to this PWT document. Error bars can be understood as a measure of volatility in the GDP measure - larger bars indicate more ups and downs in the series). To be sure, by whatever measure, Singapore under Lee Kuan Yew grew very fast compared to the rest of the world (certainly in the top 10% of all countries for the period 1965-1990, sometimes appearing as the top performer overall), though it was not among the ranks of the ultra-poor when it started (the low-end estimate of around $3,000 per capita in 1965 may not be rich, but it’s three times the estimated per capita GDP of China in 1965 for the same measure). But purely by accident, the Gleditsch data shows Lee in the worst possible light:

Measure Growth rate Percentile Rank
PWT 8.1: Output side, chained PPPs 7.25% 100 1 out of 57
PWT 8.1: Output side, current PPPs, 2005$ 7.21% 100 1 out of 57
PWT 8.1: Expenditure side, current PPPs, 2005$ 7.03% 100 1 out of 57
PWT 8.1: Expenditure side, chained PPPs 6.89% 100 1 out of 57
WDI: GDP per capita, constant 2005$ 6.63% 100 1 out of 42
Maddison 2013: Real GDP per capita, 1990$ 6.38% 99 2 out of 80
PWT 8.0: Expenditure side, current PPPs, 2005$ 7.01% 98 2 out of 57
PWT 8.0: Expenditure side, chained PPPs 6.88% 98 2 out of 57
PWT 8.0: Output side, current PPPs, 2005$ 6.86% 98 2 out of 57
PWT 8.1: National-accounts growth rates, 2005$ 6.65% 98 2 out of 57
PWT 8.0: National-accounts growth rates, 2005$ 6.65% 98 2 out of 57
PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 5.00% 93 5 out of 57
PWT 8.0: Output side, chained PPPs 4.83% 91 6 out of 57
Gleditsch 4.83% 91 8 out of 83

There are perfectly good reasons for this variation in growth estimates. Current PPP measures of GDP per capita should not, in general, be identical to chained PPP measures, since the PPP conversion factors will vary over time in the latter and not in the former; I assume that this divergence may be magnified when an economy is undergoing genuine structural transformation. Expenditure-side and output-side measures will also vary depending on whether a country is facing better or worse terms of trade, something that will apply especially to trade-dependent economies like Singapore’s.

More generally, the Maddison project, the World Bank, and the Penn World Table project make different adjustments to the numbers produced by national statistical offices, based on different views about how to compare various prices across countries and time and different assumptions about the structure of particular economies. And though in the Singaporean case this is not really a problem, ultimately most estimates of the productive capacity of an economy, or the living standards of a country, depend on the reliability of national statistical agencies, which are subject to different constraints, including lack of resources to gather data and political manipulation. Morten Jerven, for example, argues that in some African countries, the numbers measuring GDP are basically guesstimates of limited value, given the lack of reliable price surveys, the low capacity of some national statistical offices, and the impossibility of measuring certain economic sectors; and Jerome Wallace has written on the political incentives for manipulating GDP statistics in China, especially at the subnational level, which bias Chinese growth rates upwards. (Estimates of Chinese GDP in particular are currently controversial. Though the main PWT data reports estimates of the Chinese economy based on official national accounts data, the PWT researchers also provide an additional table reporting “adjusted” national accounts data based on the research of Harry Wu. The Maddison project reports the Wu-adjusted data instead, which results in generally lower rates of growth before 1990 than the official data).

How much does it matter, however, which measure we use to evaluate the economic performance of particular regimes and political leaders? Which leaders and regimes have the most “disputed” economic performance, depending on the measure used? Using the Beta version of the Archigos dataset, I estimated the growth rates of all available measures of GDP per capita for all political leaders who were in office by at least 8 years up until 2014 in the post-1945 period. Eight years may not seem long, but in fact only about 15% of all leaders survive that long in power, so this is a pretty select group of “political survivors.” Moreover, eight years is two American presidential terms (so the data includes some American leaders), and seems long enough for leaders to actually make a difference, or at least successfully ride out a crisis or two. The economic stars of this select group of about 350 politically over-achieving group of leaders presided over estimated growth rates greater than 90% of all other countries with data for the period in which they were in office (averaging all growth rate estimates from the different datasets):

The variation at the top is enormous, depending on what measure we use. For example, Obasanjo is ranked as the top performing leader from 1999-2007 on many of the PWT8.1 measures, but only in the 84th percentile according to Maddison, and the estimated growth rates for the period range all the way from 6.7% per year (Maddison) to 28% per year (PWT 8.1, growth in consumption). If we believe the PWT, Obasanjo presided over a seven-fold increase in Nigeria’s living standards; if we believe Maddison (or the WDI), Nigerian living standards merely increased by about 1.7 times during his time in office. The economic performance of other leaders varies even more dramatically: if we believe version 8.1 of the PWT, the real consumtion of households and government in Equatorial Guinea under Teodoro Obiang Nguema Mbasogo increased about 6 times from 1979-2014; if we believe the GDP per capita measures on the expenditure side in both versions of the PWT, living standards increased about 45 times; and if we believe the output-side measure from the PWT version 8.0, the productive capacity of the economy of Equatorial Guinea increased about 125 times, more than under any other leader in this dataset. A real benefactor! (Right). In this context, it is reassuring that almost all measures agree that Singapore’s productive capacity and measured living standards increased by around five times during Lee’s time in office.

The same variability is also evident among the very worst performers:

Depending on which measure you use, Nigeria’s economic output and living standards under the military government of Babangida either contracted at a rate of around 17% per year (PWT8.1, expenditure-side measures), or merely remained stagnant (Maddison, World Development indicators). Jabir as-Sabah of Kuwait presided over one of the most severe depressions in modern history (-15% per year for 12 years, output-side measure in PWT 8.0) or merely over an extended recession caused by falling oil prices (-1.3% per year, real consumption measure from PWT 8.1). In the case of Syria under Hafiz al-Assad, the different datasets do not even agree as to whether the economy was growing a bit or shrinking horribly during his time in power.

The problem is not that some datasets always produce higher or lower estimates, but that for some particular kinds of leaders and countries, they seem to disagree for opaque reasons. The biggest divergences in estimates seem to occur for leaders that presided over states whose statistical capacity is at best dubious, or who were undergoing some severe trade shock (wild swings in the price of oil, or severe conflict or civil war), but it’s hard to tell without more detailed analysis. (By contrast, estimates of growth rates in the “advanced” economies of Europe and the USA typically agree across all measures). Here, for example, are the leaders whose growth estimates differ the most (90th percentile and above) when measured in more than two different ways by two or more different datasets, as well as the sources of the high and low estimates:

Leader Lowest Highest Difference Source low Source high Measures
Obasanjo, Nigeria, 1999-2007 6.8% 28.2% 21.43 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 15
Babangida, Nigeria, 1985-1993 -18.0% 0.9% 18.84 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ Maddison 2013: Real GDP per capita, 1990$ 14
Emile Lahoud, Lebanon, 1998-2007 0.0% 14.5% 14.45 WDI: GDP per capita, constant 2005$ PWT 8.1: Output side, chained PPPs 15
Jabir As-Sabah, Kuwait, 1978-1990 -14.6% -1.3% 13.28 PWT 8.0: Output side, chained PPPs PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 13
Amad Al Thani, Qatar, 1995-2007 2.8% 15.8% 12.96 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Expenditure side, current PPPs, 2005$ 13
Bashar al-Assad, Syria, 2000-2011 1.4% 13.3% 11.87 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 13
Bagabandi, Mongolia, 1997-2005 -0.6% 9.9% 10.49 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Expenditure side, current PPPs, 2005$ 15
Hun Sen, Cambodia (Kampuchea), 1985-1993 -4.3% 5.4% 9.67 Gleditsch, from Maddison, PWT8.0 PWT 8.0: Output side, current PPPs, 2005$ 13
Nguema Mbasogo, Equatorial Guinea, 1979-2014 5.3% 14.8% 9.52 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ PWT 8.0: Output side, current PPPs, 2005$ 12
Saddam Hussein, Iraq, 1979-2003 -8.6% 0.9% 9.45 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 14
H. Aliyev, Azerbaijan, 1993-2003 -5.2% 3.9% 9.04 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ WDI: GDP per capita, PPP, constant 2005$ 15
Hun Sen, Cambodia (Kampuchea), 1997-2014 -0.8% 7.9% 8.64 Gleditsch, from Maddison, PWT8.0 PWT 8.0: Expenditure side, current PPPs, 2005$ 14
Elias Hrawi, Lebanon, 1989-1998 -1.5% 6.8% 8.28 PWT 8.1: Output side, chained PPPs WDI: GDP per capita, constant 2005$ 14
Menem, Argentina, 1988-1999 2.8% 10.9% 8.13 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Expenditure side, chained PPPs 14
Khatami, Iran (Persia), 1997-2005 3.5% 11.4% 7.99 WDI: GDP per capita, constant 2005$ PWT 8.1: Expenditure side, current PPPs, 2005$ 15
Akayev, Kyrgyz Republic, 1991-2005 -8.1% -0.2% 7.92 PWT 8.1: Expenditure side, chained PPPs Maddison 2013: Real GDP per capita, 1990$ 15
Yeltsin, Russia (Soviet Union), 1991-1999 -13.2% -5.3% 7.91 PWT 8.1: Output side, current PPPs, 2005$ WDI: GDP per capita, PPP, constant 2005$ 15
Ngouabi, Congo, 1969-1977 -3.6% 4.3% 7.85 PWT 8.0: Output side, chained PPPs PWT 8.1: Output side, current PPPs, 2005$ 14
Al-Assad H., Syria, 1971-2000 -6.0% 1.6% 7.55 Gleditsch, from Maddison, PWT8.0 WDI: GDP per capita, constant 2005$ 14
Jabir As-Sabah, Kuwait, 1991-2006 1.5% 8.8% 7.30 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ PWT 8.0: Output side, current PPPs, 2005$ 13
Nguesso, Congo, 1997-2014 0.3% 7.5% 7.17 PWT 8.0: Output side, chained PPPs PWT 8.1: Output side, chained PPPs 14
Kabbah, Sierra Leone, 1998-2007 -1.2% 6.0% 7.13 PWT 8.0: Output side, chained PPPs Maddison 2013: Real GDP per capita, 1990$ 15
Hu Jintao, China, 2003-2012 2.9% 10.0% 7.09 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ PWT 8.1: National-accounts growth rates, 2005$ 15
Mwinyi, Tanzania/Tanganyika, 1985-1995 -5.6% 1.2% 6.79 PWT 8.1: Real consumption of households and government, current PPPs, 2005$ PWT 8.0: National-accounts growth rates, 2005$ 13
Berdymukhammedov, Turkmenistan, 2006-2014 5.5% 12.2% 6.76 PWT 8.0: Expenditure side, current PPPs, 2005$ PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 14
Ilhma Aliyev, Azerbaijan, 2003-2014 9.6% 16.3% 6.75 WDI: GDP per capita, constant 2005$ PWT 8.1: Output side, chained PPPs 14
Johnson Sirleaf, Liberia, 2006-2014 1.0% 7.6% 6.65 PWT 8.0: Output side, chained PPPs WDI: GDP per capita, PPP, constant 2005$ 14
Manning, Trinidad and Tobago, 2001-2010 5.6% 12.2% 6.55 WDI: GDP per capita, PPP, constant 2005$ PWT 8.1: Output side, chained PPPs 15
Doe, Liberia, 1980-1990 -8.3% -1.9% 6.45 WDI: GDP per capita, constant 2005$ Maddison 2013: Real GDP per capita, 1990$ 14
Hamad Isa Ibn Al-Khalifah, Bahrain, 1999-2014 -1.1% 5.1% 6.27 PWT 8.1: National-accounts growth rates, 2005$ PWT 8.1: Output side, chained PPPs 14
Khalifa Al Nahayan, United Arab Emirates, 2004-2014 -7.1% -0.8% 6.26 WDI: GDP per capita, PPP, constant 2005$ Gleditsch, from Maddison, PWT5.6, Imputed based on first/last available 3
Macias Nguema, Equatorial Guinea, 1968-1979 1.6% 7.6% 5.97 Maddison 2013: Real GDP per capita, 1990$ PWT 8.1: Real consumption of households and government, current PPPs, 2005$ 13

Some of these numbers have an air of fantasy about them. It is not, I think, possible to know with any degree of certainty the GDP per capita of Equatorial Guinea under Macias Nguema (last one in the table above), much less to estimate its growth rate, since government bureaucracies pretty much ceased to operate, the country was more or less off-limits to foreigners, cocoa production collapsed, and perhaps a third of the population fled or was killed during his time in power. (Perhaps “per capita” GDP increased because the population was declining at the time, despite the apparently complete economic disaster, but it’s hard to say: under these circumstances, all GDP numbers must be suspect). Even when the numbers are not utterly fantastic, however, the divergences in growth rates sometimes seem inexplicable without a deep understanding of how the underlying GDP numbers were generated. Should we think that the average growth in living standards under Hu Jintao was around 2.9% per year, or closer to 10% per year? Or was it more like 7%, as the latest expenditure-side measure of GDP per capita from the PWT 8.1 says?

Or take a more detailed look at Nigeria, which has both the worst (Babangida) and the best (Obasanjo) performers in terms of growth, and also the most widely divergent estimates of such growth:3

Datasets do not agree on how high was Nigeria’s GDP at the beginning of Babangida’s time in power, in the mid-1980s: it could have been as high as $1158 per capita (PWT8.0, output side) or as low as $568 (WDI, constant 2005 dollars). By 1994, when he leaves power, it could have been as low as $229 (PWT8.1) or as high as $2,817 (WDI, PPP adjusted), a more than tenfold difference! The datasets also do not agree on how low GDP was by the end of Abacha’s reign and the return to elected governments (was it $1034, according to Maddison? or $228, according to PWT?), or how high GDP was by the end of Obasanjo’s second stint in office (was it $881, in constant 2005 dollars according to the WDI? or as high as $4,527, also according to the World bank, when adjusting for PPP in the particular way the World bank happens to do so here? Or merely around $2,400, according to the expenditure side measure, chained PPPs, of PWT8.1?). Some of these estimates consistently differ by about a factor of five; perhaps country specialists can explain them (adjustments by the statistical office to the national accounts? Different adjustments by dataset providers in response to changing prices of oil?), but the average user seems unlikely to know. Perhaps it’s impossible to tell exactly: based on available data, all we can tell is that average living standards (probably) declined under the military government of Babangida, and (probably) increased under under the elected government of Obasanjo, at least for a hypothetical “average person,” but it’s pointless to try to figure out by how much. (And that’s before we even get into philosophical questions about whether GDP per capita really measures anything of any importance).

The country’s political regime does seem to matter a bit for whether or not a country’s growth estimates agree; in general, estimates for more “democratic” regimes tend to agree more, perhaps because they tend to be calculated under more transparent conditions. Using Geddes, Wright, and Frantz’s dataset of authoritarian regimes, we can calculate the average growth rates and growth percentiles of all regimes in place for at least three years (so there’s enough data to calculate some sensible growth rates) since 1950 (n = 239). (As above, the growth percentiles are relative to the dates of the regime; so, for example, a regime that grew at 5% per year from 1950-1980 may be in the 95th percentile for that period, while a regime that grew at 7% per year in the 1970-1980 period may be only in the 90th percentile for that period, if other countries grew even faster in that time. This is a rough way of adjusting for common factors operating on the world economy on all regimes in a particular period of time; instead of looking at the growth rate of a regime by itself, we can look at how that growth rate compares to the growth rate of all other countries during the regime’s lifetime). Here’s what their growth rates and growth percentiles look like when plotted against their basic regime type (colored dots represent means of growth rates or growth percentiles from one dataset and one measure):

The graph indicates three things. First, for the periods in which there is data, democracies in the sample seem to have grown faster than authoritarian regimes, when averaging over the entire lifetime of each regime, as some of the best research on this topic suggests. Their median “growth percentile” seems to have been higher than that of non-democracies for the periods in which they were in existence. But depending on which measure we use, we could get the opposite result: on the PPP WDI measure, autocracies seem to grow faster than democracies. (A situation ripe for p-hacking!). Second, economic performance in democracies seems to have been more stable than economic performance in non-democracies, as Rodrik and others have shown in more detail elsewhere, though growth rates vary widely across both democracies and non-democracies, and the extent of the variation depends in part on which measure of economic growth we choose to focus on. But third, and most importantly for our purposes here, estimates of economic growth seem to vary more across datasets in non-democracies than in democracies. Especially in countries going through periods of “no authority” (civil wars, warlord regimes, etc.), estimates of growth are basically all over the place, as we should perhaps expect when statistical offices cease to operate and economic activity goes underground.

We can take the same look at the same picture at a finer level of detail:

In some places (e.g., “warlord” regimes - no central authority, like Afghanistan in the early 2000s), the error bars around the mean growth rates are huge, and estimates from different datasets are basically all over the place. Interestingly, estimates of growth percentiles across different datasets also differ quite a bit for the (mostly Middle Eastern) monarchies, and many party or party/military regimes. In comparison, estimates for average growth rates in democracies seem to agree pretty closely across all datasets. Indeed, the standard deviation of the different estimates of the log of the level (not the growth) of GDP, on any given year, within each regime, is higher in non-democracies than in democracies; in other words, estimates of “how wealthy the country is” on any given year differ more within non-democracies than within democracies, and the biggest outliers (the countries where different datasets disagree the most) are all non-democratic:

Moreover, the divergence in estimates is not just due to the poverty of most authoritarian countries; non-democracies have more diverging estimates of GDP at all levels of GDP on any given year. Though poorer democracies and hybrid regimes do tend to have more variable estimates of their level of GDP than richer democracies and hybrid regimes, as we might expect (perhaps poorer countries have more difficulty gathering reliable data), the opposite appears to be true for non-democratic regimes; estimates of the actual level of GDP of richer authoritarian regimes across datasets diverge as much as the estimates of the level of GDP of poorer authoritarian regimes:

Moral of the story: it’s difficult to measure incomes. It’s even harder to construct estimates of income that are comparable across widely different economies and societies, or to interpret these measures appropriately. (Income and political datasets should have more metadata!). But it seems hardest to do that for regimes that can lie with greater impunity.

All code for this post is available here.


  1. The choice to use “output side” (rather than Expenditure side) measures of GDP makes good sense for the Gleditsch data, which is designed for use in international relations research where measuring the productive capacity of an economy is more important than measuring living standards. But Gleditsch’s data for some countries sometimes mixes numbers from Maddison, the World Bank, and PWT that appear to have been calculated in different ways and for different purposes.
  2. The estimated growth rates are the coefficient of the simple linear model log(per capita) ~ year, for each measure of GDP per capita. Technically, these are trend growth rates (the slope of the trend line of the log of per capita GDP), rather than the geometric mean of each year’s growth rate (another usual way of averaging growth rates over time), but the differences remain whichever way one calculates average growth rates, and for most countries the estimated growth rates are pretty similar using either approach (even though trend growth rates may not be appropriate if the time series has a structural break).
  3. See my post on histories of instability for more on these kinds of “deep history” figures.