tl;dr: Multifactor productivity data from a famous economic dataset, used often to proxy for technological progress and innovation, might be significantly biased by poor estimates of the social returns to education, the capital share of income, and real GDP. Such estimates should be treated with caution.

Introduction

Total factor productivity is an economic concept that is used to quantify how efficiently a country can make use of its economic resources. It's a rather nebulous concept in general because it's not directly measurable but instead corresponds to latent variables in growth models that account for "unexplained variation" in output.

If we stick to the abstract realm of growth models, there is often a clear definition: for instance, we might model a country's real GDP Y by a function such as

Y=AL1−αKα

where L and K denote the country's total labor force and capital stock respectively, 0<α<1 is a parameter, and A is total factor productivity, hereafter abbreviated as TFP. If we have two countries with the same capital stock and labor force but one of them has a higher economic output, we say that country has a higher TFP.

Of course, the same principle works in general, not just with this specific functional form and for labor and capital inputs. If we have some economically relevant inputs f1,f2,…,fn (sometimes called factors of production), we can imagine a model where economic output in a country is given by

Y=AH(f1,f2,…,fn)

for some function H. In this case, the TFP A is just a factor that's present to "close the model": changes in output not accounted for by changes in the inputs fi are automatically accounted for by changes in A. For a specific H, we can see A=Y/H as a measure of economic productivity which controls for "obvious factors" such as changes in the labor or capital stocks. Making the right choice of H is very important, of course: we don't want to use a random H, but one that is actually informed by the data. I'll talk more about how we do this later.

For instance, if a country has increasing TFP over time, that suggests the country's output is rising because of things that are out of our model. If our model only had labor force and capital stock, perhaps the quality of the workers in the country has gone up: they have more human capital, even though they have the same number of people because each person is on average more productive. It could be that the country's institutions have improved and this enables them to make more efficient use of their existing resources. Anything that is out of the model goes into TFP, which is why it's sometimes called a "trash can statistic".

Importantly, TFP is a concept that only makes sense relative to a model, in particular relative to a collection of factors of production that we choose to consider in our model. There is no such thing as a country's "absolute TFP" because TFP is a latent variable and its value depends on the model specification. Still, TFP is often used as a proxy for technological progress, because we intuitively think that better technology should involve something other than raw accumulations of labor, capital, and human talent. This approach is used by Bloom et al. (2020), the famous "are ideas getting harder to find?" paper, for the United States specifically. Guzey et al. (2021) criticizes them on this point, among others.

The puzzle of declining total factor productivity

Now, we come to the question in the title of the post. Check out the plot below:

According to this plot, China had a higher TFP in 1956 (when it was poorer per capita than most African countries) than in 2019! What's going on here?

Saying that China regressed technologically doesn't make sense, no matter how we interpret that claim. China today has both better physical technology and social technology (coordination mechanisms et cetera) than it did in 1956. However, the graph claims the opposite, so there appears to be some discrepancy to explain here. Certainly, we would not want to plug this series into some kind of model of technological progress without understanding how it was constructed.

Moreover, the puzzles are not exclusive to China. TFP also seems to have declined in Italy and has been flat in Canada since the early 1970s. What's really going on here?

How is this data constructed?

The Federal Reserve Bank of St. Louis sources this data from the Penn World Table, a data collection project that tabulates economic data across time and across different countries. The Excel spreadsheet version of their latest release is 6.4 MB in size, so there is a lot of data in there. In addition to tabulating data that's easier to measure (such as GDP and population), they also report estimates of TFP in their recent releases.

First, let's talk about the factors of production they use. They have measures of capital stock across countries, which they use to define K; and they have several measures about labor: population size, what fraction of the population was employed, and how many hours per year employed people worked on average. On top of these, they also have a measure of "human capital" which is supposed to account for changes in worker productivity, which they construct by looking at the educational attainment of people in the country and combining these with estimates from the relevant literature about how many years of education are worth how much gain in productivity. This is added to the data as a multiplier for L, so workers with more education count for more in their weighted labor inputs measure.

Now, let's talk about what they do to deal with H being unknown. Their exact approach is a little bit more complicated than the simple version I'll explain here, but they are similar in spirit and give very similar results when applied to longitudinal data, so I'll stick to my version here.

If we know that Y=AH(f1,f2,…,fn) holds, which it does by definition, then simply using the chain rule gives us

1YdYdt=1AdAdt+1Yn∑k=1∂Y∂fkdfkdt

We can express this quantity in a more convenient form using the functions

αk=∂logY∂logfk=fkY∂Y∂fk

which are called factor elasticities. The intuition behind these numbers is that locally, a one percent change in fk should produce an αk percent change in Y. Substituting these into the expression above gives

1YdYdt=1AdAdt+n∑k=1αk1fkdfkdt

The important fact here is that we observe both economic output Y and the factor stocks fk. Therefore, if only we had a way to get a handle on the numbers αk (which are not constant in general), we could use this equation to compute the growth rate of A, then integrate those growth rates over time to compute A(t2)/A(t1) for any two times t1,t2 - a longitudinal index for A.

Here is the part where we need some economic theory: in a perfectly competitive market, we expect αk to be equal to the share of national income being paid to factork. Markets are in general not perfectly competitive, and it's possible to consider other points in these calculations, but those introduce even more assumptions and make the estimates less robust. As a first pass, it should be a decent approximation to just assume αk at any given time is equal to the ratio of the fraction of national income being paid out to owners of factor k divided by the total economic output of the country.

The Penn World Table (PWT) dataset has estimates of how much was paid to all workers in a country in wages and other forms of compensation, and they divide this by their GDP estimate to get an estimate for αL, the labor share of national income. They then assume the production function is homogeneous of degree 1 in labor and capital jointly, i.e. that for all r>0 and for all L,K, we have

H(rL,rK)=rH(L,K)

which implies αK=1−αL upon differentiating with respect to r on both sides. This gives them enough information to compute a longitudinal index for A, as mentioned above.

What could go wrong?

When doing forensic analysis to identify the culprit behind the counterintuitive results of this calculation, I've pinned down three places that look suspicious:

The human capital adjustment made to the labor force.

The assumption that αK=1−αL.

The estimates of real GDP that the calculation is based on.

Edit (dated GMT 3:10 AM, 19 August 2023): I've noticed that (3) could be a potential problem only after publishing this post: there appears to be some doubt about to what extent the real GDP data for China from projects involving the University of Groningen (such as PWT and the Maddison Project) can be trusted. I say more about this in the conclusion section. I'm not yet convinced that the data from PWT is bad, but the lack of agreement with other sources is concerning.

Focusing on the other points for the moment, both (1) and (2) bias TFP growth to be lower in most countries than it should be. Changing these parts of the calculation gives the following results:

Here, "hc" denotes the human capital adjustment, while the inclusion of "csh_correction" determines whether we estimate the capital share as 1−αL directly or whether we apply another correction on top that I'll discuss shortly.

"hc + labsh + avh" should reproduce the results from the Penn World Table dataset, and we can see this is successful because the two curves coincide exactly. The cumulative impact of addressing points (1) and (2) by removing "hc" (human capital adjustment) and adding "csh_correction" (capital share correction) is to shift TFP estimates to the curve "csh_correction + labsh + avh". As you can see, if we take this curve as our TFP estimate, then Chinese TFP appears to have increased by a factor of 2.36 since 1970 instead of remaining flat!

Now, let's talk about both (1) and (2) in more detail.

The human capital adjustment

The Penn World Table paper doesn't give a detailed description of exactly how they compute the human capital multiplier in their dataset. They say that they take data from Barro and Lee (2013) on primary, secondary, and tertiary school completion across different age groups, and combine these with some estimate of the returns to education (e.g. each year raises income by 10%) in all of these different schooling periods. However, I was unable to locate the exact returns to years of education schedule they use for the calculation.

Still, we can get some rough estimates. The Barro and Lee (2013) dataset says that the average years of education of the Chinese population aged 25 to 64 increased by around 7 years from 1956 to 2015. Combining this with the roughly ≈2.22 fold increase in the human capital index reported in the PWT over the same period suggests they assumed social returns to education of around

log(2.22)7≈11.4%/year

This is extremely large! In fact, it probably exceeds the median private correlational returns to education worldwide. In other words, if you took the population of the median country and ran a regression

log(annual income)=α+β×years of education+ε

on working-age adults, you would probably estimate β≈0.1/year - see Montenegro and Patrinos (2014) for more information on the global distribution of β. Therefore, the Penn World Table implicitly assumes that the social returns to education are equal to the private correlational returns, and neglects effects we have substantial reasons to suspect are important:

Ability bias: People who have more years of education just tend to be more capable people on average, so they are able to produce more and earn more income later in life.

Signaling effects: Even controlling for ability bias, there are good reasons to think that half or more of the returns to education are from signaling effects, i.e. people earning more because degrees provide them with legible-to-employers signals of ability, conscientiousness, conformity, et cetera.

I think the correct estimate of the social returns to education should be 20% or less of the estimate used by the Penn World Table. I can make this correction explicitly, but because 20% is close to zero this wouldn't be very far from just dropping human capital adjustment from the estimation of TFP altogether, which is what I do in the plots above.

The capital share assumption

Assuming αK=1−αL might look reasonable, but in fact, it's quite likely to be an overestimate. There could be other factors of production, the most important of which is probably land in agrarian societies, suggesting this method would systematically overestimate capital share in poor countries whose economies are dominated by agriculture and where land remains an important factor of production. While PWT corrects for natural resource rents in this calculation, they do not correct for rents accruing to land.

Correcting this bias should lead to us estimating a higher rate of TFP growth. Indeed, if we bring back the key identity

1YdYdt=1AdAdt+αL1LdLdt+αK1KdKdt

we can see that the larger the elasticity αK, the more of output growth is explained by capital growth and less is left over for the TFP growth term. As a result, any adjustment to the procedure which lowers αK will increase our estimate for the growth rate of TFP.

I perform a crude correction by assuming that land elasticity of output is equal to 0.5 in the agricultural sector and land plays no other role in production. I tried to get a source for a particular elasticity to use here but came up short - I would appreciate it if readers could direct me to an appropriate source.

As around 40% of China's GDP in the 1970s was agriculture, this assumption gives an initially large role to land as a factor of production that vanishes later, a broad pattern that I think is qualitatively accurate. Quantitatively, this correction has a similar impact to the human capital correction, and stacking them gives the final result that TFP has increased by more than two times in China since 1970.

Conclusion

Putting all of this information together, it seems quite likely that TFP in China increased by a factor of 2 or more since 1956. Because I expect αK=1−αL to be a more legitimate assumption for wealthier countries, the capital share bias should be small in these cases, but the human capital bias can still be substantial because of the aggressive returns to education assumed by the PWT.

In general, TFP for labor and capital most likely grew by more than PWT estimates suggest, a result that I find more consistent with basic intuitions than the results PWT reports for some countries.

Edit (dated GMT 3:10 AM, 19 August 2023): Different datasets disagree on by how much real GDP grew in China over the relevant period, leading to different estimates of mean TFP growth. For instance, PWT and the Maddison Project, both projects of the University of Groningen, estimate a growth rate of 6% per year or so from 1978 to 2004, while chaining together growth estimates from the World Bank over the same period gives a mean growth rate of 9.2% per year. If we accept the World Bank estimate instead, this strengthens the case that TFP didn't decline even further by adding an additional ~ 3% per year growth that is not accounted for by capital or labor.

Code

The code to reproduce the results in this post is available in this Colab notebook.

tl;dr: Multifactor productivity data from a famous economic dataset, used often to proxy for technological progress and innovation, might be significantly biased by poor estimates of the social returns to education, the capital share of income, and real GDP. Such estimates should be treated with caution.## Introduction

Total factor productivity is an economic concept that is used to quantify how efficiently a country can make use of its economic resources. It's a rather nebulous concept in general because it's not directly measurable but instead corresponds to latent variables in growth models that account for "unexplained variation" in output.

If we stick to the abstract realm of growth models, there is often a clear definition: for instance, we might model a country's real GDP Y by a function such as

Y=AL1−αKα

where L and K denote the country's total labor force and capital stock respectively, 0<α<1 is a parameter, and A is total factor productivity, hereafter abbreviated as TFP. If we have two countries with the same capital stock and labor force but one of them has a higher economic output, we say that country has a higher TFP.

Of course, the same principle works in general, not just with this specific functional form and for labor and capital inputs. If we have some economically relevant inputs f1,f2,…,fn (sometimes called

factors of production), we can imagine a model where economic output in a country is given byY=AH(f1,f2,…,fn)

for some function H. In this case, the TFP A is just a factor that's present to "close the model": changes in output not accounted for by changes in the inputs fi are automatically accounted for by changes in A. For a specific H, we can see A=Y/H as a measure of economic productivity which controls for "obvious factors" such as changes in the labor or capital stocks. Making the right choice of H is very important, of course: we don't want to use a random H, but one that is actually informed by the data. I'll talk more about how we do this later.

For instance, if a country has increasing TFP over time, that suggests the country's output is rising because of things that are out of our model. If our model only had labor force and capital stock, perhaps the quality of the workers in the country has gone up: they have more

human capital, even though they have the same number of people because each person is on average more productive. It could be that the country's institutions have improved and this enables them to make more efficient use of their existing resources. Anything that is out of the model goes into TFP, which is why it's sometimes called a "trash can statistic".Importantly, TFP is a concept that only makes sense

relative to a model, in particular relative to a collection of factors of production that we choose to consider in our model. There is no such thing as a country's "absolute TFP" because TFP is a latent variable and its value depends on the model specification. Still, TFP is often used as a proxy for technological progress, because we intuitively think that better technology should involve something other than raw accumulations of labor, capital, and human talent. This approach is used by Bloom et al. (2020), the famous "are ideas getting harder to find?" paper, for the United States specifically. Guzey et al. (2021) criticizes them on this point, among others.## The puzzle of declining total factor productivity

Now, we come to the question in the title of the post. Check out the plot below:

According to this plot, China had a higher TFP in 1956 (when it was poorer per capita than most African countries) than in 2019! What's going on here?

Saying that China regressed technologically doesn't make sense, no matter how we interpret that claim. China today has both better physical technology and social technology (coordination mechanisms

et cetera) than it did in 1956. However, the graph claims the opposite, so there appears to be some discrepancy to explain here. Certainly, we would not want to plug this series into some kind of model of technological progress without understanding how it was constructed.Moreover, the puzzles are not exclusive to China. TFP also seems to have declined in Italy and has been flat in Canada since the early 1970s. What's really going on here?

## How is this data constructed?

The Federal Reserve Bank of St. Louis sources this data from the Penn World Table, a data collection project that tabulates economic data across time and across different countries. The Excel spreadsheet version of their latest release is 6.4 MB in size, so there is a lot of data in there. In addition to tabulating data that's easier to measure (such as GDP and population), they also report estimates of TFP in their recent releases.

First, let's talk about the factors of production they use. They have measures of capital stock across countries, which they use to define K; and they have several measures about labor: population size, what fraction of the population was employed, and how many hours per year employed people worked on average. On top of these, they also have a measure of "human capital" which is supposed to account for changes in worker productivity, which they construct by looking at the educational attainment of people in the country and combining these with estimates from the relevant literature about how many years of education are worth how much gain in productivity. This is added to the data as a multiplier for L, so workers with more education count for more in their weighted labor inputs measure.

Now, let's talk about what they do to deal with H being unknown. Their exact approach is a little bit more complicated than the simple version I'll explain here, but they are similar in spirit and give very similar results when applied to longitudinal data, so I'll stick to my version here.

If we know that Y=AH(f1,f2,…,fn) holds, which it does by definition, then simply using the chain rule gives us

1YdYdt=1AdAdt+1Yn∑k=1∂Y∂fkdfkdt

We can express this quantity in a more convenient form using the functions

αk=∂logY∂logfk=fkY∂Y∂fk

which are called

factor elasticities. The intuition behind these numbers is that locally, a one percent change in fk should produce an αk percent change in Y. Substituting these into the expression above gives1YdYdt=1AdAdt+n∑k=1αk1fkdfkdt

The important fact here is that we observe both economic output Y and the factor stocks fk. Therefore, if only we had a way to get a handle on the numbers αk (which are

not constantin general), we could use this equation to compute the growth rate of A, then integrate those growth rates over time to compute A(t2)/A(t1) for any two times t1,t2 - alongitudinal indexfor A.Here is the part where we need some economic theory: in a perfectly competitive market, we expect αk to be equal to the

share of national income being paid to factork. Markets are in general not perfectly competitive, and it's possible to consider other points in these calculations, but those introduce even more assumptions and make the estimates less robust. As a first pass, it should be a decent approximation to just assume αk at any given time is equal to the ratio of the fraction of national income being paid out to owners of factor k divided by the total economic output of the country.The Penn World Table (PWT) dataset has estimates of how much was paid to all workers in a country in wages and other forms of compensation, and they divide this by their GDP estimate to get an estimate for αL, the labor share of national income. They then assume the production function is homogeneous of degree 1 in labor and capital jointly, i.e. that for all r>0 and for all L,K, we have

H(rL,rK)=rH(L,K)

which implies αK=1−αL upon differentiating with respect to r on both sides. This gives them enough information to compute a longitudinal index for A, as mentioned above.

## What could go wrong?

When doing forensic analysis to identify the culprit behind the counterintuitive results of this calculation, I've pinned down three places that look suspicious:

Edit (dated GMT 3:10 AM, 19 August 2023):I've noticed that (3) could be a potential problem only after publishing this post: there appears to be some doubt about to what extent the real GDP data for China from projects involving the University of Groningen (such as PWT and the Maddison Project) can be trusted. I say more about this in the conclusion section. I'm not yet convinced that the data from PWT is bad, but the lack of agreement with other sources is concerning.Focusing on the other points for the moment, both (1) and (2) bias TFP growth to be lower in most countries than it should be. Changing these parts of the calculation gives the following results:

Here, "hc" denotes the human capital adjustment, while the inclusion of "csh_correction" determines whether we estimate the capital share as 1−αL directly or whether we apply another correction on top that I'll discuss shortly.

"hc + labsh + avh" should reproduce the results from the Penn World Table dataset, and we can see this is successful because the two curves coincide exactly. The cumulative impact of addressing points (1) and (2) by removing "hc" (human capital adjustment) and adding "csh_correction" (capital share correction) is to shift TFP estimates to the curve "csh_correction + labsh + avh". As you can see, if we take this curve as our TFP estimate, then Chinese TFP appears to have increased by a factor of 2.36 since 1970 instead of remaining flat!

Now, let's talk about both (1) and (2) in more detail.

## The human capital adjustment

The Penn World Table paper doesn't give a detailed description of exactly how they compute the human capital multiplier in their dataset. They say that they take data from Barro and Lee (2013) on primary, secondary, and tertiary school completion across different age groups, and combine these with some estimate of the returns to education (e.g. each year raises income by 10%) in all of these different schooling periods. However, I was unable to locate the exact returns to years of education schedule they use for the calculation.

Still, we can get some rough estimates. The Barro and Lee (2013) dataset says that the average years of education of the Chinese population aged 25 to 64 increased by around 7 years from 1956 to 2015. Combining this with the roughly ≈2.22 fold increase in the human capital index reported in the PWT over the same period suggests they assumed social returns to education of around

log(2.22)7≈11.4%/year

This is extremely large! In fact, it probably exceeds the median

private correlational returns to educationworldwide. In other words, if you took the population of the median country and ran a regressionlog(annual income)=α+β×years of education+ε

on working-age adults, you would probably estimate β≈0.1/year - see Montenegro and Patrinos (2014) for more information on the global distribution of β. Therefore, the Penn World Table implicitly assumes that the social returns to education are equal to the private correlational returns, and neglects effects we have substantial reasons to suspect are important:

Ability bias:People who have more years of education just tend to be more capable people on average, so they are able to produce more and earn more income later in life.Signaling effects:Even controlling for ability bias, there are good reasons to think that half or more of the returns to education are from signaling effects, i.e. people earning more because degrees provide them with legible-to-employers signals of ability, conscientiousness, conformity,et cetera.I think the correct estimate of the social returns to education should be 20% or less of the estimate used by the Penn World Table. I can make this correction explicitly, but because 20% is close to zero this wouldn't be very far from just dropping human capital adjustment from the estimation of TFP altogether, which is what I do in the plots above.

## The capital share assumption

Assuming αK=1−αL might look reasonable, but in fact, it's quite likely to be an overestimate. There could be other factors of production, the most important of which is probably land in agrarian societies, suggesting this method would systematically overestimate capital share in poor countries whose economies are dominated by agriculture and where land remains an important factor of production. While PWT corrects for natural resource rents in this calculation, they do not correct for rents accruing to land.

Correcting this bias should lead to us estimating a higher rate of TFP growth. Indeed, if we bring back the key identity

1YdYdt=1AdAdt+αL1LdLdt+αK1KdKdt

we can see that the larger the elasticity αK, the more of output growth is explained by capital growth and less is left over for the TFP growth term. As a result, any adjustment to the procedure which lowers αK will increase our estimate for the growth rate of TFP.

I perform a crude correction by assuming that land elasticity of output is equal to 0.5 in the agricultural sector and land plays no other role in production. I tried to get a source for a particular elasticity to use here but came up short - I would appreciate it if readers could direct me to an appropriate source.

As around 40% of China's GDP in the 1970s was agriculture, this assumption gives an initially large role to land as a factor of production that vanishes later, a broad pattern that I think is qualitatively accurate. Quantitatively, this correction has a similar impact to the human capital correction, and stacking them gives the final result that TFP has increased by more than two times in China since 1970.

## Conclusion

Putting all of this information together, it seems quite likely that TFP in China increased by a factor of 2 or more since 1956. Because I expect αK=1−αL to be a more legitimate assumption for wealthier countries, the capital share bias should be small in these cases, but the human capital bias can still be substantial because of the aggressive returns to education assumed by the PWT.

In general, TFP for labor and capital most likely grew by more than PWT estimates suggest, a result that I find more consistent with basic intuitions than the results PWT reports for some countries.

Edit (dated GMT 3:10 AM, 19 August 2023):Different datasets disagree on by how much real GDP grew in China over the relevant period, leading to different estimates of mean TFP growth. For instance, PWT and the Maddison Project, both projects of the University of Groningen, estimate a growth rate of 6% per year or so from 1978 to 2004, while chaining together growth estimates from the World Bank over the same period gives a mean growth rate of 9.2% per year. If we accept the World Bank estimate instead, this strengthens the case that TFP didn't decline even further by adding an additional ~ 3% per year growth that is not accounted for by capital or labor.## Code

The code to reproduce the results in this post is available in this Colab notebook.