Friday, July 12, 2019

The Phillips Curve: An Overview

Noah Smith has an article in Bloomberg today about the Phillips curve — the relationship between employment and inflation where "employment" and "inflation" can mean a couple of different things. Phillips' original paper talked about wage inflation (wage growth) and unemployment, but sometimes these can refer to inflation the price level in general (e.g. CPI inflation) or even expected inflation (in the New Keynesian Phillips Curves [NKPC] in DSGE models). I realized I don't have a good one-stop post for discussion of the Phillips curve, so this is going to be that post.

Noah's frame is the recent congressional hearings with Fed Chair Powell, and in particular the pointed questioning from Alexandria Ocasio-Cortez about whether the Phillips curve is "no longer describing what is happening in today’s economy." He continues to discuss the research finding a 'fading Phillips curve' and mentioned Adam Ozimek's claim that the Phillips curve is alive and well — all things I have discussed on this blog in the context of the Dynamic Information Equilibrium Model [DIEM]. Let's begin!

1. There is a direct relationship between wage growth and the unemployment rate

The structure of wage growth and the unemployment rate over the past few decades shows a remarkable similarity (as always, click to enlarge):

The wage growth model has continued to forecast well for a year and a half so far, while the unemployment rate model not only has done well for over two years now (I started it earlier) but has outperformed forecasts from the Fed as well as Ray Fair's model. Regardless of whether the models are correct (but seriously, that forecasting performance should weigh pretty heavily), they are still excellent fits to the prior data and describe the time series' structure accurately. There's actually another series with this exact shock pattern ('economic seismogram') match — JOLTS hires. The hires measure confirms the 2014 mini-boom appearing in wage growth and unemployment so we're not just matching negative recession shocks, but positive booms. We can put the models together on the same graph to highlight the similarity ... and we can basically transform them to fall on top of each other by simply scaling and lagging:

This find that shocks to JOLTS hires lead unemployment by about 5 months, and shocks to wages by 11 months — with first two leading NBER recessions and the last one happening after a recession is over. We can be pretty confident that changes in hires cause changes in unemployment which in turn cause changes in wage growth. Between shocks, the normal pattern is that unemployment falls and wage growth rises (accelerates). The rate of the latter is slow, but consistent (and forecast correctly by the DIEM):

2. Adam Ozimek's graph is more like a Beveridge curve and isn't quite as clean as presented

I used the wage growth model above and a similar model of prime age employment to reproduce a version of Ozimek's graph in an earlier post. Ozimek uses the Employers Cost Index (ECI), but I use the Atlanta Fed wage growth tracker data because it is monthly and goes back a bit farther in time [3]. However, this pretty much produces an identical graph to Ozimek's when we plot the same time period:

The DIEMs for wage growth and prime age employment population ratio [EPOP] also have some similar structure — however the 2014 mini-boom is not as obvious in EPOP if it appears at all ...

This indicates that these two series might have a more complex relationship than unemployment and wage growth. In fact, if you plot them on Ozimek's axes highlighting the temporal path through the data points in green (and yellow) as well as some additional earlier data in yellow (and highlighting the recent data in black) you see how the nice straight line above is somewhat spurious and the real slope is actually a bit lower:

The green dashed line shows where the data is headed (in the absence of a recession), and the light gray lines show the "dynamic equilibria" — the periods between shocks when wage growth and employment steadily grow. When a recession shock hits, we move from one "equilibrium" to another, much like the Beveridge curve (as I discuss in this blog post and in my paper).

3. The macro-relevant Phillips curve has faded away

The Phillips curves above talk about wage inflation, but in macro models the relationship is between unemployment and the price level (e.g. CPI or PCE inflation) — the NKPC. Now it's true that wages are a "price" and a lot of macro models don't distinguish between the price of labor and the price of goods. But it appears empirically we cannot just ignore this distinction because there does not appear to be any signal in price level data today ... but there used to be!

Much like in the first part of this post, we can look at DIEMs for (in this case) core PCE inflation and unemployment, and note that they really do seem to be related in the 60s through the 80s:

We see spikes of inflation cut off by spikes in unemployment, which fade out in the 90s. This is where a visualization of these "shocks" I've called "economic seismograms" is helpful — the following is a chart in a presentation from last year (this time its the GDP deflator):

Spikes in inflation are "cut-off" by recessions during the 60s and 70s, but that effect begins to fade out over time. What's interesting is that the period of a "strong Phillips curve" pretty much matches up with the long demographic shift of women entering the workforce in the 60s, 70s, and 80s. The Phillips curve vanishes when women's labor force participation becomes highly correlated with men's (i.e. only really showing signs of recession shocks). This is among several things that seem to change after the 1990s.

Why does this happen? I have some speculation (a metaphor I use is that mass labor force entry is like a "gravity wave" for macro) that I most concisely wrote up in a comment about my new book:
My thinking behind it is that high rates of labor force expansion (high compared to population growth) are more susceptible to the business cycle. Unlike adding people at the population growth rate, adding people at an accelerated rate because of something else happening — women entering the workforce — is more easily affected by macro conditions. Population grows and people have to find jobs, but women don't have to go against existing social norms and enter the workforce in a downturn, but are more likely to do so during an upturn (i.e. breaking social norms gets easier if it pays better than if it doesn't). 
This would cause the business cycle to pro-cyclically amplify and modulate the rate of women entering the workforce, which gives rise to bigger cyclical fluctuations and also the Phillips curve. 
As a side note: I think a similar mechanism played out during industrialization, when people were being drawn from rural agriculture into urban industry. And also a similar mechanism plays out when soldiers return from war (post-war inflation and recession cycles).
That new book's first chapter is largely about how this effect is generally behind the "Great Inflation" — and that it has nothing to do with monetary policy. Which brings us back to the beginning of this post: the Fed can't produce inflation because it never really could [1].

Update 13 July 2019: I wanted to add that this relationship between inflation and unemployment and the fading of it isn't about "expected" inflation (the expectations augmented Phillips curve), but observed inflation. It remains entirely possible that the "Lucas critique" is behind the fading — that agents learned how the Fed exploits the Phillips curve and so the relationship began to break down. Of course, the direct consequence is that apparently the Fed became a master of shaping expectations ... only to result in sub-target inflation after the Great Recession. It would also mean that the apparent match between rising labor force participation and the magnitude of the Phillips curve is purely a coincidence. I personally would go with Occam's razor here [2] — generally expectations-based theories verge on the unfalsifiable.


So 1. yes, wage growth and unemployment appear to be directly causally related; 2. wage growth and EPOP are not as closely or causally related; and 3. yes, the Phillips curve relationship between unemployment and the macro-relevant price level inflation has faded away as the surge of women entering the workforce ended.



[1] This is not to say a central bank can never create inflation — it could easily create hyperinflation, which is more a political problem than a macroeconomic mechanism. The cut-off between the "hyperinflation" effective theory and the "monetary policy is irrelevant" effective theory seems to be on the order of sustained 10% inflation. (Ina side note mentioned at that link, that might also be where MMT — or really any one-dimensional theory of how an economy works — is a good effective theory. Your economy simplifies to a single dimension when money printing, inflation and government spending all far outpace population and labor force growth.)

[2] Is granting the Fed and monetary policy control of inflation so important that we must come up with whatever theory allows it no matter how contrived?

[3] Update 14 July 2019: Here's the ECI version alongside the Atlanta Fed wage growth tracker data — graph originally from here. ECI's a bit too uncertain to see the positive shock in the 2014 mini-boom.

Thursday, July 11, 2019

Wage growth, inflation, interest rates, and employment

With the Fed hearings in Congress this week and some new data releases this week, I thought it'd be good to get a dynamic information equilibrium model (DIEM) snapshot just before the end of the month and what many people are thinking is going to be the first Fed rate cut since the Great Recession. The Atlanta Fed's Wage growth tracker was updated today and the latest results are in line with the DIEM forecast from a year and a half ago:

We're pretty much at the point where wage growth has reached the NGDP growth dynamic equilibrium, which I've speculated is the point where a recession is triggered (by e.g. wages eating into profits, resulting in falling investment). Of course, the NGDP series is noisy, but this is what the "limits to wage growth" picture looks like with an average-sized shock (in the wage growth time series):

Inflation (CPI all items, seasonally adjusted) came in today lower than the 2.5% dynamic equilibrium this month but well within the error bands. This is year-over-year and continuously compounded annual rate of change (i.e. log derivative):

But inflation doesn't give us much of a sign of a recession (it can react after the fact, but isn't a leading indicator).

A metric many people look at is the yield curve — I've been tracking the median of a collection of rate spreads (which basically matches the principal component). This is only loosely based on dynamic information equilibrium (i.e. there's a long-term tendency for interest rates to decline), but is really more a linear model of the interest rate data before the last three recessions (so caveat emptor) coupled with an AR process forecast:

That linear model gives us an estimate of when the yield curve should invert as an indication of a recession. One thing to note is that with the Fed potentially lowering interest rates at the end of the month, the path of the interest rate spread will likely "turnaround" and start climbing — it's done so in the past three recessions. That turnaround point has been between one and five quarters before the recession onset, but then the turnaround has also usually been at about -50 bp — these are indicated with the gray box on the next graph:

As a side note: when people say AR processes outperform DSGE models, this is an example of one of those AR processes.

If the fed lowers rates this month, then the turnaround will be 20-30 bp higher than the past three recessions — is this an indication of looser policy than in the past? Political pressure? This is not necessarily to say the Fed's rate decisions will have an impact. It's just a representation of how the Fed changes policy in the face of economic weakness. Much like how a person who sees themselves about to get in a car accident might tense up, tensing up does not do anything to mitigate or prevent the accident.

Earlier this week, JOLTS data came out. I've speculated that these measures are leading indicators, and it appears that shocks to JOLTS hires appear at around 5 months before shocks to the unemployment rate and around 11 months before shocks to wage growth (the model above) — the latter coming after the recession has begun. In any case, JOLTS quits appears to be showing a flattening indicating a turnaround:

I talked about this on Twitter a bit. In the last recession, hires led the pack but that might have been a result of the housing bubble where construction hires started falling nearly 2 years before the recession onset. If that was a one-off, then quits and openings look like the better indicators. Here's openings:

As a side note, I talk about that atypical early lead for hires in my book as an indication that potentially the big xenophobic outbreak around the 2006 election might have had an impact on the housing bubble (an earlier draft version appears here as a blog post).

Again, a lot of this is speculative — I'm trying to put out clear tests of the usefulness of the dynamic information equilibrium model for forecasting and understanding data. But the series that seem to lag recessions (wage growth, inflation) are right in line with the DIEMs, while the series that seem to lead recessions (JOLTS) are showing the signs of deviations.


Update 4:30pm PDT

Here's the 10-year-rate forecast from 2015 still doing much better than the BCEI forecast of roughly the same vintage ...

Friday, July 5, 2019

Labor market update: external validity edition

There were several threads on twitter (e.g. here, here, here) the past couple days that tie up under the theme of "external validity" versus "internal validity". It's a distinction that appears to mean something different in macroeconomics than it does in other sciences, but I can't quite put my finger on it. Operationally, its definition appears to imply you can derive some kind of "understanding" from a model that doesn't fit out-of-sample data.

Let's say I observe some humans running around, jumping over things at a track and field event. I go back to my computer and code up an idealized model of a human reproducing the appearance of a representative human and giving it some of the behaviors I saw. Now I want to use this model to derive some understanding when I experiment with some policy changes ... say, watching the interaction between the human and angry mushroom people ...

A lot of macro models are basically like this — neither internal validity nor external validity. It's just kind of a simulacrum — sure, Mario looks a bit like a person, and people can move around. But no one can jump that high or change direction of their jump 180° in mid-air. A more precise analogy would be the invented virtual economies of video games like Civilization or Eve Online, but they're still not real because there is no connection with macro data.

In science, a conclusion about e.g. effects of some treatment on mice may be internally valid (i.e. it was done correctly and shows a real and reproducible effect, per a snarky twitter account, in mice), but not externally valid (i.e. the effect will not not occur in humans). There's even a joke version of the linked "in mice" twitter account for DSGE models, but that's really not even remotely the same thing at all. DSGE models do not have internal validity in the scientific sense — they are not valid representations of even the subset of data they are estimated for. Or a better way to put it: we don't know if they are valid representations of the data they are estimated for.

We can know if the test on mice is internally valid — someone else can reproduce the results, or you can continue to run the experiment with more mice. Usually something like this is done in the paper itself. There's been a crisis in psychology recently due to failing to meet this standard, but it's knowable through doing the experiments again. 

We cannot know if a macro model is internally valid in this sense. Why? Because macro models are estimated using time series for individual countries. If I estimate a regression over a set of data from 1980-1990 for the US, there is no way to get more data from 1980-1990 for the US in the same way we can get more mice — I effectively used all the mice already. Having someone else estimate the model or run the codes isn't testing internal validity because it's basically just re-doing some math (though some models fail even this).

The macro model might be an incredibly precise representation of the US economy between 1980 and 1990 in the same way old quantum theory and the Bohr model was an incredibly precise representation of the energy levels of Hydrogen. But old quantum theory was wrong.

Macro models are sometimes described as having "internal consistency" which is sometimes confused for "internal validity" [1]. Super Mario Brothers is internally consistent, but it's not internally valid.

So if internal validity is "unknowable" for a macro model, we can look at external validity — out-of-sample data from other countries or other times, i.e. forecasting. It is through external validity that macro models gain internal validity — we can only know if a macro model is a valid description of the data it was tested on (instead of being a simulacrum) if it works for other data.

Which brings me to today's data release from BLS — and an unemployment rate forecast I've been tracking for over two years (click to enlarge):

Note that the model works not only for other countries (like Australia), but also different time series such as the prime age labor force participation rate also released today:

That is to say the dynamic information equilibrium model (DIEM) has demonstrated some degree of external validity. This basically obviates any talk about whether DSGE models, ABMs, or other macro models can be useful for "understanding" if they do not accurately forecast. There are models that accurately forecast — that is now the standard. If the model does not accurately forecast, then it lacks external validity which means it cannot have internal validity — we can ignore it [1].

That said, the DSGE model from FRB NY has been doing fairly well with inflation for bit over a year ... so even discussion of whether a DSGE model has to forecast accurately is obviated even if you are only considering DSGE models. They have to now — at least a year. This one has.



[1] Often people bring up microfoundations as a kind of logical consistency. A DSGE model has microfoundations, so even if it doesn't forecast exactly right the fact that we can fit macro data with a microfounded DSGE model provides some kind of understanding.

The reasoning is that we're extrapolating from the micro scale (agents, microfoundations) to the macro scale. It's similar to "external validity" except instead of moving to a different time (i.e. forecasting) or a different space (i.e. other countries), we are moving to a different scale. In physics, there's an excellent example of doing this correctly — in fact, it's related to my thesis. The quark model (QCD) is kind of like a set of microfoundations for nuclear physics. It's especially weird because we cannot really test the model very well at the micro scale (though recent lattice calculations have been getting better and better). The original tests of QCD came from extrapolating from the different energy scales (in the diagram below, ) using evolution equations. QCD was quite excellent at describing the data (click to enlarge):

Measurement of the structure function of a nucleon at one scale allows QCD to tell us what it looks like at another scale. We didn't prove QCD to be a valid description of reality at the scale it was formulated at in terms of quarks and gluons ("microfoundations"), but rather we extrapolated to different scales — external validity. Other experiments confirmed various properties of the quark microfoundations, but this experiment was one that confirmed the whole structure of QCD.

But we can in fact measure various aspects of the microfoundations of economics — humans, unlike quarks, are readily accessible without building huge accelerators. These often turn out to be wrong. But more importantly, the DSGE models extrapolated from these microfoundations do not have external validity — they don't forecast and economists don't use them to predict things at other scales (AFAICT) like, say, predicting state by state GDP.

What's weird is that the inability to forecast is downplayed, and the macro models are instead seen as providing some kind of "understanding" because they incorporate microfoundations, when in actuality the proper interpretation of the evidence and the DSGE construction is that either the microfoundations or the aggregation process are wrong. The only wisdom you should gain is that you should try something else.

Saturday, June 29, 2019

Median sales price of new houses

Data for the median sales price (MSP) of new houses was released this past week on FRED, and the data is showing a distinct correlated negative deviation which is generally evidence that a non-equilibrium shock is underway in the dynamic information equilibrium model (DIEM).

I added a counterfactual shock (in gray). This early on, there is a tendency for the parameter fit to underestimate the size of the shock (for an explicit example, see this version for the unemployment rate in the Great Recession). The model overall shows the housing bubble alongside the two shocks (one negative and one positive) to the level paralleling the ones seen in the Case Shiller index and housing starts.

This seems like a good time to look at the interest rate model and the yield curve / interest rate spreads. First, the interest rate model is doing extraordinarily well for having started in 2015:

I show the Blue Chip Economic Indicators forecast from 2015 as well as a recent forecast from the Wall Street Journal (click to embiggen):

And here's the median (~ principal component) interest rate spread we've been tracking for the past year (almost exactly — June 25, 2018):

If -28 bp was the lowest point (at the beginning of June), it's higher than previous three lowest points (-40 to -70 bp). Also, if it is in fact the lowest point, the previous three cycles achieved their lowest points between 1 and 5 quarters before the NBER recession onset. 

Friday, June 28, 2019

PCE inflation

The DIEM for PCE inflation continues to perform fairly well ... though it's not the most interesting model in the current regime (the lowflation period has ended).

Here's the same chart with other forecasts on it:

The new gray dot with a black outline shows the estimated annual PCE inflation for 2019 assuming the previous data is a good sample (this is not the best assumption, but it gives an idea where inflation might end up given what we know today). The purple dots with the error bars are Fed projections, and the other purple dotted line is the forecast from Jan Hatzius of Goldman Sachs.

Mostly just to troll the DSGE haters, here's the FRB NY DSGE model forecast compared to the latest data — it's doing great!

But then the DIEM is right on as well with smaller error bands ...

Monday, June 24, 2019

A Workers' History of the United States 1948-2020

After seven years of economic research and developing forecasting models that have outperformed the experts, author, blogger, and physicist Dr. Jason Smith offers his controversial insights about the major driving factors behind the economy derived from the data and it's not economics — it's social changes. These social changes are behind the questions of who gets to work, how those workers organize, and how workers identify politically — and it is through labor markets that these social changes manifest in economic effects. What would otherwise be a disjoint and nonsensical postwar economic history of the United States is made into a cohesive workers' history driven by women entering the workforce and the backlash to the Civil Rights movement — plainly: sexism and racism. This new understanding of historical economic data offers lessons for understanding the political economy of today and insights for policies that might actually work.
Dr. Smith is a physicist who began with quarks and nuclei before moving into research and development in signal processing and machine learning in the aerospace industry. During a government fellowship from 2011 to 2012 — and in the aftermath of the global financial crisis — he learned about the potential use of prediction markets in the intelligence community and began to assess their validity using information theoretic approaches. From this spark, Dr. Smith developed the more general information equilibrium approach to economics which has shown to have broader applications to neuroscience and online search trends. He wrote A Random Physicist Takes on Economics in 2017 documenting this intellectual journey and the change in perspective towards economic theory and macroeconomics that comes with this framework. This change in perspective to economic theory came with new interpretations of economic data over time that finally came together in this book.
The book I've been working on for the past year and a half — A Workers' History of the United States 1948-2020 — is now available on Amazon as a Kindle e-book or a paperback. Get your copy today! Head over to the book website for an open thread for your first impressions and comments. And pick up a copy of A Random Physicist Takes on Economics if you haven't already ...

Update 7am PDT 24 June 2019

The paperback edition still says "publishing" on KDP, but it should be ready in the next 24-48 hours. However, I did manage to catch what is probably a fleeting moment where the book is #1 in Macroeconomics:

Update 2pm PDT 24 June 2019

Paperback is live!

Sunday, June 23, 2019

Sometimes I feel like I don't see the data

Sometimes I feel like my only friend.

I've seen links to this nymag article floating around the interwebs that purports to examine labor market data for evidence that the Fed rate hike of 2015 was some sort of ominous thing:
But refrain they did not. 
Instead, the Federal Reserve began raising interest rates in 2015 ...
Scott Lemieux (a poli sci lecturer at the local university) puts it this way:
But the 2015 Fed Rate hike was based on false premises and had disastrous consequences, not only because of the direct infliction of unnecessary misery on many Americans, but because it may well have been responsible for both President Trump and the Republican takeover of the Senate, with a large amount of resultant damage that will be difficult or impossible to reverse. 
Are we looking at the same data? Literally nothing happened in major labor market measures in December of 2015 (here: prime age labor force participation, JOLTS hires, unemployment rate, wage growth from ATL Fed):

There were literally no consequences from the Fed rate hike in terms of labor markets. All of these time series continued along their merry log-linear equilibrium paths. It didn't even end the 2014 mini-boom (possibly triggered by Obamacare going into effect) which was already ending.

But it's a good opportunity to plug my book which says that the Fed is largely irrelevant (although it can make a recession worse). The current political situation is about changing alliances and identity politics amid the backdrop of institutions that under-weight urban voters.


Update + 30 minutes

Before someone mentions something about the way the BLS and CPS count unemployment, let me add that nothing happened in long term unemployment either:

The mini-boom was already fading. Long term unemployment has changed, but the change (like the changes in many measures) came in the 90s.

Thursday, June 20, 2019

Resolving the Cambridge capital controversy with logic

So I wrote somewhat tongue-in-cheek blog post a few years ago titled "Resolving the Cambridge capital controversy with abstract algebra" [RCCC I] that called the Cambridge Capital Controversy [CCC] for Cambridge, UK in terms of the original debate they they were having — summarized by Joan Robinson's claim that you can't really add apples and oranges (or in this case printing presses and drill presses) to form a sensible definition of capital. I used a bit of group theory and the information equilibrium framework to show that you can't simply add up factors of production. I mentioned at the bottom of that post that there are really easy ways around it — including a partition function approach in my paper — but Cambridge, MA (Solow and Samuelson) never made those arguments.

On the Cambridge, MA side no one seemed to care because the theory seemed to "work" (debatable). A few years passed and eventually Samuelson conceded Robinson and Sraffa were in fact right about their re-switching arguments. A short summary is available in an NBER paper from Baqaae and Farhi, but what interested me about that paper was that the particular way they illustrated it made it clear to me that the partition function approach also gets around the re-switching arguments. So I wrote that up in a blog post with another snarky title "Resolving the Cambridge capital controversy with MaxEnt" [RCCC II] (a partition function is maximum entropy distribution or MaxEnt).

This of course opened a can of worms on Twitter when I tweeted out the link to my post. The first volley was several people saying Cobb-Douglas functions were just a consequence of accounting identities or that they fit any data — a lot of which was based on papers by Anwar Shaikh (in particular the "humbug" production function). I added an update to my post saying these arguments were disingenuous — and in my view academic fraud because they rely on a visual misrepresentation of data as well as a elision of the direction of mathematical implication. Solow pointed out the former in his 1974 response to Shaikh's "humbug" paper (as well as the fact that Shaikh's data shows labor output is independent of capital which would render the entire discussion moot if true), but Shaikh has continued to misrepresent "humbug" until at least 2017 in an INET interview on YouTube.

The funny thing is that I never really cared about the CCC — my interest on this blog is research into economic theory based on information theory. RCCC I and RCCC II were both primarily about how you would go about addressing the underlying questions in the information equilibrium framework. However, the subsequent volleys have brought up even more illogical or plainly false arguments against aggregate production functions that seem to have sprouted in the Post-Keynesian walled garden. I believe it's because "mainstream" academic econ has long since abandoned arguing about it, and like my neglected back yard a large number of weeds have grown up. This post is going to do a bit of weeding.

Constant factor shares!

Several comments brought up that Cobb-Douglas production functions can fit any data assuming (empirically observed) constant factor shares. However, this is just a claim that the gradient 

\nabla = \left( \frac{\partial}{\partial \log L} , \frac{\partial}{\partial \log K} \right)

is constant, which a fortiori implies a Cobb-Douglas production function

\log Y = a \log L + b \log K + c

A backtrack is that it's only constant factor shares in the neighborhood of observed values, but that just means Cobb-Douglas functions are a local approximation (i.e. the tangent plane in log-linear space) to the observed region. Either way, saying "with constant factor shares, Cobb Douglas can fit any data" is saying vacuously "data that fits a Cobb-Douglas function can be fit with a Cobb-Douglas function". Leontief production functions also have constant factor shares locally, but in fact have two tangent planes, which just retreats to the local description (data that is locally Cobb-Douglas can be fit with a local Cobb-Douglas function).

Aggregate production functions don't exist!

The denial that the functions even exist is by far the most interesting argument, but it's still not logically sound. At least it's not disingenuous — it could just use a bit of interdisciplinary insight. Jo Michell linked me to a paper by Jonathan Temple with the nonthreatening title "Aggregate production functions and growth economics" (although the filename is "Aggreg Prod Functions Dont Exist.Temple.pdf" and the first line of the abstract is "Rigorous approaches to aggregation indicate that aggregate production functions do not exist except in unlikely special cases.")

However, not too far in (Section 2, second paragraph) it makes a logical error of extrapolating from $N = 2$ to $N \gg 1$:
It is easy to show that if the two sectors each have Cobb-Douglas production technologies, and if the exponents on inputs differ across sectors, there cannot be a Cobb-Douglas aggregate production function.
It's explained how the argument proceeds in a footnote:
The way to see this is to write down the aggregate labour share as a weighted average of labour shares in the two sectors. If the structure of output changes, the weights and the aggregate labour share will also change, and hence there cannot be an aggregate Cobb-Douglas production function (which would imply a constant labour share at the aggregate level).
This is true for $N = 2$, because the change of one "labor share state" (specified by $\alpha_{i}$ for a individual sector $y_{i} \sim k^{\alpha_{i}}$) implies an overall change in the ensemble average labor share state $\langle \alpha \rangle$. However, this is a bit like saying if you have a two-atom ideal gas, the kinetic energy of one of the atoms can change and so the average kinetic energy of the two-atom gas doesn't exist therefore (rigorously!) there is no such thing as temperature (i.e. a well defined kinetic energy $\sim k T$) for an ideal gas in general with more than two atoms ($N \gg 1$) except in unlikely special cases.

I was quite surprised that econ has disproved the existence of thermodynamics!

Joking aside, if you have more than two sectors, it is possible you could have an empirically stable distribution over labor share states $\alpha_{i}$ and a partition function (details of the approach appear in my paper):

Z(\kappa) = \sum_{i} e^{- \kappa \alpha_{i}}

take $\kappa \equiv \log (1+ (k-k_{0})/k_{0})$ which means

\langle y \rangle \sim k^{\langle \alpha \rangle}

where the ensemble average is

\langle X \rangle \equiv \frac{1}{Z} \sum_{i} \hat{X} e^{- \kappa \alpha_{i}}

There are likely more ways than this partition function approach based on information equilibrium to get around the $N = 2$ case, but we only need to construct one example to disprove nonexistence. Basically this means that unless the output structure of a single firm affects the whole economy, it is entirely possible that the output structure of an ensemble of firms could have a stable distribution of labor share states. You cannot logically rule it out.

What's interesting to me is that in a whole host of situations, the distributions of these economic states appear to be stable (and in some cases in an unfortunate pun, stable distributions). For some specific examples, we can look at profit rate states and stock growth rate states.

Now you might not believe these empirical results. Regardless, the logical argument is not valid unless your model of the economy is unrealistically extremely simplistic (like modeling a gas with a single atom — not too unlike the unrealistic representative agent picture). There is of course the possibility that empirically this doesn't work (much like it doesn't work for a whole host of non-equilibrium thermodynamics processes). But Jonathan Temple's paper is a bunch of wordy prose with the odd equation — it does not address the empirical question. In fact, Temple re-iterates one of the defenses of the aggregate production function approaches that has vexed these theoretical attempts to knock them down (section 4, first paragraph):
One of the traditional defenses of aggregate production functions is a pragmatic one: they may not exist, but empirically they ‘seem to work’.
They of course would seem to work if economies are made up of more than two firms (or sectors) and have relatively stable distributions of labor share states.

To put it yet another way, Temple's argument relies on a host of unrealistic assumptions about an economy — that we know the distribution isn't stable, and that there are only a few sectors, and that the output structure of these few firms changes regularly enough to require a new estimate of the exponent $\alpha$ but not regularly enough that the changes create a temporal distribution of states.

Fisher! Aggregate production functions are highly constrained!

There's a lot of references that trace all the way back to Fisher (1969) "The existence of aggregate production functions" and several people who mentioned Fisher or work derived from his papers. The paper is itself a survey of restrictions believed to constrain aggregate production functions, but it seems to have been written from the perspective that an economy is a highly mathematical construct that can either only be described by $C^{2}$ functions or not at all. In a later section (Sec. 6) talking about whether maybe aggregate production functions can be good approximations, Fisher says:
approximations could only result if [the approximation] ... exhibited very large rates of change ... In less technical language, the derivatives would have to wiggle violently up and down all the time.
Heaven forbid were that the case!

He cites in a footnote the rather ridiculous example of $\lambda \sin (x/\lambda)$ (locally $C^{2}$!) — I get the feeling he was completely unaware of stochastic calculus or quantum mechanics and therefore could not imagine a smooth macroeconomy made up of noisy components, only a few pathological examples from his real analysis course in college. Again, a nice case for some interdisciplinary exchange! I wrote a post some years ago about the $C^{2}$ view economists seem to take versus a far more realistic noisy approach in the context of the Ramsey-Cass-Koopmans model. In any case, why exactly should we expect firm level production functions to be $C^{2}$ functions that add to a $C^{2}$ function?

One of the constraints Fisher notes is that individual firm production functions (for the $i^{th}$ firm) must take a specific additive form:

f_{i}(K_{i}, L_{i}) = \phi_{i}(K_{i}) + \psi_{i}(L_{i})

This is probably true if you think of an economy as one large $C^{2}$ function that has to factor (mathematically, like, say, a polynomial) into individual firms. But like Temple's argument, it denies the possibility that there can be stable distributions of states $(\alpha_{i}, \beta_{i})$ for individual firm production functions (that even might change over time!) such that 

Y_{i} = f_{i}(K_{i}, L_{i}) = K_{i}^{\alpha_{i}}L_{i}^{\beta_{i}}


\langle Y \rangle \sim K^{\langle \alpha \rangle} L^{\langle \beta \rangle}

The left/first picture is a bunch of random production functions with beta distributed exponents. The right/second picture is an average of 10 of them. In the limit of an infinite number of firms, constant returns to scale hold (i.e. $\langle \alpha \rangle + \langle \beta \rangle \simeq 0.35 + 0.65 = 1$) at the macro level — however individual firms aren't required to have constant returns to scale (many don't in this example). In fact, none of the individual firms have to have any of the properties of the aggregate production function. (You don't really have to impose that constraint at either scale — and in fact, in the whole Solow model works much better empirically in terms of nominal quantities and without constant returns to scale.) Since these are simple functions, they don't have that many properties but  we can include things like constant factor shares or constant returns to scale.

The information-theoretic partition function approach actually has a remarkable self-similarity between macro (i.e. aggregate level) and micro (i.e. individual or individual firm level) — this self-similarity is behind the reason why Cobb-Douglas or diagrammatic ("crossing curve") models at the macro scale aren't obviously implausible.

Both the arguments of Temple and Fisher seem to rest on strong assumptions about economies constructed from clean, noiseless, abstract functions — and either a paucity or surfeit of imagination (I'm not sure). It's a kind of love-hate relationship with neoclassical economics — working within its confines to try to show that it's flawed. A lot of these results are cases of what I personally would call mathiness. I'm sure Paul Romer might think they're fine, but to me they sound like an all-too-earnest undergraduate math major fresh out of real analysis trying to tell us what's what. Sure, man, individual firms production functions are continuous and differentiable additive functions. So what exactly have you been smoking?

These constraints on production functions from Fisher and Temple actually remind me a lot of Steve Keen's definition of an equilibrium that isn't attainable — it's mathematically forbidden! It's probably not a good definition of equilibrium if you can't even come up with a theoretical case that satisfies it. Fisher and Temple can't really come up with a theoretical production function that meets all their constraints besides the trivial "all firms are the same" function. It's funny that Fisher actually touches on that in one of his footnotes (#31):
Honesty requires me to state that I have no clear idea what technical differences actually look like. Capital augmentation seems unduly restrictive, however. If it held, all firms would produce the same market basket of outputs and hire the same relative collection of labors.
But the bottom line is that these claims to have exhausted all possibilities are just not true! I get the feeling that people have already made up their minds which side of the CCC they stand on, and it doesn't take much to confirm their biases so they don't ask questions after e.g. Temple's two sector economy. That settles it then! Well, no ... as there might be more than two sectors. Maybe even three!

Monday, June 17, 2019

Resolving the Cambridge capital controversy with MaxEnt

I came across this 2018 NBER working paper from Baqaee and Farhi again today (on Twitter) after seeing it around the time it came out. The abstract spells it out:
Aggregate production functions are reduced-form relationships that emerge endogenously from input-output interactions between heterogeneous producers and factors in general equilibrium. We provide a general methodology for analyzing such aggregate production functions by deriving their first- and second-order properties. Our aggregation formulas provide non-parametric characterizations of the macro elasticities of substitution between factors and of the macro bias of technical change in terms of micro sufficient statistics. They allow us to generalize existing aggregation theorems and to derive new ones. We relate our results to the famous Cambridge- Cambridge controversy.
One thing that they do in their paper is reference Samuelson's (version of Robinson's and Sraffa's) re-switching arguments. I'll quote liberally from the paper (this is actually the introduction and Section 5) because it sets up the problem we're going to look at:
Eventually, the English Cambridge prevailed against the American Cambridge, decisively showing that aggregate production functions with an aggregate capital stock do not always exist. They did this through a series of ingenious, though perhaps exotic looking, “re-switching” examples. These examples demonstrated that at the macro level, “fundamental laws” such as diminishing returns may not hold for the aggregate capital stock, even if, at the micro level, there are diminishing returns for every capital good. This means that a neoclassical aggregate production function could not be used to study the distribution of income in such economies. 
... In his famous “Summing Up” QJE paper (Samuelson, 1966), Samuelson, speaking for the Cambridge US camp, finally conceded to the Cambridge UK camp and admitted that indeed, capital could not be aggregated. He produced an example of an economy with “re-switching”: an economy where, as the interest rate decreases, the economy switches from one technique to the other and then back to the original technique. This results in a non-monotonic relationship between the capital-labor ratio as a function of the rate of interest r. 
... [In] the post-Keynesian reswitching example in Samuelson (1966). ...  [o]utput is used for consumption, labor can be used to produce output using two different production functions (called “techniques”). ... the economy features reswitching: as the interest rate is increased, it switches from the second to the first technique and then switches back to the second technique.
I wrote a blog post four years ago titled "Resolving the Cambridge capital controversy with abstract algebra" which was in part tongue-in-cheek, but also showed how Cambridge, UK (Robinson and Sraffa) had the more reasonable argument. With Samuelson's surrender summarized above, it's sort of a closed case. I'd like to re-open it, and show how a resolution in my blog post renders the post-Keynesian re-switching arguments as describing pathological cases unlikely to be realized in a real system — and therefore calling the argument in favor of the existence of aggregate production functions and Solow and Samuelson.

To some extent, this whole controversy is due to economists seeing economics as a logical discipline — more akin to mathematics — instead of an empirical one — more akin to the natural sciences. The pathological case of re-switching does in fact invalidate a general rigorous mathematical proof of the existence of aggregate production functions in all cases. But it is just that — a pathological case. It's the kind of situation where you have to show instead some sort of empirical evidence it exists until you take the impasse it presents to mathematical existence seriously.

If you follow through the NBER paper, they show a basic example of re-switching from Samuelson's 1966 paper. As the interest rate increases, one of the "techniques" becomes optimal over the other and we get a shift in capital to output and capital to labor:

Effectively, this is a shift in $\alpha$ in a production function

Y \sim K^{\alpha} L^{1-\alpha}

or more simply in terms of the neoclassical model in per-labor terms ($x \equiv X/L$)

y \sim k ^{\alpha}

That is to say in one case we have $y \sim k^{\alpha_{1}}$ and $y \sim k^{\alpha_{2}}$ in the other. As the authors of the paper put it:
The question we now ask is whether we could represent the disaggregated post-Keynesian example as a version of the simple neoclassical model with an aggregate capital stock given by the sum of the values of the heterogeneous capital stocks in the disaggregated post-Keynesian example. The non-monotonicity of the capital-labor and capital-output ratios as a function of the interest rate shows that this is not possible. The simple neoclassical model could match the investment share, the capital share, the value of capital, and the value of the capital-output and capital-labor ratios of the original steady state of the disaggregated model, but not across steady states associated with different values of the interest rate. In other words, aggregation via financial valuation fails.
But we must stress that this is essentially one (i.e. representative) firm with this structure, and that across a real economy, individual firms would have multiple "techniques" that change in a myriad ways — and there would be many firms.

The ensemble approach to information equilibrium (where we have a large number of production  functions $y_{i} \sim k^{\alpha_{i}}$) recovers the traditional aggregate production function (see my paper here), but with ensemble average variables (angle brackets) evaluated with a partition function:

\langle y \rangle \sim k^{\langle \alpha \rangle}

(see the paper for the details). This formulation does not depend on any given firm staying in a particular "production state" $\alpha_{i}$, and it is free to change from any one state to another in a different time period or at a different interest rate. The key question is that we do not know which set of $\alpha_{i}$ states describes every firm for every interest rate. With constant returns to scale, we are restricted to $\alpha$ states between zero and one, but we have no other knowledge available without a detailed examination of every firm in the economy. We'd be left to a uniform distribution over [0,1] if that is all we had, but we could (in principle) average the $\alpha$'s we observe and constrain our distribution to respect $\langle \alpha \rangle$ to be some (unknown) real value in [0, 1]. That defines a beta distribution:

Getting back to the Samuelson example, I've reproduced the capital to labor ratio:

Of course, our model has no compunctions against drawing a new $\alpha$ from a beta distribution for any value of the interest rate ...

That's a lot of re-switching. If we have a large number of firms, we'll have a large number of re-switching (micro) production functions — Samuelson's post-Keynesian example is but one of many paths:

The ensemble average (over that beta-distribution above) produces the bolder blue line:

This returns a function with respect to the interest rate that approximates a constant $\alpha$ as a function of the interest rate — and which only gets better as more firms are added and more re-switching is allowed:

This represents an emergent aggregate production function smooth in the interest rate where each individual production function is non-monotonic. The aggregate production function of the Solow model is in fact well-defined and does not suffer from the issues of re-switching unless the draw from the distribution is pathological — for example, all firms being the same or, equivalently, a representative firm assumption). 

This puts the onus on the Cambridge, UK side to show that empirically such cases exist and are common enough to survive aggregation. However, if we do not know about the production structure of a sizable fraction of firms with respect to a broad swath of interest rates, we must plead ignorance and go with maximum entropy. As the complexity of an economy increases, we become less and less likely to see a scenario that cannot be aggregated.

Again, I mentioned this back four years ago in my blog post. The ensemble approach offers a simple workaround to the inability to simply add apples and oranges (or more accurately printing presses and drill presses). However, the re-switching example is a good one to show how a real economy — with heterogeneous firms and heterogeneous techniques — can aggregate into a sensible macroeconomic production function.


Update 18 June 2019

I am well aware of the Cobb-Douglas derangement syndrome associated with the Cambridge capital controversy that exists in on Econ twitter and the econoblogosphere (which is in part why I put that gif with the muppet in front of a conflagration on the tweets about this blog post ... three times). People — in particular post-Keynesian acolytes — hate Cobb-Douglas production functions. One of the weirder strains of thought out there is that a Cobb-Douglas function can fit any data arbitrarily well. This plainly false as

a \log X + b \log Y + c

is but a small subset of all possible functions $f(X, Y)$. Basically, this strain of thought is equivalent to saying a line $y = m x + b$ can fit any data.

A subset of this mindset appears to be a case of a logical error based on accounting identities. There have been a couple papers out there (not linking) that suggest that Cobb-Douglas functions are just accounting identities. The source of this might be that you can approximate any accounting identity by a Cobb Douglas form. If we define $X \equiv \delta X + X_{0}$, then

X_{0} \left( \log (\delta X + X_{0}) + 1\right) + Y_{0} \left( \log (\delta Y + Y_{0}) + 1\right) + C

is equal to $X + Y$ for $\delta X / X_{0} \ll 1$ if

C \equiv - X_{0} \log X_{0}- Y_{0} \log Y_{0}

That is to say you can locally approximate an accounting identity by taking into account that log linear is approximately linear for small deviations.

It appears that some people have taken this $p \rightarrow q$ to mean $q \rightarrow p$ — that any Cobb Douglas form $f(X, Y)$ can be represented as an accounting identity $X+Y$. That is false in general. Only the form above under the conditions above can do so, so if you have a different Cobb Douglas function it cannot be so transformed.

Another version of this thinking (from Anwar Shaikh) was brought up on Twitter. Shaikh has a well-known paper where he created the "Humbug" production function. I've reproduced it here:

I was originally going to write about something else here, but in working through the paper and reproducing the result for the production function ...

... I found out this paper is a fraud. Because of the way the values were chosen, the resulting production function has no dependence on the variation in $q$ aside from an overall scale factor. Here's what happens if you set $q$ to be a constant (0.8) — first "HUMBUG" turns into a line:

And the resulting production function? It lies almost exactly on top of the original:

It's not too hard to pick a set of $q$ and $k$ data that gives a production function that looks nothing like a Cobb-Douglas function by just adding some noise:

The reason can be seen in the table and relies mostly on Shaikh's choice of the variance in the $k$ values (click to enlarge):

But also, if we just plot the $k$-values and the $q$-values versus time, we have log-linear functions:

Is it any surprise that a Cobb-Douglas production function fits this data? Sure, it seems weird if we look at the "HUMBUG" parametric graph of $q$ versus $k$, but $k(t)$ and $q(t)$ are lines. The production function is smooth because the variance in $A(t)$ depends almost entirely on the variance in $q(t)$ so that taking $q(t)/A(t)$ leaves approximately a constant. The bit of variation left is the integrated $\dot{k}/k$, which is derived from a log-linear function — so it's going to have a great log-linear fit. It's log-linear!

Basically, Shaikh mis-represented the "HUMBUG" data as having a lot of variation — obviously nonsense by inspection, right?! But it's really just two lines with a bit of noise.


Update + 2 hours

I was unable to see the article earlier, but apparently this is exactly what Solow said. Solow was actually much nicer (click to enlarge):

The cute HUMBUG numerical example tends to bowl you over at first, but when you think about it for a minute it turns out to be quite straightforward in terms of what I have just said. The made-up data tell a story, clearer in the table than in the diagram. Output per worker is essentially constant in time. There are some fluctuations but they are relatively small, with a coefficient of variation about 1/7. The fact that the fluctuations are made to spell HUMBUG is either distraction or humbug. The series for capital per worker is essentially a linear function of time. The wage share has small fluctuations which appear not to be related to capital per worker. If you as any systematic method or educated mind to interpret those data using a production function and the marginal productivity relations, the answer will be that they are exactly what would be produced by technical regress with a production function that must be very close to Cobb-Douglas.
Emphasis in the original. That's exactly what the graph above (and reproduced below) shows. Shaikh not only does not address this comment in his follow up — he quotes only the last sentence of this paragraph and then doubles down on eliding the HUMBUG data as representative of "any data":
Yet confronted with the humbug data, Solow says: “If you ask any systematic method or any educated mind to interpret those data using a production function and the marginal productivity relations, the answer will be that they are exactly what would be produced by technical regress with a production function that must be very close to Cobb-Douglas” (Solow, 1957 [sic], p. 121). What kind of “systematic method” or “educated mind”  is it that can interpret almost any data, even the humbug data, as arising from a neoclassical production function?
This is further evidence that Shaikh is not practicing academic integrity. Even after Solow points out that "Output per worker is essentially constant in time ... The series for capital per worker is essentially a linear function of time" continues to suggest that "even the humbug data" is somehow representative of the universe of "any data" when it is in fact a line.

The fact that Shaikh chose to graph "HUMBUG" rather than this time series is obfuscation and in my view academic fraud. As of 2017, he continues to misrepresent this paper in an Institute for New Economic Thinking (INET) video on YouTube saying "... this is essentially an accounting identity and I illustrated that putting the word humbug and putting points on the word humbug and showing that I could fit a perfect Cobb-Douglas production function to that ..."


Update 20 June 2019

I did want to add a bit about how the claims about the relationship between Cobb-Douglas production functions and accounting identities elide the direction of implication. Cobb-Douglas implies an accounting identity holds, but the logical content of the accounting identity on its own is pretty much vacuous without something like Cobb-Douglas. In his 2005 paper, Shaikh elides the point (and also re-asserts his disingenuous claim about the humbug production function above).