Sunday, February 23, 2020

A future response from Random Critical Analysis?

I'll be addressing the sole constructive, objective argument you made in that blog post within the next few days.  I'll try to be respectful and fair, even though you seem to be unable or unwilling to reciprocate.
That's regarding my previous post. There were quite a lot of right wing, conservative, and libertarian Twitter accounts (amazingly, groups that don't like government health care) who seemed to think that I was being a jerk, disrespectful, or making ad hominem arguments — and that somehow reflected on what I was saying. I pause to note that saying I'm being a jerk and using that to somehow dismiss what I was saying is itself argumentum ad hominem. That aside, no one is owed someone else's respect. 

I was pointing out that we should not assume RCA is competent — hard to imagine that ever coming across as respectful to RCA! And it's true that is ad hominem! But expertise in stats as well as using software to run regressions was very much at issue. He is making arguments (and mistakes!) that are unsupportable for reasons that have to do with the details of how regressions work.

There are also additional reasons to believe RCA is entirely biased (he called me a socialist, lol), and his first response was that I was attacking his analysis because he disagreed with my politics. Overall, since he didn't disclose who he is or his politics — and that we can gather from his statements that his politics agree with the finding that health care spending in the US is perfectly normal — he really needs to do a lot more "leaning over backwards" because of that bias and failure to disclose.

I'm sorry to say models and statistics can be manipulated in multiple ways. And people have a way of finding out how to present results that agree with their preconceived ideas. People have a tendency to believe that regressions don't lie, but just like photographs and film editing there are lots of things you can do to present something that is not completely honest (even if it's superficially correct) — and that's especially effective on people who aren't well versed in stats (just like people who don't work with Photoshop aren't as good at spotting alterations in pictures).

If we find some paper out there touting the benefits of free market economics from the Cato Institute or Mercatus Center, we all have reason to be suspect. I'm not saying it's something they can't overcome — I've actually cited a Mercatus fellow in my book. But it requires some extra leaning over backward (e.g. a Mercatus fellow writing in their paper "I know you're reading this and thinking of course someone at Mercatus found this, but ..." and going on to explain). We don't even know who RCA is which obscures our ability to do this kind of due diligence.

With that out of the way, what did he think the "sole constructive, objective argument" was?

The presumably substantive part of his argument is little more than a quibble over choice of model.  Even if he were objectively correct in this, which he isn’t, it leaves the vast majority of my arguments intact.  He just makes himself look like an asshole.
Someone preferring a different model than you is not a priori evidence of bad faith.  You're coming into this w/ the presumption that you know the one true way to do this, ignoring that this approach is quite common, and without having the perspective of the rest of the evidence.
Apparently, he's a double space after a period person. Ugh. Unfortunately, this is incorrect in multiple ways.

First, I was to write a basic math book but started it off with a whole table of incorrect addition facts, like 2 + 2 = 5, you have no reason to continue. This doesn't leave the rest of the book "intact". There is just no reason to continue past the first point because it's fundamentally flawed, and the rest can be presumed to be as flawed.

Second, while "this approach is quite common", it is not used to extrapolate a nonlinear coefficient that far outside the range of the data (see the 2/22 update, part II of my blog post for more details). The difference of logs ceases to be a percentage for more than ~ 10% differences, and while you might extrapolate an elasticity in economics you don't do it for a factor of 2 difference from the data you used to estimate it.

Third, I do not think the linear model is a better model and never said that [1] — I said it was indistinguishable from the nonlinear one he uses over the non-US data and results in a different conclusion in terms of the US being an outlier. If I have two models that are equally valid and one implies one conclusion and one implies another, it's really a toss-up. You don't draw either conclusion. At least if  you're being scientific. Leaning over backwards requires you support both conclusions or neither.

It's selecting one or the other without saying both are equally good that's evidence of bad faith. The reason RCA claims the nonlinear fit is better is because it makes it such that the US not an outlier — which is the conclusion he is trying to find. That is to say he has to assume the US is not an outlier in order to select his nonlinear specification. I think this point is lost on a lot of people rising to RCA's defense [2].

But I discovered something else about why he chose the nonlinear model that goes to RCA's incompetence here. He appears to have selected that nonlinear model because of the better relative to the of the linear model. This is hilariously wrong. RCA was apparently complaining about being blocked by a statistics professor who complained about the exact same thing I am complaining about [3]:


The tweets RCA was blocked for? These:


is not really a valid metric for nonlinear models aside from polynomials. This is a great simple explanation from a statistics consultant and author. I'm not sure if RCA is using a polynomial here or a nonlinear model (such as a log regression as in graph under discussion in my previous post). He's used polynomials in the past, and makes sense for those. But the thing is that is a monotonically increasing function of the number of variables, so comparing a linear model and a second order polynomial, the latter will always win. That's what the stats professor ("Statistical Ideas") is saying — it's understandable why he'd block someone who obviously doesn't understand what he's talking about.

But here RCA is also comparing R² for a linear and either a nonlinear model or a polynomial model which is either meaningless because a) linear and 2+ order polynomial are not commensurate, or b) a nonlinear model  is not valid.

I'm not optimistic about RCA actually responding to my criticisms. I imagine he'll do something like the graphs above — comparing a nonlinear  to a linear one — which will just demonstrate his incompetence further. Such is the problem with the Dunning-Kruger effect.

...

Footnotes:

[1] I did say it was better in terms of absolute error, but that's just a single metric. Otherwise I said "Over the non-US data, the linear fit (brownish dashed line) is basically as good as the nonlinear fit ... And over the entire range, the nonlinear fit falls inside the 90% confidence limits of the linear model" (emphasis added)

[2] Another weird defense was that finding any proportional relationship, linear or not, proves that health care expenses rise with income. The thing is the data set includes a lot of developing countries. I could see health care expenses rising with income for Mexico or Latvia.

And if that was the case, why stand up for RCA's specific proportional relationship — not one person who made the claim that any proportional relationship proved the point also said that RCA's analysis was garbage. It was more of  a rhetorical tack than a logical one. We're Karl Smith and Lyman Stone agreeing that RCA's analysis was garbage?

"So what if this analysis is garbage, anyone who says it's proportional proves the underlying point of the garbage analysis!" 

If I had a study that said that minimum wages reduced employment and you came along with a critique that said my analysis made major errors and should have resulted in a much smaller effect if the math was right, I don't come back and say "See, there's still an effect!"

[3] RCA has a habit of basically ignoring other people's expertise when it disagrees with him.

Friday, February 21, 2020

Leaning over backwards: health care edition

It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty—a kind of leaning over backwards.  For example, if you’re doing an experiment, you should report everything that you think might make it invalid—not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked—to make sure the other fellow can tell they have been eliminated. 
I need to start off with what should become an obligatory statement noting that while Feynman was pretty good about describing what it means to participate in the process of science, he was also a sexist jerk and a prime example of toxic masculinity.

Anyway, I can't remember how I came across it a year ago (I think via Steve Roth), but the anonymous "random critical analysis" [RCA] is back in the econoblogosphere (it's still around, I swear!) — this time amplified by Alex Tabarrok at Marginal Revolution (not going to link). He's still peddling his wares. He claims:
Health spending is overwhelmingly determined by the average real income enjoyed by nations’ residents in the long run.
Emphasis in the original. I think it's a good example of how relying on an internet rando [0] to do quantitative analysis on major policy issues can lead us astray. It's also a good case study in how to identify the sometimes subtle choices that end up ensuring the conclusions as well as the sometimes even subtler hints that people are overselling their competence.

I have no idea why he emphasized "residents" because the rest of the sentence isn't exactly that precise. You could dig into the PPP adjusted AIC et cetera, but I'd first like to focus on the "determined". The next graph in the extended blog post is a log-linear regression that actually has no causal interpretation — health care spending is "determining" income as much as income is "determining" health care spending:

Just because you chose the x-axis to be income and the y-axis to be health care spending and performed a regression does not mean you've found a causal relationship that "determines" things. Certainly, there is likely some relationship! Per capita income of a country seems like a plausible variable in a model of health care expenses. In fact, it's what you'd expect if health care was becoming more and more unaffordable as costs outstrip the ability to pay for them (a mechanism that seems to be at play for housing prices in the US). RCA's big innovation is saying that instead of this being a problem, this is what people want.

RCA makes a big point about using log-log graphs in the opening paragraphs, but one of the problems here is that he shows only the log-log version of this graph because — as we'll see below — the linear version looks pretty silly. We'll also see that lines on log-log graphs help conceal the choice (and it is a deliberate choice) of a nonlinear function. However, there's also this: 
In case you’re not already aware, these slopes [on log-log plots] can be readily interpreted in percentage terms. ... For example, the slope in the health expenditure-income plot below implies that a 1% increase in underlying value on the x-axis (income) predicts a 1.8% increase in value on the y-axis (health spending).
In reading this, I'm pretty sure that RCA doesn't actually understand that the 1.8 in that slope of the log-log graph means the growth rate of health spending is 1.8 times the growth rate in income. Sure, if incomes rise 1%, then health care will rise 1.8%. But if incomes rise 5%, then health care will rise 9%. Taking the difference in logs, you can substitute a (small) percentage in for x and get a percentage out as y, but you can't interpret the slope as a percentage unless that percentage is 180%. It's subtle, but it's the kind of thing you look for when teaching students because it helps you see if they understand the material or are just mechanically reproducing results.

I mean, in case you're not already aware [1] ...

Ok, next I have a bit of a nit — but it's kind of a theme of general sloppiness with RCA [2]. At that footnote we can learn a bit about orthogonal polynomials, but here we can learn a bit about significant figures. I was able to reproduce the graph above, but the equation in the annotation (like in the case in footnote [2]) gives an entirely different result (click to enlarge):


Blue is RCA's curve, gray dashed lines are my nonlinear model lines (either fit myself, or as above, using the equation RCA wrote down or rounding), and red is the data except for the US which is in green matching the original graph.

RCA rounded the two coefficients accurately (albeit to different orders), but it's really obvious we're on logarithmic axes when rounding from 9.88 to 10 moves you entirely off the data. RCA makes the claim that the fit is robust to leaving out the US, and if you look at the equations (especially if rounded by RCA's method) they are pretty close:

y = − 9.88 + 1.77 x          with the US

y = − 9.68 + 1.75 x          without the US

And that new equation (dashed gray) even looks pretty close on that log plot:


As an aside, this also tells us that RCA decided to do his analysis of the US not being an outlier in healthcare spending by including the US in the fits. That's just not how that works. Anyway, transforming back to linear space, we've gone down by 10% at US levels of income:


At 17% of GDP these days, a 10% reduction in health care spending in the US (~ 2% of GDP) would be a significant improvement! However, the US still isn't an outlier in this view — only about as much as Ireland. That's because this view is still effectively "a quadratic fit for no reason" that RCA was on about a year ago. If

y = a x² 

then

log(y) = log(a) + 2 log(x)

The fits above show ~ 1.8 instead of 2 so we have x^1.8 instead of x^2, but we're still fitting a nonlinear function to the data. Why? As far as I can tell the only reason is that it tells the story the author wants.

This is where we get back to Feynman. Leaning over backwards with honesty would force us to ask why we should throw out a simple linear fit. It's not accuracy over range of the data — unless we're already assuming the US is not an outlier and including it in the fits.


Over the non-US data, the linear fit (brownish dashed line) is basically as good as the nonlinear fit — the brown dashed curve and the gray dashed curve fall on top of each other until we get out to the US. And over the entire range, the nonlinear fit falls inside the 90% confidence limits of the linear model [3]. In fact, the linear model is actually much better than the nonlinear model in terms of absolute error [4]:


RCA's nonlinear model (fit correctly) actually predicts US health care spending will be somewhere between $7000 per capita and $13000 (90% CL) — and $8000, solidly within one sigma, is right in line with the linear model. That means that unless we include the US we do not have any reason to select the nonlinear model here. Or another way — assuming the US is not an outlier, US is not an outlier.

You may ask why I haven't gone in depth on the rest of the avalanche of graphs that follow. Unlike a novel where an unreliable narrator can be interesting, in science it's anathema. You have to go for more than just the superficial honesty of not deliberately lying, but rather leaning over backwards to show that your biases aren't driving the conclusions. In short, this is an example of cargo cult science — and it tells us more about the person doing it than it does about the world.

But screw it. Let's forget about science. Let's listen to an internet rando. Let's say RCA's claim is true that the US is not an outlier — and that globally health care spending rises twice as fast as income. We've apparently traded a US-specific problem for an enormous global problem where health care rises faster than income and becomes more and more unaffordable for everyone on Earth. The US's medical bankruptcies are just the canaries in the coal mines of a growing global problem [5]. His laser focus on showing the US is not an outlier at all costs makes him miss the forest for the trees — I'm sure RCA didn't want to imply that global health care spending is on an unsustainable path. He seems to think spending more and more money on health care (despite "diminishing returns" [6]) is the epitome of civilization:
The typical American household is much better fed today than in prior generations despite spending a much smaller share of their income on groceries and working fewer hours.  I submit this is primarily a direct result of productivity.  We can produce food so much more efficiently that we don’t need to prioritize it as we once did.  The food productivity dividend, as it were, has been and is being spent on higher-order wants and needs like cutting edge healthcare, higher amenity education, and leisure activities.
I don't know about you, but I love going to the doctor [7].

...

Update + 5 hours

In case you might think I'm being unnecessarily harsh on RCA, please note a) that isn't the first time I've encountered him and b) this from the "discussion" of this blog post this evening (click to enlarge):


...

Update 2/22/2020

I am wondering if this might be a clearer demonstration of what I'm getting at here. I fit another nonlinear function to RCA's data (leaving the US out because we want to see if it's an outlier). It's a logistic function. Using this function here has some basis in economic theory — e.g. satiation points. At some point consuming health care is more of a hassle than a benefit, right? Only so many angioplasties you can have in a year. Well, at least that's a plausible model. If we drop RCA's data in, we get the brown-ish curve below:


This says the US (green dot) is overspending by about double at a point where it should be reaching satiation. I could write up a whole long blog about this, and since it matches RCA's curve (blue) for every country except the US most of the rest of his analysis would go through. The other countries in the world are on the growing part of the curve — it'd just change the conclusions about the US.

But I can't do this. At least, not in good faith — and definitely not leaning over backwards. I actually believe this picture is almost certainly more accurate. My uncertainty in the saturation level is about where the single prediction bands put it. However, I would be a charlatan if I tried to push this fit to the data and let it be used by others in policy discussions.

Sure, I might put up a blog post and say, hmm, interesting — let's see how the data looks in the future!

But I can't draw a conclusion — US health care spending is an outlier — like how RCA has done with his. That's what I mean by the plot being in bad faith.

...

Update 2/22/2020 part II

These graphs were made somewhat tongue-in-cheek, but they illustrate a bit of the problem with extrapolating the nonlinear fits as far out as the US — where does it stop? At what point do we stop saying that because a point falls in the gray band, we have to conclude it's not an outlier? (Click to enlarge)


A good summary of my argument is that we can't go as far as that green dot representing the US and claim we're leaning over backwards in being honest.

Additionally, the slope determined in the nonlinear fit are akin to elasticities in economics — the change in e.g. price vs quantity in elasticities of supply and demand (one of the earliest things I looked at with information equilibrium on this blog). I'd say it's a stretch to actually say slopes in this example are estimates of elasticities (we have aggregate macro data here, not micro data), but lets go with it. The thing is that 1) they are elasticities only where the difference in logs is approximately a percentage, and 2) estimating elasticities and applying that human behavioral result well beyond the data you measured is not scientifically supportable.

Let's look at x compared to a reference value of x₀ = 5. The blue line below is 100 × (log x − log x₀) aka difference of logs, while the yellow line is 100  × (x − x₀)/x₀ aka the percent difference. We can see how this approximation breaks down as you move away from a region:


Note that the US is about twice as far out on the graph as the highest point in the data — so in terms of the slope on the log graph representing an elasticity, we're well out of scope of the approximation.

And even if the approximation was still in scope, extrapolating the behavioral meaning of that elasticity all the way to twice the highest point in the data is even more problematic.

In a more down to earth example, we know that gas prices do not heavily impact consumption in the short run when they fluctuate at the 10-30% level. RCA's analysis is like extrapolating that finding to increases of 100% — if gas prices doubled, consumption would remain constant. That's iffy on its own. However, he takes it a bit further — if some data then showed the US didn't reduce its consumption when gas prices doubled (i.e. it was in line with that extrapolation), RCA's analysis would be claiming that gas consumption is actually perfectly inelastic (people everywhere don't care about the price of gas at all) instead of possible structural reasons the US didn't reduce supply (e.g. the US built roads and housing that locked in commuting and therefore gas consumption). The former is basically a conclusion derived from a single data point — like RCA's claim the US isn't an outlier.

...

Update 2/22/2020 part III

Per commenter rob below, here is that disallowed region (above the blue dashed line) and where RCA's curve intersects it:


I mean, if we're allowed to extrapolate to the green point, why can't we extrapolate all the way out to that intersection?

n.b. This is a scope condition (a limit of the region of validity of the model).

...

Footnotes:

[0] Sure, I'm also an internet rando (a random physicist you could say), but I give my real name and you can peruse my grad school papers and thesis here if you'd like. In the interest of leaning over backward, I can also say that I am quite biased towards the left of the political spectrum. However, I don't have really strong feelings about health care policy — I do think it should be free because of basic morality, but that doesn't necessarily mean I think it should be a smaller component of GDP, but maybe a plausible future is one where most of us work in health care instead of retail (the transition appears to be already happening). Other people have much better thoughts on health care policy than I do. I wrote a short book on my views of the political economy of the US that doesn't even mention health care except for the possible stimulus effect of the ACA, focusing instead on racism, sexism, and other social forces as drivers of the economy.

[1] "In case you're not already aware ..." is also the kind of language Trump uses when he just heard about something for the first time. [Edit: added + 30 mins.]

[2] In that Twitter thread from a year ago, I found out the equation RCA printed on the graph did not give the line presented in that graph. RCA said (a week later) it was about the plotting function label being unable to handle orthogonal polynomials:
I used a 3rd order polynomial with an orthogonal transformation -- poly() function in R.  The labeling package isn't smart enough to transform the coefficients.  No big deal.
Although this information did let me figure out what happened on RCA's graph, it's also the kind of word salad you get when a student is trying to confidently answer a question that they don't really understand. I imagine it took him that week to figure it out. Basically, RCA confused R's poly() coefficients with R's polynomial() coefficients. I'll use Hermite polynomials (not 100% sure how R chooses the orthogonal set) to show the difference.

A normal ("raw") regression fits (to third order)

p(x) = a x³ + b x² + c x + d

with the fit returning (a, b, c, d) while an orthogonal polynomial regression fits (using Hermite polynomials which is probably not what R is doing, but still illustrative)

p(x) = a' (x³ − 3 x) + b' (x² − 1) + c' x + d' · 1

with the fit returning (a', b', c', d'). where a = a', b = b', and c = c' − 3 a'. and d = d' − b'. It's quite valuable to do the latter, because it can reduce the covariance at each order — for example, Hermite polynomials of different orders are designed to have a zero overlap integral so adding each order doesn't affect the previous orders like it would for adding monomial terms at each order ( looks a bit like x near x = 1, while x² − 1 doesn't as much near x = 1). But RCA isn't really doing an analysis where he shows increasing or decreasing orders where this process is most valuable — fitting a linear function, then fitting a quadratic and comparing the size of the new coefficients to see if adding the quadratic was warranted. If he had done that (as well as properly testing for the US as an outlier), he would have found that adding a quadratic term was not warranted unless the US was added.

However, I'm pretty certain RCA did not understand what R was doing until I called him out — at which point he went back to the documentation and tried to figure it out ... but still didn't understand it. If he had, he would have written something more like:
I used 3rd order orthogonal polynomials  -- poly() function in R.  I accidentally input the orthogonal poly coefficients in as raw poly coefficients.  No big deal.
It's true that it's a simple mistake, but it also sheds light on who RCA is. Note that the common plotting package is in fact capable of handling using poly() in the linear model  and printing the correct polynomial. But sure, it's that the package isn't smart enough, not that he made a mistake or didn't understand what he was doing.

[3] The error bands RCA is providing seem to be either less than one sigma or (more likely) are mean prediction bands (effectively where the new regression line will shift given a new data point) rather than single prediction bands (where an individual new data point might fall) which I tend give and what a typical person tends to think of when they see error bands.

[4] Wanted to keep the axes above consistent, but the single prediction error is pretty broad and can only be appreciated if you zoom out a bit. Click to enlarge.



[5] This may be true in a different way than RCA believes — rising US health care costs might be driving up health care costs around the world as we consume all the health care resources (Twitter thread here):


[6] To wit:
America’s mediocre health outcomes can be explained by rapidly diminishing returns to [health care] spending ...
[7] I love this quote:
Conversely, when we look at indicators where America skews high, these are precisely the sorts of [procedures] high-income, high-spending countries like the United States do relatively more of. 
You know, things rich people like to do! Like coronary artery bypasses, hip replacements, knee replacements, and coronary angioplasties. Those are a much more fun use of your disposable income than a trip to Spain!

Sunday, February 9, 2020

How should we approach how people think?

A common objection to the information equilibrium approach I've run into over the years is that economics at the micro level is about incentives or at the macro level how people react to policy changes in the economy. My snarky response to exactly that objection earlier today on Twitter was this:
The approach can be thought of as assuming people are too (algorithmically) complex for us to know how they respond to incentives, as opposed to [the] typical [economics] approach where you not only assume you know how individuals think but write down simplistic equations for it.
Let me expand on what this means in a (slightly) less snarky way. We'll set up a simple scenario where people are given 7 choices (the "opportunity set") and must select one.

The typical approach in economics

In economics, one would set up a utility function u(x) that represents your best guess at what a typical person thinks — how much "utility" (worth or value) they derive from each option. Let's say it looks like this:


You've thought about what people think (maybe using your own experience or various thought experiments) and you assign a value u(x) for each choice x. While I've made the above function overly simplistic, it's still assigning a value to each choice.

You would then set up an optimization problem over the choices, derive the first order conditions (which are basically the derivatives in the various dimensions you are considering), and find the maximum (i.e. the location of the zero of the derivatives, or a point on the boundary of your opportunity set).


That's the utility maximizing point and you find out your (often sole "representative") agent selects choice 5. Often, 100% of agents in your model (or 100% of a single representative agent) will select that choice. Everyone is the same. Sometimes you can have heterogeneous agents, and while each type of agent will make different selections each agent of each type will make the same selection.

Of course we can allow error, and there are random utility discrete choice models [pdf] that effectively allow random choices among the various utility options such that in the end we have e.g. most people choosing 5 with a few 4's or 6's (for 1000 agents):


But basically the approach assumes that when confronted with a choice you are able to construct a really good model — a u(x) — of how a person will respond.

This of course sets up a problem called "Lucas critique": if you make changes to policy to try and exploit something you learned this way, people can adapt to make your original model — your original utility function — moot. For example, if you make option 5 illegal, the model as is says people will start choosing 4 or 6 in roughly equal numbers. But maybe agents will adapt and choose 2 instead?

The response to the Lucas critique is generally to get ever deeper inside people's heads — to understand not just their utility functions but how their utility functions will change in response to policy, to get at the so-called deep parameters also known as microfoundations.

The approach in information equilibrium

In the information equilibrium approach, when asked what a person will choose out of 7 options, you furrow your brow, look up to sky, and then give one of these:

‾\_(ツ)_/‾

One agent will choose option 4 (with ex post probability 1):


Another will choose option 1:


If you ask that agent again, maybe they'll go with option 2 now:


Why? We don't know. Maybe they had medical bills between the choices. Maybe that first agent really loves Bernie Sanders and Bernie said to choose option 4. Again:

‾\_(ツ)_/‾

If we have millions of people in an economy (here, only 1000), then you're going to get a distribution over the choices. And if you have no prior information about that choice (i.e. ‾\_(ツ)_/‾), then you're going to get a uniform distribution (with probability ~ 1/7 for 7 choices — about 14%):


In this case, economics becomes more about the space of choices, the opportunity set — not about what individual people are thinking about. And that size of the opportunity set can be measured with information theory, hence information equilibrium (where we equate different spaces of choices). It turns out there is a direct formal mathematical relationship to the utility approach above, except instead of utility being about what individuals value it's about the size of that space of options.

In the information equilibrium approach, we depend on two assumptions that set up the basis of equilibrium:

  1. The distribution (and it doesn't have to be uniform) is stable except for a sparse subset of times.
  2. Agents fully map (i.e. select) the available choices (again, except for a sparse subset of times).

The "sparse subset" is just the statement that we aren't in disequilibrium all the time. If we are, we never see the macroeconomic state associated with that uniform distribution and we can't get measurements about it. We have to see the uniform distribution for long enough to identify it. Agents also have to select the available choices, otherwise we'll miss information about the size of the opportunity set.

But information equilibrium also allows for non-equilibrium. Since we aren't making assumptions about how they think, people could suddenly all make the same choice, or be forced into the same choice. These "sparse non-equilibrium information events", or more simply "economic shocks" cause observables to deviate from their equilibrium values. The dynamic information equilibrium model (DIEM) makes some assumptions about what these sparse shocks look like (e.g. the have a finite duration and amplitude), and it gives us a pretty good model of the unemployment rate [1]:


Those 7 choices above are translated into this toy model as jobs in various sectors (with one "sector" being unemployment).

This approach also gives us supply and demand (this is the connection to Gary Becker's 1962 paper Irrational Behavior in Economic Theory [pdf], see also here). We don't have 7 discrete choices here, but rather a 2-dimensional continuum between two different goods (say, blueberries and raspberries) bounded by a budget constraint (black line). The average is given by the black dot. As the price of one good goes up, on average people consume less of it.


And again, people might "bunch up" (i.e. make similar choices and not fully map the opportunity set) in that opportunity set and that gives us non-equilibrium cases where supply and demand fails:


In both of these "failures" of equilibrium (recessions, bunching up in the opportunity set), I am under the impression that sociology and psychology will be more important drivers than what we traditionally think of as economics [2].

But what about that "algorithmically" in parentheses in my original tweet? That's a reference to algorithmic complexity. The agents in the utility approach are not very algorithmically complex — they choose 5 either all the time or at least almost all the time:

{5, 5, 5, 5, 5, 5, 5, 4, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5}

This could be approximated by a computer program that outputs 5 all the time. The agents in the information equilibrium approach are far more complex:

{4, 7, 3, 4, 1, 6, 6, 5, 4, 2, 3, 1, 1, 3, 6, 3, 4, 2, 2, 6}

As you make this string of numbers longer and longer, the only way a computer program can reproduce it is to effectively incorporate the string of numbers itself. That's the peak of algorithmic complexity — algorithmic randomness. That's what I mean when I say I treat humans as so complex they're random. No computer program could capture a set of choices a real human made except a list of all those choices.

In a sense, you can think of the two approaches, utility and information equilibrium, as starting from two limits of human complexity — so simple you can capture what they think with a function versus so complex that you can't do it at all. I imagine the truth is somewhere in between, but given the empirical failure of macroeconomics (I call getting beaten by an AR process a failure) it's probably closer to the complex side than the simple side.

And that approach turns economics on its head — instead of being about figuring out what choices people will make, it's about measuring the set of choices made by people.

...

Footnotes:

[1] That's been pretty accurate at forecasting the unemployment rate for three years now (click to enlarge, black is post-forecast data):


[2] In fact, I wrote a book about how the post-war evolution of the US economy seems to be more about social changes than monetary and fiscal policy. Maybe it's not correct, but it at least gives some perspective of how different macroeconomic analysis can be from the way it is conducted today.

Thursday, January 2, 2020

"It takes a model to beat a model"


I came across this old gem on Twitter (here), and Jo Michell sums it up pretty well in the thread:
It takes a model to beat a model has to be one of the stupider things, in a pretty crowded field, to come out of economics. ... I don’t get it. If a model is demonstrably wrong, that should surely be sufficient for rejection. I’m thinking of bridge engineers: ‘look I know they keep falling down but I’m gonna keep building em like this until you come up with a better way, OK?’
There are so many failure modes of the maxim "it takes a model to beat a model":

Formally rejecting a model with data. Enough said.

Premature declaration of "a model". It seems that various bits of math in econ are declared "models" before they have been shown to be empirically accurate more often than is optimal. Now empirical accuracy doesn't necessarily mean getting variables right within 2% (although it can) — it can mean 10% or even just getting the qualitative form of the data correct. I have two extended discussions on the failure to do this here (DSGE) and here (Keen). The failure mode here is that something (e.g. DSGE) is declared a model using a lower bar than is applied to, say, cursory inspection of data or linear fits.

Rejecting a model as useless even without formal rejection. I wrote about this more extensively here, but the basic idea is that a model a) can be way too complex for the data it's trying to explain (this inherently makes a model hard to reject because you need as a good heuristic ~ 20 or so data points per parameter to make a definitive call so you can always add parameters and say "we'll wait for more data"), or b) can give the same results as another model that is entirely different (either use Occam's razor, or just give one of these  ¯\_(ツ)_/¯ to both models). The latter case can be seen as a tie goes to no one. Essentially — heuristic rejection.

Rejecting a model with functional fits. Another one I've written more extensively about elsewhere, but if you have a complicated model that has more parameters than a functional fit that more accurately represents the data, you can likely reject that more complicated model. One of the great uses of functional fits is to reduce the relevant complexity (relevant dimension) of your data set. Without any foreknowledge, the dimension d of a data set is on the order of the number n of data points (d ~ n) — worst case is that you describe every data point with a parameter. However, if you can fit that data (within some error) to a function with k parameters with k < d, then any model that describes the same data set with p parameters (within the same error) where kp < d, then you can (informally) reject that model as likely too complex. That functional fit doesn't even have to come from anywhere! (Note, this is effectively how Quantum Mechanics got its first leg up from Planck — lots of people were fitting the blackbody spectrum with fewer and fewer parameters until Planck gave us his one-parameter fit with Planck's constant.)

Failing to accept a model as rejected. One of the most maddening ways the "it takes a model to beat a model" maxim is deployed is by people who just don't accept that a model has been rejected or that another model outperforms it. This is more a failure mode of "enlightenment rationality" which assumes good faith argument from knowledgeable participants [1].

I make no particular argument that these represent an orthogonal spanning set (in fact, the 4th has non-zero projection along the 3rd). However, it's pretty clear that the maxim is generally false. In fact, it's pretty much the converse [2] of a true statement — if you have a better model, then you can reject a model — and as we all learned in logic the converse is not always true.

...

Update 14 January 2020

Somewhat related, there is also the idea that "there's always a least bad model" — to use Michell's analogy, there's always a least bad bridge. But there isn't. Sometimes there's just a shallow bit to ford.

Paul Pfleiderer takes on the compulsion to have something that gets called a "model" in his presentation here:


Making a model that isn't empirically accurate using unrealistic assumptions to make a theoretical argument is basically the same thing as making up data to make an empirical one.

My impression is that this compulsion is deeply related to "male answer syndrome" in the male-dominated field of economics.


...

Footnotes:

[1] Note that this is not necessarily a failure mode of science, which is a social process, but rather the application of that macro-scale social process to individual agents. Science does not require any agent to change their mind, only that on average at the aggregate level more accurate descriptions of reality survive over less accurate ones (e.g. Planck's maxim — people holding onto older ideas die and a younger generation grows up accepting the new ideas). The "enlightenment rationality" interpretation of this is that individuals change their minds when confronted with rational argument and evidence, but there is little evidence this occurs in practice (sure, it sometimes does).

[2] In logical if-then form "it takes a model to beat a model" is if you reject a model, then you have a better model.