Friday, May 1, 2020

What's in a name?


That which we call a model by any other name would describe as well ... or not
Shakespeare, I think.

I'm in the process of trying to distract myself from obsessively modeling the COVID-19 outbreak, so I thought I'd write a bit about language in technical fields.

David Andolfatto didn't think this twitter thread was very illuminating, but at its heart is something that's a problem in economics in general — and not just macroeconomics. It's certainly a problem in economics communication, but I also believe it's a kind of a professional economics version of "grade inflation" where "hypotheses" are inflated into "theorems" and "ideas" [1] are inflated into "models".

Now every economist I've ever met or interacted with is super smart, so I don't mean "grade inflation" in the sense that economists aren't actually good enough. I mean it in the sense that I think economics as a field feels that it's made up of smart people so it should have a few "theorems" and "models" in the bag instead of only "hypotheses" and "ideas" — like how students who got into Harvard feel like they deserve A's because they got into Harvard. Economics has been around for centuries, so shouldn't there be some hard won truths worthy of the term "theorem"?

This was triggered by his claim that Ricardian equivalence is a theorem (made again here). And I guess it is — in economics. He actually asked what definitions were being used for "model" and "theorem" at one point, and I responded (in the manner of an undergrad starting a philosophy essay [2]):
the·o·rem 
a general proposition not self-evident but proved by a chain of reasoning; a truth established by means of accepted truths 
mod·​el 
a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs
I emphasized those last clauses with asterisks in the original tweet (bolded them here) because they are important aspects that economics seems to either leave off or claim very loosely. No other field (as far as I know) uses "model" and "theorem" as loosely as economics does.

The Pythagorean theorem is established from Euclid's axioms (including the parallels axiom, which is why it's only valid in Euclidean space) that include things like "all right angles are equal to each other". Ricardian equivalence (per e.g. Barro) instead based on axioms (assumptions) like "people will save in anticipation of a hypothetical future tax increase". This is not an accepted truth, therefore Ricardian equivalence so proven is not a theorem. It's a hypothesis.

You might argue that Ricardian equivalence as shown by Barro (1974) is a logical mathematical deduction from a series of axioms — just like the Pythagorean theorem — making it also a theorem. And I might be able to meet you halfway on that if Barro had just written e.g.:

$$
A_{1}^{y} + A_{0}^{o} = c_{0}^{o} + (1 - r) A_{1}^{o}
$$

and proceeded to make a bunch of mathematical manipulations and definitions — calling it "an algebraic theorem". But he didn't. He also wrote:
Using the letter $c$ to denote consumption, and assuming that consumption and receipt of interest income both occur at the start of the period, the budget equation for a member of generation 1, who is currently old, is [the equation above]. The total resources available are the assets held while young, $A_{1}^{y}$, plus the bequest from the previous generation, $A_{0}^{o}$. The total expenditure is consumption while old, $c_{1}^{o}$, plus the bequest provision, $A_{1}^{o}$, which goes to a member of generation 2, less interest earnings at rate $r$ on this asset holding.
It is this mapping from these real world concepts to the variable names that makes this a Ricardian Equivalence hypothesis, not a theorem, even if that equation was an accepted truth (it is not).

In the Pythagorean theorem, $a$, $b$, and $c$ aren't just nonspecific variables, but are lengths of the sides of a triangle in Euclidean space. I can't just call them apples, bananas, and cantaloupes and say I've derived a relationship between fruit such that apples² + bananas² = cantaloupes² called the Smith-Pythagoras Fruit Euclidean Metric Theorem.

There are real theorems that exist in the real world in the sense I am making — the CPT theorem comes to mind as well as the noisy channel coding theorem. That's what I mean by economists engaging in a little "grade inflation". I seriously doubt any theorems exist in social sciences at all.

The last clause is also important for the definition of "model" — a model describes the real world in some way. The Hodgkin-Huxley model of a neuron firing is an ideal example here. It's not perfect, but it's a) based on a system of postulates (in this case, an approximate electrical circuit equivalent), and b) presented as a mathematical description of a real entity.

Reproduced from Hodgkin and Huxley (1952)
The easiest way to do part b) is to compare with data but you can also compare with pseudo-data [3] or moments (while its performance is lackluster, a DSGE model meets this low bar of being a real "model" as I talk about here and here). *Ahem* — there's also this.

Moment matching itself gets the benefit of "grade inflation" in macro terminology. I'm not saying it's necessarily wrong or problematic — I'm saying a model that matches a few moments is too often inflated to being called "empirically accurate" when it really just means the model has "qualitatively similar statistics".

One of the problems with a lack of concern with describing a real state of affairs is that you can end up with what Paul Pfleiderer called chameleon models — models that are proffered for use in policy, but when someone questions the reality of the assumptions the proponent changes the representation (like a chameleon) to being more of a hypothesis or plausibility argument. You may think using a so-called "model" that isn't ready for prime time can be useful when policy makers need to make decisions, but Pfleiderer put it well in a chart:



But what about toy models? Don't we need those? Sure! But I'm going to say something you're probably going to disagree with — toy models should come after empirically successful theory. I am not referring to a model that matches data to 10-50% accuracy or even just gets the direction of effects right as a toy model — that's a qualitative model. A toy model is something different.

I didn't realize it until writing this, but apparently "toy model" on Wikipedia is a physics-only term. The first line is pretty good:
In the modeling of physics, a toy model is a deliberately simplistic model with many details removed so that it can be used to explain a mechanism concisely.
In grad school, the first discussion of renormalization in my quantum field theory class used a scalar (spin-0) field. At the time, there were no empirically known "fundamental" scalar fields (the Higgs boson was still theoretical) and the only empirically successful uses of renormalization were QED and QCD — both theories with spin-1 gauge bosons (photons or gluons) and spin-½ fermions (electrons or quarks). Those details complicate renormalization (e.g. you need a whole different quantization process to handle non-Abelian QCD). The scalar field theory was a toy model of renormalization of QED — used in a class to teach renormalization to students about to learn QED that had already been shown to be empirically accurate to 10s of decimal places.

The scalar field theory would be horribly inaccurate if you tried to use it to describe the interactions of electrons and photons.

The problem is not that many economic "toy models" are horribly inaccurate, but rather that they don't derive from even qualitatively accurate non-toy models. Often it seems no one even bothers to compare the models (toy or not) to data. It's like that amazing car your friend has been working on for years but never seems to drive — does it run? Does he even know how to fix it?

At this stage, I'm often subjected to all kinds of defenses — economics is social science, economics is too complex, there's too much uncertainty. The first and last of those would be arguments against using mathematical models or deriving theorems at all, which a fortiori makes my point that the words "model" and "theorem" are inflated from their common definition in most technical fields.

David's defense is (as many economists have said) that models and theorems "organize [his] thinking". In the past, my snarky comment on this has been that economists must have really disorganized minds if they need to be organizing their thinking all the time with models. Zing!

But the thing is we have a word for organized thought — idea [4]:
i·de·a 
a formulated thought or opinion
But what's in a name? Does it matter if economists call Ricardian equivalence a theorem, a hypothesis, or an idea? Yes — because most human's exposure to a "theorem" (if any) is the Pythagorean Theorem. People will think that the same import applies to Ricardian Equivalence, but that is false equivalence.

Ricardian Equivalence is nowhere near as useful as the Pythagorean Theorem, to say nothing about how true it is. Ricardian Equivalence may be true in Barro's model — one that has never been compared to actual data or shown to represent any entity or state of affairs. In contrast, you could right now with a ruler, paper, and pencil draw a right triangle with sides of length 3, 4, and 5 inches [5].

I hear the final defense now: But fields should be allowed their own jargon — and not policed by other fields! Who are you fooling? 

Well, it turns out economists are fooling people — scientists who take the pronouncements of economics at face value. I write about this in my book (using two examples of E. coli and capuchin monkeys):


We have trusting scientists going along with rational agent descriptions put out there by economists when these rational agent descriptions have little to no empirical evidence in their favor — and even fewer accurate descriptions of a genuine state of affairs. In fact, economics might do well to borrow the evolutionary idea of an ecosystem being the emergent result of agents randomly exploring the state space.

...

PS

My "to be fair" items so that I'm not just "calling out economics" are "information" in information theory and "theory" in physics. The former is really unhelpful — I know it's information entropy, but people who know that often shorten it to just information and people who don't think information is like knowledge despite the fact that information entropy is maximized for e.g. random strings.

In physics, any quantum field theory Lagrangian is called a "theory" even if it doesn't describe anything in the real world. It is true that the completely made up ones don't get names like quantum electrodynamics but rather "φ⁴  theory". If it were economics, that scalar field φ would get a name like "savings" or "consumption".

...

Footnotes:

[1] I had a hard time coming up with the word here — my first choice was actually "scratch work". Also "concepts" or "musings".

[2] ... at 2am in a 24 hour coffee shop on the Drag in Austin.

[3] "Lattice data" (for QCD) or data generated with VAR models (in the case of DGSE) are examples of pseudo-data.

[4] Per [1], this is also why I thought "concept" would work here:
con·cept

something conceived in the mind
[5] This is actually how ancient Egyptians used to measure right angles — by creating 3-4-5 unit triangles [pdf].

Friday, April 24, 2020

Seven years later ...

On the 7th anniversary of this blog, we are finding ourselves in the midst of a deadly pandemic and the biggest macroeconomic shock since possibly the Great Depression. I hope everyone out there is staying healthy, practicing good mitigation, and still has a job.

The next seven years on the blog are going to be different — gone will be the days of tracing the path of the macroeconomic equilibrium, replaced with following the first non-equilibrium shock since the information equilibrium framework was formalized. Will we see a sharp rise in unemployment followed by the typical decline we've seen over the past century in US data? Will there be a step response? I hope the economy recovers from this shock faster than it has in the past, but I am not optimistic.


...

PS The post title is a MST3K reference to "The Final Sacrifice". Here's to wondering if there is beer on the sun.


Sunday, April 12, 2020

What does this physicist think of economists?**

I have had fringe contact with more macroeconomics than usual as of late, for obvious reasons (e.g. I have been producing macroeconomic models that outperform mainstream models by orders of magnitude), and I do understand this is only one corner of the discipline. I don’t mean this as a complaint dump, because most of physics suffers from similar problems due to being a similarly male-dominated field, but here are a few limitations I see in the mainstream economic models put before us:

1. They do not sufficiently grasp that social forces and unpredictable human nature are more powerful than economic forces and “rational agents”. In the short run you try economic stimulus, but in the long run you learn that not giving Republicans cover to dismantle democracy through “public choice” protects you the most. Or you move from doing “unemployment insurance” to “paying companies to keep people on the payroll” once you get that job search and matching is driven more by social relationships than economic theory. In this regard the economic models end up being too pessimistic about human brains (reduced to a 1-dimensional utility function!), and it seems that “the econophysics complaints about the economists” (yes there is such a thing) are largely correct on this count. On this question econophysics models (e.g.) really do better, though not the models of everybody.

2. They do not sufficiently incorporate people's humanity. An economic stimulus plan, for instance, may be freakishly amoral, which leads to adjustments along the way, and very often those adjustments are stupid policy moves suggested by impatient billionaires. This is not built into the economic models I am seeing, even though there is a large independent branch of sociology research. It is hard from them to understand, I guess? Still, it means that economic models will be too alien, rather than too human. Economists might protest that it is not the purpose of their science or models to incorporate social change and morality, but these factors are relevant for prediction, and if you try to wash your hands of them (no Easter pun intended) you will be wrong a lot.

3. The concept of scope, specifically the part that tells us that effective theories of the same system at different scales may have little relationship to each other at leading order — so much so that they may have incommensurate domains of validity. Economists seem super-unaware of this, at least much less so than physicists are these days, though it seems to be more of a “la-la-la-I-can't-hear-you” pursuit of tractable macro models aggregating “rational agents” than earnestly trying to understand the complex system they are purportedly researching. That is really hard, either in physics or economics. Still, on the predictive front without a good understanding of scope and scale a lot will go askew, as indeed it does in economics.

The economic models also do not seem to incorporate Richard Feynman-like bias offset techniques. Don't fool yourself, and you're the easiest person to fool! But economists still feel like opining about subjects well outside their domain of expertise without considering that their political priors may strongly influence their ideas. Some of their “ideas” are shown to be horribly misguided through the subsequent scrutiny. Economists might claim these factors already are incorporated in the variables they are modeling, since they claim to incorporate human behavior. Ideally you may wish to incorporate the past work of the modeler themselves (i.e. the past light cone of the observer's causal wavefunction) in the model's Bayesian prior probability, so that they do not see everything as a nail when all they have is a hammer. I have not yet seen a Dunning-Kruger-aware dimension in economic models, though you might argue many economists are “Dunning-Kruger” in their public rhetoric, blurting out what they think is good for us rather than actually learning about it first. The institutional modesty of physicists (whole theories are predicated on the principle that “we are not special in the universe”) is slightly subtler.

4. Selection bias from the failures coming first. The early macroeconomic models were calibrated from the Great Depression, because what else could they do? Then came the Great Recession, which was also a mess. It is the messes which are visible first, at least on average. So some of the models may have been too pessimistic at first. These days we have Japan, South Korea, and a bunch of Nordic states that haven’t quite “blown up” with several million people making initial unemployment claims and literal Depression-era food lines we see here. If the early models had access to all of that data, presumably they would be more predictive of the entire situation today. But it is no accident that the failures (like Richard Epstein) will be more visible in the media early on.

And note that right now some of the very worst countries (United States, possibly the United Kingdom?) are not far enough along on the data side to yield useful inputs into the models. So currently those macro models might be picking up too many semi-positive data points of functioning governments and not enough from failed states or “train wrecks,” and thus they are too optimistic.

On this list, I think my #1 comes closest to being an actual criticism, the other points are more like observations about doing science in a messy, imperfect world. In any case, when economic models are brandished, keep these limitations in mind. But the more important point may be for when critics of economic models raise the limitations of those models. Very often the cited criticisms are chosen selectively, to support some particular agenda, when in fact the biases in the economic models almost certainly run in one direction — towards the interests of billionaires (see a. below).

Which is how a lot of macro men think it should be.

Now, to close, I have a few rude questions directed at economists that nobody seems willing to publicly acknowledge, but actually we all already know the answers to them:

a. As a class of scientists, how much are economists paid by vested interests (e.g. GMU/Mercatus, Hoover Institution, Cato)? Is is being wrong or right better for their salaries?

b. How smart are they? What are their average GRE scores?

c. Are they hired into thick, liquid academic and institutional markets? Or does it take five years to publish a paper? And how meritocratic are those markets? Is it just people from five schools who are allowed to get jobs or publish?

d. What is their overall track record on predictions, whether before or during this crisis?

e. On average, what is the political orientation of economists? And compared to other academics?  Do they use the market social welfare function when they make non-trivial recommendations?

f. We know, from physics, that if you are a French physicist, being a Frenchman predicts your space-time location better than does being an physicist (there is an old PRL paper on this somewhere). Is there a comparable phenomenon in economics?

g. How well do they understand how to model any system, relative to say what an undergrad physics major would know?

h. Are there “defunct economists” in the manner that John Maynard Keynes charges there are “defunct economists”? If so, what do you have to do to earn that designation? And are the defunct sometimes right, or right on some issues? How meta-rational are those who allege defunct-ism? Are they meta-meta-rational? How about meta-meta-meta-rational?

i. How many of them have studied Douglas Hofstadter’s now 40 year old meta-work on emergence and meta-fiction? Meta.

Just to be clear, as ITE readers will know, I have not been criticizing the mainstream macroeconomic recommendations of stimulus. But still those seem to be questions worth asking.

...

** PS This is a mix of parody (because it's risible) and critique (because economics doesn't really work that well compared to even epidemiology) of this.

PPS #NotAllEconomists

PPPS Made a couple edits and slight changes (references to public choice theory, Japan).

PPPPS Update: Cowen is now saying the "debate" is becoming "emotional". That a) is exactly one point I am making here — his preferred approach to economics lacks empathy, morality, and humanity, and b) is what purportedly "rational" men often say about women, which is another.

People are literally lining up in Depression-era food lines, and Tyler wants to debate whether or not epidemiology journals should be colonized by economists.

PPPPPS I do want to emphasize that this is a parody — a physicist adopting the same self-regard and sneering tone Cowen shows towards epidemiology (but with the additional layer of irony being that physicists have produced a lot more empirically accurate theories than macroeconomists have). I think a lot of economists do good work. Unfortunately, a lot of economists (especially those more right & libertarian leaning) need to learn to, in the words of Kendrick Lamar, "be humble / sit down".

Tuesday, April 7, 2020

JOLTS data — and the twig crack that caused the avalanche?

Back from a long hiatus — things were crazy at the real job trying to get set up to work from home for a month or longer. Happy to report my family and I are doing well, and I hope everyone out there is staying healthy.

The drop in the JOLTS job openings rate I noted in the previous post (from February) has continued and it appears we're showing a definite deviation:


While you may be thinking "Yes, the COVID-19 shock", I should point out that this data is from February 2020 — and the deviation starts with data from December 2019. As I put it in a tweet from last month's data: What if there was a recession brewing and COVID-19 just triggered the market, like the old trope of a tree branch breaking causing an avalanche?

I saw that in the 2008 recession the JOLTS measures were some of the earlier indicators in the labor market with job openings being 4-6 months ahead of the shock to the unemployment rate. That was based on a single shock, but the hires data averages about 5 months lead using multiple shocks (in both directions) from the 1990s recession to today.

And last month's unemployment rate showed the first signs of a non-equilibrium shock with March 2020 data by either the Sahm rule or my "recession detection algorithm" threshold:


December 2019 to March 2020 is 4 months — right in line with the previous recession.

Now I understand it seems odd — how could JOLTS data predict a pandemic? Or as I put it in my twitter thread referenced above — how could the yield curve predict a pandemic? Even the "limits to wage growth" [1] hypothesis predicts a recession!

But in this view, the pandemic was just a coordinating signal. Often, these coordinating signals come from the Fed — an interest rate hike, lack of a cut, or even letting a financial institution fail — and coordination causes recessions (we all cut back on spending, we all sell our stocks, etc). Because the pandemic signal was so sudden and so unambiguous, we got a much sharper signal in the unemployment rate than usual and a bit of a compressed period between JOLTS and unemployment. For example, total separations is only barely registering a signal (it's there) while hires shows nothing yet (click to enlarge):


COVID-19 was the twig crack that caused an avalanche that was already building.

I've seen that some people think the recovery will be rapid. I doubt this because we are seeing a shock to the labor market — for example, initial claims spiked into the millions. A typical "surprise information shock" that evaporates has a distinct pattern:


It would look something like the red dashed line in this graph of S&P 500 data (while I show a non-equilibrium shock the size of the 2008 recession as a counterfactual recession path for reference):


However, unemployment is already rising and it falls at basically the same rate over the entire history of the data. This "remarkable recovery regularity" became the basis for the dynamic information equilibrium model (first here, then here). This implies that we are unlikely to see a sudden shift back to low unemployment but rather something more like this:


I added a step response (i.e. "ringing artifacts" or overshooting) to this qualitative non-equilibrium shock because the shock seems pretty sharp, however it is possible it won't happen as the step response has been gradually disappearing over time in US data. It's possible it won't be this big — though some people like James Bullard are are saying 30% is possible so it might be even bigger. But even the rise to 4.4% already in the data will take 3 years to get back to 3.5% along the dynamic equilibrium path.

It's going to be a long slog.

...

Footnotes:

[1] In the past several decades, when wage growth exceeds the nominal GDP growth trend, there has generally been a recession.

Saturday, February 29, 2020

Market updates for a bad week

Now I don't really look at the information equilibrium models for markets as particularly informative (looking at the error band spreads should tell you all you need to know about that) — this should be taken with a grain of salt. And always remember: I'm a crackpot physicist, not a financial adviser.

The stock market and recession shocks

With that out the way, here's what the recent drop in the markets looks like on the S&P 500 model:


Curiously, we seem to be back in the post-Tariff equilibrium after the past few months of out-performing that expectation. We are at the edge of the 90% band estimated on post-recession data (blue), and entering into the 90% band estimated on the entire range of data since 1950 (lighter green).

I should also note that in the past, the recession process is never really a single drop straight down. It's a series of drops over the course of months with some moments of recovery:


That does not mean the recent drop is not the start of such a series, just that it's entirely possible this could turn around.

Is this a prelude to a recession? Maybe, maybe not. For one thing, it will be different from the past two "asset bubble era" recessions (dot-com in 2001 and housing in 2008) given there's no discernible asset bubble in recent GDP data:


It would be an example of a non-Minky recession! (At least if this isn't some kind of shadow economy crisis — the housing boom didn't show up in the stock market, but does show up in GDP, while the dot-com boom shows up in both.)

We are basically reaching my "limits to growth" hypothesis — that recessions happen when wage growth exceeds GDP growth, thus eating into profits. (In fact, that interacts with the asset bubbles as the asset bubbles boost GDP allowing wage growth to go higher than it would have without the bubble.) This graph shows the DIEM trend of GDP (blue line), the wage growth DIEM (green line) as well as projected paths (dashed) and a recession counterfactual (dotted green).


But there's another aspect that I've been carrying along since this blog started — that spontaneous drops in "entropy" (i.e. agents bunching up in the state space by all doing the same thing, like panicking) — are behind recessions. These spontaneous falls make economics entirely different from thermodynamics where they're disallowed by the second law, by the way (atoms don't get scared and cower in the corner). This human social behavior would ostensibly be triggered by news events that serve to coordinate behavior — unexpected Fed announcements, bad employment reports, yield curve inversion, or in this case a possible global pandemic. Or all of the above? I imagine if the Fed comes out of its next meeting in March with no interest rate cut, it might make a recession inevitable.

Interest rates and inversion

The 10-year rate is back at the bottom of the range of expected values from this 2015 (!) forecast:


And the interest rate spreads are trending back down and inverting again:


One thing to note is that while the median daily spread data (red) dipped into the range of turnaround points seen before a recession in the past three recessions several months ago, the monthly (average) spread data (orange) did not go that low (n.b. the monthly average is what I used to derive the metrics). We also didn't see interest rates rise into the range seen before a recession (which tends to be caused by the Fed lowering interest rates in the face of bad economic news). An inversion or near miss followed by another inversion is not exactly unknown in the time series data, either.

JOLTS openings and the market

The latest JOLTS job openings data released earlier this month (data for December 2019) showed a dramatic drop even compared to the 2019 re-estimate of the dynamic equilibrium rate:


This appears to be right on schedule with the market correlation I noticed a few months ago:


The recent drop (December 2019) matches up with the drop at the beginning of that same year (Jan 2019) in the aftermath of the December 2018 rate hike. If this correlation holds up, the job openings rate will rise back up and then fall again in January 2021 (about 11 months after Feb 2020). But the other possibility is that this is the first sign of a recession — the JOLTS measures all appear to lead the unemployment rate (via e.g. Sahm Rule) as an indicator. However, JOLTS is noiser (which requires a higher threshold parameter) and later than the unemployment rate. JOLTS comes out 2 months after the data it's representing (the data for December 2019 came out mid-February 2020, while the unemployment rate for December 2019 came out the first week of January 2020), so whether it's a better indicator than the unemployment rate remains to be seen — there's a complex interplay of noise, data availability, and revisions (!) that makes me think we should just stick to Sahm's rule.

I'll be looking forward to the (likely revised!) JOLTS data coming with the Fed's March meeting.

...

Update 8 March 2019

I updated the graphs with a few more days of data, including a Fed rate cut (visible in the spread data). Click to enlarge:





Friday, February 28, 2020

Dynamic equilibrium: health care CPI

Since I've been looking at a lot of health care data recently, I thought I'd run the US medical care CPI component through the dynamic information equilibrium model (DIEM). It turns out to have roughly the same structure as CPI overall (click to enlarge):


A big difference is that the dynamic equilibrium growth rate is α = 0.035/y, basically a full percentage point above the rate for all items α = 0.025/y. As it's been basically at play since the 50s (at least), medical prices are now twice what prices as a whole have risen in the same period.

I was curious — was the US an outlier (lol)? I ran the model over a bunch of HICP health data from Eurostat (which the UI is at best silly, at worst pathological) for several countries (Sweden, France, the Netherlands, Switzerland, Germany, the UK, Estonia, Italy, Turkey, Denmark, and Spain). This is definitely a graph you have to click to enlarge:


They're all remarkably similar to each other except France, which came out with an equilibrium rate of α = 0. That could be wrong due to the recent data being in the middle of a non-equilibrium shock — time will tell.

I also compared the dynamic equilibrium to the DIEM model of the CPI (HICP) for all items for each country which produces an interesting plot:


It looks like the US is not much of an outlier on that graph — but that's a bit misleading since the possible inflation rates can't really deviate too much above the diagonal y = x line otherwise headline inflation (i.e. all the components) would rapidly be overtaken by health care price inflation (one of those components). In fact, nearly every time it came out that the health care rate was basically equal to the headline rate. You can see it if we plot the difference versus the headline CPI rate:



Most of the countries are clustered right around zero, with outliers being the US and France. France is an outlier because its health care price inflation has been basically zero for the past decade meaning the difference graphed above is essentially the negative of the inflation rate of about 2%. The US is an outlier in the other direction — by 4 standard deviations if we leave out France and the US in the estimate of the distribution.

If this is correct, the US health care prices rise at nearly a percentage point faster than prices overall, meaning prices are nearly 30% higher today from growth over the period from 1996-2020 (the Eurostat data range) than they would be if those prices grew like any other country's. And these are prices — not income, profits, or consumption.

...

Appendix

Here are all the individual graphs. The dashed lines are lines at the dynamic equilibrium rate, but in some of the graphs they appear well off the lines — that's because in some of the cases the different levels add the shocks in different ways (e.g. the 2nd shock is positive but the 3rd shock is negative) so they add or subtract differently for each country. Couple that with the fact that I determine the sign of a shock in the parameter estimation by the sign of the width, not the amplitude but instead of using that sign I set it by hand for the dashed line guides, and well, you see the result — random dashed lines appearing across the graphs (albeit with the right slope). Anyway, click to enlarge.












Sunday, February 23, 2020

A future response from Random Critical Analysis?

I'll be addressing the sole constructive, objective argument you made in that blog post within the next few days.  I'll try to be respectful and fair, even though you seem to be unable or unwilling to reciprocate.
That's regarding my previous post. There were quite a lot of right wing, conservative, and libertarian Twitter accounts (amazingly, groups that don't like government health care) who seemed to think that I was being a jerk, disrespectful, or making ad hominem arguments — and that somehow reflected on what I was saying. I pause to note that saying I'm being a jerk and using that to somehow dismiss what I was saying is itself argumentum ad hominem. That aside, no one is owed someone else's respect. 

I was pointing out that we should not assume RCA is competent — hard to imagine that ever coming across as respectful to RCA! And it's true that is ad hominem! But expertise in stats as well as using software to run regressions was very much at issue. He is making arguments (and mistakes!) that are unsupportable for reasons that have to do with the details of how regressions work.

There are also additional reasons to believe RCA is entirely biased (he called me a socialist, lol), and his first response was that I was attacking his analysis because he disagreed with my politics. Overall, since he didn't disclose who he is or his politics — and that we can gather from his statements that his politics agree with the finding that health care spending in the US is perfectly normal — he really needs to do a lot more "leaning over backwards" because of that bias and failure to disclose.

I'm sorry to say models and statistics can be manipulated in multiple ways. And people have a way of finding out how to present results that agree with their preconceived ideas. People have a tendency to believe that regressions don't lie, but just like photographs and film editing there are lots of things you can do to present something that is not completely honest (even if it's superficially correct) — and that's especially effective on people who aren't well versed in stats (just like people who don't work with Photoshop aren't as good at spotting alterations in pictures).

If we find some paper out there touting the benefits of free market economics from the Cato Institute or Mercatus Center, we all have reason to be suspect. I'm not saying it's something they can't overcome — I've actually cited a Mercatus fellow in my book. But it requires some extra leaning over backward (e.g. a Mercatus fellow writing in their paper "I know you're reading this and thinking of course someone at Mercatus found this, but ..." and going on to explain). We don't even know who RCA is which obscures our ability to do this kind of due diligence.

With that out of the way, what did he think the "sole constructive, objective argument" was?

The presumably substantive part of his argument is little more than a quibble over choice of model.  Even if he were objectively correct in this, which he isn’t, it leaves the vast majority of my arguments intact.  He just makes himself look like an asshole.
Someone preferring a different model than you is not a priori evidence of bad faith.  You're coming into this w/ the presumption that you know the one true way to do this, ignoring that this approach is quite common, and without having the perspective of the rest of the evidence.
Apparently, he's a double space after a period person. Ugh. Unfortunately, this is incorrect in multiple ways.

First, I was to write a basic math book but started it off with a whole table of incorrect addition facts, like 2 + 2 = 5, you have no reason to continue. This doesn't leave the rest of the book "intact". There is just no reason to continue past the first point because it's fundamentally flawed, and the rest can be presumed to be as flawed.

Second, while "this approach is quite common", it is not used to extrapolate a nonlinear coefficient that far outside the range of the data (see the 2/22 update, part II of my blog post for more details). The difference of logs ceases to be a percentage for more than ~ 10% differences, and while you might extrapolate an elasticity in economics you don't do it for a factor of 2 difference from the data you used to estimate it.

Third, I do not think the linear model is a better model and never said that [1] — I said it was indistinguishable from the nonlinear one he uses over the non-US data and results in a different conclusion in terms of the US being an outlier. If I have two models that are equally valid and one implies one conclusion and one implies another, it's really a toss-up. You don't draw either conclusion. At least if  you're being scientific. Leaning over backwards requires you support both conclusions or neither.

It's selecting one or the other without saying both are equally good that's evidence of bad faith. The reason RCA claims the nonlinear fit is better is because it makes it such that the US not an outlier — which is the conclusion he is trying to find. That is to say he has to assume the US is not an outlier in order to select his nonlinear specification. I think this point is lost on a lot of people rising to RCA's defense [2].

But I discovered something else about why he chose the nonlinear model that goes to RCA's incompetence here. He appears to have selected that nonlinear model because of the better relative to the of the linear model. This is hilariously wrong. RCA was apparently complaining about being blocked by a statistics professor who complained about the exact same thing I am complaining about [3]:


The tweets RCA was blocked for? These:


is not really a valid metric for nonlinear models aside from polynomials. This is a great simple explanation from a statistics consultant and author. I'm not sure if RCA is using a polynomial here or a nonlinear model (such as a log regression as in graph under discussion in my previous post). He's used polynomials in the past, and makes sense for those. But the thing is that is a monotonically increasing function of the number of variables, so comparing a linear model and a second order polynomial, the latter will always win. That's what the stats professor ("Statistical Ideas") is saying — it's understandable why he'd block someone who obviously doesn't understand what he's talking about.

But here RCA is also comparing R² for a linear and either a nonlinear model or a polynomial model which is either meaningless because a) linear and 2+ order polynomial are not commensurate, or b) a nonlinear model  is not valid.

I'm not optimistic about RCA actually responding to my criticisms. I imagine he'll do something like the graphs above — comparing a nonlinear  to a linear one — which will just demonstrate his incompetence further. Such is the problem with the Dunning-Kruger effect.

...

Update 29 February 2020

Here's an illustration of the fact that increases monotonically with additional orders in a polynomial fit alongside another thing I've mentioned in passing that I'll discuss first — RCA's measure of AIC he uses for the x-axis includes health care. That is to say he is graphing y versus x = y + z. If you do that with random data, you can easily get what looks like a linear correlation:


This is why Lyman Stone's and Karl Smith's belief (discussed in [2] below) that any proportional relationship validates the claims is completely wrong. It can arise purely because health care is a component of AIC.

And if we fit progressively higher order polynomials to this data, we get the (well known) result that R² increases monotonically with the order of the polynomial:


...

Footnotes:

[1] I did say it was better in terms of absolute error, but that's just a single metric. Otherwise I said "Over the non-US data, the linear fit (brownish dashed line) is basically as good as the nonlinear fit ... And over the entire range, the nonlinear fit falls inside the 90% confidence limits of the linear model" (emphasis added)

[2] Another weird defense was that finding any proportional relationship, linear or not, proves that health care expenses rise with income. The thing is the data set includes a lot of developing countries. I could see health care expenses rising with income for Mexico or Latvia.

And if that was the case, why stand up for RCA's specific proportional relationship — not one person who made the claim that any proportional relationship proved the point also said that RCA's analysis was garbage. It was more of  a rhetorical tack than a logical one. We're Karl Smith and Lyman Stone agreeing that RCA's analysis was garbage?

"So what if this analysis is garbage, anyone who says it's proportional proves the underlying point of the garbage analysis!" 

If I had a study that said that minimum wages reduced employment and you came along with a critique that said my analysis made major errors and should have resulted in a much smaller effect if the math was right, I don't get to come back and say "See, there's still an effect!"

[3] RCA has a habit of basically ignoring other people's expertise when it disagrees with him.

Friday, February 21, 2020

Leaning over backwards: health care edition

It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty—a kind of leaning over backwards.  For example, if you’re doing an experiment, you should report everything that you think might make it invalid—not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked—to make sure the other fellow can tell they have been eliminated. 
I need to start off with what should become an obligatory statement noting that while Feynman was pretty good about describing what it means to participate in the process of science, he was also a sexist jerk and a prime example of toxic masculinity.

Anyway, I can't remember how I came across it a year ago (I think via Steve Roth), but the anonymous "random critical analysis" [RCA] is back in the econoblogosphere (it's still around, I swear!) — this time amplified by Alex Tabarrok at Marginal Revolution (not going to link). He's still peddling his wares. He claims:
Health spending is overwhelmingly determined by the average real income enjoyed by nations’ residents in the long run.
Emphasis in the original. I think it's a good example of how relying on an internet rando [0] to do quantitative analysis on major policy issues can lead us astray. It's also a good case study in how to identify the sometimes subtle choices that end up ensuring the conclusions as well as the sometimes even subtler hints that people are overselling their competence.

I have no idea why he emphasized "residents" because the rest of the sentence isn't exactly that precise. You could dig into the PPP adjusted AIC et cetera, but I'd first like to focus on the "determined". The next graph in the extended blog post is a log-linear regression that actually has no causal interpretation — health care spending is "determining" income as much as income is "determining" health care spending:

Just because you chose the x-axis to be income and the y-axis to be health care spending and performed a regression does not mean you've found a causal relationship that "determines" things. Certainly, there is likely some relationship! Per capita income of a country seems like a plausible variable in a model of health care expenses. In fact, it's what you'd expect if health care was becoming more and more unaffordable as costs outstrip the ability to pay for them (a mechanism that seems to be at play for housing prices in the US). RCA's big innovation is saying that instead of this being a problem, this is what people want.

RCA makes a big point about using log-log graphs in the opening paragraphs, but one of the problems here is that he shows only the log-log version of this graph because — as we'll see below — the linear version looks pretty silly. We'll also see that lines on log-log graphs help conceal the choice (and it is a deliberate choice) of a nonlinear function. However, there's also this: 
In case you’re not already aware, these slopes [on log-log plots] can be readily interpreted in percentage terms. ... For example, the slope in the health expenditure-income plot below implies that a 1% increase in underlying value on the x-axis (income) predicts a 1.8% increase in value on the y-axis (health spending).
In reading this, I'm pretty sure that RCA doesn't actually understand that the 1.8 in that slope of the log-log graph means the growth rate of health spending is 1.8 times the growth rate in income. Sure, if incomes rise 1%, then health care will rise 1.8%. But if incomes rise 5%, then health care will rise 9%. Taking the difference in logs, you can substitute a (small) percentage in for x and get a percentage out as y, but you can't interpret the slope as a percentage unless that percentage is 180%. It's subtle, but it's the kind of thing you look for when teaching students because it helps you see if they understand the material or are just mechanically reproducing results.

I mean, in case you're not already aware [1] ...

Ok, next I have a bit of a nit — but it's kind of a theme of general sloppiness with RCA [2]. At that footnote we can learn a bit about orthogonal polynomials, but here we can learn a bit about significant figures. I was able to reproduce the graph above, but the equation in the annotation (like in the case in footnote [2]) gives an entirely different result (click to enlarge):


Blue is RCA's curve, gray dashed lines are my nonlinear model lines (either fit myself, or as above, using the equation RCA wrote down or rounding), and red is the data except for the US which is in green matching the original graph.

RCA rounded the two coefficients accurately (albeit to different orders), but it's really obvious we're on logarithmic axes when rounding from 9.88 to 10 moves you entirely off the data. RCA makes the claim that the fit is robust to leaving out the US, and if you look at the equations (especially if rounded by RCA's method) they are pretty close:

y = − 9.88 + 1.77 x          with the US

y = − 9.68 + 1.75 x          without the US

And that new equation (dashed gray) even looks pretty close on that log plot:


As an aside, this also tells us that RCA decided to do his analysis of the US not being an outlier in healthcare spending by including the US in the fits. That's just not how that works. Anyway, transforming back to linear space, we've gone down by 10% at US levels of income:


At 17% of GDP these days, a 10% reduction in health care spending in the US (~ 2% of GDP) would be a significant improvement! However, the US still isn't an outlier in this view — only about as much as Ireland. That's because this view is still effectively "a quadratic fit for no reason" that RCA was on about a year ago. If

y = a x² 

then

log(y) = log(a) + 2 log(x)

The fits above show ~ 1.8 instead of 2 so we have x^1.8 instead of x^2, but we're still fitting a nonlinear function to the data. Why? As far as I can tell the only reason is that it tells the story the author wants.

This is where we get back to Feynman. Leaning over backwards with honesty would force us to ask why we should throw out a simple linear fit. It's not accuracy over range of the data — unless we're already assuming the US is not an outlier and including it in the fits.


Over the non-US data, the linear fit (brownish dashed line) is basically as good as the nonlinear fit — the brown dashed curve and the gray dashed curve fall on top of each other until we get out to the US. And over the entire range, the nonlinear fit falls inside the 90% confidence limits of the linear model [3]. In fact, the linear model is actually much better than the nonlinear model in terms of absolute error [4]:


RCA's nonlinear model (fit correctly) actually predicts US health care spending will be somewhere between $7000 per capita and $13000 (90% CL) — and $8000, solidly within one sigma, is right in line with the linear model. That means that unless we include the US we do not have any reason to select the nonlinear model here. Or another way — assuming the US is not an outlier, US is not an outlier.

You may ask why I haven't gone in depth on the rest of the avalanche of graphs that follow. Unlike a novel where an unreliable narrator can be interesting, in science it's anathema. You have to go for more than just the superficial honesty of not deliberately lying, but rather leaning over backwards to show that your biases aren't driving the conclusions. In short, this is an example of cargo cult science — and it tells us more about the person doing it than it does about the world.

But screw it. Let's forget about science. Let's listen to an internet rando. Let's say RCA's claim is true that the US is not an outlier — and that globally health care spending rises twice as fast as income. We've apparently traded a US-specific problem for an enormous global problem where health care rises faster than income and becomes more and more unaffordable for everyone on Earth. The US's medical bankruptcies are just the canaries in the coal mines of a growing global problem [5]. His laser focus on showing the US is not an outlier at all costs makes him miss the forest for the trees — I'm sure RCA didn't want to imply that global health care spending is on an unsustainable path. He seems to think spending more and more money on health care (despite "diminishing returns" [6]) is the epitome of civilization:
The typical American household is much better fed today than in prior generations despite spending a much smaller share of their income on groceries and working fewer hours.  I submit this is primarily a direct result of productivity.  We can produce food so much more efficiently that we don’t need to prioritize it as we once did.  The food productivity dividend, as it were, has been and is being spent on higher-order wants and needs like cutting edge healthcare, higher amenity education, and leisure activities.
I don't know about you, but I love going to the doctor [7].

...

Update + 5 hours

In case you might think I'm being unnecessarily harsh on RCA, please note a) that isn't the first time I've encountered him and b) this from the "discussion" of this blog post this evening (click to enlarge):


...

Update 2/22/2020

I am wondering if this might be a clearer demonstration of what I'm getting at here. I fit another nonlinear function to RCA's data (leaving the US out because we want to see if it's an outlier). It's a logistic function. Using this function here has some basis in economic theory — e.g. satiation points. At some point consuming health care is more of a hassle than a benefit, right? Only so many angioplasties you can have in a year. Well, at least that's a plausible model. If we drop RCA's data in, we get the brown-ish curve below:


This says the US (green dot) is overspending by about double at a point where it should be reaching satiation. I could write up a whole long blog about this, and since it matches RCA's curve (blue) for every country except the US most of the rest of his analysis would go through. The other countries in the world are on the growing part of the curve — it'd just change the conclusions about the US.

But I can't do this. At least, not in good faith — and definitely not leaning over backwards. I actually believe this picture is almost certainly more accurate. My uncertainty in the saturation level is about where the single prediction bands put it. However, I would be a charlatan if I tried to push this fit to the data and let it be used by others in policy discussions.

Sure, I might put up a blog post and say, hmm, interesting — let's see how the data looks in the future!

But I can't draw a conclusion — US health care spending is an outlier — like how RCA has done with his. That's what I mean by the plot being in bad faith.

...

Update 2/22/2020 part II

These graphs were made somewhat tongue-in-cheek, but they illustrate a bit of the problem with extrapolating the nonlinear fits as far out as the US — where does it stop? At what point do we stop saying that because a point falls in the gray band, we have to conclude it's not an outlier? (Click to enlarge)


A good summary of my argument is that we can't go as far as that green dot representing the US and claim we're leaning over backwards in being honest.

Additionally, the slope determined in the nonlinear fit are akin to elasticities in economics — the change in e.g. price vs quantity in elasticities of supply and demand (one of the earliest things I looked at with information equilibrium on this blog). I'd say it's a stretch to actually say slopes in this example are estimates of elasticities (we have aggregate macro data here, not micro data), but lets go with it. The thing is that 1) they are elasticities only where the difference in logs is approximately a percentage, and 2) estimating elasticities and applying that human behavioral result well beyond the data you measured is not scientifically supportable.

Let's look at x compared to a reference value of x₀ = 5. The blue line below is 100 × (log x − log x₀) aka difference of logs, while the yellow line is 100  × (x − x₀)/x₀ aka the percent difference. We can see how this approximation breaks down as you move away from a region:


Note that the US is about twice as far out on the graph as the highest point in the data — so in terms of the slope on the log graph representing an elasticity, we're well out of scope of the approximation.

And even if the approximation was still in scope, extrapolating the behavioral meaning of that elasticity all the way to twice the highest point in the data is even more problematic.

In a more down to earth example, we know that gas prices do not heavily impact consumption in the short run when they fluctuate at the 10-30% level. RCA's analysis is like extrapolating that finding to increases of 100% — if gas prices doubled, consumption would remain constant. That's iffy on its own. However, he takes it a bit further — if some data then showed the US didn't reduce its consumption when gas prices doubled (i.e. it was in line with that extrapolation), RCA's analysis would be claiming that gas consumption is actually perfectly inelastic (people everywhere don't care about the price of gas at all) instead of possible structural reasons the US didn't reduce supply (e.g. the US built roads and housing that locked in commuting and therefore gas consumption). The former is basically a conclusion derived from a single data point — like RCA's claim the US isn't an outlier.

...

Update 2/22/2020 part III

Per commenter rob below, here is that disallowed region (above the blue dashed line) and where RCA's curve intersects it:


I mean, if we're allowed to extrapolate to the green point, why can't we extrapolate all the way out to that intersection?

n.b. This is a scope condition (a limit of the region of validity of the model).

...

Footnotes:

[0] Sure, I'm also an internet rando (a random physicist you could say), but I give my real name and you can peruse my grad school papers and thesis here if you'd like. In the interest of leaning over backward, I can also say that I am quite biased towards the left of the political spectrum. However, I don't have really strong feelings about health care policy — I do think it should be free because of basic morality, but that doesn't necessarily mean I think it should be a smaller component of GDP, but maybe a plausible future is one where most of us work in health care instead of retail (the transition appears to be already happening). Other people have much better thoughts on health care policy than I do. I wrote a short book on my views of the political economy of the US that doesn't even mention health care except for the possible stimulus effect of the ACA, focusing instead on racism, sexism, and other social forces as drivers of the economy.

[1] "In case you're not already aware ..." is also the kind of language Trump uses when he just heard about something for the first time. [Edit: added + 30 mins.]

[2] In that Twitter thread from a year ago, I found out the equation RCA printed on the graph did not give the line presented in that graph. RCA said (a week later) it was about the plotting function label being unable to handle orthogonal polynomials:
I used a 3rd order polynomial with an orthogonal transformation -- poly() function in R.  The labeling package isn't smart enough to transform the coefficients.  No big deal.
Although this information did let me figure out what happened on RCA's graph, it's also the kind of word salad you get when a student is trying to confidently answer a question that they don't really understand. I imagine it took him that week to figure it out. Basically, RCA confused R's poly() coefficients with R's polynomial() coefficients. I'll use Hermite polynomials (not 100% sure how R chooses the orthogonal set) to show the difference.

A normal ("raw") regression fits (to third order)

p(x) = a x³ + b x² + c x + d

with the fit returning (a, b, c, d) while an orthogonal polynomial regression fits (using Hermite polynomials which is probably not what R is doing, but still illustrative)

p(x) = a' (x³ − 3 x) + b' (x² − 1) + c' x + d' · 1

with the fit returning (a', b', c', d'). where a = a', b = b', and c = c' − 3 a'. and d = d' − b'. It's quite valuable to do the latter, because it can reduce the covariance at each order — for example, Hermite polynomials of different orders are designed to have a zero overlap integral so adding each order doesn't affect the previous orders like it would for adding monomial terms at each order ( looks a bit like x near x = 1, while x² − 1 doesn't as much near x = 1). But RCA isn't really doing an analysis where he shows increasing or decreasing orders where this process is most valuable — fitting a linear function, then fitting a quadratic and comparing the size of the new coefficients to see if adding the quadratic was warranted. If he had done that (as well as properly testing for the US as an outlier), he would have found that adding a quadratic term was not warranted unless the US was added.

However, I'm pretty certain RCA did not understand what R was doing until I called him out — at which point he went back to the documentation and tried to figure it out ... but still didn't understand it. If he had, he would have written something more like:
I used 3rd order orthogonal polynomials  -- poly() function in R.  I accidentally input the orthogonal poly coefficients in as raw poly coefficients.  No big deal.
It's true that it's a simple mistake, but it also sheds light on who RCA is. Note that the common plotting package is in fact capable of handling using poly() in the linear model  and printing the correct polynomial. But sure, it's that the package isn't smart enough, not that he made a mistake or didn't understand what he was doing.

[3] The error bands RCA is providing seem to be either less than one sigma or (more likely) are mean prediction bands (effectively where the new regression line will shift given a new data point) rather than single prediction bands (where an individual new data point might fall) which I tend give and what a typical person tends to think of when they see error bands.

[4] Wanted to keep the axes above consistent, but the single prediction error is pretty broad and can only be appreciated if you zoom out a bit. Click to enlarge.



[5] This may be true in a different way than RCA believes — rising US health care costs might be driving up health care costs around the world as we consume all the health care resources (Twitter thread here):


[6] To wit:
America’s mediocre health outcomes can be explained by rapidly diminishing returns to [health care] spending ...
[7] I love this quote:
Conversely, when we look at indicators where America skews high, these are precisely the sorts of [procedures] high-income, high-spending countries like the United States do relatively more of. 
You know, things rich people like to do! Like coronary artery bypasses, hip replacements, knee replacements, and coronary angioplasties. Those are a much more fun use of your disposable income than a trip to Spain!