Thursday, December 10, 2020

Initial claims and other COVID-19 shocks

Back in June of 2020, I posted an estimate of the future path of initial claims [1] on Twitter (click to enlarge):

While the rate of improvement was overestimated, it captured the qualitative behavior quite well:

Being able to predict the qualitative behavior of the time series in the future is pretty good confirming evidence for a hypothesis — not the least of which being there's no way you could have had access to data in the future without travelling through time. The underlying concept was that the rate of improvement after the initial spike would gradually fall back to the long term equilibrium (logarithmic rate) of about −0.1/y (which shows up as the line that is almost at zero):

The hypothesis is that while the initial part of the non-equilibrium shock was a sharp spike, there is an underlying component that is a more typical, more gradual, shock. One way to visualize it is in the unemployment rate via "core" unemployment (per Jed Kolko):

Here's a cartoon version. In the current recession, we're seeing something that hasn't been that apparent (or at least as rapid) in the data [2]. There's the normal recession (solid line) as well as a sharp spike (dashed):

Instead of the usual derivative that's a single (approximately Gaussian) shock (solid line), we have a more complex structure with a smoothly falling return to the usual dynamic equilibrium (here exaggerated to −0.2/y so it looks different from zero):

Zooming in on the box in the previous graph, we get the cartoon version of the data above (dashed curve) that eventually asymptotes to the long run dynamic equilibrium rate:

Since we haven't had a shock of this type before in the available data with mass temporary layoffs, it's at least not entirely problematic to suggest an ad hoc model like this one. The underlying "evaporation" of the temporary shock information is based on the entropic shocks that appear in the stock market (including for this exact same COVID-19 event as well as the December 2018 Fed rate hike): 



[1] This is not what would be the technically correct model in terms of dynamic equilibrium, but over this short time scale the civilian labor force has been roughly constant since June. It doesn't really change the shape except for the initial slope which is lower because it is undersampled using only monthly CLF measurements instead of weekly ICSA measurements:

The "real" model isn't that different:

[2] It's possible the "step response" in the unemployment rate in the 1950s and 60s is a similar effect, but nowhere near as rapid.

Sunday, December 6, 2020

Qualitative economics done right, part N [1]

I seem to have involved myself in a Twitter dispute with economist Roger Farmer about what it means to make macro models — or more broadly "the nature of the scientific enterprise", as Roger the economist kindly tried to explain to me, a physicist. Unfortunately, due to his prolific use of the quote tweet the argument is likely impossible to follow. You can see the various threads via this search.


This started when I noted that Roger Farmer's claims about unemployment — in particular in papers supporting his claims that he cites here Farmer (2011) and here Farmer (2015) — are inconsistent with the long run qualitative behavior of the unemployment rate data. That is to say the models are not consistent with the empirical fact that the unemployment rate between recessions falls at a logarithmic rate of about −0.09/y in the US (BYDHTTMWFI: here's a recent NBER paper).

Let me say right off that I actually appreciate Roger Farmer's work — he does seem to think outside the box compared to the DSGE approach to macro that has taken over the field.

I am going to structure this summary in a series of claims that I am not making because it seems many people have confused requiring qualitative agreement with data with precise measurements of the electron magnetic moment.

Here we go!

I am not saying models with RMS error ε ≥ x must be rejected.

The funniest part about this is that in my figure that I use to show the Roger's model's lack of qualitative agreement actually shows the DIEM has worse RMS error over the range of the data Roger shows in his graph [2].

The thing is it's easy to get low RMS error on past data simply by adding parameters to a fit. This does not necessarily work with projected data, but in general more parameters often yield a better fit to past data and sometimes a better short run projection.

However, my original claim that started this off was that his model based on shocks to the stock market in the supporting papers was "disconnected from the long run empirical behavior of the unemployment rate". It's true that if you take the shocks to the unemployment rate and add the dynamic equilibrium of the S&P 500 model, you get a short run correlation that lasts from 1998 to about 2010:

This correlation around the 2008 recession is pointed out in Farmer (2011) Figure 2. However, you only have to go back to the early 90s recession to get a counterexample to the idea that shocks to the S&P 500 match up with shocks to the unemployment rate.

Second, there is also no particular empirical evidence that the unemployment rate will flatten out at any particular level (be it the natural rate in neoclassical models, or in Farmer's models a rate based on asset prices).Third, Farmer's models do not show log-linear decline between recession shocks. 

It is these three basic empirical facts about the unemployment rate that I was referencing when I made my claim in that initial tweet. Even if the RMS error is bad, a model of the unemployment rate is at least qualitatively consistent with the data if 1) the shocks are not entirely dependent on the stock market, 2) the rate does not flatten out at any level except possibly u = 0, or 3) shows an average log-linear decline of −0.09/y between recessions (a fact that was called out in a recent NBER paper, BYDHTTMWFI).

What I am saying is that Roger's models are not qualitatively consistent with the data — think a model of gravity where things fall up — and should be rejected on those grounds. The unemployment rate literally levitates in his models. Additionally there exist models with lower RMS error and qualitative agreement with the data; the existence of those models should give us pause when considering Roger's models.

I am not calling for Roger Farmer to stop working on his models.

It's fine by me if he wants to give talks, write blog posts about his model, or think about improving it in the privacy of his own research notebook. I would prefer that he grapple with the fact that the models are not qualitatively consistent with the data instead of getting defensive and saying that they don't have to pass that low bar. I believe models that are not qualitatively consistent with the data should not be used for policy, though — and that is one of Roger's aims.

It's true that a lot of ideas start out kind of wrong — it's unrealistic to expect a model to match the data exactly right out of the gate. And that's fine! I've had a ton of bad ideas myself! But there is no reason we should expect half-baked ideas lacking qualitative agreement with the data to be taken seriously in the larger marketplace of ideas. 

So many comments on the feed were about working towards an insight or the models being just an initial idea that could be improved. Most of us don't get a chance to put even really good ideas in front of a lot of people, so why should we accept something that's apparently not ready for prime time just because it's from a tenured professor? I have a Phd and a lot of garbage models of economic systems that aren't even qualitatively accurate in my Mathematica notebook directory — should we consider all of those? In any case, "it may lead to future progress" is not a reason to say "oh, fine then" to models that aren't qualitatively consistent with empirical data.

What I am saying is that we should set the bar higher for what we consider useful models in macro than "it might qualitatively agree with data one day". We can leave discussion of those models out of journals and policy recommendations.

I am not saying we should apply the standards of physics to economics.

This goes along with people saying I shouldn't be applying "Popperian rejection" to economic models. First off, this misconstrues Popper who was talking about falsifiability as a condition for scientific theories as opposed to pseudoscience. Roger's models are falsifiable — I don't think they are pseudoscience. However, Popper didn't really say much about models being falsified despite the fact that lots of people think he did.

General Relativity is a better model than Newtonian gravity, but both models are falsifiable. We consider Newtonian gravity to be incorrect for strong gravitational fields, precise enough measurements in weak fields, or velocities close to the speed of light. We still use good old Newton all the time — I did just the other day for an orbital dynamics question at work. I fully understand the difference between a model that is an approximation and one that is supposed to be a precise representation of reality.

Popper, however, did not say anything about models that don't qualitatively agree with the data. That's because in most of science, such models are thrown out before they are ever published. Economics, especially macro, operates in a different mode where I guess they consider models that look nothing like the data. Ok, I know the time series data is an exponentially increasing amplitude sine wave and this model says it's a straight line, but hear me out!

If the standards for agreement with the data are below qualitative agreement with the data, then there's really no reason to throw out Steve Keen's models [3]. But that's the problem — there are models that agree with the data! David Andofatto's simple model matches the data fairly well qualitatively! (It gets points 1 and 3 above and could be set to u* = 0 to get 2.) The existence of those models should set the bar for the level of empirical accuracy we should accept in macro models.

What I am saying is that there are existing models that more precisely match the data — and that is the standard I am using. It's not physics, but rather the performance other economic models. If you have a model that has worse RMS error, but has better qualitative agreement with the data, then that's ok to bring to the table. Overall, there seems to be far too much garbage that is allowed in macro because, well, there apparently wouldn't be any macro papers at all if some basic standards were enforced. When I say these models that aren't even qualitatively consistent with the data should be thrown out, I'm not talking about Popperian rejection, I am talking about desk rejection.

One last point ... what is the use of a model that doesn't qualitatively agree with data?

I didn't have a way to phrase this one as something I'm not saying. I literally cannot fathom how you can extract anything useful from a model that does not qualitatively agree with the data. This is lowest bar I can think of.

Yes this model looks nothing like the data but it's useful because I can use it to understand things based on ...

That ellipsis is where I cannot complete the sentence. Based on gut feelings? Based on divine revelation? If the model looks nothing like the data, what is anything derived from it derived from? The pure mathematical beauty of its construction?

It's like someone saying "Here's my model of a car!" and they show you a cat. Yes, this cat isn't qualitatively consistent with a car, but it's a useful first step in understanding a car. The cat gives me insights into how the car works. And you really shouldn't be using Popperian rejection of the cat model of a car because automobile engineering is not the same as physics. Making a detailed car model is unnecessary for figuring out how it works — a cat is perfectly acceptable. Eventually, this cat model will be improved and will get to a point where it matches car data well. The cat model also allows me to make repair recommendations for my car. You see the cat has a front and a back end, where the front has two things that match up with the car headlights, and yes the fuel goes in the front of the cat while it goes in the side of a car but that's at least qualitatively similar ...


Update 7 December 2020

Also realized Roger has made a major stats error here:

Jason Jason. @infotranecon I really don’t know where to start.  1. The unemployment rate is I(1) to a first approximation. 2. The S&P measured in real units is I(1) to a first approximation. The two series are cointegrated. The S&P Granger causes the unemployment rate.

Here's Dave Giles, econometrician emeritus extraordinaire:

If two time series, X and Y, are cointegrated, there must exist Granger causality either from X to Y, or from Y to X, both in both directions.



[1] The title is a reference to my old series that led, among other places, to realizing Wynne Godley has been maligned by people who ostensibly support him, and that Dirk Bezemer fabricated quotes in his widely cited paper.

[2] I do find it problematic that Roger not only cuts off the data early compared to data that was available at the time Farmer (2015) was published, but also cuts of data that was available at the time that would appear in the domain of his graph — data that emphasizes that the model does not qualitatively match the data. He also uses quarterly unemployment data which further reduces the disagreement.

[3] I mean c'mon!

Sunday, October 18, 2020

The four failure modes of Enlightenment values

I don't write about process as much these days — in part because I'm no longer working my previous project that had me effectively commuting across the country every month to the middle of nowhere, and in part because I'm now working a much bigger project that barely leaves me enough time to update even the existing dynamic information equilibrium model forecasts. But recently there seems to be an upswing in calls for civility, declarations of incivility, and long sighs about about how to criticize the "correct" way. I saw George Mason economist Peter Boettke tweet out this the other day that includes a list of "rules" for how to criticize:
How to compose a successful critical commentary:
  1. You should attempt to re-express your target’s position so clearly, vividly, and fairly that your target says, "Thanks, I wish I’d thought of putting it that way."
  2. You should list any points of agreement (especially if they are not matters of general or widespread agreement).
  3. You should mention anything you have learned from your target.
  4. Only then are you permitted to say so much as a word of rebuttal or criticism.
It seems fitting that Boettke would tweet this out given his defense of the racist economist/public choice theorist James Buchanan. It's pure "Enlightenment" rationalism — the same Enlightenment that gave us many advances in science, but also racism and eugenics. These rules are in general a great way to go about criticism — but if and only if certain norms are maintained. If these norms aren't maintained, these rules inculcate us with a vulnerability to what I've called viruses of the Enlightenment. To put in the terms of my job: this process has not been subjected to failure mode effect analysis (FMEA) and risk management.

This isn't intended to be a historical analysis of what the "Enlightenment" was, how it came to be, or its purpose, but rather how the rational argument process aspect is used — and misused — in discourse today. I've identified a few failure modes — the vulnerabilities of "Enlightenment" values.

Failure mode 1: Morally repugnant positions

I'm under the impression that like bioethics, medical ethics, or scientific ethics, someone needs to convene an interdisciplinary ethics of rational thought. There are still occasions when science seems to think the pursuit of knowledge is an aim higher than any human ethics, and failures run the gamut from the recent protests to building another telescope on Mauna Kea (part of a longer series of protests) to unethical human experiments.

Rationalism seems to continue to hold this view — that anything should be up for discussion. But we've long since discovered that science can't just experiment on people without considering the ethics, so why should we believe rationalism can just say whatever it wants?

Unfortunately, since we are humans and not rational robots, the discussion of some ideas themselves might spread or exacerbate morally repugnant beliefs. This is contrary to the stated purpose of "Enlightenment values" — open discussion that leads to the "best" ideas winning out in the "marketplace of ideas". And if that direct causality breaks (open discussion → better ideas), the rationale for open discussion is weakened [0]. Simply repeating a lie or conspiracy theory is known to strengthen the belief in it — in part from familiarity heuristic. And we know that simply changing the framing of a question on polls can change people's agreement or disagreement. Right wing publications try to launder their ideas by simply getting mainstream publications to acknowledge them, pulling them out the "conservative ecosystem" — as Steve Bannon has specifically talked about (see here).

Rule #1 fails to acknowledge our humanity. Simply repeating a morally repugnant idea can help spread it, and in the very least requires the critic to carry water for a morally repugnant idea. I cannot be required to restate someone's position that's favorable racism because that requires giving racism my voice, and immorally helping the cause of racism.

For example, Boettke's defense of Buchanan requires him to carry water for Buchanan. If we consider the possibility that Nancy MacLean's claims of a right-wing conspiracy to undermine democracy and promote segregation are true (I am not saying they are, and people I respect — e.g. Henry Farrell — strongly disagree with that interpretation of the evidence), then carrying that water should be held to a level of ethical scrutiny a bit higher than, say, discussing the differences between Bayesian and frequentist interpretations of probability.

This is not to say we shouldn't talk about Buchanan or racism. It's not like we don't experiment with human subjects (e.g. clinical trials). It's just that when we do, there are various ethical questions that need to be formally addressed from informed consent to what we plan to learn from that experiment. A human experiment where we ask the question about whether humans feel pain from being punched in the face is not ethical even if we have consent from the subjects because the likelihood of learning something from it is almost zero. "I'm just asking questions" here is not a persuasive ethical argument.

This is in part why I think shutting down racists from speaking on college campuses isn't problematic in any way. Would we authorize a human experiment where we engage in a campaign of intimidation of minorities just to measure the effects? We already know about racist thought — it's not like these are new ideas. They're already widely discussed — that's how students on campuses know what to protest. And in terms of ethical controls, we might well consider that the moral risk managed solution consistent with intellectual discourse is to have these “speakers” write their “ideas” down, have the forum led by someone who is not a famous racist, or possibly is even opposed to the “ideas” [3].

Failure mode 2: Over-representation of the elite

I criticized Roger Farmer's acceptance of Hayek's interpretation that prices contain information on Twitter a year or so ago (for more detail on my take, you can check out my Evonomics article). Farmer subsequently unfollowed me on Twitter which likely decreases the engagement I get through Twitter’s algorithms.

Now my point here is not that one is obligated to listen to every crackpot (such as myself) and engage with their “ideas”. It’s that we cannot feasibly exist in a world where all expression is heard and responded to — regardless of how misguided or uninformed. And who would want that?

But it does mean participation via the (purportedly) egalitarian Enlightenment ideals of “free speech” and “free expression” in the marketplace of ideas is already limited. And the presumption of “equals” engaging in mutual criticism behind Bottke's “rules” artificially limits the bounds of criticism further. Already elites pick and choose the criticism they engage with — giving them an additional power of “permission” distorts the power balance even more.

Unfortunately public speech and public attention ends up being rationed the same way most scarce resources are rationed — by money. The elite gatekeepers at major publications push the opinions and findings of their elite comrades through the soda straw of public attention. We hear the opinions of millionaires and billionaires as well as people who find themselves in circles where they occasionally encounter billionaires far more often than is academically efficient. Bloomberg and Pinker talking about free speech. MMT. Charles Murray.

Bloomberg writing at is a particularly egregious example of breaking the egalitarian norm. Bloomberg's undergraduate education is in electrical engineering from the 1960s and he has a business degree from the same era. He has no particular qualifications to judge the quality of discourse, the merits of the freedom of speech, or who should be forced to tolerate right wing intimidation on college campuses. He is in the position he is in because he made a great deal of money which enabled him to take a chance on running for office and becoming mayor of New York.

That said, I don't have particular expertise in this area — but then I don't get to write at

As such, “cancelling” the speech of these members of the elite mitigates this bias almost regardless of the actual reason for the cancellation simply because they’re over-represented.

More market-oriented people might say having billions of dollars must mean you’ve done at least something right and therefore could result in being over-represented in the marketplace of ideas. That's an opinion you can argue — in the marketplace of ideas — not implement by fiat. Now this is just my own opinion, but I think having too much money seems to make people less intelligent. Maybe life gets too easy. Maybe you lose people around you that disagree with you because they're dependent on your largess. Lack of intellectual challenge seems to turn your brain to mush in the same way lack of physical activity turns your body to mush. You might have started out pretty sharp, but — whatever the reason — once the cash piles up it seems to take a toll. I mean, have you listened to Elon Musk lately? However, even if you believe having billions of dollars means you have something worthwhile to say, that is not the Enlightenment's egalitarian ethos. King George III had a lot more money than any of the founders of the United States, but it's not like they felt compelled to invite him or his representatives to speak at the signing of the Declaration of Independence.

While everyone has a right to say what they want, that right that does not grant everyone a platform. The “illiberal suppression” of speech can be a practical prioritization of speech. "Cancelling" can mitigate systemic biases, enabling a less biased, more genuine discourse. Why should we have to listen to the same garbage arguments over and over again? Even if they aren’t garbage, why the repetition? And even if the repetition is valid, why must we have the same people doing the repeating? [1] An objective function optimized for academic discussion should prioritize novel ideas, not the same people rehashing racism, sexism, or even “enlightenment” values for 30 years.

It's true that novelty for novelty's sake creates its own bias in academia — journals are biased towards novel results rather than confirmation of last year's ideas creating a whole new set of problems. In addition to novel ideas, verifiability and empirically accuracy would also be good heuristics. Expertise or credentials in a particular subject is often a good heuristic for priority, but like the other heuristics it is just that — a heuristic. Knowing when to break with a heuristic is just as valuable as the heuristic itself.

In any case, just assuming elites and experts should be free from criticism unless it meets particular forms of "civility" or that their "ideas" should be granted a platform free from being "cancelled" do not further the spirit of the Enlightenment values that most of us agree on — that what's true or optimal ought to win out in the marketplace of ideas.

Failure mode 3: Rational thought and academic research is not free speech

Something obvious in the norms in Boettke's list is that he appears to recognize rational argument differs from free speech. "Free speech" does not require you to speak in some proscribed manner — that would ipso facto fail to be free speech. 

However, the ordinary process by which old ideas die off through rational argument seems to be conflated with suppressing free speech these days. Having your paper on race and IQ rejected for publication because it rehashes the old mistakes and poor data sets is normal rational progress, not the suppression of free speech. "Just asking questions" needs to come to grips with the fact that lots of those questions have been asked before and have lots of answers. Just as we don't need to continuously rehash 19th century aether theory, we don't need to continuously rehash 19th century race science [2].

When shouts of "free speech" are used as a cudgel to force academic discussion of degenerative research programs in Lakatos' sense, it represents a failure mode of "Enlightenment" values and science in general. In order for science and the academy to function, it needs to rid itself these degenerative research programs regardless of whether rural white people in the United States continue to support them. If these research programs turn out to not be degenerative — well, there's a pretty direct avenue back into being discussed via those new results showing exactly that. Assuming they follow ethical research practices, of course.

Failure mode 4: People don't follow the spirit of the rules

Failure to follow the spirit of these rules tends to be rampant in any "school of thought" that claims to challenge orthodoxy from race science to Austrian economics. Feynman's famous "cargo cult science" commencement address is a paean to the spirit of the rules of science (and "Enlightenment" values generally), but unlike Boettke's rules for others Feynman asks fledgling scientists to direct the rules inward — "The first principle is that you must not fool yourself — and you are the easiest person to fool."

This failure mode is far less intense than discussing racism, unethical human experiments or plutocracy, but is far more common. Certainly, the "straw man" application of Rule #1 falls into this. But one of the most frustrating is the one many of us feel when engaging with e.g. MMT acolytes — never acknowledging that you have "re-express[ed] your target’s position ... clearly, vividly, and fairly."

Randall Wray or William Mitchell (e.g.) simply never acknowledge any criticism is valid or accurate. Criticism is dismissed as ad hominem attacks instead of being acknowledged. If "successful" critical commentary (per the "rules") requires the subjects to grant you permission, any criticism can be shut down by a claim that the critic doesn't know what they are talking about.

This failure to follow the spirit of the rules appears in numerous ways, from claims that simply expressing a counterargument isn't civil discourse to the failure of someone espousing racist views to admit that those views are actually racist [4] to general hypocrisy. However, the end effect is that failure to follow the spirit of the rules is an attempt to enable the speaker with the ability to grant permission to which facts or counterarguments are allowed and which aren't. That's not really how "Enlightenment values" are supposed to work.

Being granted permission by the subject of criticism is also generally unnecessary to actual progress. Humans — especially established public figures — rarely listen to criticism. Upton Sinclair, Bertrand Russell, and Max Planck captured different dimensions of this (a rationale, a mechanism, and a real course of progress) in pithy quotes (respectively):
It is difficult to get a man to understand something, when his salary depends upon his not understanding it! 
If a man is offered a fact which goes against his instincts, he will scrutinize it closely, and unless the evidence is overwhelming, he will refuse to believe it. If, on the other hand, he is offered something which affords a reason for acting in accordance to his instincts, he will accept it even on the slightest evidence. 
A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
This is how the world has always been. Your audience for your criticism is never the subjects of the criticism, but rather the next generation. Explaining your subject's position before criticizing it is done as part of Feynman's "leaning over backward" — for yourself — not legitimacy.

Other failure modes

I wanted to collect my thoughts on free speech, "cancelling", and the terrible state of "the discourse" in one essay. This list is not meant to be exhaustive, and I may expand it in the future when I have new examples that don't fit in the previous four categories. For example, you might think that academic journals are a form of intellectual gatekeeping — and I'd agree — but I believe that falls under failure mode 2: the over-representation of the elite, not a separate category. There are also genuine workarounds in that case that everyone uses (arXiv, SSRN). You may also disagree with the particular choice of basis — and I'm certain another orthonormal set of failure modes could span the same failure effect space.

Also, because I talk about MMT along with Public Choice and racism, it doesn't mean I equate them. There are similarities (both get a leg up through the support of billionaires), but I am trying to find examples from across a broad spectrum of politics and political economy. There are major failures and minor. However, I think the examples I've chosen most clearly illustrate these failure modes.

I have been sitting on this essay for nearly a year. I was motivated to action by a tweet from Martin Kulldorff, a professor at the Harvard Medical School about how Scott Atlas was "censored" [5] for spreading misinformation about the efficacy of various coronavirus mitigations (from masks to lockdowns). Atlas is on the current administration's "Coronavirus Task Force" and a fellow at the Hoover institution — a front for right wing views funded by billionaires. There is literally no universe in which this is a true egalitarian "Enlightenment" discussion — from the elite over-representation with Harvard and the billionaires at Hoover to the lack of disclosure of conflicts of interest (failure modes 2 and 4, respectively). That far too many people think Atlas being "censored" is against the spirit of the Enlightenment is exactly how it can fail.



[0] This is similar to the argument against markets as mechanisms for knowledge discovery — information leakage in the causal mechanism breaks it.

[1] More on this here. Why do we have to hear specifically Charles Murray talk about race and IQ? (TL;DR because it's not about ideas, but rather signalling and authority.)

[2] Personally, I think IQ tests should include a true/false question that asks if you think there's nothing wrong with believing the racial or ethnic group to which you belong has on average a higher IQ than others. Answering "true" would indicate you're probably bad at understanding self-bias that is critical to scientific inquiry and should reduce your score by at least 1/2.  As George Bernard Shaw said, “Patriotism is your conviction that this country is superior to all other countries because you were born in it.” Racism is at its heart your conviction that your race is superior to all other races because you were born into it — the rest is confirmation bias.

[3] In Star Trek: The Next Generation "Measure of a Man" (S:2 E:9), Commander Riker is tasked with prosecuting the idea that the android Lt. Commander Data is not a person, but rather Federation property — something with which Riker personally disagrees.

[4] I have never really understood this. Unless you're hopelessly obtuse, you must know if you have racist views. Why would you be upset about other people identifying them as such? The typical argument being supported by racist views is that racism is correct and right! A racist (who happens to be white by pure coincidence) who believes that other non-white people have lower IQs through some genetic effect is trying to support racism. I have so much more respect for racists, like a pudgy white British man who appears in the beginning of The Filth and the Fury (2000) who openly admits he is racist. That's the Enlightenment!

[5] In no way is this censorship and calling it that is risible idiocy. The tweets were removed on Twitter, a private company, not by the US government. And Atlas still has access to multiple platforms — including amplification by elite Harvard professors, which is what is actually happening.

Saturday, July 18, 2020

Dynamic information equilibrium and COVID-19

Since I've gotten questions, I thought I'd put together a brief explainer on the Dynamic Information Equilibrium Model (DIEM) and its application to the path of COVID-19.


I wrote a preprint on the DIEM a couple years ago (posted at SSRN), and gave a talk about the approach at the UW economics department (see here). The primary application was to labor markets, specifically the unemployment rate. However, the model has many other applications in economics (and the original information equilibrium approach has applications to physics). So how did I end up applying this model to COVID-19? It started from laziness.

Back in April, I was looking at the various models of COVID-19 out there, in particular the IHME model. I wanted to compare the performance to the data, but instead of coding it up myself I took a screenshot and digitized the data. Digitizing adds error and digitizing exponentially falling functions creates all kinds of problems, so I instead fit the IHME forecasts with a DIEM model since I had the code readily available.

It turned out to do a decent job of describing the IHME models, but additionally when there were discrepancies with the observed data it turned out the DIEM worked better. Thinking about the foundations of the DIEM, the reason it worked became clear.


The DIEM is an application of "information equilibrium" — the idea that one process $A$ can be the source of information for another process $B$ such that it takes the same number of bits of (information theory) information to specify $A$ as it does to specify $B$. In a sense, if $A$ is in information equilibrium with $B$ then the two are informationally equivalent. Information equilibrium constrains what a process that matches e.g. $A$ with $B$ can look like.

That's all very abstract, but in economics we have demand for a good being matched with supply (creating a transaction) or job openings being matched with unemployed people (creating a hire) — in equilibrium. In the case of COVID-19, we have virus + healthy person $\rightarrow$ sick person

Like any communication channel transferring information, these matches can fail to happen. Voices are garbled on a cell phone call causing a failure of the information specifying the sound waves going into the the speaker's phone being transferred completely to the sound waves coming out of the listener's phone. Information equilibrium is something of an idealized state that can be interrupted by non-equilibrium. It may seem vacuous to say sometimes you have equilibrium and sometimes you have non-equilibrium, but the information theory underlying it gives us some useful handles (e.g. failures to fully sample the underlying space, correlations, or other changes in information entropy).

Dynamic information equilibrium asks what information equilibrium can tell us when the processes $A$ and $B$ are growth processes.

A & \sim e^{a t}\\
B & \sim e^{b t}

Just because they are "growth" processes, that doesn't mean they are growing — they could be shrinking or $A$ could be growing and $B$ could be shrinking.

If you go to the paper you can get the details of the mathematics (including how this generalizes to ensembles of processes), but the key result is that information equilibrium requires

\frac{d}{dt} \log \frac{A}{B} \simeq (k - 1) b \equiv \alpha

where $k$ measures the relative information content of events in process $A$ versus events in process $B$. What this says is that if you look at the data on a log plot versus time, it will consist mostly of data where the rate of growth of decline of the data will be a straight line (i.e. exponential growth or decay with constant log-linear slope).

Mostly. What makes this DIEM a model and not a theory is that there's an assumption about what happens in non-equilibrium. In the original application of the model to the unemployment rate, there was an assumption that the straight line isn't interrupted by non-equilibrium too much — that non-equilibrium events are sparse in the time series data. If this wasn't true, then it'd be impossible to measure that $\alpha$ and your model of non-equilibrium would be everything. In labor markets, recessions are the sparse non-equilibrium events in the unemployment rate and the recovery is the equilibrium:

Adding in a logistic step function to handle the recessions shocks gives us a description of the unemployment rate (and other economic variables) over time:


It turns out that the DIEM is really good model of the data for COVID-19 cases and deaths and the forecast from April for the path of the outbreak in the US was remarkably accurate — at least until the 2nd surge in the most recent data (i.e. a non-equilibrium event):

The model works well for most countries, for example here are Italy and the UK (click to enlarge):

The fact that we can't really see that 2nd surge until it starts is due to the model being too simple to predict non-equilibrium events. It can, however, be used to see when a non-equilibrium event is getting started and then monitor its progress. For example, back on May 20th I was predicting the beginning of a 2nd surge in Florida based on the DIEM model of cases there (and I later added a 2nd non-equilibrium shock, which can be handled using e.g. this algorithm):

Another limitation of the model is that it has explicit assumptions that the number of events $n$ you're seeing is large $n \gg 1$. This means the model does not work well when there are just a few cases or deaths and for the initial onset of the outbreak. For example, here is South Korea:

Related to the $n \gg 1$ assumption, we basically start an outbreak at $t_{0}$ in the midst of a non-equilibrium shock with dynamic equilibrium valid for $t \gt t_{0}$. This is effectively treated in the model as if a previous outbreak had recently ended (so that dynamic equilibrium is also valid for $t \lt t_{0}$). The model that would deal with the initial outbreak would almost certainly have to incorporate specifics of the individual virus and the networks it travels in that is beyond the scope of information equilibrium — itself a "shortcut" in describing complex systems.

Other observations

One the things the model predicts is that after a 2nd (or 3rd) surge, the data should return to the previous log-linear path unless something has changed. This appears to be happening for several regions — Germany and King County, WA for example:

This remains to be seen if this holds up. In Sweden, the rate of decline after the 2nd surge in cases seems to have improved and is now comparable to Germany's

Previously, Sweden's rate of decline in cases of $\alpha \simeq$ 2% per day was approximately the same as most of the US — about half the rate of 4-5% apparent in most of Europe as well as in NY state (dominated by counts from NYC). Did people in Sweden change behavior in the face of that 2nd surge? It's an open question. [See update 25 July 2020 below.]

Another thing we need to keep in mind that these are reported cases and deaths. With testing increasing in many countries, more and more cases are discovered. This results in an obvious difference between the rate of decline for cases in the US versus that for deaths:

Other countries have much more similar rates of decline for the two measures. For the US, this means the rate of decline for cases is somewhat lower than would be if testing was widely available. That is to say observed $\alpha_{US} \simeq \alpha_{US}^{\text{cases}} + \alpha_{US}^{\text{testing}}$. It also means the observed rate of decline for cases must decrease at some point in the future (e.g. once testing far outpaces transmission). As it is, the "case fatality rate" (CFR) appears to be heading to zero:

This theoretically should flatten out at some point at the true population CFR (although it's complicated since more deaths can occur during a surge because hospitals are at capacity). Estimated CFRs are in the 0.1% order of magnitude so this point is likely far in the future for the US.


The DIEM is an incredibly simple model. In the senses above — too simple. However, it has also proven useful for estimating the long run path of COVID-19 in several regions. In the places it applies, a given pandemic can be seen as an instance of a universal process with its specific parameters aggregating the effects of multiple aspects of society from policy to social networks to details of the specific virus.

Overall, we should keep in mind that the combination of policy, epidemiology, and social behavior is a social system. There might be empirical regularities from time to time, but humans can always change their behavior and thus change outcomes.


Update 21 July 2020

Minor edits and updated Sweden, Germany and US ratio graphs with more recent data.


Update 25 July 2020

The assumption of sparseness mentioned above may have failed us in the estimation of the dynamic equilibrium rate for Sweden — the first and second surges were too close together to properly measure it. It would resolve some inconsistencies (i.e. Sweden seeming to have a higher rate than the rest of Europe before the 2nd surge, Sweden oddly shifting to a rate more consistent with the rest of Europe after the 2nd surge). Here is the model using the most recent data (as of 11am PDT) to estimate the dynamic equilibrium $\alpha$ compared to the original fit (click or tap to enlarge):


Another way to visualize multiple DIEMs is via what I call "seismograms" which displays the temporal information about the parameters (the shock width and the shock timing) on a timeline like this for several US states (click or tap to enlarge — the blue is only to differentiate the US aggregate, not direction of shock as in other uses):

The translation is fairly straightforward — a longer shock is represented by a wider band placed at the center (in time) of a non-equilibrium shock (above red-ish, below in gray). In the link above, you can add amplitude/magnitude information by scaling the color but this version just emphasizes time. Here's a graphical version of how these translate from my book:


Update 9 September 2020

The "return to equilibrium" has turned out to be remarkably accurate for the US:

A 3rd surge may be getting started in the US (associated with schools opening for the new year) — zoomed in on the gray box in the previous graph:

In Sweden, there is a 3rd surge ending ...

Also, the predicted path of deaths in the US using cases turned out to be fairly accurate with only the lag being uncertain in advance:

The ratio of deaths to cases for the US has returned to the "equilibrium" of a decline due to a likely combination of effects from demographic to increasing testing (the latter seeming like the primary contribution):


Update 2 October 2020

Another predictive success of the DIEM for COVID-19 — calling a 3rd surge in Florida on 9/13:

And its subsequent appearance:


Data sources:

International data from European CDC

US state data from the COVID Tracking Project

Friday, May 1, 2020

What's in a name?

That which we call a model by any other name would describe as well ... or not
Shakespeare, I think.

I'm in the process of trying to distract myself from obsessively modeling the COVID-19 outbreak, so I thought I'd write a bit about language in technical fields.

David Andolfatto didn't think this twitter thread was very illuminating, but at its heart is something that's a problem in economics in general — and not just macroeconomics. It's certainly a problem in economics communication, but I also believe it's a kind of a professional economics version of "grade inflation" where "hypotheses" are inflated into "theorems" and "ideas" [1] are inflated into "models".

Now every economist I've ever met or interacted with is super smart, so I don't mean "grade inflation" in the sense that economists aren't actually good enough. I mean it in the sense that I think economics as a field feels that it's made up of smart people so it should have a few "theorems" and "models" in the bag instead of only "hypotheses" and "ideas" — like how students who got into Harvard feel like they deserve A's because they got into Harvard. Economics has been around for centuries, so shouldn't there be some hard won truths worthy of the term "theorem"?

This was triggered by his claim that Ricardian equivalence is a theorem (made again here). And I guess it is — in economics. He actually asked what definitions were being used for "model" and "theorem" at one point, and I responded (in the manner of an undergrad starting a philosophy essay [2]):
a general proposition not self-evident but proved by a chain of reasoning; a truth established by means of accepted truths 
a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs
I emphasized those last clauses with asterisks in the original tweet (bolded them here) because they are important aspects that economics seems to either leave off or claim very loosely. No other field (as far as I know) uses "model" and "theorem" as loosely as economics does.

The Pythagorean theorem is established from Euclid's axioms (including the parallels axiom, which is why it's only valid in Euclidean space) that include things like "all right angles are equal to each other". Ricardian equivalence (per e.g. Barro) instead based on axioms (assumptions) like "people will save in anticipation of a hypothetical future tax increase". This is not an accepted truth, therefore Ricardian equivalence so proven is not a theorem. It's a hypothesis.

You might argue that Ricardian equivalence as shown by Barro (1974) is a logical mathematical deduction from a series of axioms — just like the Pythagorean theorem — making it also a theorem. And I might be able to meet you halfway on that if Barro had just written e.g.:

A_{1}^{y} + A_{0}^{o} = c_{0}^{o} + (1 - r) A_{1}^{o}

and proceeded to make a bunch of mathematical manipulations and definitions — calling it "an algebraic theorem". But he didn't. He also wrote:
Using the letter $c$ to denote consumption, and assuming that consumption and receipt of interest income both occur at the start of the period, the budget equation for a member of generation 1, who is currently old, is [the equation above]. The total resources available are the assets held while young, $A_{1}^{y}$, plus the bequest from the previous generation, $A_{0}^{o}$. The total expenditure is consumption while old, $c_{1}^{o}$, plus the bequest provision, $A_{1}^{o}$, which goes to a member of generation 2, less interest earnings at rate $r$ on this asset holding.
It is this mapping from these real world concepts to the variable names that makes this a Ricardian Equivalence hypothesis, not a theorem, even if that equation was an accepted truth (it is not).

In the Pythagorean theorem, $a$, $b$, and $c$ aren't just nonspecific variables, but are lengths of the sides of a triangle in Euclidean space. I can't just call them apples, bananas, and cantaloupes and say I've derived a relationship between fruit such that apples² + bananas² = cantaloupes² called the Smith-Pythagoras Fruit Euclidean Metric Theorem.

There are real theorems that exist in the real world in the sense I am making — the CPT theorem comes to mind as well as the noisy channel coding theorem. That's what I mean by economists engaging in a little "grade inflation". I seriously doubt any theorems exist in social sciences at all.

The last clause is also important for the definition of "model" — a model describes the real world in some way. The Hodgkin-Huxley model of a neuron firing is an ideal example here. It's not perfect, but it's a) based on a system of postulates (in this case, an approximate electrical circuit equivalent), and b) presented as a mathematical description of a real entity.

Reproduced from Hodgkin and Huxley (1952)
The easiest way to do part b) is to compare with data but you can also compare with pseudo-data [3] or moments (while its performance is lackluster, a DSGE model meets this low bar of being a real "model" as I talk about here and here). *Ahem* — there's also this.

Moment matching itself gets the benefit of "grade inflation" in macro terminology. I'm not saying it's necessarily wrong or problematic — I'm saying a model that matches a few moments is too often inflated to being called "empirically accurate" when it really just means the model has "qualitatively similar statistics".

One of the problems with a lack of concern with describing a real state of affairs is that you can end up with what Paul Pfleiderer called chameleon models — models that are proffered for use in policy, but when someone questions the reality of the assumptions the proponent changes the representation (like a chameleon) to being more of a hypothesis or plausibility argument. You may think using a so-called "model" that isn't ready for prime time can be useful when policy makers need to make decisions, but Pfleiderer put it well in a chart:

But what about toy models? Don't we need those? Sure! But I'm going to say something you're probably going to disagree with — toy models should come after empirically successful theory. I am not referring to a model that matches data to 10-50% accuracy or even just gets the direction of effects right as a toy model — that's a qualitative model. A toy model is something different.

I didn't realize it until writing this, but apparently "toy model" on Wikipedia is a physics-only term. The first line is pretty good:
In the modeling of physics, a toy model is a deliberately simplistic model with many details removed so that it can be used to explain a mechanism concisely.
In grad school, the first discussion of renormalization in my quantum field theory class used a scalar (spin-0) field. At the time, there were no empirically known "fundamental" scalar fields (the Higgs boson was still theoretical) and the only empirically successful uses of renormalization were QED and QCD — both theories with spin-1 gauge bosons (photons or gluons) and spin-½ fermions (electrons or quarks). Those details complicate renormalization (e.g. you need a whole different quantization process to handle non-Abelian QCD). The scalar field theory was a toy model of renormalization of QED — used in a class to teach renormalization to students about to learn QED that had already been shown to be empirically accurate to 10s of decimal places.

The scalar field theory would be horribly inaccurate if you tried to use it to describe the interactions of electrons and photons.

The problem is not that many economic "toy models" are horribly inaccurate, but rather that they don't derive from even qualitatively accurate non-toy models. Often it seems no one even bothers to compare the models (toy or not) to data. It's like that amazing car your friend has been working on for years but never seems to drive — does it run? Does he even know how to fix it?

At this stage, I'm often subjected to all kinds of defenses — economics is social science, economics is too complex, there's too much uncertainty. The first and last of those would be arguments against using mathematical models or deriving theorems at all, which a fortiori makes my point that the words "model" and "theorem" are inflated from their common definition in most technical fields.

David's defense is (as many economists have said) that models and theorems "organize [his] thinking". In the past, my snarky comment on this has been that economists must have really disorganized minds if they need to be organizing their thinking all the time with models. Zing!

But the thing is we have a word for organized thought — idea [4]:
a formulated thought or opinion
But what's in a name? Does it matter if economists call Ricardian equivalence a theorem, a hypothesis, or an idea? Yes — because most human's exposure to a "theorem" (if any) is the Pythagorean Theorem. People will think that the same import applies to Ricardian Equivalence, but that is false equivalence.

Ricardian Equivalence is nowhere near as useful as the Pythagorean Theorem, to say nothing about how true it is. Ricardian Equivalence may be true in Barro's model — one that has never been compared to actual data or shown to represent any entity or state of affairs. In contrast, you could right now with a ruler, paper, and pencil draw a right triangle with sides of length 3, 4, and 5 inches [5].

I hear the final defense now: But fields should be allowed their own jargon — and not policed by other fields! Who are you fooling? 

Well, it turns out economists are fooling people — scientists who take the pronouncements of economics at face value. I write about this in my book (using two examples of E. coli and capuchin monkeys):

We have trusting scientists going along with rational agent descriptions put out there by economists when these rational agent descriptions have little to no empirical evidence in their favor — and even fewer accurate descriptions of a genuine state of affairs. In fact, economics might do well to borrow the evolutionary idea of an ecosystem being the emergent result of agents randomly exploring the state space.



My "to be fair" items so that I'm not just "calling out economics" are "information" in information theory and "theory" in physics. The former is really unhelpful — I know it's information entropy, but people who know that often shorten it to just information and people who don't think information is like knowledge despite the fact that information entropy is maximized for e.g. random strings.

In physics, any quantum field theory Lagrangian is called a "theory" even if it doesn't describe anything in the real world. It is true that the completely made up ones don't get names like quantum electrodynamics but rather "φ⁴  theory". If it were economics, that scalar field φ would get a name like "savings" or "consumption".



[1] I had a hard time coming up with the word here — my first choice was actually "scratch work". Also "concepts" or "musings".

[2] ... at 2am in a 24 hour coffee shop on the Drag in Austin.

[3] "Lattice data" (for QCD) or data generated with VAR models (in the case of DGSE) are examples of pseudo-data.

[4] Per [1], this is also why I thought "concept" would work here:

something conceived in the mind
[5] This is actually how ancient Egyptians used to measure right angles — by creating 3-4-5 unit triangles [pdf].