Thursday, January 2, 2020

"It takes a model to beat a model"


I came across this old gem on Twitter (here), and Jo Michell sums it up pretty well in the thread:
It takes a model to beat a model has to be one of the stupider things, in a pretty crowded field, to come out of economics. ... I don’t get it. If a model is demonstrably wrong, that should surely be sufficient for rejection. I’m thinking of bridge engineers: ‘look I know they keep falling down but I’m gonna keep building em like this until you come up with a better way, OK?’
There are so many failure modes of the maxim "it takes a model to beat a model":

Formally rejecting a model with data. Enough said.

Premature declaration of "a model". It seems that various bits of math in econ are declared "models" before they have been shown to be empirically accurate more often than is optimal. Now empirical accuracy doesn't necessarily mean getting variables right within 2% (although it can) — it can mean 10% or even just getting the qualitative form of the data correct. I have two extended discussions on the failure to do this here (DSGE) and here (Keen). The failure mode here is that something (e.g. DSGE) is declared a model using a lower bar than is applied to, say, cursory inspection of data or linear fits.

Rejecting a model as useless even without formal rejection. I wrote about this more extensively here, but the basic idea is that a model a) can be way too complex for the data it's trying to explain (this inherently makes a model hard to reject because you need as a good heuristic ~ 20 or so data points per parameter to make a definitive call so you can always add parameters and say "we'll wait for more data"), or b) can give the same results as another model that is entirely different (either use Occam's razor, or just give one of these  ¯\_(ツ)_/¯ to both models). The latter case can be seen as a tie goes to no one. Essentially — heuristic rejection.

Rejecting a model with functional fits. Another one I've written more extensively about elsewhere, but if you have a complicated model that has more parameters than a functional fit that more accurately represents the data, you can likely reject that more complicated model. One of the great uses of functional fits is to reduce the relevant complexity (relevant dimension) of your data set. Without any foreknowledge, the dimension d of a data set is on the order of the number n of data points (d ~ n) — worst case is that you describe every data point with a parameter. However, if you can fit that data (within some error) to a function with k parameters with k < d, then any model that describes the same data set with p parameters (within the same error) where kp < d, then you can (informally) reject that model as likely too complex. That functional fit doesn't even have to come from anywhere! (Note, this is effectively how Quantum Mechanics got its first leg up from Planck — lots of people were fitting the blackbody spectrum with fewer and fewer parameters until Planck gave us his one-parameter fit with Planck's constant.)

Failing to accept a model as rejected. One of the most maddening ways the "it takes a model to beat a model" maxim is deployed is by people who just don't accept that a model has been rejected or that another model outperforms it. This is more a failure mode of "enlightenment rationality" which assumes good faith argument from knowledgeable participants [1].

I make no particular argument that these represent an orthogonal spanning set (in fact, the 4th has non-zero projection along the 3rd). However, it's pretty clear that the maxim is generally false. In fact, it's pretty much the converse [2] of a true statement — if you have a better model, then you can reject a model — and as we all learned in logic the converse is not always true.

...

Update 14 January 2020

Somewhat related, there is also the idea that "there's always a least bad model" — to use Michell's analogy, there's always a least bad bridge. But there isn't. Sometimes there's just a shallow bit to ford.

Paul Pfleiderer takes on the compulsion to have something that gets called a "model" in his presentation here:


Making a model that isn't empirically accurate using unrealistic assumptions to make a theoretical argument is basically the same thing as making up data to make an empirical one.

My impression is that this compulsion is deeply related to "male answer syndrome" in the male-dominated field of economics.


...

Footnotes:

[1] Note that this is not necessarily a failure mode of science, which is a social process, but rather the application of that macro-scale social process to individual agents. Science does not require any agent to change their mind, only that on average at the aggregate level more accurate descriptions of reality survive over less accurate ones (e.g. Planck's maxim — people holding onto older ideas die and a younger generation grows up accepting the new ideas). The "enlightenment rationality" interpretation of this is that individuals change their minds when confronted with rational argument and evidence, but there is little evidence this occurs in practice (sure, it sometimes does).

[2] In logical if-then form "it takes a model to beat a model" is if you reject a model, then you have a better model.

Tuesday, December 24, 2019

Random odds and ends from December

I thought I'd put together a collection of some of the dynamic information equilibrium models (DIEMs) that only went out as tweets over the past couple weeks.

I looked at life expectancy in the US and UK (for all these, click to enlarge):


The US graph appears to show discrete shocks for antibiotics in the 40s & 50s, seatbelts in the 70s, airbags in the 90s & 2000s along with a negative shock for the opioid crisis. At least those are my best guesses! In the UK, there's the English Civil War (~ 1650s) and the British agricultural revolution (late 1700s). Again — my best guess.

Another long term data series is share prices in the UK:


Riffing on a tweet from Sri Thiruvadanthai I made this DIEM for truck tonnage data — it shows the two phases of the Great Recession in the US (housing bubble bursting and the financial crisis):


There's also PCE and PI (personal consumption expenditures and personal income). What's interesting is that the TCJA shows up in PCE but not PI — though that's likely due to the latter being a noisier series.


Here's a zoom in on the past few years:


Bitcoin continues to be something well-described by a DIEM, but with so many shocks it's difficult to forecast with the model:


We basically fail the sparseness requirement necessary to resolve the different shocks — the logistic function stair-step fails to be an actual stair-step:


A way to think about this is that the slope of this time series (the "shocks") are a bunch of Gaussians. When they get too close to each other and overlap, it's hard to resolve the individual shocks.

That's all for now, but I might update this with additional graphs as I make them — I'm in the process of a terrible cold and distracting myself with fitting the various time series I come across.

Saturday, December 14, 2019

Dynamic equilibrium: consumer sentiment

I looked at the University of Michigan's consumer sentiment index for signs of dynamic information equilibrium, and it turns out to be generally well described by it in the region for which we have monthly data [1]


The gray dashed lines are the dynamic equilibria. The beige bands are the NBER recessions, while the gray bands are the shocks to consumer sentiment. There might be an additional shock in ~ 2015 (the economic mini-boom) but the data is too noisy to clearly estimate it.

Overall, this has basically the same structure as the unemployment rate — and in fact the two models can be (roughly) transformed onto each other:



The lag is 1.20 y fitting CS to U and −1.24 y fitting U to CS meaning that shocks to sentiment lead shocks to unemployment by about 15-16 months. This makes it comparable to the (much noisier) conceptions metric.

Of course, this is not always true — in particular in the conceptions data the 1991 recession was a "surprise" and in the sentiment data the 2001 recession was a surprise. It's better to visualize this timing with an economic seismogram (that just takes those gray bands on the first graph and puts them on a timeline, colored red for "negative"/bad shocks and blue for "positive"/good shocks):


As always, click to enlarge.

Note that in this part of the data (and as we'll see, the rest of the data), CS seems to largely match up with the stock market. I've added in the impossibly thin shock in the S&P 500 data (along with a boom right before that looks a bit like the situation in early 2018) in October of 1987  — the largest percentage drop in the S&P 500 on record ("Black Monday", a loss of ~ 20%). Previously, I'd left that shock out because it's actually very close to being within the noise (it's a positive and a negative shock that are really close together, so it's difficult to resolve and looks like a random blip).

If we subtract out the dynamic equilibrium for consumer sentiment and the S&P 500, and then scale and shift the latter, we can pretty much match them except for the period between the mid 70s and the late 90s:


Remarkably, that period is also when a lot of other stuff was weird, and it matches up with women entering the workforce. It does mean that we could just drop down the shocks from the S&P 500 prior to 1975 into the consumer sentiment bar in the economic seismogram above.

I don't know if anyone has looked at this specific correlation before over this time scale — I haven't seen it, and was a bit surprised at exactly how well it worked!

...

Update 22 December 2019

Noah Smith tweeted a bunch of time series of surveys, so I took the opportunity to see how well the DIEM worked. Interestingly, there may be signs of running into a boundary (either the 100% hard limit, or something more behavioral — such as the 27% 'crazification factor'). Click to enlarge as always. First, the Gallup poll asking whether now is a good time to get a quality job:


And here is the poll result for the question about the economy being the most important issue in the US:


Both of these series are highly correlated with economic measures — the former with the JOLTS job openings rate (JOR), the latter with the unemployment rate:

 

...

Footnotes:

[1] Since many shocks — especially for recessions & the business cycle — have durations on the order of a few months, if the data is not resolved at monthly or quarterly frequency then the shocks can be extremely ambiguous. As shown later in the post (the S&P 500 correlation), we can look at some of the other lower resolution data as well.

Sunday, December 8, 2019

Unemployment in France (and Germany)

I thought I'd look in to the unemployment rate in France using dynamic information equilibrium after seeing a tweet from Manu Saadia. Originally, this appeared as a twitter thread, but I've expanded it into a blog post. Manu tells the story ...
The main economic problem of France is endemic, mass unemployment. It has been going on since I was born, in the early 70s. Left and Right governments have come and gone, reformed this and reformed that but mass unemployment has remained.
And that story is pretty much what the data says:


We have a series of non-equilibrium shocks that could easily be considered one long continuous shock from the late 60s until the 80s. Politically, this was under French Presidents de Gaulle, Pompidou, and Giscard — coming to an end under Mitterand. This set the stage for the persistently high unemployment rate.

The unemployment rate does not come down as fast as in France as it does in the US — the dynamic equilibrium is about d/dt log U = −0.05/y in France versus −0.08/y in the US, −0.09/y in Japan, or −0.07/y in Australia. A 10% unemployment rate will come down nearly a full percentage point in the US or Japan in a year on average in equilibrium, but only half a point in France. 

France also experienced the double dip that the entire EU experienced in the global financial crisis. Without that double-dip, unemployment in France would be closer to 5% today (assuming the dynamic equilibrium model is correct, of course). Adding a shock in 2000 in France didn't improve the metrics much. It's likely a genuine shock (like in the broader EU), but it seems a borderline case in the data.


Well, that double dip was not exactly experienced by the entire EU ...

Germany doesn't really experience the global financial crisis except as a bit of "overshooting" in a recession that starts in the early 2000s and additionally has no subsequent 2012 recession.


Germany turns out to be a counterexample to claims about that −0.05/y is representative of a structural problem unique to France made by Lars Christensen:
And there you have the answer: THERE is a major STRUCTURAL problem in France - otherwise wages would adjust faster to shocks. This combined with the lack of a proper monetary policy is the cause of France's unemployment problem.
Germany has a similarly low 'matching' dynamic equilibrium rate on the order of −0.05/y. France is actually a bit better at  −0.054/y compared to Germany at −0.049/y however we should be careful of reading too much into what is likely unrealistic precision. And Spain's matching rate appears to be closer to −0.12/y making it the "best" managed country of the three on this metric.

The main policy failure — if there is one — is to be found in the shocks (or single big shock) to the French economy in the 70s that raised unemployment to the higher level. This is similar to the "path dependence" in the unemployment rate for black people in the US compared to white. The shocks and matching rate/dynamic equilibrium are almost identical — it's just that the black unemployment rate was at a higher level sometime before the 1970s (Jim Crow & general racism) and so experiencing the same shocks to the same economy remained higher ever since.

Germany experienced a lesser version of those shocks to unemployment in the 60s and 70s as well as that lack of a second shock in 2012 in the global financial crisis putting it in a slightly better position today. 

It's a possibility that Christensen may be right about France's lack of an independent monetary policy with Eurozone policy set just right for Germany but too tight for France leading to a "double dip" and 2.5 percentage points higher unemployment. But like with Spain having the best labor market when judging by the dynamic equilibrium, it becomes pretty weird pretty quickly to make this "double dip" story work.

In addition to Germany, monetary policy must have been just right for Estonia, Greece, and Ireland by this "lack of a double dip" metric. In addition to France, monetary policy was also too tight for the Netherlands, Spain, Italy, Portugal, Finland, Slovenia, Luxembourg, and Austria. Again, that's if we use this "double dip" metric. Turkey and Australia also experience a negative shock at the same time despite not being on the Euro.

A more likely explanation is much simpler — a huge surge in the price for oil in 2011 (in part due to the Arab Spring uprisings):


In fact, the oil shocks of the 70s are blamed for the economic malaise in France and the end of the Trente glorieuses. Not every country has the same exposure to commodities prices — for example, unemployment in the US continued on its downward path unabated.

A country's unemployment history could also be caused by the oldest factor on record — just a bit of bad luck. For example, the US could have been in a similar state in at least one of these Monte Carlo unemployment rate histories:



Saturday, December 7, 2019

Money velocity, interest rates, and ... robots?

I'd been ruminating on a question from commenter Anti on my end of the year wrap-up post:
I can't get past the idea that monetarism is legitimate, but you seem to have a point about women entering the workforce. How likely is it that such demographic changes change money velocity and that central banks seem to take a long time realizing such changes occur? Perhaps the surge in working women increased velocity in the 70s, making it easier to spur inflation, and we've seen the trend reverse since.
As a recovering monetary model-curious person myself, and having looked at the correlations between money velocity and interest rates (like here for MZM, or money with zero maturity), I can agree that there is probably macro-relevant information in those relationships. In fact if you look at a long run interest rate series (like Moody's AAA corporate rate, which tracks the 10-year rate quite closely), it appears that money velocity and interest rates are basically measures of the same underlying thing [click to enlarge]:


That dynamic information equilibrium model (DIEM) for the AAA rate was the subject of a blog post from last year, and true to the information equilibrium relationship between rates and MZM velocity, velocity is well-described by a log-linear transformation of the rate model with different non-equilibrium shock parameters.

And if you look closely at those shocks, you can see that a) the shock to interest rates comes well before velocity, and b) the shock to interest rates is actually earlier than the demographic shock to the level of women in the labor force (which is closer to the velocity shock).

Did rising interest rates cause the demographic shock and subsequent inflation ... and everything else? It's kind of a neo-Fisherite view that was the rage in the econoblogosphere a couple years ago. But let's look at the causality in terms of an economic seismogram [click to enlarge]:


First, as a side note, this picture makes the view that spending on the Vietnam war was a factor on interest rates and the Great Inflation look even sillier — the shock to interest rates begins in the late 50s or early 60s ... well before the acceleration in the war.

But does this cause problems for the view that demographics (more specifically, labor force size) are a controlling factor in inflation? Did the rising AAA rate cause velocity to increase, which then caused women to enter the workforce and subsequently inflation to rise?

The issue with this view of the causality between monetary measures and inflation comes down to some of the same problems with the Vietnam war view — specifically:

  • Inflation reaches its peak about 3.5 years after Civilian Labor Force (CLF) growth does, and when CLF declines in the Great Recession, inflation reaches its nadir (“lowflation”) about 3.5 years later in 2013.
  • Those two changes are of comparable magnitude — the smaller CLF decline in the Great Recession results in a commensurately smaller decline in the price level.
  • There is no visible shock to the velocity of MZM or AAA interest rates in the opposite direction ~ 12.5 or 6.5 years before the lowflation shock for AAA and MZM, respectively. These would have to be in June of 2000 and June of 2006 (again, for AAA and MZM, respectively). In fact, at those moments those metrics are at relative highs compared to the DIEM path.
As I talk about both in my post on Granger causality and economic seismograms and in my book, due to the extremely limited nature of macroeconomic data (both in time series as well as in the number of macro-relevant events) we have to be extremely careful about claiming causality. The two shocks to the labor force are of different sizes and in opposite directions — which matches up with the two shocks of different sizes in opposite directions to inflation with almost exactly the same delay

I step on the accelerator and the car speeds up in the next couple seconds; I pull my foot off and it slows down over the next couple seconds. This is the situation we like to see in terms of causality. We don't want "long and variable lags" as Milton Friedman put it — that's just scientific nonsense. If I step on a pedal with the car accelerating a second later, it's hard to justify causality when I let off that pedal and the car accelerates more a few minutes later or worse ... does nothing.

That's the case we have with interest rates and velocity. That does not mean the shocks are unrelated to inflation — lowflation could have been caused by some other factor X. However, we'd then have the simple causal model with fixed lag dt between CLF and CPI = a log CLF(t+dt) + b  compared to a model where CPI = f(AAA, X) where we neither know what X is nor have any other non-equilibrium data with shocks to figure it out. Occam comes to our rescue!

That's one of the major things I tied to get a handle on in my book — causality. Causality is hard, especially in time series [1]. I made sure that I wasn't just relying on a single shock (any two DIEMs with a single shock of comparable width can be transformed into each other), and frequently went to other historical data to make sure any claims I was making bore out. For example, did you know that male paper authors started citing themselves much more in the years after women started working in greater numbers?

...

So why do we have this shock to monetary observables in the 50s and 60s? What is it about?

My guess is that it's about something entirely different: capital stock. There is a surge in capital stock around the same time period as the surge in interest rates. Post-war industrial automation resulted in a lot of investment in lots of new equipment to manufacture consumer goods — including the first robots. As I show on the economic seismogram, GMs first robot (that would show up on the Tonight Show a couple years later) is right at the leading edge of that shock to corporate AAA interest rates.

That's just my guess — but it's where my intuition would take me.

...

Addendum 10 December 2019

Another interesting thing we can add to this timeline are the shocks to the S&P 500 (treated as one extended shock) in the 60s and 70s:


Interest rates and velocity go up, and the S&P 500 comes down (link to more on the S&P 500 shocks).

...

Footnotes:

[1] This is not a serious statement, but rather a play on "Prediction is hard, especially about the future." This being the internet, someone almost certainly would have read that straight.

Monday, December 2, 2019

Information Transfer Economics: Year in Review 2019

It's the Information Transfer Economics Year in Review for 2019!

It's my annual meta-post where I try in vain to understand exactly how social media works. But most of all, it's a way to say thank you to everyone for reading. Perhaps there's a post that you missed. Personally, I'd forgotten that one of the top five below was written this year.

As the years go by (now well into the 7th year of the blog), the blog's name seems to be more and more of a relic. I do find it a helpful reminder of where I started each time I open up an editor or do a site search. Nowadays, I seem to talk much more about "dynamic information equilibrium" than "information transfer". In the general context, the former is a kind of subset of the latter:


All of the aspects have applications, it's just that the DIEM for the labor market measures a) gives different results from traditional econ, b) outperforms traditional econ models, and c) has been remarkably accurate for nearly the past three years.

Thanks to your help, I made it to 1000 followers on Twitter this year! It seems the days of RSS feeds are behind us (I for one am sad about this) and the way most people see the blog is through links on Twitter or Facebook. Speaking of which, the most shared article on social media (per Feedly) was this one:

Most shared
The post notes an interesting empirical correlation between the fluctuations in the JOLTS job openings rate (and even other JOLTS measures) around the dynamic equilibrium (i.e. mean log-linear path) with the fluctuations in the S&P 500 around the dynamic equilibrium. It's a kind of 2nd order effect beyond the 1st order DIEM description.
Feedly's algorithm for determining shares is strange, however. I'm not sure what counts as a share (since it's not tweets/retweets). Adding to the confusion as to what a share means, it didn't make the top 5 in terms of page views (per Blogger). Like most years, the top posts are mostly criticism. Those were:

Top 5 posts of the year

#1: MMT = Keynes + Monetary kookiness 
I wrote this soon after Doug Henwood's Jacobin piece that Noah Smith recently re-tweeted. For me, the whole "MMT" thing is not really theory because it doesn't produce any models with any kind of empirical accuracy. I actually have a long thread I'm still building where I'm reading the first few chapters of Mitchell and Wray's MMT macro textbook. Their entire approach to empirical science is misguided — it'd pretty much have to be because otherwise the MMT would've been discarded long ago. It's also politically misguided in the sense that it does not understand US politics. And as Doug Henwood points out, the US is probably the only country that meets MMT's criteria of being a sovereign nation issuing it's own currency because of the role of the US dollar in the world. But this blog post points out another way MMT bothers me: it's just weird. MMT acolytes talk about national accounting identities like how socially stunted gamers talk about their waifu.
#2: Resolving the Cambridge capital controversy with MaxEnt 
This started out as a tongue in cheek sequel to my earlier post "Resolving the Cambridge capital controversy with abstract algebra". Here I showed that the re-switching argument that eventually convinced Paul Samuleson that Joan Robinson was right turns out to have a giant hole in it if your economy is bigger than, say, two firms. This sucked me into a massive argument on Twitter about Cobb-Douglas production functions where people brought up Anwar Shaikh's "Humbug" production function — which I found to be a serious case of academic dishonesty.
#3: JOLTS day: January 2019 
No idea why this became so popular, but it was an update of the JOLTS data. It turns out the "prediction" was likely wrong (and even if it turns out there is a recession in the next year, it would still be right for the wrong reasons). I go into detail about what I learned from that failed prediction in this post.  
#4: Milton Friedman's Thermostat, redux 
This is one of my fun (as in fun to write) "Socratic dialogs" where I try to explain why Milton Friedman's thermostat argument is actually just question begging. 
#5: Market updates, Fair's model, and Sahm's rule 
This is another post that consists mostly of updates (including the inaccurate model from Ray Fair, who is possibly more well known for his inaccurate models of US presidential elections). But it's also where I talk about Claudia Sahm's "rule" that was designed to be a way for automatic stabilizers to kick in in a more timely fashion based on the unemployment rate. There's a direct connection between her economic implementation of a CFAR detector (a threshold above a local average) and my (simpler) dynamic equilibrium threshold recession detector.

The top 3 of 2019 made it into the top 10 of all time, which had been relatively stable for the past couple years. Overall, I'm posting less (I've been exceedingly busy at my real job this past year), but it seems that ones I do post are having more of an impact. Nothing will likely ever dislodge my 2016 post comparing "stock-flow consistency" to Kirchhoff's laws (in the sense that both are relatively contentless without additional models) with tens of thousands of pageviews for reasons that are still baffling to me.

New book!

I also wrote my second short book and released it in June — A Workers' History of the United States 1948-2020. As you can tell from the title, it's a direct response to Friedman and Schwartz's Monetary History and essentially says the popular narratives of the US post-war economy are basically all wrong. Inflation, unionization, and the housing bubble are manifestations of social phenomena — but especially sexism and racism. Check it out if you haven't already.


Thank you!

Thank you again to everyone for your interest in my decidedly non-mainstream approach to economics. Thank you for reading, commenting, and tweeting. I think the ideas have started to gain some recognition — a little bit more each year.

(Here are the 2018, 2017, and 2016 years in review.)