Information Transfer Economics: Expectations destroy information

Thursday, May 1, 2014

Expectations destroy information

I have a beef with the use of expectations in economics. They seem to be so powerful as to make them useless as an explanation; they also seem to lack empirical impact where they originally were designed to have an impact.

Here, I am going to describe the role expectations play in the economy using information theory -- it's not diving too deep, mostly just an argument using Shannon information and measuring information loss with the KL divergence.

The KL divergence measures the extra message length for a given amount of data that must be sent if you have a code optimal for the wrong distribution relative to the true distribution. In general, it represents the information loss (measured in nats or bits depending on which logarithm you use) by being wrong about a distribution relative to being right.

Relating this to information transfer economics, the KL divergence represents the extra information that must be sent in the market (lower information efficiency) given the wrong expected economic state distribution relative to the true future economic state distribution. I previously hypothesized that lower information efficiency was related to recessions.

In this post, I want to show that economic "expectations" generally destroy information unless they are right. I put together some simulations illustrating this point using a relatively simple model. Imagine there are 10 states the economy can be in (think a hidden semi-Markov model). There is a probability distribution that gives the chance it is in one of the state in the next time period, for example, if all states are equally likely, this distribution would be

0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1

i.e. a 10% chance it is in any of the ten states.

If the economy was going to be in state #2 with 100% probability, then it looks like this:

0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

We will compare two of these distributions, call them the actual (A) and the expected (E) distributions with the KL divergence. The KL divergence of the two distributions above is

0.0 log(0.0/0.1) + 1.0 log(1.0/0.1) + 0.0 log(0.0/0.1) + ... + 0.0 log(0.0/0.1)

= 0 + 2.303 + 0 + 0 + ... + 0

= 2.303

That is, there is about 2.3 nats (3.3 bits) more information in the second sequence relative to the first. (Which makes sense: specifying a number from 1 to 10 -- in our case state #2 -- requires 3.3 bits.)

In the following, I actually show the negative of the KL divergence (partially because I accidentally did it that way and am too lazy to change it, and partially because it emphasizes that we are looking at information loss).

In the simulations below, I randomly generated 10,000 probability distributions on the 10 states and looked at the KL divergence. I did three cases: one, the expected distribution was a small perturbation from the actual distribution (E = A + δA), two, the expected distribution was the first distribution given above .. all states equally likely aka the least informative prior (E = 0.1), and three, the expected distribution was another randomly generated distribution (E and A are uncorrelated). Here are the results:

We can see that if there aren't any big changes in the distribution (the small perturbation, blue), the information loss is minimal. If there are big changes (the new distribution is uncorrelated, red) then information loss is not only large, but has a long tail. Interestingly, the least informative prior (gray) is not only in the middle of these, but doesn't have much of a tail (there is a sharp cut-off). You'd imagine the blue histogram showing the typical case where past performance gives an indication of future performance, with only small deviations. The red histogram shows what it's like if you're wrong about the future.

What does this mean? Well, if expectations are accurate (people can accurately predict the future), then there is a limited amount of information loss. It may be reductio ad absurdum, but I'd say this pretty much proves that there is information loss in the market mechanism. Who can accurately predict the future? I'll take it a step further and say that expectations are the cause of that loss of information, and information loss appears to be a primary driver of recessions.

When people are especially bad at predicting the future, you not only get massive information loss on average, but there is a long tail of even greater loss (red histogram).

The average case for information loss, falling between the two extremes, is pretty well described by the least informative prior. This is potentially a reason why the information transfer model, which assumes this as a starting point, can do a good job with trends. This least informative prior is also effectively the efficient markets hypothesis in the real-world sense. There are trends and momentum, but price movements are unpredictable. They are not completely unpredictable; the information loss (red histogram) would be large. This least informative prior is also the assumption that expectations do not affect the long run trends.

Ah, but, you say, if you kind of know what you are talking about (you're able to predict the future), then you can do better than the least informative prior. That's why I have this next graph: it shows that the ability to do better than the least informative prior is very sensitive to how wrong you are -- you only need to be a little bit wrong for the tail risk to be a serious negative impact on your average performance. I plot the histograms for a 10% perturbation (used in the graph above) up to a 40% perturbation (in the last one, I show the 100% perturbation from the graph above in red):

The takeaway is that you basically always have to be right otherwise the potential losses from being wrong will add up and cause you to lose more information than the least informative prior over time.

Ah, but, you say, wouldn't an expectation based on the information transfer model give you at least some capability to predict the future -- contradicting your assertion that people are bad at predicting the future? Nope. The information transfer model is built on the least informative prior -- you are assuming maximum ignorance about the future, that is to say, you are assuming as much ignorance about the future as I am claiming people are already in possession of! That is to say, maximum ignorance.

14 comments:

Tom BrownMay 2, 2014 at 1:14 PM
Jason, you're back!... I will dig in later. In the meantime I had some of my own interchanges about expectations recently, which I've summarized here:

http://pragcap.com/forums/topic/expectations#post-64397
ReplyDelete
Replies
Tom BrownMay 3, 2014 at 10:05 AM
Jason, when you calculate the KL divergence of two distributions above (call then D1 and D2) and the corresponding sets of 10 probabilities for each are say

D1: {P1(1), P1(2), .. P1(10)} = {0.1 ,0.1, ... 0.1}

and

D2: {P2(1), P2(2), ... P2(10)} = {0, 1, 0, 0, ... 0}

Then the KL divergence you calculate follows the following formula?

KL divergence = P2(1)*log(P2(1)/P1(1)) + P2(2)*log(P2(2)/P1(2)) + ... + P2(10)*log(P2(10)/P1(10))

I'm just trying to match up D1 and D2 with the numbers you plugged in there.

Do you say that is the KL divergence of D1 wrt D2?

Also, I guess I haven't thought about this before, but epsilon*log(epsilon) goes to 0 as epsilon goes to 0? L'Hopital's rule I guess?
ReplyDelete
Replies
Jason SmithMay 3, 2014 at 11:02 AM
I realized I wanted to like the phrase:

"information loss appears to be a primary driver of recessions"

to this post:

http://informationtransfereconomics.blogspot.com/2014/03/modeling-macroeconomic-fluctuations.html

But I already linked to it twice in the post, so that is probably good enough.
ReplyDelete
Replies
Tom BrownMay 3, 2014 at 11:21 AM
"It may be reductio ad absurdum, but I'd say this pretty much proves that there is information loss in the market mechanism."

Can you expand on this a bit? ... in terms of your example here, this even applies with the "maximally ignorant" expectation of you distribution A above (all 10 states assumed to have equal probability), correct? Hmmm... I don't know, where does the "market mechanism" come in here exactly? Thanks.
ReplyDelete
Replies
Tom BrownMay 3, 2014 at 11:34 AM
Jason, do you think it's possible to translate the concepts you introduce here into a Nick Rowe (or David Beckworth) style allegorical story?

I'm having a hard time imagining what the 10 market states might correspond to. I realize this is a super simplified example, and maybe there is not good correlation. A discrete distribution is probably not the best either for that purpose.

But say we had a very simple story... say interest rates could only take on discrete values, and only a finite number were possible. Then say... maybe some other variable... I don't know ... the "discount rate" (which I only learned about yesterday)... so say 5 interest rates and two possible discount rates. There's 10 possible states right there. Does that capture (conceptually) how one might go about relating your discrete states to states of the economy?
ReplyDelete
Replies
Tom BrownMay 3, 2014 at 2:07 PM
Jason, you feel like a challenge? I pointed your blog out to Mark A. Sadowski once. He seemed to already be aware of it and furthermore to have a generally favorable view of it (he seemed to regard you as being unlike some other physicists that dabbled in econ... he meant in a good way). However, when I asked you about your view of expectations I thought he'd have a problem with that, which was true:

http://www.themoneyillusion.com/?p=26552&cpage=2#comment-328760

I'm not at the level where I can understand what either you or Mark are talking about most of the time, but he seems to be pretty highly regarded as an empiricist. He regards himself as an empiricist, and although he won't label himself a Market Monetarist, he does spend a lot of time defending their views and I think he generally agrees with them. Did you happen to see the time that in a debate with Steve Randy Waldman (at interfluidity.com) Mark essentially "won" the debate, and Steve ended up crossing out at least one whole post (and maybe one other) in it's entirety and putting red "BULLSHIIT" watermarks across all his own plots, with a brief explanation that he did so because Mark had convinced him he was wrong? That was amusing. Anyway, Mark's defense of MMism apparently includes their take on expectations (which is central to the MM story from what I can tell). I'd LOVE to see you guys debate the expectations issue at some point. I'd sit back with my bowl of popcorn and watch... trying to learn as much as possible (though I'm sure a lot would go over my head). Do you feel you have the data to back up your claims in such a match up with Mark? :D
ReplyDelete
Replies

Add comment

Comments are welcome. Please see the Moderation and comment policy.

Also, try to avoid the use of dollar signs as they interfere with my setup of mathjax. I left it set up that way because I think this is funny for an economics blog. You can use € or £ instead.

Note: Only a member of this blog may post a comment.