Tuesday, September 2, 2014

Which way does the information flow?

I've been asked how I know information flows from the demand to the supply several times, most recently by commenter Jamie and David Glasner. The easiest answer is that I don't really know, but just assumed that for now. However there are some plausible arguments that I will try to clearly present here. For concreteness, let's call I(D) the information sent (or received) by "the demand" and I(S) the information received (or sent) by "the supply".

This is going to be both technical and philosophical, but first let me start off with three points:
  1. The direction of information flow is largely irrelevant to most of the results of the information transfer model I've presented on this blog because I nearly always take I(D) = I(S); in that case the direction of information flow essentially comes down to a sign convention in which solutions of the differential equation to choose to reproduce supply and demand diagrams.
  2. "Information" in the information transfer framework is not specific insider knowledge about corporate earnings, understanding of the IS-LM model, political intuition, theories of inflation or really anything people colloquially equate with the word information. Actually, information in information theory generally represents a lack of knowledge (a random sequence of numbers has more information than a predictable one). The entirety of I(D) would consist of a list of numbers of widgets purchased at various prices by different individuals. Now it is true that e.g. consumers' theories about inflation might determine the numbers of widgets they might buy at various prices, but as far as the market is concerned (and the information transfer framework), those theories are irrelevant once you have that list of numbers of widgets purchased at various prices. This also helps us understand point three ...
  3. The selection of I(D) or I(S) as the information source is not ideological. Nor does it mean that corporations are "dumber" than consumers or vice versa.
Now for cases where I(S) < I(D) -- or I(D) < I(S) -- it does make a difference which one is the information source; let me present a couple of arguments for how one should be able to tell the difference.

Argument from abstract physical processes

This argument is based on email discussions with Peter Fielitz and can be found in his paper with Guenter Borchardt [1]. What follows is my version of his argument and any errors are mine. The idea is that if a process variable (our D or S above) could theoretically generate an infinite amount of information, then it cannot be an information source -- the amount of information transferred must be finite, i.e. I(source) < ∞.

This seems to point towards demand being the information source since one could (theoretically) produce an infinite supply of widgets (or an effectively infinite supply of widgets relative to the number of consumers -- so many that no more will be bought or the price goes to zero), but a market with an infinite quantity demanded relative to the number of consumers is non-sensical.

Peter gives an analogy with an ideal gas where the internal energy E maps to the demand D and the volume V maps to the supply S. We could set up an experiment where V is effectively infinite (relative to the number of particles) but an experiment where E is infinite relative to the number of particles is non-sensical.

Peter also allows that if one changes the conditions (e.g. keeping one variable constant), it might change the status of which process variable is the source or the destination.

Argument from information loss

Peter Fielitz also had an interesting observation in our email exchange. Using the ideal gas analogy where E maps to D and V maps to S, he pointed out that the internal energy of an ideal gas is hard to measure directly. However one can achieve a very accurate estimate by measuring the pressure and volume and using the ideal gas law E ~ pV.

Now demand is hard to measure directly, but if one uses the analog of the ideal gas law D ~ PS where P is the market price, one should be able to get an accurate estimate of the demand. This is especially interesting because this is exactly how the government tries to measure aggregate demand (AD) or NGDP. Either the statisticians tally up how much everyone made from selling all their goods (income method) or tally up how much all the goods were purchased for (expenditure method), but the result in both cases is an estimate of aggregate demand.

This analogy also points to demand being the information source so that I(S) ≤ I(D): that tally based on aggregate supply AS at best will give us aggregate demand. The estimate of NGDP is either below the true NGDP (in which case it is missing information or wrong) or it is above the true NGDP (in which case it is wrong). Overestimates of NGDP are always wrong; underestimates are either missing information (but can be wrong, too). Even the best estimate from the market will result in NGDP below potential NGDP.

This was part of my original intuition behind why demand is typically the information source in trying to describe economic data -- the total sum of widgets sold (along with their prices) is at best a lower bound for the maximum possible number of widgets sold at various prices.

Let's imagine the demand as a probability distribution P(i, j, k) where i represents a given consumer, j represents a given number of widgets and k represents a given price, and Q(i, j, k) is the probability distribution of desired sales. You could imagine P as the number of people that will buy widgets (at a given price point) in Seattle vs Tacoma and Q as how many widgets at what prices suppliers put in stores in Seattle vs Tacoma.

It seems fairly obvious to me that the market mechanism is trying to figure out P and suppliers with incorrect Q's will lose money (or not make as much) relative to those with correct Q's. I personally don't care how many pounds of bacon the bacon industry is trying to sell in which markets (i.e. Q) -- and I am spending zero effort to find out.

As pointed out by commenter Jamie, firms will produce advertising to influence P. However advertising does not travel through the price mechanism, but rather through human communication systems -- advertising does not represent information flowing from I(S) to I(D) or vice versa. Additionally, firms will spend money on market research in order to figure out P -- they are essentially bypassing the market estimate Q. This could represent an individual firm's lack of information about P (because they have competition and so don't know the entire market estimate Q) or the fact that the full market estimate Q is also imperfect ... giving further evidence that that I(S|Q) ≤ I(D|P). [The notation I(x|y) means the information in the process variable x given the probability distribution y.]

The probability distribution Q as an incorrect estimate of the distribution P represents information loss calculated via the Kullback-Leibler divergence D(P||Q). Now in general 0 ≤ D(P||Q), so an incorrect estimate Q of the distribution P always represents information loss. This is relevant to e.g. this post on Walras' law where information loss could represent excess demand or excess supply at a given price -- the distribution in either case is wrong, and only when there is no excess supply or demand is P = Q.

In this view, the market appears to be a mechanism that attempts to find the best available Q to minimize D(P||Q) given potential constraints, but since D is semi-definite, we generally have I(S|Q) ≤ I(D|P).

Summary

While it's not set in stone (feel free to point out errors in comments!), I think these are some pretty plausible arguments for why we can assume information flows from demand to supply.

The other thing to keep in mind is that for most of this blog, remember point 1 at the top of this post: I assume I(D) = I(S) so the direction of flow doesn't really matter.

[1] P. Fielitz and G. Borchardt, Physics Essays 24 (2011) 350.

Saturday, August 30, 2014

Walras' law, information theory edition


Nick Rowe has a new post up and it inspired me to take up his challenge (entering as a non-economist). Rowe is probably one of the best economist bloggers out there if you want to get more technical than the typical post from Scott Sumner or Paul Krugman. His question is this:
Q. Assume an economy where there are (say) 7 markets. Suppose 6 of those markets are in equilibrium (with quantity demanded equal to quantity supplied). Is it necessarily true that the 7th market must also be in equilibrium (with quantity demanded equal to quantity supplied)?
I've looked at Walras' law before (e.g. this post). I'm going to answer this using information theory with progressively more complexities, but I'll start with some notation.

Define $I(D_{k})$ to be the source (demand) information in the $k^{\text{th}}$ market and $I(S_{k})$ to be the received information (supply). Define aggregate source information (aggregate demand, AD) and aggregate received information (aggregate supply, AS) as

$$
I(AD) =  I(\sum_{k} D_{k}) \;\;\text{and}\;\; I(AS) =  I(\sum_{k} S_{k})
$$

If the information in each market is independent, this becomes:

$$
I(AD) = \sum_{k}  I(D_{k}) \;\;\text{and}\;\; I(AS) = \sum_{k} I(S_{k})
$$

And lastly, define excess information in the $k^{\text{th}}$ market as

$$
\Delta I_{k} \equiv I(D_{k}) - I(S_{k})
$$

Rowe's question becomes

$$
\text{If } \Delta I_{k = 1 .. 6} = 0 \text{ then what is } \Delta I_{7} \text{ ?}
$$

First is the "Walras' law is correct" version [1] ...

We assume that the information in each market is independent and that $I(AD) = I(AS)$, so that

$$
0 = I(AD) - I(AS) = \sum_{k} I(D_{k}) - \sum_{k} I(S_{k}) = \sum_{k} \Delta I_{k}
$$

rearranging the terms

$$
0 = \Delta I_{7} + \sum_{k = 1}^{6} \Delta I_{k} = \Delta I_{7} + 0
$$

Therefore, $\Delta I_{7} = 0$.

Now the thing is that all we can really say is that $I(AS) \leq I(AD)$ (the market doesn't necessarily transfer all the information), so that brings us to the non-ideal information transfer version [2] ...

We assume that the information in each market is independent and that $I(AS) \leq I(AD)$, so that

$$
0 \leq I(AD) - I(AS) = \sum_{k} I(D_{k}) - \sum_{k} I(S_{k}) = \sum_{k} \Delta I_{k}
$$

rearranging the terms

$$
0 \leq \Delta I_{7} + \sum_{k = 1}^{6} \Delta I_{k} = \Delta I_{7} + 0
$$

Therefore, $\Delta I_{7} \geq 0$.

That means Walras' law doesn't pin down that last market, and says that there can be excess demand. But it's even worse than that, which brings us to the non-independent (i.e. mutual) information version [3] ...

As I keep mentioning, we're assuming the information in each market is independent, i.e.

$$
I(D_{j} + D_{k}) = I(D_{j}) + I(D_{k})
$$

But this isn't necessarily true and in general (e.g. Shannon joint entropy)

$$
\text{(1) } I(D_{j} + D_{k}) \leq I(D_{j}) + I(D_{k})
$$

This says for practical purposes that some of the information in the source in one market may be the same as the information in the source in another, hence they do not necessarily add to yield more information. So that all we really know is that

$$
I(\sum_{k = 1}^{6} D_{k}) \geq I(\sum_{k = 1}^{6} S_{k})
$$

based on the fact that you can't get more information out than you put in. This means that knowing the six markets clear doesn't necessarily even tell us about the aggregate demand of the 6 markets (ignoring the seventh).

Nick Rowe basically arrives at this last version -- he says there can be excess demands/supplies of money in each of the six markets so Walras' law can't really tell us anything about the seventh. The information theory argument presented here does not require money, which is consistent with Rowe. He says that the same result could hold in a barter economy because some good could effectively operate as money and there would be excess demands for various barter goods in each of the individual markets. Rowe says that:
Walras' Law is true and useful for the economy as a whole only if there is only one market in the whole economy, where all goods are traded for all goods.
This appears to be saying that if you can't decompose $AD = D_{1} + D_{2} + \cdots$ (or the decomposition is trivial), then you get Walras' law back -- and it's true. If you can't decompose the markets, then there are no "joint entropies" that can be formed from their decomposition, so there is no information loss in equation (1) above. This doesn't rule out non-ideal information transfer in version [2] above, but assuming markets work, saying you can't decompose the markets (or the decomposition is trivial) gets you back to version [1] where Walras' law holds.

So is Nick's post essentially re-deriving the sub-additivity of joint entropy?


Was the Fed's quantitative easing serious overkill?

In an earlier post I tried to make a play for the null hypothesis in saying that David Beckworth's claim that the Fed is achieving its inflation target is hard to justify since the currency component of the monetary base (M0) seems to describe inflation over the past 50 years -- and thus the Quantitative Easing (QE) (or large scale asset purchases LSAPs) appearing in the monetary base (MB) is irrelevant. Tom Brown asks a great question in a comment: did the rounds of QE/LSAPs cause both MB and M0 to go up?

Here is a graph of the monetary base -- reserves in dotted red, currency in solid blue:


The rounds of QE appear as vertical lines. There is a hint that QE1 may have caused a jump in M0, but little evidence that subsequent rounds did anything. It will help to look at this data in another way. Here are the (logarithmic) derivatives of the data, scaled to the maximum value between 2007 and 2014:


The rise in M0 coinciding with QE1 jumps right out in this graph, but the other two rounds show little (obvious) impact. This becomes even clearer if we look at the data in yet another way:


In this graph, I show the blue line as the x-axis and the red line as the y-axis. If changes in MB reserves affected M0, then the dots should all appear along the line y = x. For QE1 (shown as blue dots) this is a reasonable model (ok, reasonable is a stretch given the data -- let's try plausible). For QE2 (red dots) and QE3 (green dots), the data seem to fall along the line y = constant with the rest of the data (gray dots) -- again, implying zero influence.

Interpretation?

It is possible that QE1 helped cause M0 to rise, but subsequent rounds of QE didn't do much of anything (i.e. maybe QE1 already did as much as could be done). Imagine M0 as a partially filled glass of water. QE1 filled it up; QE2 and 3 simply sloshed over the side. This view seems to be somewhat supported by data.

Before 2008, M0 (and MB) started to fall below trend. The crisis hit and the first round of QE starts sending M0 back to the trend. Subsequent rounds of QE do less because M0 is closer to trend (in a sense, this is a model where the impact of MB is proportional to the difference between M0 and the trend). Here is a graph that illustrates this point (M0 in blue, the pre-crisis trend of M0 is dashed blue and MB -- M0 including reserves -- is red):


The remaining slow return to trend may have more to do with waiting for NGDP and unemployment to return to normal than QE2 and QE3. The brings up the question: was QE1 enough? Actually, that question might not be strong enough: was QE1 serious overkill? Did we only need, say, 300 billion dollars worth of QE1 rather than 3 trillion over the course of three rounds?

The data is too limited at this point to make solid conclusions. QE1 seems to have been concurrent with a rise in M0 (causality is difficult to determine without a model -- perhaps QE lowered interest rates and caused output to increase via the ISLM model which caused M0 to go up?), but there is no evidence other rounds of QE did anything to influence inflation in the model where M0 determines inflation. (Again, maybe MB lowered interest rates and caused output to increase through the IS-LM mechanism, causing inflation to increase.)

Thursday, August 28, 2014

Improved estimate of pre-Depression currency in circulation

I found this interesting data set from FRED on the currency in circulation in the US from 1875 to 1914 [1] , which allows me to improve the estimate I used here to look at the pre-Great Depression trend. Here is the updated graph (the spike in currency in WWI was much sharper than previously shown):


[1] Other Fed data on currency goes back to 1918, leaving 1914-1918 still undetermined, hence it's still an estimate in the region shown in the graph.

Smooth move

Sometimes you make interesting mistakes. I wanted to address David Beckworth's claim that the Fed is hitting its inflation target where the evidence consists of looking where the core PCE inflation data is, defining that as the target, and saying therefore the Fed must be on target [1]. This involved me switching over from core CPI data to core PCE data. I also made the change from fitting the model to price level data to fitting to inflation data. That's not the interesting part.

In the process I accidentally over-smoothed the money supply and NGDP data (FYI, I normally don't do any smoothing at all) and found a pretty awesome result. This graph reveals, for the first time ever [2], trend inflation:


Here's the same result for CPI inflation:


For completeness, here is the PCE price level model fit and the error distribution for PCE inflation:


Pretty Gaussian!

Of course, these results are based on the monetary base minus reserves (aka currency in circulation, aka "M0"), which means the large scale asset purchases (LSAP) Beckworth claims are influencing core PCE inflation are irrelevant to describing PCE inflation [2]. In fact, the information transfer model explains the inflation trend outside even before financial crisis (the ostensible onset of Beckworth's "corridor" of 1-2% core PCE inflation) [3].


[1] For many market monetarists, the Bayesian prior probability of the model that the central bank can achieve its target is P = 1, therefore whatever inflation is measured to be, that must be the target (or measurement error).

[2] Assuming the information transfer model is right :)

[3] I am calculating inflation by the instantaneous derivative of the logarithm (the local slope on a log scale), so it's a bit noisier than Beckworth's graph.

Looking at the foundations of money

My copy of Money and the Early Greek Mind: Homer, Philosophy, Tragedy arrived the other day; I'll be looking for insights into the information transfer picture. In the meantime, here's a nice discussion. I like the way one of the properties of money is described there "[money] facilitates transitive relations between objects".

This makes some sense of the evolution of money [what I say here isn't novel] -- one way to facilitate transitive relations between objects is to choose one of those objects and look at all of the relationships between every other object and that object. This happens when e.g. cigarettes become a form of currency in POW camps or prisons (or high school). Everything gains a price in cigarettes and the relative value of e.g. chocolate and bacon can be related to each other via cigarettes.

Money becomes money when it loses its own intrinsic value and becomes only valuable for its ability to facilitate these transitive relations -- a medium of communicating information.

Wednesday, August 27, 2014

Fisher's proto-information transfer economics

One of Irving Fisher's thesis advisors was Willard Gibbs (of thermodynamics fame, which I mention because of the connection between information theory and thermodynamics). Here's a link to his 1892 thesis; I was struck by how close some of the equations are to the information transfer model.

Fisher looks at the exchange of some number of gallons of $A$ for some number of bushels of $B$ and states: "the last increment $dB$ is exchanged at the same rate for $dA$ as $A$ was exchanged for $B$". Fisher writes this as an equation on page 5:

$$
\text{(1) } \frac{A}{B} = \frac{dA}{dB}
$$

The argument seems to have been introduced by both Jevons and Marshall. Of course it's generally false. Many goods exhibit economies of scale (i.e. buying in bulk) or other effects so that either the last increments of $dA$ and $dB$ are cheaper or more expensive than the first increments. A somewhat less restrictive assumption is that if we scale the total amount of $A$ and $B$ then the relationship between the rate $\alpha A$ was exchanged for $\alpha B$ and the rate the last increment $d(\alpha A) = \alpha dA$ is exchanged for the last increment $\alpha dB$ is unchanged. This property is called homogeneity of degree zero, and you can think of it as what would happen if we doubled the price of everything along with how much money we make: i.e. nothing.

Equation (1) is not the most general equation consistent with homogeneity of degree zero, but rather

$$
\text{(2) } \frac{A}{B} = k \frac{dA}{dB}
$$

This is identical to the result from this argument and the basis for the information transfer model. What is actually equilibrating in the market is the information the market is moving around when $A$ is exchanged for $B$.