## Tuesday, March 22, 2016

### Information equilibrium: a common language for multiple schools

Cameron Murray has a great post about the challenge of reforming economics in which he points out two challenges: social and technical. The social challenge is that different "schools" are tribal, and reconciliation isn't rewarded. Just read Murray on this.

The second challenge is something that I have tried to work towards answering:

[H]ow do you teach a pluralist program when there is no recognised structure for presenting content from many schools of thought, which can often be contradictory, and when very few academics are themselves sufficiently trained to to so?
...
What is needed is a way to structure the exploration of economic analysis by arranging around economic problems around some core domains. Approaches from various schools of thought can be brought into the analysis where appropriate, with the common ground and links between them highlighted.
Despite being completely out of the mainstream, the information equilibrium framework does not have to subscribe to a specific school of economic thought. I actually thought this is what you were supposed to mean by framework (other economists disagree and include model-specific results in what they call frameworks). In fact, I defined framework by something that is not model specific:
One way to understand what a framework is is to ask whether the world could behave in a different way in your framework ... Can you build a market monetarist model in your framework? It doesn't have to be empirically accurate (the IT framework version is terrible), but you should be able to at least formulate it. If the answer is no, then you don't have a framework -- you have a set of priors.
This is what pushed me to try and formulate the MMT and Post-Keynesian (PK) models that use "Stock Flow Consistent" (SFC) analysis as information equilibrium model. The fact that I criticized an aspect of SFC analysis upset the MMT and PK tribes (see the post and comments) led me to not end up posting the work I'd done.

But in the interest of completeness and showing that the information equilibrium framework allows you to talk about completely different schools of economics with the same language, let me show the model SIM from Godley and Lavoie as an information equilibrium model.

SFC models as an information equilibrium model

First, divide through by $Y$ (this represents an overall scale invariance), so all the variables below are fractions of total output (I didn't change the variable names, though because it would get confusing).

Define the variable $B$ to be government spending minus taxes.

$$B \equiv G - T$$

Define $x$ to be a vector of consumption, the variable $B$, taxes, disposable income and high powered money:

$$\begin{bmatrix} C \\ B \\ T \\ Y_{D} \\ H \end{bmatrix}$$

Define the matrix $A$ to be

$$\begin{bmatrix} 1 & 1 & 1 & 0 & 0 \\ 1 & 1 & 0 & -1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & -\alpha_{1} & -\alpha_{2} \\ 0 & 0 & 1 & 1 & 0 \end{bmatrix}$$

Define the vector $b$ to be

$$\begin{bmatrix} -1 \\ 0 \\ -\theta \\ 0 \\ -1 \end{bmatrix}$$

The SFC model SIM from Godley and Lavoie is then

$$A x + b = 0$$

$$H \rightleftarrows Y_{D}$$

with [Ed. note: I originally got my notes confused because I wrote $Y_{D}$ as $D$ through part of them and $B$ instead of the $Y_{D}$ I use here, so left off the following equation]

$$B \equiv \int_{\Gamma} dY_{D}$$

where the second equation is an information equilibrium relationship [and the third is a path integral; in the model SIM, they take $\Gamma$ to effectively be a time step]. The issue that I noticed (and upset the SFC community) is that it's assumed that the information transfer index is 1 so that instead of:

$$H \sim Y_{D}^{k}$$

You just have

$$H \sim Y_{D}$$

and the velocity of high powered money is equal to 1. Also, there is no partial equilibrium -- only general equilibrium so you never have high powered money that isn't in correspondence with debt (or actually in the SFC model, exactly equal to debt).

Even with this assumption, however, the model can still be interpreted as an information equilibrium model. There is supply and demand for government debt that acts as money. This money is divided up to fund various measures e.g. consumption.

Market monetarism as an information equilibrium model

Over time, I have attempted to put the various models Scott Sumner writes down into the information equilibrium framework. The first three are described better here.

1) u : NGDP ⇄ W/H

The variable u is the unemployment rate. H is total hours worked and W is total nominal wages.

2) (W/H)/(PY/L) ⇄ u

PY is nominal output (P is the price level), L is the total number of people employed and u is the unemployment rate.

3) (1/P) : M/P ⇄ M

where M is the money supply. This may look a bit weird, but it could potentially work if Sumner didn't insist on an information transfer index k = 1 (if k is not 1, that opens the door to a liquidity trap, however). As it is, it predicts that the price level is constant in general equilibrium and unexpected growth shocks are deflationary in the short run.

This 4th one is described here.

4) V : NGDP ⇄ MB and i ⇄ V

where V is velocity, MB is the monetary base and i is the nominal interest rate. So that in general equilibrium we have:

V = k NGDP/MB

log i = α log V

Or more compactly:

log i = α log NGDP/MB + β

More models!

More mainstream Keynesian and other models all appear here or in my paper. Here's a model that is based on the Solow model. However, I think showing how the framework can illustrate both Market Monetarism and Post Keynesianism using the same tools gives an idea of how useful it is.

I can even put John Cochrane's asset pricing equation approach in the framework!

The interesting part is that it lays bare some assumptions (e.g. that the IS-LM model is an AD-AS model with low inflation).

And despite my protests, expectations can be included. It just involves looking at the model temporally rather than instantaneously.

1. Thanks Jason.

Before I respond, just know that I totally agree with your view that individual agents don't have to be rational for the aggregate to have the "rational" properties that mainstream economics derive from summing up rational agents. I read Becker's 1962 paper in 2014 and realised this is exactly what is going on. I now teach this way - economists saw some general patterns and codified it into supply and demand, but then went through a process of rationalising this same pattern at an individual level. Which was unnecessary, as you demonstrate.

But what I mean be framework is elevated to a higher level here. For example, you start by calling things Y and B etc. But what concept is Y? What unit? Is it a physical object? A property right? A monetary exchange value of an object at a certain point in time? Is it a measure of aggregate utility? This is where you need a higher-level framework, because once people know what concept the variable in different schools/methods mean, they are then able to see where they might be useful. This higher level model would be used to help provide clear translations from economic concepts to mathematical methods.

and I will put it on my blog next week.

1. Cheers, Cameron.

One thing I like about the Becker approach is that if agents clump up in that opportunity set (everyone wants to push consumption to period 2), you can get a breakdown of 'supply and demand' and (emergent) rationality. I don't think traditional econ identifies failure modes like that. I hope more people follow your lead! If starting that small change in teaching econ is all I end up contributing, that will be worth it!

I completely agree that there needs to be an even more fundamental definition of what the units of economic discourse are.

FWIW, in the IT framework, those variables represent probability distributions over some domain ... e.g. space, time. You have equilibrium when e.g. the spatial probability distribution of 'demand events' for X is equal to the spatial probability distribution of 'supply events' of X -- and there are a large number of units of X (so that the probability distribution comes close to being the actual distribution). In that case, the information required to construct both distributions is equal I(Pd(X)) = I(Ps(X)).

Since the whole thing simplifies if you talk about uniform distributions, you can think of e.g. NGDP as the total number of 'demand events' (measured in e.g. dollars). When uniformly distributed (not true, but works to leading order), the information in a string of 'demand events' drawn from that distribution is just proportional to the number of events ... I(Pd(D)) ~ NGDP.

However, since this definition is pretty malleable, it could easily fit distributions of property rights, accounts, etc. you discuss in your follow-up.

For example, I've used it for the distribution of price states in a time series. The distribution over nominal interest rate states is equal to the distribution over "the price of money", which is proportional to velocity in this very basic model.

The IT framework is just one example of a possible way to address this issue, but I agree that more needs to be done in this regard.

2. "The fact that I criticized an aspect of SFC analysis upset the MMT and PK tribes (see the post and comments) led me to not end up posting the work I'd done."

Don't let the critics put you off. The MMT and PKE people have been taking criticism for years and they are still going strong. Criticism and debate are necessary for the development and dissemination of knowledge. This is a great contribution of blogging.

1. Yes, I'm glad he posted it too. It's late, so I'll have to take a detailed look later though.

3. Regarding SFC models. With H⇄B, B is defined as over one period (B≡G−T), since G and T are wrt one period (rather than being integrated values). But H is an integrated value over all periods.

Also, I had a different thought, which I suppose isn't just SFC specific, and that is the idea of assuming the agents explore the whole space allowed them. In this case it would be letting alpha1 and alpha2 range over all possible values they are allowed... and then taking the "center of mass" of those results as the maximum entropy answer.

1. Also, it seems there is no "price" concept in SIM originally, true? Is that generally true of SFC models?

2. Back to B≡G−T ... eventually T=G (since it's steady state gain is 1 no matter what the parameters are), so the steady state gain for B is 0, no matter what the parameters are. Again, interpreting all of them as defined over one period. So eventually H goes to zero if H goes as B^k (unless k=0, then H stays at 1). It seems to me Delata_H makes more sense there.

3. Should be "$\Delta H$ makes more sense." Also, if theta=0 the the steady state gain for T (and integrated T) are both 0 of course, but I think that's the only exception.

4. Typo in equation. Will fix when I get a chance. Not terribly important as the general idea is there. Linear transformation plus one dynamical equation.

5. Ah, so then it should be
\begin{align} \Delta H & \rightleftarrows B \tag 1\\ & \text{and}\\ \Delta H & \sim B^k \tag 2 \end{align}
Which will still mean that $\lim\limits_{t \to \infty} \Delta H = 0$ if $\theta \neq 0$, but that may be OK.

6. Nope. But I will fix it when I have time.

4. So B is defined both as G - T ((government spending over one time step) - (taxes collected over one time step)) and path integral of D? That 2nd part actually defining D I guess (I don't see D defined until that integral). I'm unclear on what D is in words.

1. B is the amount of bonds issued in a time period; D is the stock of debt.

D is defined by that integral.

2. Sorry Jason, I don't get it. I would sum all the deficits (B[n]) to get the total debt at period n (D[n]). What am I missing?

3. Sounds about right. Not sure what you are missing.

4. I'd do the sum (integrate) the other way around, like this:
$$D_n = \sum_{i=2}^n B_i \tag 3$$
Assuming $D_1 = 0$ and $B_1 = 0$ (as G&L apparently do)

5. It's not the integral of B, it's the integral of dD.
$$\int_{\Gamma} dD = \int_{\partial \Gamma} D = \Delta D = B = G - T$$

6. OK, I see. I'm reviewing path integrals. Would it be fair to characterize Γ as generally a function of time, between two sample times (nTs and (n+1)Ts say)?

7. Yes, that is one way you'd parameterize the path from one value of D to another.

I tried to get away from imposing a coordinate system.

8. Ok, thanks. I'm matching up this:

$$\int_{\Gamma} dD \tag 1$$

With what I found in this section of the Wikipedia article on line integrals:

$$\int\limits_{C} f\, ds = \int_{a}^{b} f(\mathbf{r}(t))\,|\mathbf{r}' (t)|\,dt \tag 2$$

It seems $\Gamma$ corresponds to the curve (or path) $C,$ the constant $1$ corresponds to the integrand $f,$ and $dD$ corresponds to what they call an "elementary arc length" $ds.$ That all seems clear enough. Apparently $\mathbf{r}$ get's you from one end of the curve $C$ to the other via its parameter $t$ as $t$ varies from $a$ to $b$ with $a < b.$ The result should be independent of the parameterization $\mathbf{r}$ of $C.$ That all makes sense. However, I'm not familiar with this notation:

$$\int_{\partial \Gamma} D \tag 3$$

What is the meaning of that exactly?

9. Or I guess instead of f=1 we could have dD/ds = f. I'm at my worst with math when I can't imagine some concrete examples to generalize from. So let me know if this makes sense as an example: going back to f=1 (for simplicity), we could have a=0, b=1, r(t)=2t then (1) above would evaluate to 2. We get the same answer for r(t) = +/- 2t^n, which makes sense... r just being a map to get us from 0 to 2 (i.e. draw out the 1-D "curve" (in this case) $\Gamma$): it doesn't much matter how we get there. What I'm not seeing is the advantage of introducing the path integral here... I'm not saying their isn't one, I'm just not seeing it. Do you have an example of a case where there's an advantage to the path integral concept?

10. ... I should have said for monotonic r it doesn't much matter, but if r should reverse directions on its path from a to b one or more times, it does matter (I think).

11. ... Ah, but I then I forgot the bijectve requirement for r... so for 1D cases, r can't reverse I guess.

12. The notation means "boundary". The integral isn't saying much more than the fundamental theorem of calculus.

https://en.wikipedia.org/wiki/Stokes%27_theorem

It's not so much the path integral as the differential forms. If you'd like you can take:

$$\int_{\Gamma} dD = \int_{t_{0}}^{t} dt' \frac{d}{dt'} D(t') = D(t) - D(t_{0}) \equiv \Delta D \equiv B \equiv G - T$$

All of physics is easier with differential forms.

13. Thanks, that's a big help. Brings back memories of second year electromagnetics class. I'd already concluded the expression you give above based on your earlier introduction of $\Delta D$ so I was going to stop worrying about it, but that Stokes' theorem link cleared up my remaining questions.

14. ...plus it was reassuring to see you write it out.