Wednesday, November 30, 2016

Economic theory and male answer syndrome

I found this via Pedro Serôdio. Early on, it hit a phrase that made be LOL
For me the attraction of the work of Kondratieff, Schumpeter and Carlota Perez in the modern era, though I am critical of them all ...
The author probably would let us know that his likes and retweets aren't endorsements, either. There is literally no reason for the phrase "though I am critical of them all" in the paragraph in which it appears except as signalling (the criticisms are never discussed).

Anyway, let's look at a bit more:
First you would have to fix the problem Paul Romer identifies in “The Trouble With Macroeconomics”: over-abstract models, divorced from data, based on over-restricted assumptions. Deference to authority where “objective fact is displaced from its position as the ultimate determinant of scientific truth”.
Ah, good. Making macroeconomics more empirical is laudable.
Next, you would have to relentlessly chase down the sources of the massive miscalculation of risk preceding 2008. These include a failure to factor in crime and malfeasance; the inability to measure or trace risk in the shadow banking system; the failure even to model the dynamics of banking as a separate and distinct agent. And the complete failure to model or accept the need to account for irrational human behaviours.
Wait -- how do you know this? Didn't you just say in the previous paragraph that macroeconomics is divorced from data? Then there are no empirically grounded models you could be using to base the importance of these particular mechanisms in describing macroeconomic data. Basically, this paragraph is divorced from the data in exactly the same way the author just said macroeconomics is divorced from the data.

I've said this many times. Don't just say including irrational human behaviors or banking yields better models. Build these models and show that that they are better empirically. That is to say, understand the first paragraph before writing the second.
... macroeconomics should suddenly become instead of a theory based on the assumption of equilibrium and rationality, one based on the assumption of disequilibrium and shocks – not just external shocks but shocks generated inside the system.
Um, you probably shouldn't base a theory on assumption of what the theory is trying to understand.

This is something that many economists (of all stripes) seem to do and it baffles me. Well, it baffles me as a scientist -- I totally understand it from a sociological/human behavior perspective.

Let me call it "economist answer syndrome", which is very close (if not usually identical) to "male answer syndrome". What should be the fundamental questions of economics (What are recessions? What determines the economic state? Is there a useful definition of equilibrium?) are instead presented as answers (Recessions are monetary. Endogenous shocks. No.). The answers differ from economist to economist. The various "schools of economics" are probably best described as specific answers to what should be the research programs of economics.

A good example of this is that second paragraph above. It's all answers. Risk was miscalculated leading to a financial crisis that caused a recession that was missed because macro models left out banking. The question form is to ask what role banks played in the crisis. In fact, some theories out there say that the financial crisis was a symptom, not a cause of the recession. If we were being scientific, as Paul Romer would have us be, then we should present this as a question, potentially presenting a mechanism and some data as evidence backing up that mechanism. If we're just saying stuff, then there are people out there that say the financial crisis was a symptom. He said, he said.

People often say that economics is politically biased, but really I think the issue is more that economics simply uses the political mode of thinking (where there are answers for anything of political interest) rather than the scientific one (where there are questions about anything of scientific interest).
One thing that would happen is that the future would start sending signals to the present via the market ...
There is actually a way to turn this vague statement into something meaningful (using information theory to describe the communication channel carrying those signals). It leads to the theory advocated on this blog (which should be noted is not entirely at odds with mainstream economics).
... so that they assume breakdown, irrationality, crime, inadequate and asymmetric information ...
This is just more male answer syndrome, more assumptions.

But the information transfer framework does allow (from the start) for markets to breakdown (it's called non-ideal information transfer). It turns out that it might be useful when looking at recessions, but not for the typical market state.

Update 1 December 2016

I was one of the Dean's Scholars at the University of Texas as an undergrad, and the director of the program was Alan Cline. He was the one who introduced me to "male answer syndrome"; it was one of the things he highlighted in a message he gave us on graduation day. Ever since then I've tried to follow his advice -- to stop and listen first, to think before proffering theories.

Tuesday, November 29, 2016

Causality, Newcomb's paradox, and rational expectations

Quantum eraser experiment. From Wikipedia.

I've at times enjoyed philosophy, but generally things like Newcomb's problem (which was linked to at Marginal Revolution today) makes me roll my eyes. There are two basic questions with this thought experiment: are we ceding the infallibility of the predictor and the potential lack of causality in the prediction or not? They turn out to be linked by causality.

There's a lot of persuasion in the problem that the predictor is infallible, but the problem doesn't come out and say so. Is the predictor an oracle in the computer science sense? There's really no reason to continue this discussion if we don't have an answer to this.

David Edmonds says "You cannot influence a decision made in the past by a decision made in the present!" At a fundamental level, the quantum eraser basically says that what Edmonds statement is generally wrong as stated (you just can't send a signal/communicate by influencing a past decision with a present decision). The way out of that is that we're dealing with a macroscopic system in the ordinary world, but in the ordinary world there's no such thing as an oracle. The Stanford Encyclopedia of Philosophy has more to say.

However, I think this problem is illustrative of a paradox with expectations in economics, so let's reformulate it.

Firm X (which determines most of the economy) can decide to cut output if it expects less aggregate demand. Normal output is 100 units, cut back is 50.

However, there's also a central bank. The central bank's forecasts have always been right. However, if the central bank forecasts that firm X will keep output the same, they will cut back on aggregate demand (raising interest rates, making firm X lose money). And if the central bank forecasts firm X will cut back on output, they'll boost aggregate demand. The boost/cut is +/-50 units.

Here's the original table:

Predicted choice  |  Actual choice  |  Payout

B (+1M)              B                 1M
B (+1M)              A+B               1M 1k
A+B (0)              B                 0
A+B (0)              A+B               1k

And here's our new table:

Predicted choice  |  Actual choice  |  Payout

Cut back +50  (r-)   Cut back (50)      100
cut back +50  (r-)   Keep same (100)    150
Keep same -50 (r+)   Cut back (50)        0
Keep same -50 (r+)   Keep same (100)     50

This might sound familiar: it's Scott Sumner's retro-causal Fed policy. An expected future rate hike yields lower expected output (expected avg = 25). And assuming the Fed is infallible (predicted = actual, i.e. rational/model-consistent expectations), the optimal choice is to cut back output (take box B). However, assuming the Fed is fallible (not rational expectations), the optimal choice is to keep output the same (expected result = 100). Basically, this is Edmonds answer above: take both boxes. When a model assumption (rational expectations) reproduces philosophical paradoxes, it's probably time to re-examine it.

The question at hand is whether rational expectations can move information from the future into the present. I've discussed this in more detail before in the context of so-called "neo-Fisherism". The rational expectations "operator", much like the oracle/predictor "operator", acts on a future (expected/predicted) state and moves information (sends a signal) into the present. In general such an operator -- were this information genuine (i.e. the predictor is infallible) -- violates causality. In quantum physics, there are cases where it appears on the surface that there might be causality violation (such as the quantum eraser above), but in every case no communication can occur (usually meaning locality is violated, but not causality).

So the question really is are we suspending causality and superluminal communication? If that is the premise of Newcomb's paradox or rational expectations, then there is nothing wrong with someone who can exactly predict the future (they're probably getting the information from the future) or future actions causing present conditions. If we're not, then the obvious choice is to assume that even rational expectations or infallible predictors can be wrong and take both boxes.

This so-called "philosophical paradox" should explicitly say whether we are suspending causality in our thought experiment instead of being mealy-mouthed about it.

Monday, November 28, 2016

The scope of introductory economics

I butted into a conversation between David Andolfatto (DA) and Noah Smith (NS) on Twitter about methodology in economics. Let me start with the conversation (edited slightly for clarity) that lead to me jumping in:
DA Noah, we all have models (thought-organizing frameworks) embedded in our brains. Unavoidable. No alternative. 
NS Understanding is not the same as thought-organization. Very different things. 
DA OK, let's step back and define terms. How do you define "understanding" something?
NS Let's say "understanding" means having a model that is both internally and externally valid.
DA "Validity" is a statement concerning the logical coherence of a sequence of statements, conditional on assumptions. 
NS That's internal validity. External validity means that the model matches data. 
DA Yes, but one needs a well-defined metric with which we judge "match the data." ... In my view, this judgement must be made in relation to the question being asked.
This is where I butted in:
JS I LOLed at 1st 1/2 of this. Well-defined metric is theory scope plus stats (works for every other field). Econ does not yet get scope. 
DA "Econ does not yet get scope." What does this mean? 
JS A model's scope (metrics for "matching data") should be derived alongside the model itself. Doesn't seem to happen in Econ texts. 
DA At the introductory level, the "empirics" we seek to explain/interpret are largely qualitative in nature. So it does happen. ... But better job could be done at upper levels, for sure.
So what did I mean? Basically, economics isn't approached as an empirical theoretical framework with well-defined scope. It is not set up from the beginning to be amenable to experiments that control the scope and produce data (quantitative or qualitative observations) that can be compared with theory. I'll try and show what I mean using introductory level material -- even qualitative.

Let me give a positive example first from physics. One of the first things taught are inelastic and elastic collisions. Inelastic collisions are almost always qualitative descriptions because even graduate students probably wouldn't be able to quantitatively describe the energy absorption in a rubber ball bouncing or friction slowing something down. You can sometimes approximate the latter with a constant force. The experimental setup is not too terribly different from how Galileo set up his tracks (he used balls, but now we know about rotational inertia, so that comes later):

These are set up to mimic the scope of elastic collision theory: approximately frictionless, approximately conserved kinetic energy. That scope is directly related to the scope of the theory. And we show what happens when scope fails (friction slowing the little carts down).

Now I'd say it isn't critical that students carry out these cart experiments themselves (though it helps learning) -- it would probably be a hard hurdle to surmount for economics. Simply describing the setup showing the results of these experiments would be sufficient, and there exist real economics papers that do just this.

In introductory economics, some general principles are usually discussed (like here from Krugman, but Mankiw's book starts similarly), then things proceed to the production possibilities frontier (PPF), upward sloping supply curves and downward sloping demand curves. This is probably the best analogy with the physics scenario above.

The assumptions that go into this are rationality and a convex PPF -- these should define the scope of the (qualitative) theory (i.e. individuals are rational and the PPF is convex, which requires 2 or more goods). The way that demand is taught also requires more than one good (so there is a trade-off in marginal utility between the two).

Now first: rationality generally fails (the charitable version is that it's a mixed bag) for individuals in most lab experiments. So there is either an implicit assumption that this is only for a large number of individuals (i.e. collective/emergent rationality, which isn't ruled out by experiments) or only deals with rational robots. As economics purports to be a social theory, we'll have to go with the former.

Additionally, the classic experimental tests of "supply and demand" (e.g. Vernon Smith, John List) do not approach economics this way. In those experiments, individuals are assigned utilities or reservation prices for a single good. You could imagine these as setting up "rational" agents analogous to the "frictionless" carts in the physics example, but we're still dealing with a single good. As an aside, there is an interesting classroom demo for the PPF using multiple goods, but like the experiments designed to show demand curves, this one isn't actually showing what it's trying to show (the area relationship of the pieces immediately leads to a quadratic PPF surface, which will have convex PPF level curves).

Here are a couple of graphics from Vernon Smith (1962):

So, are these results good? Are the fluctuations from theory due to failures of rationality? Or maybe the small number of participants? Is it experimental error? The second graph overshoots the price -- which is what the information equilibrium (IE) approach does by the way:

[Ed. note: this is a place holder using a positive supply shift until I get a chance to re-do it for a demand shift, which will give the same results, just inverted. Update: updated.]

In the IE model, the fluctuations are due to the number of participants, but the overshoot depends on the details of the size of the shift relative to the size of the entropic force maintaining equilibrium (the rate of approach to equilibrium, much like the time constant in a damped oscillator).

I use the IE model here just as a counterpoint (not arguing it is better or correct). The way introductory economic theory scope is taught, we have no idea how to think about that overshoot [1]. Rationality (or the assigned utility) tells us we should immediately transition to the new price. The analogy with introductory physics here would be a brief large deviation from "frictionless" carts or conservation of energy in an elastic collision.

Aside from rationality, which is a bit of a catch-all in terms of scope, there are issues with how fast shifts of supply and demand curves have to be to exhibit traditional supply and demand behavior. The changes Vernon Smith describe are effectively instantaneous. However much of the microeconomics of supply and demand depend on whether the changes happen slowly (economic growth, typically accompanied by inflation) or quickly (such as this story about Magic cards). And what happens after supply and demand curves shift? Is the change permanent, or do we return to an equilibrium (as IE does)? Does the speed of changes have anything to do with bubbles (see Noah Smith as well)?

In a sense, much of this has to do with the fact that economics does not have a complete theory about transitions between different economic states -- but supply and demand curves are all about transitions between different states. And what happens if nothing happens? Does the price just stay constant (a kind of analogy with Newton's first law)? The EMH says it follows a random walk -- does it return to the equilibrium price as the supply and demand theory seems to suggest? With so much of economics and econometrics looking at time series (even Smith's experiment above), one would expect introductory economics to at least address this.

Another issue is what David Glasner and John Quiggin have called the macrofoundations of micro (here's Krugman as well) -- the necessary existence of a stable macroeconomy for microeconomic theory to make sense. This also impacts the scope, but could probably be left out of the introduction to supply and demand much like the Higgs vacuum can be left out of the introduction to physics.

Overall, one doesn't get a good sense of the true scope of the theory in introductory economics, and it isn't taught in such a way that is consistent with how the classic experiments are done.

This issue carries over into introductory macroeconomics. One of my favorite examples is that nearly all of the descriptions of the IS-LM model completely ignore that it makes an assumption about the (well-documented) relationship between the money supply and output/inflation in its derivation that effectively limits the scope to low inflation. But I never see any economist say that the IS-LM model is only valid (is in scope) for low inflation. In the IE version, this can be made more explicit.

Paul Pfleiderer's chameleon models [linked here] points out one problem that arises out of not treating scope properly in economics: models that flip back and forth between being toy models and policy-relevant models. This is best understood as flipping back and forth between different scope ("policy relevant" means the theory's scope is fairly broad, while toy models tend to have narrow or undefined scope). Generally, because of the lack of attention to scope, we have no idea if a given model is appropriate or not. One ends up using DSGE models to inform policy even if they have terrible track records with data.



[1] That overshoot is also the only thing in my mind that tells me this experiment actually measures something rather than being completely tautological (i.e. impossible for any result other than orthodox supply and demand to emerge).

Saturday, November 26, 2016

The effect of a December 2016 Fed interest rate hike

Last year when the Fed raised short term interest rates from a range between 0 and 25 basis points to a range between 25 and 50 basis points, I predicted (based on the information equilibrium [IE] model) that the monetary base (the path labeled C in the graph below) would start to fall (I had no idea how fast) relative to no change (the path labeled 0 in the graph below). That turned out to be a pretty successful prediction. The Fed now stands poised (according to Fed watchers like Tim Duy) to raise rates again after its December meeting to a range between 50 and 75 basis points. What will be the impact? What can we learn about the IE model?

The key question is whether the monetary base will start to fall faster or not. The IE model predicts a lower equilibrium monetary base for the range 50-75 basis points (labeled C' in the graph below). If the distance to the equilibrium matters, then the rate of fall should accelerate a bit. However it is possible the drift rate has to do with factors other than the distance to the equilibrium (such as volume of trading) which might be independent of the distance to the equilibrium. I illustrated both of these paths. The solid line is time series model forecasts based on the Mathematica function -- which auto-selects an ARIMA process -- for weekly source base data after the Dec 2015 announcement. The dashed line is a possible accelerated path that is simply adjusted to cover the new distance (to the new equilibrium C') a year later. The RMS error is shown as a yellow region (and blue for the period after the original estimate of reaching the equilibrium C).

Data after the December 2015 Fed meeting is shown in orange (both weekly source base and monthly adjusted monetary base). The expected paths (assuming immediate adjustment) are shown in gray as 0 (no change), C (25-50 bp after 2015), and C' (50-75 bp after 2016). The black line represents zero reserves and the dashed black line represents the average 40 billion in reserves from 1990 to 2008.

*  *  *

The model is described in more detail in my paper.

Here's a cool 3D visualization of the model.

Some additional discussion about the December 2015 prediction is here.

Here is the IE model's performance compared with DSGE and other models.

Here are a series of interest rate forecasts compared with the Blue Chip Economic Indicators (BCEI) forecast. The BCEI forecast has continued to be incorrect -- much worse than the IE model.

Wednesday, November 23, 2016

Integrating out

This Twitter thread was/is pretty entertaining. However, there is a good lesson here about "subsuming things into an integral" (which I've put in a snarky way):

Just because you integrate out/integrate over some variable it doesn't mean it isn't doing anything.

*  *  *

Update 24 November 2016

Jo Michell replied to my snark:

However, this isn't true. The objective function, shown at the top of the original graphic above, is best thought of as an interaction between and individual agent and a mean field of labor. The easiest way to see this is that despite being inside an integral, the "equation of motion" (functional derivative) is non-trivial (simplifying the objective function a bit for clarity, dropping time, x = ξ [, C, M, etc and other variables], and k = i [because that's what I wrote for some reason]):

This means that there is are particular values of h(k), supplied labor of type k, that will result from a given disutility functional v[h(k), x]. This means that individual utility will depend on the disutility of the "labor field" h, and the "labor field" will impact the individual utility. Actually, re-writing this functional allows us to put this in a form that explicitly shows it is just the leading order term of an interaction with a mean field potential ṽ[h(k)] like a (Wick rotatedDyson series in physics:

In the original text in the graphic above, Woodford refers to the inclusion of v[h(k)]:
"I ... have written (1.1) as if the representative household simultaneously supplies all of the types of labor."
The second emphasis is Woodford's, but the first is mine: this is an example of "as if" being effective field theory in action. We have an "effective representative agent" that actually includes interaction with the entire labor market (to first order).

This is identical to, for example, the mass of an electron in the Higgs field. The J propagator in the graphic above is formally equivalent to the self-energy of a massless electron interacting with the Higgs field resulting in an electron mass. When we write down an electron mass, we're not assuming it has no interaction with the Higgs field. We are using an effective field theory where all of the interactions with the Higgs field has been integrated out leaving an effective mass term in the Lagrangian.

*  *  *

Update 29 November 2016

Jo Michell has replied again (and Pedro Serôdio has chimed in). I added a couple of blurbs above (about the variable x just representing other variables). However, I'd like to show more how there is a real interaction here. First, let me LaTeX this up a bit ... our objective function (that we maximize) is:

J[u, h] = u(x) - \int dk \; v[h(k), x]

I showed above that [1]

\frac{\delta J[u, h]}{\delta h} = \frac{\partial v}{\partial h}

We also have

\frac{\delta J[u, h]}{\delta u} & = 1 - \int dk \; \frac{\partial v[h, x]}{\partial u} - \frac{d}{dx} \frac{\partial}{\partial u'} \int dk \; v[h, x]\\
& = 1 - \int dk \left[ \frac{\partial v[h, x]}{\partial u} - \frac{d}{dx} \frac{\partial v[h, x]}{\partial u'} \right]

Now if $\partial v/\partial u \equiv 0$ and $\partial v/\partial u' \equiv 0$ (i.e. $v$ does not depend on $u$), then we are stuck with

\frac{\delta J[u, h]}{\delta u} = 1

and therefore we aren't at an optimum of our objective function (the Euler-Lagrange equation isn't zero). However, the functional form $v = \alpha \; \tilde{v[h] \; u(x)}$ is a simple ansatz that tells us that $\partial v/\partial u = 1$ and the integral gives 1 (hence $\alpha = 1$) and we have

\frac{\delta J[u, h]}{\delta u} = 0

And we have the basic $J$ term in the "propagator" as shown in the handwritten Dyson series (with "Feynman diagrams") above.

Now it is true -- as Jo Michell points out -- that basically this sums up into a coefficient $\zeta$ in the Philips curve. It's not too different (abstractly) from the vertex correction in the anomalous magnetic moment adding up to a factor of $\alpha/2\pi$. Our truncated mean field calculation above simply shifts utility by a bit, while integrating out all the virtual photons simply shifts $g$ by a bit.

Michell describes this state as lacking "serious heterogeneity or strategic interaction", and I guess it isn't serious if by "serious" you mean "beyond leading order" where fluctuations in heterogeneity start to matter (e.g. a recession hits manufacturing jobs harder than service sector jobs). However any average state of heterogeneity (labor market configuration) is going to be (at leading order in this model) simply a "mean field" -- a constant background.

Now none of this means the model is right or that this is even a good approach. I'm just saying that lacking heterogeneity and interactions is very different from keeping only the leading order effects of heterogeneity and interactions.

You'd actually want to compare this theory to data to see what we'd want to do next. If it doesn't get that right (as reasonable given a leading order theory), the solution is definitely not to go to higher order corrections in the Dyson series, but rather to scrap it and try something different [2].


[1] For example, $v[h]$ could be an entropy functional for input distributions $h$, thus a maximum entropy configuration might optimize this equation of motion.

[2] In my experience with macro models, the leading order theory is terrible at describing data so going forward with heterogeneity/strategic fluctuations (higher order) is not good methodology. Better to scrap it and try something different based on different empirical regularities. This one, maybe?

Tuesday, November 22, 2016

Prediction market failure?

Prediction market for the 2016 election.

On my drive home from work tonight, I heard Justin Wolfers express continued confidence in prediction markets in the wake of the 2016 election on Marketplace on NPR. Now the election result is perfectly consistent with a sub-50% probability (at some points as low as 10% in the Iowa Electronic Markets pictured above). Even a 10% probability has to come up sometimes. This may have some of you reaching for the various interpretations of probability. But the question is: How do we interpret results like this (from Predictwise)?

How do we handle this data in real time as an indicator of any property of the underlying system? Some might say this is a classic case of herding, being proven wrong only by incontrovertible results. Some might say that conditions actually changed at the last minute, with people changing their minds in the voting booths (so to speak). I am personally under the impression that partisan response bias was operating the entire time, so everyone was clustering around effectively bad polling data.

But what if this was a serious intelligence question, like whether a given country would test a nuclear weapon? This was actually the subject of a government funded project, so it isn't necessarily an academic question that will remain an academic question. We need metrics.

My work on this blog with the information equilibrium framework originally started out as an attempt to answer that academic question and provide metrics, but the result was basically: No, you can never really trust a prediction market. This is also why non-ideal information transfer comes up pretty early on my blog. A key thing to understand is that markets are good at solving allocation problems, not (knowledge) aggregation problems. The big take-away is that you can only tell if a particular prediction market is working is if they never fail. Additionally, they tend to fail in the particular way these election markets fail -- by showing a price that is way too low (high) because of insufficient exploration/sampling of the state space.

These results haven't gone through peer-review yet, so maybe I made a mistake somewhere. However, even if prediction markets were proven to be accurate ask yourself: how useful were these results shown above? How much more did we get from them than we got from polling data?

How do maximum entropy and information equilibrium relate?

I think I need a better post about how information equilibrium and maximum entropy relate to one another. As I'm still learning myself, this is partially for my own benefit. The short answer is that information equilibrium is a way of saying that two maximum entropy probability distributions from which two different random variables A and B are drawn are in some sense (i.e. informationally) equivalent to each other. This equivalence will show up in specific relationships between A and defined by an information equilibrium condition. For example, fluctuations in one will show up as fluctuations in the other.

Maximum entropy seeks to maximize the function H(X) over possible probability distributions p(X) for random variable X

p(X = xk) = pk

H(X) = - Σ pk log pk

subject to some constraint, where the pk represents a probability of an event in some abstract space indexed by k. For example, if that constraint is that X has a maximum and minimum value, then p is a uniform distribution. The outcome of entropy maximization is a distribution p(X) that maximizes H(X).

Information equilibrium is essentially a rephrasing of the question of whether two maximum entropy distributions of random variables A and B are (informationally) equivalent to each other (e.g. represent different observables based on an underlying variable C, for example), and tells us how one distribution will change relative to the other if they are. Symbolically,

H(A) = H(B)

with underlying probability distributions p1(A) and p2(B). This is fairly meaningless for a single observation of A and B (any single probability is proportional to another), but for a series of observations (particularly time series) we can see if both series of random variables A and B represent the same underlying information (or at least approximately so). This doesn't tell us if A or B (or maybe some unknown C) is the true information source, but only that A and B are in some sense equivalent.

In the case of uniform distributions p1(A) and p2(B), we can rewrite this as a differential equation relating the stochastic variables A and B (in the limit that A and B represent the sum of a large number of "events", for example A is total energy made up of the energy of millions of atomic kinetic energy terms)

where P turns out to represent an abstract price in applications to economics when A is abstract demand and B is abstract supply. We can think of instances of the random variable A as "demand events", B as "supply events" and the two coinciding per the condition above as a "transaction event". The practical result of this is that it gives us a framework to understand time series of exponential growth functions and power laws (and fluctuations around them) that is directly related to supply and demand.

Since p1(A) and p2(B) represent maximum entropy probability distributions, changes (shocks) will be accompanied by entropic forces restoring maximum entropy (i.e. the most likely state given prior information). We can interpret e.g. the forces of supply and demand as manifestations of these entropic forces (as well as things like sticky prices or so-called statistical equilibrium). Here are some simulations of these entropic forces for supply and demand.

Additionally, if we can identify A as the source of the information, we can turn the equivalence into a bound (you can't receive more information at B than is sent by A)

H(A) ≥ H(B)

This is called non-ideal information transfer. If we use uniform distributions as above, means that the differential equation becomes a bound.

These concepts are explored further in the paper. Here are some slides presenting these ideas. Here are some simulations showing the supply and demand (and price) relationships captured by the ideas above.

Update 23 November 2016

Here's the above relationship in picture format

Information equilibrium in agent-based models

In response to this post (or at least the link to it on Twitter), Ian Wright directed me to his paper using a Closed Social Architecture (CSA) agent-based model based on random agent actions that happily included Mathematica code that I was able to get up and running in no time. Here is the abstract:
A large market economy has a huge number of degrees of freedom with weak micro-level coordination. The "implicit microfoundations" approach considers this property of micro-level interactions to more strongly determine macro-level outcomes compared to the precise details of individual choice behavior; that is, the "particle" nature of individuals dominates their "mechanical" nature. So rather than taking an "explicit microfoundations" approach, in which individuals are represented as "white-box" sources of fully-specified optimizing behavior ("rational agents"), we instead represent individuals as "black box" sources of unpredictable noise subject to objective constraints ("zero-intelligence agents"). To illustrate the potential of the approach we examine a parsimonious, agent-based macroeconomic model with implicit microfoundations. It generates many of the reported empirical distributions of capitalist economies, including the distribution of income, firm sizes, firm growth, GDP and recessions.
My emphasis. This is essentially the idea behind the maximum entropy approaches that underlie the information transfer framework. I decided to test for information equilibrium (IE) relationships in the outputs of the model (I did a simple version with only 100 agents and 100 time steps) and found a few. Specifically:


where W are total wages of employees, U is the total number of people unemployed and P is the total profit of firms. In each case, I took the IT index to be 1 (allowing it to vary doesn't change much). The first two are IE relationships I've previously observed in macro data; the last one is new. The first one is not that surprising -- total output in this model is effectively NGDP = W + P, and since P is small (W ~ 70% of NGDP), W ~ NGDP. In the following, the IE model is blue while the data is yellow [1]:

The other two are non-trivial:

One wouldn't expect high unemployment to be associated with high output, and I'm not sure of the mechanism generating it (note that unemployment is very noisy in this model). Note that since NGDP ⇄ U is a reasonable description, this let's us define a price level p ~ NGDP/U per the IE version of the model. Here is an example trajectory for output in terms of unemployment NGDP = f(U) and the corresponding price level p = f(NGDP, U) = dNGDP/dU:

I plan on trying to add growth to the CSA model so that we can see some non-trivial IE relationships with IT index k ≠ 1.


PS As I was exploring, I came up with different things to look at, and since the model is stochastic, I didn't have the same outputs each time because I didn't seed the random number generator. Basically, the first 3 graphs in the post above are a single economy, the next 2 are a different economy (but correspond to each other), and the last two profit graphs and the two unemployment graphs in the footnote correspond to each other.



[1] I ran the model several times and here are some example outputs for profit:

And here are a couple of output-unemployment models (corresponding to the last two profit graphs above):

Monday, November 21, 2016

Exactly. People do not behave in similar or predictable or even stable ways.

I am not and never have been a fan of comparing economics to physics. People are more than particles ... they do not always behave (or think) in similar or predictable or even stable ways. On balance, that is a great thing!! But it leaves tons of room for people to draw their own (non-economist) conclusions. Reality does impose discipline ... for example, in some way a budget constraint has to hold and opportunities usually do have costs ... so I am not throwing arithmetic out the window.
This is exactly the premise of the maximum entropy/information equilibrium approach (see here). Individual people aren't predictable, but do have to conform to some basic constraints. This is the idea behind Gary Becker's "irrational" agents [update: see here]. If they are unpredictable to the extent that they could be treated as random, this is just maximum entropy. Sometimes large scale correlations appear described by non-ideal information transfer.

I do have a quibble with this not being "physics" or that this means people are "more than particles". I believe she is really just referring to e.g. classical mechanics of  point masses, and not e.g. non-equilibrium thermodynamics (what the information equilibrium approach is closest to). The utility maximization approach [e.g. here pdf] is at its heart Lagrangian mechanics, but Lagrangian mechanics isn't all of physics. It's true that individual people don't follow exact trajectories in phase space. In fact, what tends to happen is the spontaneous creation of emergent behaviors [YouTube]. Macroeconomics tends to ignore this (though it might be the motivation for one of economists' more common approximations).

In most of physics, these "particles" aren't behaving in similar, predictable or even stable ways. Quantum physics takes into account all possible things that could happen with a weighted sum. Thermodynamics is precisely not knowing the microscopic state (hence entropy). One has to embrace the instability, unpredictability, and dissimilarity.

Update 22 November 2016:

These slides [png, with pdf link at link] are a good (more technical) overview of what I am talking about.

Never underestimate the power of abstract mathematics

Wild ideas. Reference.
I remember noticing this in quantum field theory class (which I took in 2000, so five years after the original paper in 1995, so I wasn't breaking any new ground), and once posited that the various coefficients arising from amplitudes could be related with Galois theory. At the time I was considerably skeptical, but did jot a note (pictured above) that I found after the article jogged my memory [1]. I was so skeptical I would later go on to say that there were no real-world applications of Galois theory in this post.

I think I spoke too soon; here is a quote from the article:
[Francis] Brown is looking to prove that there’s a kind of mathematical group—a Galois group—acting on the set of periods that come from Feynman diagrams. “The answer seems to be yes in every single case that’s ever been computed,” he said, but proof that the relationship holds categorically is still in the distance. “If it were true that there were a group acting on the numbers coming from physics, that means you’re finding a huge class of symmetries,” Brown said. “If that’s true, then the next step is to ask why there’s this big symmetry group and what possible physics meaning could it have.”

Anyway, the main point is to keep an open mind. That's why I look at the abstract mathematical properties of information equilibrium every once in awhile (e.g. here, here, here, here, here), as well as argue against keeping math out of economics (e.g. here and here).


[1] The way I was approaching it was pretty much entirely wrong (focusing on the structure constants of the Lie algebras like SU(3)). 

Sunday, November 20, 2016

Inequality and maximum entropy

What does maximum entropy have to say about economic inequality? I am going to put forward a possible understanding of how inequality impacts macroeconomic fluctuations (aka the business cycle).

First we have to be precise with what we mean. One could say that the maximum entropy distribution of income should be a Pareto distribution, which can be highly unequal. However, the uniform distribution is also a maximum entropy distribution of income. How can that be? Constraints. The Pareto distribution is the maximum entropy distribution of x given a constraint on the value of ⟨log x⟩ (the average value of log x). The uniform distribution is the maximum entropy distribution given x is (for example) constrained to be 0 ≤ x c.

It is true that income appears to be Pareto distributed (at least in the tail, per Pareto's original investigations). Does this mean that inequality is an optimal equilibrium (i.e. maximum entropy)? Not exactly; it really just means that Pareto distributions of income are probably easiest to achieve given random processes. Let's build a simple model to look at the effects of inequality.

Let's say each agent (in the simulations below, I use N = 100 agents) has an income [1] that is drawn from a Pareto distribution, and can choose to spend some fraction of it in each time period on consumption. However, total consumption is limited (limited resources). This creates a bounded region in N-dimensional space. For three agents (N = 3), it might look something like this:

Over time, people adjust their consumption by a few percent (I used Brownian motion with σ = 5%), so the economy follows some random walk (blue squiggly line) bounded by this N–1 simplex (2-simplex for N = 3). You could add recessions (correlated motion among the dimensions) or growth (expanding the region as a function of time), but these are not necessary for my point here. We will compare the Pareto distributed income with uniform income which would be bounded by a symmetric simplex (yellow):

If we look at random walks inside each N-dimensional space bounded by these simplices (as I did here), we can see a key difference in the instability (variance) in the total consumption. Here is a typical time series path (blue is Pareto, yellow is uniform):

Note that these two time series were given the exact same set of log-linear shocks. It also manifests in a larger variation in the growth rate (average here is zero by construction, but as mentioned above, it could be positive), again blue is Pareto, yellow is uniform, and gray is the actual set of shocks applied to both series (which closely matches the uniform distribution):

What is happening here? Well, the Pareto distribution effectively turns the N-dimensional region into an n << N dimensional region. The variation in a few of the large-income dimensions becomes much more important. We could also imagine they'd become more susceptible to coordination (e.g. panic/financial crisis) since it is easier to obtain a correlation among n << N agents than N agents.

What is also interesting is that one can easily visualize a couple of policies that address inequality in this framework. A progressive income tax or a progressive consumption tax would act as a uniformly-distributed (soft) bound on the allowed consumption states (income tax affects the bound itself, while consumption tax impacts the realized values of consumption). A universal basic income (UBI) would act as a lower bound. With these two policies, you could create an allowed region for the Pareto bound, making it more likely to be close to a uniform distribution sandwiched between the two. The figure below shows this with a 100% marginal income tax rate (red bound) and a universal basic income (green bound); the constrained Pareto bound is in blue.

I am not saying this is exactly what is happening in reality. This is just a possible understanding in terms of the maximum entropy/information equilibrium framework.


[1] This could easily be reconstructed in terms of wealth inequality or whatever form of economic inequality you prefer.

Tuesday, November 15, 2016

The surge in the 10-year rate

That was quite a surge in the 10-year rate after the election, however it's still consistent with the error bands I put forward in 2015:

The Blue Chip Economic Indicators (BCEI) continues to be incorrect.


Update 17 January 2017

That surge is starting to look transient ... stay tuned:

Thursday, November 10, 2016

Coupling things does not make them indistinguishable

Apart from my quibble with his use of the subjunctive in his piece, I also don't think that Noah Smith's argument that (aggregate) supply and demand are indistinguishable is a useful frame. Coupled, yes, but not indistinguishable.


That's a physics term that might not be widely known; it's used from quantum field theory to classical mechanics to describe X and Y when X becomes a function of Y i.e. X = X(Y) or a term like X · Y is added to a Lagrangian (so that the equations of motion for X depend on when the partial derivative ∂/∂X is taken).

Noah describes several models in his post. Instead of supply, they are written in terms of productivity (where shocks are considered supply shocks). The first one couples expected future supply (productivity) to present demand:
Paul Beaudry and Franck Portier ... theorized that a recession could happen because people realize that a productivity slowdown could be coming in the future. Anticipating lower productivity tomorrow, companies would reduce investment, lay off workers and cut output. ... Traditionally, we think of random changes in productivity as supply shocks, because they affect how much companies are able to produce. But in the short term, the cutbacks in investment and hiring in Beaudry and Portier’s model would look at lot like a demand shock.
We could write in this model that present demand depends on future expected supply, coupling them:

Dt = f(Et St+1)

In the subsequent models, flagging demand today lowers supply (productivity) tomorrow:
productivity growth goes down after demand takes a hit, and that this can cause recessions to last years longer than they otherwise would
We could write this as future supply depending on present demand:

St+1 = f(Dt)

I think Noah's bigger point however is not that supply and demand are indistinguishable, but rather that supply and demand aren't related by the partial equilibrium supply and demand curves except under very specific conditions (scope). For example, the shocks might have to be small, be unanticipated, or happen quickly. The cases above concern times where shocks are anticipated or not small (e.g. high unemployment lasts long enough for it to degrade skills).

I think the information equilibrium framework is a useful corrective to this because it explicitly defines the scope where the partial equilibrium analysis is correct: when supply or demand adjusts slowly with respect to the other. I find it hard to believe that economists don't think of supply and demand curves this way, but Noah's article and Dierdre McCloskey's review of Picketty's book are evidence that this is not normal.

In any case, the information equilibrium framework looks at supply and demand as generally strongly coupled via the information equilibrium condition:

You can solve the differential equation with two different scope conditions (supply adjusts faster than demand, demand adjusts faster than supply) yielding supply and demand curves:

There is of course the general solution where they move together (are strongly coupled). In this post, I show in simulations the various regimes of the supply and demand diagram; however, every simulation is done where general equilibrium holds (as an expected value) so even though I apply a demand shock or a supply shock, both supply and demand move together. Here is one animation from that post (a positive demand shock, with demand moving faster than supply adjusts):

In each case, supply and demand are distinguishable concepts (they represent different probability distributions) -- it's just the market price strongly couples them (p = k D/S). It's true that in perfect equilibrium, there is no real distinction between the two. We have I(D) = I(S) (information revealed by supply events is equal to information revealed by demand events when they meet in a transaction event), so there's no real difference between I(D) and I(S). However, much like how there's no difference between the probability distribution of quantum events in the double slit experiment and the (squared) wavefunction, the wavefunction is still necessary to understand and solve the Schrodinger equation.

Nobody would have predicted

Noah Smith has a Bloomberg View article up about (aggregate) supply and demand. First, I have a quibble with Noah making a degenerative research program look like a progressive one:
... new technologies might get delayed for years, leading to slower long-term growth. ...This is the basis of two new theory papers [from 2015 and 2016] ... That seems to fit the experience of countries hit hard by the financial crisis and recession of 2007-9. ... It was a classic, textbook drop in aggregate demand. But the recession then dragged on for years, and productivity growth slowed, just as these models would have predicted.
These models "would have" predicted in the counterfactual (subjunctive) universe where they were developed before the financial crisis. However, both papers were explicitly inspired by the post-crisis recession. From one paper:
"We examine the hypothesis that the slowdown in productivity following the Great Recession was in significant part an endogenous response to the contraction in demand that induced the downturn."
and the other:
"... recently interest in this topic has reemerged motivated by ... the slow recoveries experienced by the US and the Euro area in the aftermath of the 2008 financial crisis."
Basically, the counterfactual universe where these models "would have" predicted slow growth post-crisis cannot logically exist.

Anyway, I have another post coming up that will discuss the substance of Noah's piece.

Update: here it is.

Wednesday, November 9, 2016

Quick stats refresher: analyzing the 2016 election

This is an economics and physics blog, but I'll do one post about some of the post-mortem analysis happening out there because it touches on math.

I've seen some arguments out there that such and such loss of voters by one candidate in one election and a shift in another election being interpreted as voters moving from one candidate to another. This is only true under the condition that either turnout is 100% (and therefore moves are zero sum) or voters are drawn from the same pool, like this:

However, the more likely scenario is that voters are drawn from differing pools (e.g. polarized political affiliation):

In this case, "shifts" of votes from one side to the other could just be pulling more or fewer voters from your pool. This is behind partisan response bias.

A specific example I have in mind is that Obama won among white voters in the Midwest in 2008 (or some similar situation), but Trump won them in 2016, therefore a racism wasn't a factor. However it could be the two pools above represent racist and non-racist voters and Obama pulled more from the non-racist reserve in 2008 while Trump pulled more racist voters from the racist pool in 2016. (I'm not ruling out a candidate pulling from both pools, just simplified to that here).

The thing is that some analysis out there appears to assume turnout was 100% (in which case, voter shifts are zero sum) or from the same pool. This isn't necessarily true.

Leaning new tricks from an old book

In my previous post, I referenced a book review [pdf] of a Bergmann's Theory of Relativity I bought as a teenager not because I understood it, but because I aspired to understand it. I thought I'd call out a couple of quotes. First, one that is partially a response to Blackford's claim that classical mechanics didn't have logical contradictions:
Someone may look for a book on Relativity Theory which states clearly and axiomatically the assumptions of this theory and develops deductively the conclusions from these assumptions. This is not what Bergmann's book tries to do. What it tries to do, and does excellently, is to show how we were compelled to adopt these assumptions, how the structure of Relativity grew from logical contradictions in the classical theory, how their removal leads naturally and simply to the Theory of Relativity. The author presents not the painful historical process, not how Relativity was discovered, but how it should have been discovered if we had known the simple and straight road of logic leading to its formulation. Even in Relativity Theory, created almost by the genius of one man, this difference between the historically and logically reconstructed process is remarkable; it is the difference between the broad highway and the pioneer's narrow pathway.
This also is an example of noting the difference between what I call "Wikipedia science" (where everything is worked out) and "real science" (which is messier). It's that difference between the broad highway and the narrow pathway.

Also, I remembered that it was the first place I'd heard about Kaluza-Klein theory:
The third part (pp. 245-279) is of much more special character and deals with the unification of the gravitational and electromagnetic field. Here we find an exposition of Weyl's and Kaluza's theories and of their generalizations on which the author collaborated with Einstein. This part will rather interest specialists than students.
The review is from July of 1943. When it came up in string theory and the discussion of extra dimensions in the late 1990s, I thought back to the book and its strange (at the time) final chapters.

You never know where theory will lead, or end up becoming useful or relevant again.