Tuesday, January 17, 2017

A good series on macro and microfoundations

Brad DeLong linked to (what I think is) the third in a series of posts at Mean Squared Errors. At least I think they make a fine series:

  1. Houdini's Straightjacket
  2. The Microfoundations Hoax
  3. The Next New Macro

The story might be profitably read in reverse order as well. The main ideas are that microfoundations were adopted in a panic, essentially take the form of a straitjacket that the modern economist frees herself from in clever papers, and are largely useless at constraining the form of macroeconomic theories.

I actually used the straitjacket metaphor before myself. In fact, I've talked about a lot of the ideas mentioned in the three posts, which is why I found JEC (the blog's author) to be a kindred spirit [1]. However, JEC is a much better writer than I am. And because of that, I wanted to steal (with credit!) a series of things I wish I had said myself.

First, there's a great passage in The Microfoundations Hoax that I'd like to quote at length:
So when critics denigrated the [macroeconomic] models of the early '70's as "ad hoc," they had a pretty serious point. 
But what was the solution to all of this ad hoc-ery?  Where were we to look for the all-important virtue of discipline?   Ideally, in social science as in physical science, the source of discipline is data.  If you want to tell the difference between a true theory and a false one, you ask reality to settle the question.  But that was the heart of the problem: with so little data, all the models looked equally good in-sample, and no model looked especially good out-of-sample.  Discipline, if there was to be any, would have to come from theory instead.  And "microfoundations" was put forward as one form of theoretical discipline. 
The idea certainly sounded good: rather than simply making up relationships between aggregate variables like interest rates, output, unemployment, and inflation, we should show how those relationships arise from the behavior of individuals.  Or, failing that, we should at least restrict the relationships in our macro models to those which are consistent with our understanding of individual behavior.  For surely our standard assumptions about individual behavior (basically: people do the best they can under the circumstances they find themselves in) must imply restrictions on how the system behaves in the aggregate. 
Sadly, this intellectual bet was lost even before it was placed.  If we take Lucas (1976) as the beginning of the microfoundations movement, we may note with some puzzlement that the premise was mathematically proven false two years earlier, in Debreu (1974) and Mantel (1974).
Emphasis in the original. When I started this blog, I was seeking exactly the discipline JEC is talking about:
Put simply: there is limited empirical information to choose between alternatives. My plan is to produce an economic framework that captures at least a rich subset of the phenomena in a sufficiently rigorous way that it could be used to eliminate alternatives.
In a sense, that defines what I (and other physicists) mean by framework: a system or theory (in this case mathematical) that organizes empirical successes. Instead of human behavior, I started to build that framework out of information theory based on the least controversial formulation of economics: a large set of demand events matches up with a large set of supply events on average (therefore in "equilibrium", the information content of the two distributions are proportional to each other).

I'd like to paraphrase JEC above to describe the mission of the information equilibrium framework:
... rather than simply making up relationships between aggregate variables like interest rates, output, unemployment, and inflation, we should show how those relationships are consistent with information theory given any possible model of individual behavior.
In The Next New Macro, JEC shares my call for empirical success:
The fact is that, right now, we do not know the way forward, and no approach, no matter how promising (or how congenial to our pre-conceptions and policy preferences) should be allowed to dominate the field until it has proven itself empirically successful.
And I think JEC captures my feelings about a lot of economics:
You can't just say, "I feel in my heart that A, B, and C cause D, so here's a regression," and claim to be doing social science.  And when you're constantly saying, "Did I say 'B'?  I meant X!  A, X, and C cause D.  Also, maybe the log of B.  Here's another regression!" your credibility does not improve.
Now I'm not necessarily as uncharitable in this case. One of the key things you do in science is posit things like "A, B, and C cause D" and then look at the data. As I put in this post, I'd like to formalize this process:
In a sense, I wanted to try to get away from the typical econoblogosphere (and sometimes even academic economics) BS where someone says "X depends on Y" and someone else (such as myself) would say "that doesn't work empirically in magnitude and/or direction over time" and that someone would come back with other factors A, B and C that are involved at different times. I wanted a world where someone asks: is X ⇄ Y? And then looks at the data and says yes or no.
The series is a fun read. you should check it out.

...

Footnotes:

[1] Not to imply JEC would agree with what I've done; he or she might look at it with the same derision as he or she looks at microfoundations.

Sunday, January 15, 2017

Curve fitting and relevant complexity

A continuing exchange with UnlearningEcon prompted me to revisit an old post from two and a half years ago. You don't need to go back an re-read it because I am basically going to repeat it with updated examples; hopefully my writing and explanation skills have improved since then.

The question UnlearningEcon asked was what good are fits to the (economic) data that don't come from underlying explanatory (behavior) models?

The primary usefulness is that fits to various functional forms (ansatze) bound the relevant complexity of the underlying explanatory model. It is true that the underlying model may be truly be incredibly complex, but if the macro-scale result is just log-linear (Y = a log X + b) in a single variable then the relevant complexity of your underlying model is about two degrees of freedom (a, b). This is the idea behind "information criteria" like the AIC: more parameters represent a loss of information.

My old post looks at an example of thermodynamics. Let's say at constant temperature we find empirically that pressure and volume have the relationship p ~ 1/V (the ideal gas law). Since we are able to reliably reproduce this macroscopic behavior through the control of a small number of macroscopic degrees of freedom even though we have absolutely no control over the microstates, the microstates must be largely irrelevant.

And it is true: the quarks and gluon microstate configurations (over which we have no control) inside the nucleus of each atom in every molecule in the gas are incredibly complicated (and not even completely understood ‒ it was the subject of my Phd thesis). But those microstates are irrelevant to understanding an ideal gas. The same applies to nearly all of the information about the molecules. Even vibrational states of carbon dioxide are irrelevant at certain temperatures (we say those degrees of freedom are frozen out).

The information equilibrium framework represents a way to reduce the relevant complexity of the underlying explanatory model even further. It improves the usefulness of fitting to data by reducing the potential space of the functional forms (ansatze) used to fit the data. As Fielitz and Borchardt put it in the abstract of their paperInformation theory provides shortcuts which allow one to deal with complex systems. In a sense, the DSGE framework does the same thing in macroeconomics. However, models with far less complexity do much better at forecasting ‒ therefore the complexity of DSGE models is probably largely irrelevant.

A really nice example of an information equilibrium model really bounding the complexity of the underlying explanatory model is the recent series I did looking at unemployment. I was able to explain most of the unemployment variation across several countries (US and EU shown) in terms of a (constant) information equilibrium state (dynamic equilibrium) and shocks:


Any agent based model that proposes to be the underlying explanatory model can't have much more relevant complexity than a single parameter for the dynamic equilibrium (the IT index restricted by information equilibrium) and three parameters (onset, width, magnitude) for each shock (a Gaussian shock ansatz). Because we can reliably reproduce the fluctuations in unemployment for several countries with no control over the microstates (people and firms), then those microstates are largely irrelevant. If you want to explain unemployment during the "Great Recession", anything more than four parameters is going to "integrate out" (e.g. be replaced with an average). A thirty parameter model of the income distribution is going to result in at best a single parameter in the final explanation of unemployment during the Great Recession.

This is not to say coming up with that thirty parameter model is bad (it might be useful in understanding the effect of the Great Recession on income inequality). It potentially represents a step towards additional understanding. Lattice QCD calculations of nuclear states is not a bad thing to do. It represents additional understanding. It is just not relevant to the ideal gas law.

Another example is my recent post on income and consumption. There I showed that a single parameter (the IT index k, again restricted by information equilibrium) can explain the relationship between lifetime income and consumption as well as shocks to income and consumption:


Much like how I used an ansatz for the shocks in the unemployment model, I also used an ansatz for the lifetime income model (a simple three state model with Gaussian transitions):


This description of the data bounds the relevant complexity of any behavioral model you want to use to explain the consumption response to shocks to income or lifetime income and consumption (e.g. short term thinking in the former [pdf], or expectations-based loss aversion in the latter [pdf]).

That adjective "relevant" is important here. We all know that humans are complicated. We all know that financial transactions and the monetary system are complicated. However, how much of that complexity is relevant? One of the issues I have with a lot of mainstream and heterodox economics (that I discuss at length in this post and several posts linked there) is that the level of relevant complexity is simply asserted. You need to understand the data in order to know the level of relevant complexity. Curve fitting is the first step towards understanding relevant complexity. And that is the purpose of the information equilibrium model, as I stated in the second post on this blog:
My plan is to produce an economic framework that captures at least a rich subset of the phenomena in a sufficiently rigorous way that it could be used to eliminate alternatives.
Alternatives there includes different models, but also different relevant complexities.

The thing that I believe UnlearningEcon (as well as many others that disparage "curve fitting") miss is that curve fitting is a major step between macro observations and developing the micro theory. A good example is blackbody radiation (Planck's law) in physics [1]. This is basically the description of the color of objects as they heat up (black, red hot, white hot, blue ... like stars). Before Planck derived the shape of the blackbody spectrum, several curves had been used fit the data. Some had been derived from various assumptions about e.g. the motions of atoms. These models had varying degrees of complexity. The general shape was basically understood, so the relevant complexity of the underlying model was basically bounded to a few parameters. Planck managed to come up with a one-parameter model of the energy of a photon (E = h ν), but the consequence was that photon energy is quantized. A simple curve fitting exercise lead to the development of quantum mechanics! However, Planck probably would not have been able to come up with the correct micro theory (of light quanta) had others (Stewart, Kirchhoff, Wien, Rayleigh, Rubens) not been coming up with functional forms and fitting them to data.

Curve fitting is an important step, and I think it gets overlooked in economics because we humans immediately think we're important (see footnote [1]). We jump to reasoning using human agents. This approach hasn't produced a lot of empirical success in macroeconomics. In fact ‒ also about two and a half years ago ‒ I wrote a post questioning that approach. Are human agents relevant complexity? We don't know the answer yet. Until we understand the data with the simplest possible models, we don't really have any handle on what relevant complexity is.

...

Footnotes

[1] I think that physics was lucky relative to economics in its search for better and better understanding in more than one way. I've talked about one way extensively before (our intuitions about the physical universe evolved in that physical universe, but we did not evolve to understand e.g. interest rates). More relevant to the discussion here is that physics, chemistry, and biology were also lucky in the sense that we could not see the underlying degrees of freedom of the system (particles, atoms, and cells, respectively) while the main organizing theories were being developed. Planck couldn't see the photons, so he didn't have strong preconceived notions about them. Economics may be suffering from "can't see the forest for the trees" syndrome. The degrees of freedom of economic systems are readily visible (humans) and we evolved to have a lot of preconceptions about them. As such, many people think that macroeconomic systems must exhibit the similar levels of complexity or need to be built up from complex agents ‒ or worse, effects need to be expressed in terms of a human-centric story. Our knowledge of humans on a human scale forms a prior that biases against simple explanations. However, there is also the problem that there is insufficient data to support complex explanations (macro data is uninformative for complex models, aka the identification problem). This puts economics as a science in a terrible position. Add to that the fact that money is more important to people's lives than quantum mechanics and you have a toxic soup that makes me surprised there is any progress.

Saturday, January 14, 2017

Stocks and k-states, part V

The picture of stock markets presented here lends support to this picture of the financial sector. We can see the distribution of k-states (relative growth rate states) moves in concert during the financial crisis:


This means that, to first order, the financial sector (gray) can experience a shock behaving like an "anti-Keynesian" government sector (blue) where the rest of the distribution represents the entire economy:


The Keynesian government sector case was discussed here.

This was inspired by an exchange between Dan Davies and Noah Smith on Twitter:

Consumption, income, and the PIH


Noah Smith writes about the Permanent Income Hypothesis (PIH) failing to match the data at Bloomberg View. He also has a great follow up on his blog that takes the "it's good for certain cases" school of modeling to task.

On Twitter (and in a blog post), the paper Noah cited got some pushback because it measured expenses rather than consumption. Claudia Sahm got a good pwn in:
"because PIH is [about] smoothing marginal utility of consumption ... not market expenses"
But as she also says:
"[Milton] Friedman & [Gary] Becker were masters at writing down unfalsifiable/irrefutable [consumption] models ... "
That's the issue. If expenses do not reveal consumption utility, then that utility ‒ and thus the PIH ‒ is basically unobservable. However I think the information equilibrium model can shed a bit of light on this subject and reconcile not only the paper Noah cites, but a paper I discussed a few days ago. And we'll do it without utility.

First we'll start with Gary Becker's "irrational agents" with an intertemporal budget constraint. As I showed in this post, agents with random consumption will not only saturate the budget constraint (as long as there are many time periods), but will manifest consumption smoothing (i.e. the consequence of the PIH) in aggregate. And if income I is in information equilibrium with consumption C (i.e. $C \rightleftarrows I$, imagine the diagram at this link with A being C, and B being I) then both will be approximately constant.

In the paper discussed in this post, the reality is a bit more complex over the lifetime of agents. I put together a "social institution" model where there are three approximately constant states: childhood, working adult, and retirement. In this view there are actually three metastable constant states of income (set by the scales zero, NGDP per capita, and Social Security, for example). I fit the a pair of logistic functions (representing the two transitions) to the income as a function of age data (the transitions are at 16.3 years and 68.1 years shown in the figure below in blue). This means $I = I(t)$ is a function of age. Now if consumption and income are in information equilibrium (IE) with information transfer (IT) index $k$, we can say

$$
\frac{C}{C_{0}} = \left( \frac{I}{I_{0}} \right)^{k}
$$

Therefore $C(t) \sim I(t)^{k}$. Fitting the IT index and the constants $I_{0}$ and $C_{0}$ to the data, we get a pretty good description of the consumption data (yellow points):


The blue points are the income data. The blue line is the logistic model. The yellow points are the consumption data. And finally, the yellow line is the IE model with the three-state logistic model (fit to the income data) as input. We find $k \sim 0.53$ [1]. This is important because $k < 1$ means that the (fractional) change in consumption will be smaller than the (fractional) change in income since

$$
\frac{dC}{C} = \; k \; \frac{dI}{I}
$$

Note that the PIH could be considered to be the approximation that $k \ll 1$; in that case $C$ doesn't react to any changes in income. However, we found $k \sim 0.5$ so the PIH's scope condition is not met and the PIH isn't even an effective theory unless you are dealing with a scenario where both income and consumption are constant ‒ essentially where you want the PIH it apply (changing income), it fails. But $k < 1$, so consumption does have a smaller reaction than income does in the presence of shocks. Additionally, the emergent consumption smoothing (linked above) relies on an assumption of no shocks. In the presence of shocks, the intertemporal budget constraint is not saturated, and therefore the PIH would in general be violated.

Can we use this IT index value to explain the consumption (expense) and income data in Noah's BV post? If $k < 1$, then $C(t)$ will move much less than $I(t)$, which gives us a picture that looks exactly the data Noah shows. In fact, fitting the income data [2] to the consumption (expense) data gives us almost the exact same IT index of $k \simeq 0.55$


Here, the yellow line is the IE model shown above with the (interpolated) income data as the input function $I(t)$. Basically, consumption moves more slowly than income, so a large income shock leads to a smaller consumption shock. The PIH (if it is observable, i.e. expenses reveal consumption utility) would be the case where consumption doesn't move at all. What we have is a case somewhat between the $C = I$ case (i.e. $k = 1$) and the $C = \text{ constant}$ case (i.e. $k = 0$).

The information equilibrium model is able to capture both sets of data (lifetime consumption and consumption in the presence of shocks); it tells us that since $k \sim 0.5$ (empirically), the PIH is generally a bad approximation whenever income is dynamic (not constant). If $k$ had been closer to 0.1 or even smaller, the PIH would be a good approximation. But the $k \sim 0.5$ result means we should listen to Noah:
This means Friedman’s theory doesn’t just need a patch or two ‒ it needs a major overhaul.
...

Update 15 January 2017

@unlearningecon is "not convinced" [3] by fitting logistic curves to the income data above. As that step is actually incidental to the primary conclusion (that the information equilibrium model $C \sim I^{k}$ with $k \sim 0.5$ describes the data), let me do the lifetime model in the same way I did the shock model: by using an interpolation of the income data as the function $I(t)$. We obtain the same result:


...

Footnotes

[1] Also note that consumption is greater than income in in childhood, less than income as a working adult, and again greater than income in retirement.

[2] In this case I just used the $I(t)$ data as the model, however I could probably fit the income data to a shock like this.

[3] Realistically, who is ever "convinced" of something which they don't already believe by anything they read on the internet ? I think the originators of the (public) internet believed it would be an open forum for the exchange of ideas. A good behavioral model would have possibly "convinced" them that all that really happens is confirmation bias and the reinforcement of priors. Ah, but who am I kidding?

Thursday, January 12, 2017

Dynamic unemployment equilibrium: Japan edition

This is basically the previous post, but for Japan:



The centroids of the recessions in the fit are at 1974.8, 1983.1, 1993.9, 1998.4, 2001.6, and 2009.0. There does not seem to be any indication of another recession since 2009, contra this post.

Dynamic unemployment equilibrium: Eurozone edition

I thought I'd redo my old version of the EU unemployment rate using the updated logarithmic version of the naive dynamic equilibrium model. However, one of the funny things that happened was that I didn't realize right off the bat that the data at FRED only goes until the end of 2013. I ended up fitting the data using just that data, so when I went to retrieve the latest data from the ECB (shown in black on the next graph) it ended up becoming a prediction of sorts. It worked pretty well:


Here is the fit to the full dataset:


The  recession centers are at 2003.1, 2009.1, and 2012.2.

Update + 3 hrs

I forgot to include the derivative (visualizing the shocks):


Wednesday, January 11, 2017

Dynamic equilibrium: unemployment rate


In last night's post, I looked at the ratio of the level of unemployment $U$ to job vacancies $V$ in terms of the "naive dynamic equilibrium" model. There was a slight change from the original version describing the unemployment rate $u$ where instead of taking the dynamic equilibrium to be:

$$
\frac{du}{dt} \approx \; \text{constant}
$$

I used the logarithmic derivative:

$$
\frac{d}{dt} \log \; \frac{V}{U} \approx \; \text{constant}
$$

This latter form is more interesting (to me) because it can be directly related to a simple information equilibrium model $V \rightleftarrows U$ (vacancies in information equilibrium with unemployment) as shown in last night's post. Can we unify these two approaches so that both use the logarithmic form? Yes, it turns out it works just fine. Because $u$ doesn't have a high dynamic range, $\log u(t)$ is directly related via linear transform to $u(t)$ such that basically we end up with a different constant.

If we have an information equilibrium relationship $U \rightleftarrows L$ where $L$ is the civilian labor force such that $u = U/L$, and we posit a dynamic equilibrium:

$$
\frac{d}{dt} \log \; \frac{U}{L} =\;  \frac{d}{dt} \log \; u \;  \approx \; \text{constant}
$$

Then we can apply the same procedure as documented here and come up with a better version of the dynamic equilibrium model of the unemployment rate data:



[Update: added derivative graph.] The transition points are at 1991.0, 2001.7, 2008.8, and 2014.3 (the first three are the recessions, and the last one is the "Obamacare boom"). The improvements are mostly near the turnaround points (peaks and valleys). Additionally, the recent slowing in the decrease in the unemployment rate no longer looks as much like the leading edge of  a future recession (the recent data is consistent with the normal flattening out that is discussed here).

I just wanted to repeat the calculation I did in the previous post for the unemployment rate. We have an information equilibrium relationship $U \rightleftarrows L$ with a constant IT index $k$ so that:

$$
\log \; U = \; k \; \log \; L + c
$$

therefore if the total number of people in the labor force grows at a constant rate $r$, i.e. $L\sim e^{rt}$ (with, say, $r$ being population growth)

$$
\begin{align}
\log \; \frac{U}{L} = & \; \log U - \log L\\
= & \; k\; \log L + c - \log L\\
 = & \; (k-1)\; \log L + c\\
\frac{d}{dt}\log \; u = & \;  (k-1)\;\frac{d}{dt} \log L \\
 = & \; (k-1)r
\end{align}
$$

which is a constant (and $u \equiv U/L$).

...

Update +4 hours:

I went back and added the recessions through the 1960s. Because the logarithm of the unemployment rate is much better behaved, I was able to fit the entire series from 1960 to the present with a single function (a sum of 7 logistic functions) instead of the piecewise method I had used for the non-logarithmic version that fit the pre-1980s, the 1980s, and the post-1980s separately. Here is the result:



The recession years are: 1970.3, 1974.8, 1981.1, 1991.1, 2001.7, and 2008.8 ‒ plus the positive ACA shock in 2014.4. And except for one major difference, the model is basically within ± 0.2 percentage points (20 basis points). That major difference is the first half of the 1980s when Volcker was experimenting on the economy ...


It's possible that we need to resolve what may actually be multiple shocks in the early 1980s.

Update 14 January 2017

Later posts examine the EU and Japan. First the EU:



The recession centers are at 2003.1, 2009.1, and 2012.2. And for Japan:



The centroids of the recessions in the fit are at 1974.8, 1983.1, 1993.9, 1998.4, 2001.6, and 2009.0.

...

Update 17 January 2017

Employment level using the model above:


Tuesday, January 10, 2017

A dynamic equilibrium in JOLTS data?


Steve Roth linked to Ben Casselman in a Tweet about the latest JOLTS data, and both brought attention to the fact that the ratio of unemployed to job openings appears to have flattened out; Steve also made the observation that it doesn't seem to go below 1.

I thought that the JOLTS data might make a good candidate for my new favorite go-to model ‒ the naive dynamic equilibrium that I recently used to make an unwarranted forecast of future unemployment. However, in this case, I think the framing of the same data from voxeu.org in terms of the logarithm might be the better way to look at the data.

We will assume that the logarithmic derivative of the inverse ratio (i.e. openings over unemployed) is a constant

$$
\frac{d}{dt} \log \; \frac{V}{U} \simeq \;\text{ constant}
$$

where $\exp \; (- \log (V/U)) = U/V$ which is the ratio Ben graphs. Using the same entropy minimization procedure, fitting to a sum of logistic functions, and finally converting back to the $U/V$ variable Ben and Steve were talking about, we have a pretty decent fit (using two negative shocks):


The shocks are centered at 2000.6 and 2008.6, consistent with the NBER recessions just like the unemployment model. There's a bit of overshooting after the recession hits just like in the unemployment case as well. I've also projected the result out to 2020. If there weren't any shocks, we'd expect the ratio to fall below 1 in 2018 while spending approximately three to four years in the neighborhood of 1.3. And I think that explains why we haven't seen a ratio below 1 often: the economy would have to experience several years without a shock. A ratio near 2 requires a shorter time to pass; a ratio near 4 requires an even shorter time. Also in this picture, Ben's "stabilization" is not particularly unexpected.

However, there's a bit more apparent in this data that is not entirely obvious in the $U/V$ representation shown above. For example, Obamacare pushed job openings in health care higher, likely even reducing the unemployment rate by a percentage point or more (as I showed here). This also has a noticeable effect on the JOLTS data; here's the fit with three shocks ‒ two negative and one positive:


With the positive Obamacare shock (onset of 2014.3 or about mid April 2014 is consistent with the unemployment findings and the negative shocks are in the nearly the same places at 2000.8 and 2008.7). We can see what originally looked like mean reversion in the past year now looks a bit like the beginning of a deviation. As I asked in my unwarranted forecast at the link above, what if we're seeing the early signs of a future recession? The result hints at a coming rise, but is fairly inconclusive:


I wanted to keep the previous three graphs in order for you to be able to cycle through them on a web browser and see the subtle changes. This graph of the logarithm of $V/U$ that I mentioned at the top of the post allows you to see the Obamacare bump and the potential leading edge of the next recession a bit clearer:


It's important to note that leading edge may vanish; for example, look at 2002-2005 on the graph above where a leading edge of a recovery and recession both appeared to vanish.

Now is this the proper frame to understand the JOLTS data? I don't know the answer to that. And the frame you put on the data can have pretty big impacts on what you think is the correct analysis. The dynamic equilibrium is at least an incredibly simple frame (much like my discussion of consumption smoothing here) that's pretty easy to generate with a matching model ($U$ and $V$ grow/shrink at a constant relative rate). Even better, an information equilibrium model because the equation at the top of this page essentially says that $V \rightleftarrows U$ with a constant IT index $k$ so that:

$$
\log \; V = \; k \; \log \; U + c
$$

therefore if the total number of unemployed grows at a constant rate $r$, i.e. $U\sim e^{rt}$ (say, with population growth)

$$
\begin{align}
\log \; \frac{V}{U} = & \; \log V - \log U\\
= & \; k\; \log U + c - \log U\\
 = & \; (k-1)\; \log U + c\\
\frac{d}{dt}\log \; \frac{V}{U} = & \;  (k-1)\;\frac{d}{dt} \log U \\
 = & \; (k-1)r
\end{align}
$$

which is a constant (which is about 0.21, or about 21% of unemployed per year or 1.7% per month, at least since 2000 ‒ multiple equilibria are possible).

Update 11 January 2017

In case you are interested, here is the code (click to expand):


With an unbiased prior, market failure is the most likely outcome

Noah Smith has a short article at Bloomberg View about how free market proponents make a play for the null hypothesis; he convincingly argues that maybe economists should be more amenable to not having to prove market failure.

I'd like to point out that using the information transfer framework while assuming we know nothing about the situation, market failure should be our default hypothesis. That's because generally in a market where A is exchanged for B, we can at best assume the information entropy of the distribution of A is a bound on the information about that distribution received at B. Symbolically:

I(A) ≥ I(B)

The intuition here is that if you send a signal about a change [0] in A to which B should respond (e.g. to clear excess supply or demand that results), at best B can receive all of the information in the signal. Given noise, irrational human actions, or sudden drops in the information content of the distributions (e.g. everyone wants to sell the same asset so that it takes less information to describe the aggregate ‒ everyone wants to sell ‒ than to describe in a typical case ‒ Alice wants to buy, Bob wants to sell, Carol wants to sell, David wants to sell, Edna wants to buy, etc [1]).

If we were completely ignorant about what is happening, then we'd expect

I(A) > I(B)

most of the time. The math can still be useful here; the resulting differential equation [2] for information equilibrium (ideal information transfer) ‒ i.e. I(A) = I(B) ‒ becomes a bound for I(A) > I(B). However, we really have to accept we know a lot less than we might think.

In a sense, in order to say that markets are working, you first have to show they are working [3]. This will involve empirical tests; my favorite are predictions like this one:


Predictions are my favorite because it's the one way you can be absolutely sure you have data you haven't trained your model on is when that data comes from the future. Actually, long term interest rates are one of the more ideal markets out there ‒ at least so far. Exchange rates (forex markets) might be one of the worst.

There are a lot more ways a market can fail than it can work. The more surprising fact is that there are working markets. When some people say that governments are responsible for functioning markets, this is one of the reasons I'm inclined to believe it.

...

Footnotes

[0] You can actually derive the information equilibrium condition by assuming an infinitesimally small signal dA and the infinitesimal response dB.

[1] Think of this as the sequence 00000... vs 10001... so that the second one requires more information to specify. A constant sequence (a light that's always on or off) carries no information.

[2] The interesting part is that this differential equation was almost written down by Irving Fisher in 1892.

[3] The micro- and macro-economic theory I've been developing on this blog actually grew out of an attempt to understand whether prediction markets were working for a vastly different application. The answer seems to be as stated above: probably not.

Monday, January 9, 2017

Behavior, institutions, and other modeling assumptions

@Unlearningecon referred me to a paper [pdf] that tries to understand lifetime consumption based on a model of human behavior. Unfortunately, the model has 10+ parameters for ~ 40 data points meaning the AIC is pretty bad (an indication of over-fitting). I challenged myself to do better and I will try to do so using an "institutional" model.

Let's say there are three "institutions" (in physics, we might call these "states") in the culture under consideration we'll call childhood, working age, and retirement. As such, there will be two transitions that we'll take take to have some width (not every agent transitions at the same time). This gives us a model that is the sum of two logistic functions with 7 parameters [1]. We'll fit these to the same data as the paper linked above.

This is the result:


The transitions are at 25.1 and 67.6 years; the former has a width of 4.7 years, while the latter has a width consistent with zero (the actual value translates to about 51 minutes which I found entertaining ‒ you'd probably want to look deeper into that than not at all if you were publishing a paper). I did add one (hopefully uncontroversial) point: income equals zero at birth. This helps pin down the childhood-to-adulthood transition and even gives us a look at expected income for people younger than 25:


I converted from log income to level (in thousands). The horizontal line shows full time employment at minimum wage (approximately 2000 hours at 3.35 per hour, minimum wage in 1984) as a scale reference.

My primary motivation here was not necessarily to say: Look at me; I can do better! That was just my challenge to myself. My main motivation was to say that there are many ways to parse the data and different models that come out of those different "priors" are not necessarily distinguishable based on data alone. For example, you could view the institutional model as exactly a behavioral expectations model: our expectations and behavior are governed by social institutions. People in US society expect to be working between their 20s and 60s and base their consumption accordingly.

And there are even simpler models such as the emergent consumption smoothing of random agents. This effectively fits the data to a constant, and is basically the approximation made in a lot of standard economic analysis.

At a certain level, these modeling choices are judgment calls. But on another level they represent priors we push onto the data. The criterion should come down to simplicity, and not our pre-existing feelings about what kind of models we ought to use (behavioral, nonlinear, evolutionary, institutional). Basically all claims that "economists should do X" or "models should include Y" should come alongside more agnostic and simplistic models where X or Y are shown to be an improvement over the simplistic explanation of the data [2].

*  *  *

Footnotes

[1] Although one is an overall normalization degree of freedom (the scale of money) and another is used to explain the childhood state for which there is not much real data in the given dataset. The data given could actually be understood by 5 parameters. A simpler model with no transition width would have only 5 parameters including the normalization (three step functions).

[2] That's part of the mission statement of this blog and the information transfer framework. It represents taking that ethos to an almost ridiculous extreme where everything is built out of log-linear relationships (not exactly, but it's not a completely unfair characterization).