Thursday, October 19, 2017

Real growth

In writing my post that went up this afternoon, I became interested in looking closely at US real GDP (RGDP) in terms of a dynamic equilibrium model for nominal GDP (NGDP) and the GDP deflator (DEF) where RGDP = NGDP/DEF. So I went ahead an tried to put together a detailed description of the NGDP data and the DEF data. This required several shocks, but one of the interesting aspects was that there appears to be two regimes:
1. The demographic transition/Phillips curve regime (1960-1990)
2. The asset bubble regime (1990-2010)
The DEF data looks much like the PCE data that I referenced in talking about a fading Phillips curve. The NGDP data is essentially one major transition coupled with two big asset bubbles (dot-com and housing):


These are decent models of the two time series:


Taking the ratio gives us RGDP, and the (log) derivative give us the RGDP growth rate


It's a pretty good model. The main difference is that for the "Phillips curve" recessions, there are large narrow shocks RGDP near the bottom of those business cycles that both narrower and larger in magnitude than we might expect (these are in fact the shocks associated with spiking unemployment rates). We can also separate out the contributions from NGDP and DEF:


Without the data it's easier to see (and I added some labels as well):


It does not currently look like there is another asset bubble forming. This is consistent with the dynamic equilibrium model for household assets, and you can also tell the dot-com bubble was a stock bubble as it shows up in assets and the S&P 500 model. In fact, today's equilibrium in both NGDP and DEF is actually somewhat unprecedented. We might even call it a third regime
3. The equilibrium (2010-present)
In the past, the things that caused business cycles were war, demographic transitions, and asset bubbles. What if there aren't any more recessions? That would be a strange world for macroeconomics. Maybe macro is confused today about productivity slowdowns and secular stagnation because we've finally reached equilibrium when everyone thought the economy was in equilibrium at least at times in the past? In fact, the mid-50s and mid-90s were actually the only times we were close. 

I am pretty sure there will be some asset bubble (or war) in the future because humans. I have no idea what that asset (or war) will be, but it's something we should keep our eyes on. At least, if this model is accurate — therefore I will continue to test it.

But maybe we've finally reached Keynes' flat ocean?

In the right frame, economies radically simplify

I was reading Simon Wren-Lewis on productivity, this out of NECSI, as well as this from David Andolfatto on monetary policy. It sent me down memory lane with some of my posts (linked below) where I've talked about various ways to frame macro data.

The thing is that certain ways of looking at the data can cause you to make either more complicated or less complicated models. And more complicated models don't always seem to be better at forecasting.

Because we tend to think of the Earth at rest, we have to add Coriolis and centrifugal "pseudo forces" to Newton's law because it is a non-inertial frame. In an inertial frame, Newton's laws simplify.

Because ancient astronomers thought not only that they were seeing circles in the sky, but that the Earth was at rest (in the center) they had to add epicycle upon epicycle to the motions of planets. In Copernicus's frame (with a bit of help from Kepler and Newton), the solar system is much simpler (on the time scale of human civilization).

Now let me stress that this is just a possibility, but maybe macroeconomic models are complex because people are looking at the data using the wrong frame and seeing a complex data series?

As I mentioned above, I have written several posts on how different ways of framing the data — different models — can affect how you view incoming data. Here is a selection:


https://informationtransfereconomics.blogspot.com/2017/03/the-recovery-and-using-models-to-frame.html

https://informationtransfereconomics.blogspot.com/2017/04/macroeconomics-has-no-equilibrium-data.html

https://informationtransfereconomics.blogspot.com/2017/04/productivity-growth-and-verdoorns-law.html

https://informationtransfereconomics.blogspot.com/2017/04/growth-regimes-lowflation-and-dynamic.html

One thing that ties these posts together is that not only do I use the dynamic equilibrium model as an alternative viewpoint to the viewpoints of economists, but that the dynamic equilibrium model radically simplifies these descriptions of economies.

What some see as the output of complex models with puzzles become almost laughably simple exponential growth plus shocks. In fact, not much seems to have happened in the US economy at all since WWII except women entering the workforce — the business cycle fluctuations are trivially small compared to this effect.

We might expect our description of economies to radically simplify when you have the right description. In fact, Erik Hoel has formalized this in terms of effective information: delivering the most information about the state of the system using the right agents.

Whether or not you believe Hoel about causal emergence — that these simplifications must arise — we know we are encoding the most data with the least amount of information because the dynamic equilibrium models described above for multiple different time series can be represented as functions of each other.

If one time series is exp(g(t)), then another time series exp(f(t)) is given by

f(t) = c g(a t + b) + d t + e

And if Y = f(X), then H(Y) ≤ H(X).

[ed. H(X) is the information entropy of the random variable X]

Now this only works for a single shock in the dynamic equilibrium model (the coefficients a and b adjust the relative widths and centroids of the single shocks in the series defined by f and g). But as I mentioned above, most of the variation in the US time series is captured by a single large shock associated with women entering the workforce.

The dynamic equilibrium frame not only radically simplifies the description of the data, but radically reduces the information content of the data. But the kicker is that this would be true regardless of whether you believe the derivation of the dynamic equilibrium model or not.

You don't have to believe there's a force called gravity that happens between any two things with mass to see how elliptical orbits with the sun at one focus radically simplifies the description of the solar system. Maybe there's another way to get those elliptical orbits. But you'd definitely avoid making a new model that requires you to look at the data as being more complex (i.e. a higher information content).

This is all to say the dynamic equilibrium model bounds the relevant complexity of macroeconomic models. I've discussed this before here, but that was in the context of a particular effect. The dynamic equilibrium frame bounds the relevant complexity of all possible macroeconomic models. If a model is more complex than the dynamic equilibrium model, then it has to perform better empirically (with a smaller error, or encompass more variables with roughly the same error). More complex models should also reduce to the dynamic equilibrium model in some limit if only because the dynamic equilibrium model describes the data [1].

...

Footnotes:

[1] It is possible for effects to conspire to yield a model that looks superficially like the dynamic equilibrium model, but is in fact different. A prime example is a model that yields a dynamic equilibrium shock as the "normal" growth rate and the dynamic equilibrium "normal" as shocks. Think of a W-curve: are the two up strokes the normal, or the down? Further data should show that eventually you either have longer up stokes or down strokes, and it was possible you were just unlucky with the data you started with.

Wednesday, October 18, 2017

Miracles

Scott Sumner has a review of the Rethinking Macroeconomics conference in which he says:
On the negative side, I was extremely disappointed by some of the comments on monetary policy. In response to calls for a higher inflation target to avoid the zero bound problem, Jeremy Stein of Harvard University asked something to the effect "What makes you think the Fed can achieve higher inflation?" (Recall that Stein was recently a member of the Federal Reserve Board.) I was pleased to see Olivier Blanchard respond that there is no doubt that we can achieve 4% inflation, or indeed any trend inflation rate we want. But then Larry Summers also suggested that he shared Stein's doubts (albeit to a lesser extent.) 
I kept thinking to myself: Why do you guys think the Fed is currently engaged in steadily raising the fed funds target? What do you think the Fed is trying to achieve? How can a top Fed official not think the Fed could raise its inflation target during a period when we aren't even at the zero bound? Why has the US averaged 2% inflation since 1990---is it just a miracle?
I've addressed almost this exact statement before (with what is going to be less derision), but the emphasized sentence is either the most innumerate claim I've ever seen from a PhD economist or just an incredibly disingenuous one ... to the point of lying on purpose to deceive.

I tried to select the data series that makes Sumner's claim as close to true as possible. It requires headline PCE inflation, but regardless of the measure you use, you get the same result I will illustrate below.

Why does Sumner choose 1990? Well, it is in fact the only year that makes his claim true:


For later starting years, average inflation is lower; for earlier starting years, average inflation is higher. In fact, average inflation since year Y has been almost monotonically decreasing as a function of Y. Therefore, since it was higher than 2% at some time in the past, the statement "inflation has averaged 2% since Y = Y₀" is true for some Y₀ (and since it is almost monotonic, there is only one such Y₀). It just so happens Y₀ ≈ 1990. There's a miracle all right — but the miracle is that Sumner would pick 1990, not that the Fed would pick 2%. I'm more inclined to believe Sumner chose 1990 in order to keep his prior that the Fed can target whatever inflation rate it wants while the Fed says it's targeting 2% [1].

The other possibility here (Occam's razor) is that inflation is just falling and the Fed has no control over it [2]. But regardless of what is actually happening, Sumner is either fooling himself or others with this "evidence". And as we add more data to this series, unless PCE inflation starts to come in above 2%, Sumner's claim is going to eventually become wrong [3]. Will he reconsider it then? 

This kind of numbers game is really upsetting to me. It is the inflation equivalent of the statements by global warming deniers that there's been "no statistically significant warming since 1997" (which uses the fact that a large volcanic eruption caused temperatures to not rise for a few years, and additionally is playing a rather loose game with the words 'statistically significant' — at the time they were making that claim there wasn't enough data to say any increase was statistically significant unless it was huge).

I know: Hanlon's razor. But in the case of global warming deniers it was a deliberate attempt to mislead.

...

Footnotes:

[1] Somewhere, I don't remember where (possibly Tim Duy?) noticed that the Fed seems to actually be looking at average headline inflation of 2%, which would mean that Sumner should choose 2005 instead of 1990. 

[2] In fact, I think it might be a demographic effect. There is a period of "normal" core PCE inflation of 1.7% in the 1990s:


[3] My estimate says it'll be some time after 2020 for most plausible paths of inflation.

Tuesday, October 17, 2017

10 year interest rate forecasts in the US and UK

A couple of continuing forecast validations — this time, it's the interest rate model (which has been used by a Korean blog called Run Money Run for Korea, Japan, and Taiwan). Specifically, we're looking at the 10-year interest rate model for both the US (which has been going for 26 months now) and UK (only  a few months):



The US graph contains forecasts from  the CBO from December of 2016 as well as a concurrent forecast from the Blue Chip Economic Indicators (BCEI) — which I love to point out costs thousands of dollars to access their insights in their journal.

Social constructs are social constructs

Noah Smith stepped into a bit of a minefield with his "scientific facts are social constructs" thread — making fun of the idea here [tweet seems to be deleted; it was referring to this tweet], attempting to get a handle on the utter philosophical mess that followed here. With the latter tweet, he illustrates that there are many different things "scientific facts are social constructs" could mean. We have no idea of the original context of the statement, except that it was in an anthropology class [0].

Clearly on some level, scientific facts are not social constructs in the sense that they fail to exist or function differently in a different society. My computer and the network it is attached to functions in exactly the way it is supposed to based on scientific facts in order for me to deliver this text to you via http. This is the universe of physics, computer science, and engineering. We are crossing model levels and scales here — from the human to the electron. As Erik Hoel shows, it is entirely possible that you cannot begin to even formulate what you mean by "social construct" and "electric current" at sufficient fidelity simultaneously (one is a description of macro states and the other is a description of micro states).

But this was an anthropology class. In anthropology, the process of science and the social constructs of society (including the process of science) are in a sense at the same level. It is entirely possible for the social process of science to interact with the anthropological states. Think of this as a "quantum uncertainty principle" for social theories. The process of measuring anthropological states depends on the social scientific process measuring it in the metaphorical sense that measuring the position of an electron depends on the momentum of the photon measuring it. It's a good thing to keep in mind.

However, in a sense, we have no possible logical understanding of what is a social construct and what isn't because we have empirical evidence of exactly one human species on one planet. You need a second independent society to even have a chance at observing something that gives you insight as to how it could be different. Is an electron a social construct? Maybe an alien society kind of bypassed the whole "particle" stage and think of electrons instead as spin-1/2 representations of the Poincare group with non-zero rest mass. The whole particle-wave duality and Hydrogen atom orbitals would be seen as a weird socially constructed view of what this alien society views as simply a set of quantum numbers.

But that's the key: we don't have that alien society, so there's no way to know. Let's designate the scientific process by an operator P = Σ |p⟩ ⟨p|. We have one human society state |s⟩, so we can't really know anything about the decomposition of our operator in terms of all possible societies s':

P = Σ Σ ⟨s'|p⟩ |s'⟩ ⟨p|

We have exactly one of those matrix elements ⟨s'|p⟩, i.e. s' = s for ⟨s|p⟩. Saying scientific facts are social constructs is basically an assumption about the entire space spanned by societies |s'⟩ based on its projection in a single dimension.

If you project a circle onto a single dimension, you get a line segment. You can easily say that the line segment could be the projection of some complex shape. It could also be a projection of a circle. Saying scientific facts are social constructs in general is saying that the shape is definitely very complex based on zero information at all, only the possibility that it could be. And yes, that is good to keep in mind. It should be part of Feynman's "leaning over backwards" advice, and has in fact been useful at certain points in history. One of my favorites is the aether. That was a "scientific fact" that was a "social construct": humans thought "waves" traveled in "a medium", and therefore needed a medium for light waves to travel in. This turned out to be unnecessary, and it is possible that someone reading a power point slide that said "scientific facts are social constructs" might have gotten from the aether to special relativity a bit faster [1].

However, the other thing that anthropology tries to do is tease out these social constructs by considering the various human societies on Earth as sufficiently different that they represent a decent sampling of those matrix elements ⟨s|p⟩. And it is true that random projections can yield sufficient information to extract the underlying fundamental signal behind the observations (i.e. the different scientific facts in different sociological bases).

But! All of these societies evolved on Earth from a limited set of human ancestors [2]. Can we really say our measurements of possible human societies are sufficiently diverse to extract information [3] about the invariant scientific truths in all possible societies including alien societies? Do we really have "random projections"? Aren't they going to be correlated?

So effectively we have come to the point where "scientific facts are social constructs" is either vacuous (we can't be sure that alien societies wouldn't have completely different sets of scientific facts) or hubris (you know for certain alien societies that have never been observed have different scientific facts [4]). At best, we have a warning: be aware that you may exhibit biases due to the fact that you are a social being embedded in society. But as a scientist, you're supposed to be listing these anyway. Are anthropologists just now recognizing they are potentially biased humans and in their surprise and horror (like fresh graduate students being told every theory in physics is an effective theory) they over-compensate by fascistically dictating other fields see their light?
Yes, anthropology: 
Anthropologists can affect, and in fact are a part of, the system they're studying. We've been here for awhile. 
xoxo, physics.
Now, can we get back to the search for some useful empirical regularities, and away from the philosophical argy-bargy?

...

Footnotes:

[0] Everyone was listing unpopular opinions the other day and I thought about putting mine up: It is impossible understand even non-mathematical things without understanding math because you have no idea whether or not what you are trying to understand has a mathematical description of which you are unaware. This post represents a bit of that put into practice.

[1] Funny enough, per [0], Einstein's "power point slide" was instead math. His teacher Minkowski showed him how to put space and time into a single spacetime manifold mathematically.

[2] Whether or not evolution itself is a social construct, you still must consider the possibility that evolution could have in fact happened in which case we just turn this definitive absolute statement into a Bayesian probability.

[3] At some point, someone might point out that the math behind these abstract state spaces is itself a social construct and therefore powerless to yield this socially invariant information. However, at that point we've now effectively questioned what knowledge is and whether it exists at all. Which is fine.

[4] I find the fact that you could list "scientific facts are social constructs" as a "scientific fact" (in anthropology) that is itself a social construct to be a bit of delicious irony if not an outright Epimenides paradox.

Thursday, October 12, 2017

Bitcoin model fails usefulness criterion

Well, this would probably count as a new shock to the bitcoin exchange rate:


In fact, you can model it as a new shock:


Since we're in the leading edge of it, it's pretty uncertain. However, I'd like to talk about something I've mentioned before: usefulness. While there is no particular reason to reject the bitcoin dynamic equilibrium model forecast, it does not appear to be useful. If shocks are this frequent, then the forecasting horizon is cut short by those shocks — and as such we might not ever get enough data without having to posit another shock thereby constantly increasing the number of parameters (and making e.g. the AIC worse).

Another way to put this is that unless the dynamic equilibrium model of exchange rates is confirmed by some other data, we won't be able to use the model to say anything about bitcoin exchange rates. Basically, the P(model|bitcoin data) will remain low, but it is possible that P(model|other data) could eventually lead us to a concurrence model.

As such, I'm going to slow down my update rate following this model [I still want to track it to see how the data evolves]. Consider this a failure of model usefulness.

...

Update 17 October 2017

Starting to get a handle on the magnitude of the shock — it's on the order of the same size as the bitcoin fork shock (note: log scale):


Update 18 October 2017

More data just reduced uncertainty without affecting the path — which is actually a really good indication of a really good model! Too bad these shocks come too frequently.


Wednesday, October 11, 2017

Scaling of urban phenomena


Via Jason Potts, I came across an interesting Nature article [1] on the scaling of urban phenomena. In particular, the authors propose to explain the relationships in the graphic above.

Now the paper goes much further (explaining variance and the scaling exponents themselves) than I will, but I immediately noticed these relationships are all information equilibrium relationships ⇄ N with information transfer indices β:

log Y/Y₀ = β log N/N₀

The reasoning behind this relationship is that the information entropy of the state space (opportunity set) of each phenomena (Y) is in equilibrium with the information entropy of the population (N) state space. This falls under deriving the scaling from the relationship of surfaces to volumes mentioned in the paper (you can think of the information content of a state space as proportional to its volume if states are uniformly distributed, and the IT index measures the relative effective dimension of those two state spaces).

I wonder if adding shocks to the dynamic equilibrium rate (d/dt log Y/N) handles some of the deviations from the linear fit. For example, the slope of the upper left graph should actually relate to the employment population ratio — but as we know there was a significant shock to that ratio in the 70s (due to women entering the workforce). I can't seem to find employment population ratio data at the city level. There is some coarse data where I can get employed in e.g. Seattle divided by King county population as a rough proxy. We can see at the link there's a significant effect due to shocks (e.g. the recessions and the tail end of women entering the workforce). The model the authors use would imply that this graph should have a constant slope. However, the dynamic equilibrium model says that it has constant slope interrupted by non-equilibrium shocks (which would result in data off of the linear fit).

But this paper is interesting, especially in its description of an underlying model — a place where the information equilibrium approach is agnostic.

...

Footnotes:

[1] The article itself is oddly written. I imagine it is due to the house styles of Nature and Harvard, but being concise does not seem to be a primary concern. For example, this paragraph:
The central assumption of our framework is that any phenomenon depends on a number of complementary factors that must come together for it to occur. More complex phenomena are those that require, on average, more complementary factors to be simultaneously present. This assumption is the conceptual basis for the theory of economic complexity.
could easily be cut in half:
The central assumption of our framework is that phenomena depend on multiple simultaneous factors. This assumption is behind economic complexity theory.
Another example:
We observe scaling in the sense that the counts of people engaged in (or suffering from) each phenomenon scale as a power of population size. This relation takes the form E{Y|N} = Y₀ N^β, where E{⋅|N} is the expectation operator conditional on population size N, Y is the random variable representing the ‘output’ of a phenomenon in a city, Y₀ is a measure of general prevalence of the activity in the country and β is the scaling exponent, that is, the relative rate of change of Y with respect to N.
could also be cut in half:
The number of people experiencing each phenomenon is observed to scale as a function of population size E{Y|N} = Y₀ N^β, where E{⋅|N} is the expectation operator conditional on population size N, Y is the number of people experiencing a phenomenon in a city with scale parameter Y₀ and β, the scaling exponent.
I could even go a bit further:
The number of people experiencing each phenomenon is observed to scale as a function of population size Y ~ N^β, where N is the population size, Y is the number of people experiencing a phenomenon in a city, and β is the scaling exponent.

Dynamic equilibrium: US prime age population

There was a tweet saying that the US prime age population (25-54) hadn't increased in a decade. I decided to get a handle on the context in terms of the dynamic equilibrium model:


It's true this population measure hasn't increased in a decade, but that is more a measure of the size of the shock due to the recession (leading to e.g. reduced immigration) than anything special about today. In fact, the growth rate today is consistent with twenty-first century prime age population growth.

JOLTS leading indicators update

The August 2017 JOLTS numbers are out (July numbers comparison is here), and the hires series is continuing a correlated deviation from the dynamic equilibrium:


There's still insufficient data to declare a shock, and the best fit results in only a small shock [1]:


...

Footnotes:

[1] The evolution of the shock counterfactual is relatively stable:


Saturday, October 7, 2017

Compressed sensing and the information bottleneck

For those that don't know, my day job is actually in signal processing research and development in the aerospace sector. As I document in my book, I came by economics research via a circuitous route. One subject I worked on for awhile (and still do to some extent) is called compressed sensing (Igor Carron's blog is a great way to keep up with the state of the art in that field, and his Google site provides a nice introduction to the subject).

One of the best parts about Igor's blog is that he brings together several lines of research from machine learning, matrix factorization, compressed sensing, and other fields and frequently finds connections between them (they sometimes appear in his regular feature "Sunday Morning Insight").

In that spirit — although more of  a Saturday Afternoon Insight — I thought I'd put a thought out there. I've been looking at how the price mechanism relates to the information bottleneck (here, here), but I've also mused about a possible connection between the price mechanism and compressed sensing. I think now there might be a connection between compressed sensing and the information bottleneck.


In compressed sensing, you are trying to measure a sparse signal (a signal that appears in only a sparse subset of your space x, like a point of light in a dark image or a single tone in a wide bandwidth). To do so, you set up your system to make measurements in what is called the dense domain — through some mechanism (Fourier transform, random linear combinations, labeled with Φ) you make the variable you wish to measure appear throughout the space y. Therefore a few random samples of the entire dense space give you information about your sparse signal, whereas a few random samples of an image with a single bright point would likely only return dark pixels with no information about the point.


Is this how the information bottleneck works? We have some domain X in which our signal is just a small part (the set of all images vs the set of images of cats), and we train a feedforward deep neural network (DNN, h1 h2 → ... hm) that creates a new domain Y where our signal information is dense (cat or no cat). Every sample of that domain tells us information about whether there is an image of a cat being fed into the DNN (i.e. if it identifies cats and dogs, a result of dog tells us it's not a cat).

In compressed sensing, we usually know the some properties about the signal that allow us to construct the dense domain (sparse images of points can be made dense by taking a 2D Fourier transform). However, random linear combinations can frequently function as a way to make your signal dense in your domain. In training a DNN, are we effectively constructing a useful random projection of the data in the sparse domain? As we push through the information bottleneck, are we compressing the relevant information into a dense domain?

The connection between compressed sensing and the structure of a neural net has been noted before (see e.g. here or here), the new part (for me at least) is the recognition of the information bottleneck as a useful tool to understand compressed sensing — "opening the black box" of compressed sensing.