Saturday, May 30, 2015

The mathematical properties of information equilibrium


This is mostly a summary of some of the properties I've worked out previously (here and here). I have added some pieces to the group theory and some other notes.

We define real valued functions $A$ and $B$ to be in the binary relationship called information equilibrium (IE), denoted $A \rightarrow B$ -- or sometimes $p : A \rightarrow B$ for an explicit "detector" $p$ -- if

$$
p \equiv \frac{dA}{dB} = k \; \frac{A}{B} \;\;\; \text{for some}\; k \in R
$$

Equivalence relation

Information equilibrium defines an equivalence relation:

$$
A \rightarrow A
$$

$$
\text{if}\; A \rightarrow B \; \text{then} \; B \rightarrow A
$$

$$
\text{if}\; A \rightarrow B \; \text{and} \; B \rightarrow C  \; \text{then} \; A \rightarrow C
$$

Group structure

The set $IE_{K} \equiv \{ A \; |  \; A \rightarrow K \; \text{and} \; A > 0 \}$ along with ordinary multiplication $(IE_{K}, \times)$ forms an infinite Abelian group.

$$
\text{if}\; A \in IE_{K}\; \text{and} \; A \in IE_{K} \; \text{then} \; AB \in IE_{K}
$$

Note that $BA = AB$. The inverse of multiplication by $A$ is $1/A$, and the identity element is the constant function $A = c$ because it's in the set:

$$
\frac{d}{dK} \; c = 0 = k \frac{1}{K} \;\;\; \text{for} \; k = 0
$$

and

$$
\frac{d}{dK} \; c A = c k \frac{A}{K} \;\; \rightarrow \;\; \frac{dA}{dK} =k \frac{A}{K}
$$

Essentially, $c A$ and $A$ represent the same element of the set since they obey the exact same IE differential equation. Additionally, IE technically requires that elements $A \gg 1$, hence why zero is excluded from the set.

I haven't explicitly done the proofs here, so these are technically conjectures ... I may add proofs later. The group $IE_{K}$ is isomorphic to the group of real numbers under multiplication without zero (and therefore the logarithm is an isomorphism from this group to the additive group of real numbers). There is one non-trivial automorphism: inversion. This is where $f(A) \rightarrow A^{-1}$ (sorry for the notation collision there, that's a function -- homomorphism -- from $IE_{K}$ to $IE_{K}$ not an IE relationship).

Category theory

We can think of $\rightarrow$ representing IE as a category theory morphism that preserves a specific structure of real functions -- that structure being information content.

I may add more here as I think of stuff, but as category theory is also known as generalized abstract nonsense there might not be much of interest.

Usefulness?

You may be asking yourself how this stuff is useful. Short answer: it's probably not. But let's try this on for size ...

Let's assume the existence of a utility function $U$ that may not be directly observable so that we can discuss the set of possible observables $A$ in information equilibrium with utility $IE_{U} = \{ A \; |  \; A \rightarrow U \; \text{and} \; A > 0 \}$. Let's say we develop a theory with at least two observables $X$ and $Y$ (among others) that are in IE with $U$. This is the picture on the left in the graphic below.


We can draw in the other IE relationships that follow from IE being an equivalence relation in gray in the graphic above in the center. This means not only are $X$ and $Y$ in IE with each other (IE is an equivalence relation), but that the set $IE_{U}$ is equivalent to the sets $IE_{X}$ and $IE_{Y}$ (it forms a complete graph, shown on the right). Utility doesn't have a privileged position, and you could now write your theory of $X$, $Y$ and $U$ entirely in terms of $X$ and $Y$.

That is to say utility, $U$, is kind of like an unknown in information equilibrium economics -- once you find a few things that are in information equilibrium with utility, you don't need to refer to utility anymore. It's a bit like the $x$ back in algebra class. Once you've solved for $x$, you don't need to use it anymore.

It also says abstract concepts like capital $K$ may not be necessary components of final economic theories if they don't represent observables. They might be useful in building the theory and relating observables to each other, but if the theory is built out of information equilibrium relationships you can re-write the entire theory in terms of the observables. The $K$'s and $U$'s might be like wavefunctions in quantum mechanics -- not directly observable, but used to build the theory. But it's a bit more than that -- because you can rewrite everything without referencing utility.

It might be fun to re-read economics papers that refer to utility and substitute the word "unknown" ... :)

Friday, May 29, 2015

The political method

Let's say someone thinks the Fed should raise rates sooner rather than later. Let's say that person is known as a hawk on monetary policy. Now let's say that person put together a model [pdf] that says this:

We conclude that in economies where the key friction is NSCNC and the net nominal interest rate threatens to encounter the zero lower bound, monetary policymakers may wish to respond with a price level increase. A chief rival to this response observed in actual economies—forward guidance on the length of time the economy will remain at the zero lower bound beyond the time when that bound is actually binding—would be inappropriate in the theory presented here.

Lo and behold the model comes out with not only that person's preferred policy but also says the policy that person opposes is bad. What a coincidence!

And now that person sends a copy of that study to the most prominent advocate of the the specific model.

Lo and behold, that advocate loves it!

Isn't politics the scientific method grand?

I bet you didn't think you'd be laughing when you read an economics paper ... laughing so hard you couldn't breathe ...

The private credit market completely solves the cross-sectional income inequality problem. It's quite awesome. I swear. (And can someone maybe label an axis, preferably two ... and what's with the Mathmatica 8 default formatting? Good enough for government work, I guess. What exactly does "non-stochastic" mean anyway? I used to teach a lab class and if one of the students presented this graph in their lab report it would have been all marked up in red. [Taken from the linked pdf, above]





2015 Q1 NGDP revised downward ...

... and I guess people (including myself) should take back all the praise heaped on the Atlanta Fed forecast. Hypermind gets more wrong and their implicit forecast for Q3 and Q4 (shown as a dotted line segment below) appears to be rather high 4.4% (that's what's needed if the Q2 forecast is correct in order to achieve their annual number of 3.4%).


The arrow and the dark blue dot indicate the new downward revision from the BEA. I'll just quote from what I said when the first estimate came out a month ago:
Overall: a continuing a trend of lukewarm economic performance, largely still in line with just about any model of the economy.

Thursday, May 28, 2015

Resolving the Cambridge capital controvery with abstract algebra



The title is a bit of a joke, and for the controversy see here. Looking into the Solow model bumps you into the question of what "capital" (K) is, and that met with the titular controversy awhile back where Cambridge, MA said you could add up different stuff in a sensible way while Cambridge, UK said you couldn't.

The information equilibrium (IE) model calls the argument for the UK (and Joan Robinson), but allows (at least) two possibilities for definitions of capital that are sensible. These sensible definitions weren't advocated by Solow/Samuelson at MIT, hence why I say that the UK won the debate: you can't just add stuff up and get a sensible answer.

First, two quick "proofs". I already showed IE is an equivalence relation (you can use it to define a set of things in IE with some economic aggregate), but I need a bit more: IE is a group under multiplication.

If $A \rightarrow K$ (with IT index $a$), then $A^{x} \rightarrow K$ (with IT index $a x$) because:

$$
\frac{d}{dK} A^{x} = x A^{x - 1} \frac{dA}{dK} = a x \frac{A^{x}}{K}
$$

(I show this because it applies for real exponents rather than just natural numbers and that might be important for some reason in the future; for natural numbers $x$ the following result would suffice.)

If $A \rightarrow K$ and $B \rightarrow K$ (with IT indices $a$ and $b$), then $A B \rightarrow K$ (with IT index $a + b$) because:

$$
\frac{d}{dK} AB = \frac{dA}{dK} B + A \frac{dB}{dK} = (a + b) \frac{AB}{K}
$$

So we have the set of all things that are in IE with $K$, and the product of any two of those things is another thing in the set -- therefore she's a witch it's a group. It is not, however, a ring -- the set isn't closed under addition:

$$
\frac{d}{dK} (A + B) = \frac{dA}{dK} + \frac{dB}{dK} = \frac{a A + b B}{K}
$$

so $A + B$ is not in IE with $K$ unless $a = b$.

This basically was Joan Robinson's point -- unless $A$ and $B$ are the same thing, you're comparing apples and oranges. Money doesn't help us either and if you introduce it, the relative prices of the capital goods become important **.

Sensible definitions of capital

Of course, this points to the first sensible solution to the capital controversy: instead of adding up capital items, use the geometric mean. Using the two results above, you can show that if $A \rightarrow K$, $B \rightarrow K$, $C \rightarrow K$, etc, then

$$
(A B C \; ... \;)^{1/n} \rightarrow K
$$

The geometric mean is also the only sensible mean for capital goods measured either as indices or in terms of money.

The second sensible solution to the controversy is a partition function approach (as I've done here) where we simply define capital to be the expected value of the capital operator, which is just the sum of the individual capital goods operators:

$$
\langle K \rangle \equiv \langle A + B + C + \; ... \; \rangle
$$

In that sense, "capital" would be more like NGDP than, say, a stock index.





** Update 5/29/2015: I thought I'd add in the details of the sentence I marked with ** above. We assume two goods markets (information equilibrium conditions) $p_{a} : N \rightarrow A$ and $p_{b} : N \rightarrow B$ where $N$ is aggregate demand/nominal output and the $p_{i}$ are prices. That gives us:

$$
k_{a} p_{a} A = N \; \text{and} \; k_{b} p_{b} B = N
$$

Substituting into the formula above

$$
\frac{d}{dK} (A + B) = \frac{a A + b B}{K} = \left( \frac{a}{k_{a} p_{a}} + \frac{b}{k_{b} p_{b}}  \right) \; \frac{N}{K}
$$

which basically shows that $K$ is in information equilibrium with aggregate demand. Note the appearance of the prices in the information transfer index.

Wednesday, May 27, 2015

The basic asset pricing equation as a maximum entropy condition


Commenter LAL has brought up the basic asset pricing equation a couple of times, and so I had a go at looking at it as a maximum entropy/information equilibrium model. Turns out it works out. In Cochrane's book (updated with link) the equation appears as:

$$
\text{(1) }\; p_{t} = E \left[ \beta \frac{u'(c_{t+1})}{u'(c_{t})} x_{t+1} \right]
$$

Where $p_{t}$ is the price at time $t$, $c_{t}$ is consumption at time $t$, $u$ is a utility function, and $\beta$ is a future discount factor. Now $x_{t}$ is also the price at time $t$ (although it's called the payoff) and of course there is the funny business of the $E$ that essentially says all the terms at a time $t+1$ exist only in the minds of humans (and turns an $x$ into a $p$). Rational expectations is the assumption that the $E$ is largely meaningless on average (i.e. approximately equal to the identity function).

As a physicist, I'm not particularly squeamish about the future appearing in an equation (or time dropping out of the model altogether), so I will rewrite equation (0) as:

$$
\text{(1) }\; p_{i} = \beta \frac{u'(c_{j})}{u'(c_{i})} p_{j}
$$

It turns out much of the machinery is the same as the Diamond-Dybvig model, so I'll just adapt the beginning of that post for this one.

The asset pricing equation is originally a model of consumption in two time periods, but we will take that to be a large number of time periods (for reasons that will be clear later). Time $t$ will be between 0 and 1.

Let's define a utility function $U(c_{1}, c_{2}, ...)$ to be the information source in the markets

$$
MU_{c_{i}} : U \rightarrow c_{i}
$$

for $i = 1 ... n$ where $MU_{c_{i}}$ is the marginal utility (a detector) for the consumption $c_{i}$ in the $i^{th}$ period (information destination). We can immediately write down the main information transfer model equation:

$$
MU_{c_{i}} = \frac{\partial U}{\partial c_{i}} = \alpha_{i} \; \frac{U}{c_{i}}
$$

Solving the differential equations, our utility function $U(c_{1}, c_{2}, ...)$ is

$$
U(c_{1}, c_{2}, ...) = a \prod_{i} \left( \frac{c_{i}}{C_{i}} \right)^{\alpha_{i}}
$$

Where the $C_{i}$ and $a$ are constants. The basic timeline we will consider is here:


Period $i$ is some "early" time period near $t = 0$ with consumption $c_{i}$ while period $j$ is some "late" time period near $t = 1$ with consumption $c_{j}$. We'll only be making changes in these two time periods. The "relevant" (i.e. changing) piece of the utility function is (taking a logarithm):

$$
\text{(2) }\; \log U \sim \;\; ... + \alpha_{i} \log c_{i} + ... + \alpha_{j} \log c_{j} + ... + \log U_{0}
$$

where all the various $C_{i}$'s, $\alpha_{i}$'s and $a$ ended up in $\log U_{0}$.

Now the derivation of the asset pricing equation sets up a utility maximization problem where normal consumption in period $i$ (called $e_{i}$) is reduced to purchase $\xi$ of some asset at price $p_{i}$, and added back to consumption in period $j$ at some new expected price $p_{j}$. So we have:

$$
\text{(3a) }\; c_{i} = e_{i} - p_{i} \xi
$$

$$
\text{(3b) }\; c_{j} = e_{j} + p_{j} \xi
$$

Normally, you'd plug these into the utility equation (2), and maximize (i.e. take a derivative with respect to $\xi$ and set equal to zero). The picture appears in this diagram (utility level curves are in gray):


The change in the amount $\xi$ of the asset held represents wiggling around the point $(e_{i}, e_{j})$ along a line with slope defined by the relative size of the prices $p_{i}$ and $p_{j}$ to reach the point labeled with an 'x': the utility maximum constrained to the light blue line.

Instead of doing that, we will use entropy maximization to find the 'equilibrium'. In that case, we can actually be more general, allowing for the case that e.g. you don't (in period $j$) sell all of the asset you acquired in period $i$ -- i.e. any combination below the blue line is allowed. However, if there are a large number of time periods (a high dimensional consumption space), the most probable values of consumption are still near the blue line (more on that here, here). Yes, that was a bit of a detour to get back to the same place, but I think it is important to emphasize the generality here.

If the states along the blue line are all equally probable (maximum entropy assumption), then the average state will appear at the midpoint of the blue line. I won't bore you with the algebra, but that gives us the maximum entropy equilibrium:

$$
\xi = \frac{e_{i} p_{j} - e_{j} p_{i}}{2 p_{i} p_{j}}
$$

If we assume we have an "optimal portfolio", i.e we are already holding as much of the asset as we'd like, we can take $\xi = 0$, which tells us $e_{k} = c_{k}$ via the equations (3) above, and we obtain the condition:

$$
\text{(4) }\; p_{i} = \frac{c_{i}}{c_{j}} p_{j}
$$

Not quite equation (1), yet. However, note that

$$
\frac{1}{U} \frac{\partial U}{\partial c_{i}}  = \frac{\partial \log U}{\partial c_{i}} = \frac{\alpha_{i}}{c_{i}}
$$

So we can re-write (4) as (note that the $j$, i.e. the future, and $i$, i.e. the present, flip from numerator and denominator):

$$
\text{(5) }\; p_{i} = \frac{\alpha_{i}}{\alpha_{j}} \frac{\partial U/\partial c_{j}}{\partial U/\partial c_{i}} p_{j}
$$

Which is formally similar to equation (1) if we identify $\beta \equiv \alpha_{i}/\alpha_{j}$. You can stick the $E$ and brackets around it if you'd like.

I thought this was pretty cool.

Now just because you can use the information equilibrium model and some maximum entropy arguments to arrive at equation (5) doesn't mean equation (1) is a correct model of asset prices -- much like how you can build the IS-LM model and the quantity theory of money in the information equilibrium framework, this is just another model with a information equilibrium description. Actually equation (4) is more fundamental in the information equilibrium view and basically says that the condition you'd meet for the optimal portfolio is simply that the ratio of the current to expected future consumption is equal to the ratio of the current to the expected price of that asset. Essentially if you think the price of some asset is going to go up 10%, you will adjust your portfolio so your expected future consumption goes up by 10%.

Paul Romer feels misunderstood


After an initial misunderstanding of his definition of mathiness, I think passed Romer's reading comprehension quiz. Romer answers "False" to each of these questions ...
1. T/F: Romer thinks that economists should not try to use the mathematics of Debreu/Bourbaki and should instead use math in the less formal way that physicists and engineers use it.
I think this (and Mark Buchanan approving thinks Romer would answer true), but Romer answers false.
2. T/F: Romer thinks that abstract mathematical models that could turn out to be of no use in understanding data and evidence are examples of mathiness.

This captures my initial misunderstanding (I linked the original, and here is the corrective from the next day). Overall, Romer should have left off the word empirical when he said: "Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in natural versus formal language and between statements with theoretical as opposed to empirical content." (I crossed out the offending clause -- Romer's idea of mathiness is completely independent of data, so I'm not sure why he mentioned it.)

3. T/F: Romer thinks that errors in mathematical arguments are examples of mathiness.


As I said, it's lack of rigor (or "tight links" as Romer phrases it). A lack of rigor can be associated with errors, but are not identical to them.
4. T/F: Romer says that the economists he has accused of mathiness are using it to promote a right-wing political agenda designed to influence national politics.


The academic politics seems to line up with national politics, but as I mention here (in the PS) it's mostly about tribes of graduate students revolving around big names.
5. T/F: Romer thinks that economists should use less math.
I personally think economics should use less formal math, but I never attributed this to Romer.
6. T/F: Romer is angry.

I think the emotional states I attributed to Romer were being "upset", "zeal" and being "weird". I insinuated Lucas and Moll might be a bit annoyed with Romer.

Tuesday, May 26, 2015

Dynamics of the savings rate and Solow + IS-LM


Hello! I'm back from a short vacation and slowly getting to the comments.

As I mentioned here, there might be a bit more to the information equilibrium picture of the Solow model than just the basic mechanics -- in particular I pointed out we might be able to figure out some dynamics of the savings rate relative to demand shocks.

In the previous post, we built the model:

$$
Y \rightarrow K \rightarrow I
$$

Where $Y$ is output, $K$ is capital and $I$ is investment. Since information equilibrium (IE) is an equivalence relation, we have the model:

$$
p: Y \rightarrow I
$$

with abstract price $p$ which was described here (except using the symbol $N$ instead of $Y$) in the context of the IS-LM model. If we write down the differential equation resulting from that IE model

$$
\text{(1) }\;\; p = \frac{dY}{dI} = \frac{1}{\eta} \; \frac{Y}{I}
$$

There are a few of things we can glean from this ...

I. General equilibrium

We can solve equation (1) under general equilibrium giving us $Y \sim I^{1/\eta}$. Empirically, we have $\eta \simeq 1$:


Combining that with the results from the Solow model, we have

$$
Y \sim K^{\alpha} \; \text{,} \; K \sim I^{\sigma} \; \text{and} \; Y \sim I
$$

which tells us that $\alpha \simeq 1/\sigma$ -- one of the conditions that gave us the original Solow model.

II. Partial equilibrium

Since $Y \rightarrow I$ we have a supply and demand relationship between output and investment in partial equilibrium. We can use equation (1) and $\eta = 1$ to write

$$
I = (1/p) Y \equiv s Y
$$

Where we have defined the saving rate as $s \equiv 1/(p \eta)$ to be (the inverse of) the abstract price $p$ in the investment market. The supply and demand diagram (including an aggregate demand shock) looks like this:


A shock to aggregate demand would be associated in a fall in the abstract price and thus a rise in the savings rate. There is some evidence of this in the empirical data:


Overall, you don't always have pure supply or demand shocks, so there might be some deviations from a pure demand shock view. In particular, a "supply shock" (investment shock) should lead to a fall in the savings rate.

III. Interest rates

If we update the model here (i.e. the IS-LM model mentioned above) to include the more recent interest rate ($r$) model written in terms of investment and the money supply/base money:

$$
(r \rightarrow p_{m}) : I \rightarrow M
$$

where $p_{m}$ is the abstract price of money (which is in IE with the interest rate), we have a pretty complete model of economic growth that combines the Solow model with the IS-LM model. The interest rate joins the already empirically accurate production function:


Since I inevitably get questions about causality, it is important to note that these are all IE relationships therefore all relationships are effectively causal in either direction. However it is also important to note that the direct impact of $M$ on $Y$ is neglected in the above treatment (including the interest rates) -- and the direct impact changes depending on the information transfer index in the price level model.

Summary

A full summary of the Solow + IS-LM model in terms of IE relationships is:

$$
Y \rightarrow K \rightarrow I \; \text{,} \; K \rightarrow D
$$

$$
Y \rightarrow L
$$

$$
1/s : Y \rightarrow I
$$

$$
(r \rightarrow p_{m}) : I \rightarrow M
$$



Update 5/27/2015: Forgot first graph; corrected.

Friday, May 22, 2015

The rest of the Solow model


Here, I mostly referred to the Cobb-Douglas production function piece, not the piece of the Solow model responsible for creating the equilibrium level of capital. That part is relatively straight-forward. Here we go ...

Let's assume two additional information equilibrium relationships with capital $K$ being the information source and investment $I$ and depreciation $D$ (include population growth in here if you'd like) being information destinations. In the notation I've been using: $K \rightarrow I$ and $K \rightarrow D$.

This immediately leads to the solutions of the differential equations:

$$
\frac{K}{K_{0}} = \left( \frac{D}{D_{0}}\right)^{\delta}
$$

$$
\frac{K}{K_{0}} = \left( \frac{I}{I_{0}}\right)^{\sigma}
$$

Therefore we have (the first relationship coming from the Cobb-Douglas production function)

$$
Y \sim K^{\alpha} \text{ , }\;\;\;\; I \sim K^{1/\sigma} \text{ and }\;\;\;\; D \sim K^{1/\delta}
$$

If $\sigma = 1/\alpha$ and $\delta = 1$ we recover the original Solow model, but in general $\sigma > \delta$ allows there to be an equilibrium. Here is a generic plot:


Assuming the relationships $K \rightarrow I$ and $K \rightarrow D$ hold simultaneously gives us the equilibrium value of $K = K^{*}$:

$$
K^{*} = K_{0} \exp \left( \frac{\sigma \delta \log I_{0}/D_{0}}{\sigma - \delta} \right)
$$

As a side note, I left the small $K$ region off on purpose. The information equilibrium model is not valid for small values of $K$ (or any variable). That allows one to choose parameters for investment and depreciation that could be e.g. greater than output for small $K$ -- a nonsense result in the Solow model, but just an invalid region of the model in the information equilibrium framework.

An interesting add-on is that $Y$ and $I$ have a supply and demand relationship in partial equilibrium with capital being demand and investment being supply (since $Y \rightarrow K$, by transitivity they are in information equilibrium). If $s$ is the savings rate (the price in the market $Y \rightarrow I = Y \rightarrow K \rightarrow I$), we should be able to work out how it changes depending on shocks to demand. There should be a direct connection to the IS-LM model as well.

What is economic growth?

Disclaimer: Ok; I mention mathiness, but that's not what this post is about. It's about modeling economic growth in general -- and concludes that the lack of accelerating growth is due to an entropic headwind (at least in the information equilibrium model).
Dietrich Vollrath at Growth Economics has a really good overview (and sensible path forward -- i.e. empirical research) of the whole Romer/mathiness brouhaha. Actually reading it reminds me of that old saying about rival research groups [0]:
The animosity between two research groups is inversely proportional to the difference between their research.
Vollrath also (sensibly) says that maybe price taking (the Lucas side) is valid in certain scenarios and monopolistic competition (the Romer side) in others. But in the end, both of these are models where 'innovation' just kind of happens:
The assumptions made by the market power [monopolistic competition] theories are just as impossible to justify as the competition [price taking] theory. The price-taking theory assumes that people just randomly walk around, bump into each other, and magically new ideas spring into existence. The market power theory assumes that people wander into a lab, and then magically new ideas just spring into existence, perhaps arriving in a Poisson-distributed process to make the math easier. Why is the magical arrival of ideas in the lab less fanciful than the magical arrival of ideas from people meeting each other? In the models, they are both governed by arbitrary statistical processes that bear no resemblance to how research actually works.
So apparently growth is in information equilibrium with some random process of innovation! (Seriously, economists, you should take a look.) Ok so what is the problem we're trying to solve with these different views of growth economics? Here's Vollrath again:
One thing we have come to a consensus on is that economic growth is driven by innovation, and not simply accumulating physical or human capital. That innovation, though, involves non-rival ideas. Non-rival ideas (e.g. calculus) can be used by anyone without limiting anyone else’s use of the idea. But modeling a non-rival idea is weird for standard economics, which was built around things that are rival (e.g. a hot dog). In particular, allowing for non-rival ideas in production means that we have increasing returns to scale (if you double all physical inputs *and* the number of ideas then you get more than twice as much output). But if we have increasing returns to scale, why don’t we see growth rates accelerating over time?
Well ... in the information equilibrium theory I'm working on here ... that's because as an economy grows in size, that economy finds itself less and less likely to be made up of many high growth states. There are more ways a larger economy can be organized using more low growth firms than high growth firms in much the same way that thermal energy can be carried away in more ways by many low energy photons than a few high energy photons. And this seems to be a good empirical description of economies:


We defined an economic temperature that goes as 1/log M where M is the monetary base and found it did a good job of reproducing the price level and nominal output of various economies.

Note that this lack of accelerating growth rates is a macro property -- it is a property of a large number of firms in a large economy. There may be differences in the bulk properties of the distribution over firms among different countries leading to different rates of deceleration, but the general lack of accelerating growth rates is a general property of the partition function and the economic temperature. That is to say the deceleration is an entropic force. Think of it as a headwind that increases with the size of the economy (under a particular monetary regime).

However, the economic temperature doesn't explain why specific businesses wouldn't have accelerating growth rates. And that's because that's totally fine! Many individual firms do in fact have rapid, accelerating growth rates -- how would any firm get started? Firms at the micro level have all kinds of growth rates (see here for more on that), but the aggregated maximum entropy distribution has a slight headwind to growth so that the aggregate doesn't have accelerating growth rates. It's another case similar to sticky macro prices, but flexible micro prices. In this case, we have a macro headwind, but no micro headwind [2]. 

This means that there's at least one model (the information equilibrium model) where both price taking (Lucas) and monopolistic competition (Romer) are theoretically misguided. Randomly running into each other in the hall or randomly forming an idea in a lab is an attempted micro model of an entropic force (see e.g. here). Like Calvo pricing, it is not correct at the micro level. It is similar to adding a diffusion force proportional to the local density gradient to the microscopic theory of atoms -- no such force exists, but it would help explain the phenomenon of diffusion if you didn't have the concept of entropy. But if you do have entropy, that microscopic diffusion force is unnecessary.

Likewise it may be possible that for the broad first order models of growth, models of innovation are unnecessary [1]. Additionally, attempting to use them to describe why there isn't accelerating growth is wrong ... the lack of accelerating growth is an emergent macroeconomic property.





Footnotes:

[0] I don't know if this is an actual saying, but at least I say it from time to time. It in a sense explains the Scott Sumner - Paul Krugman wars. Their entire disagreement could be summarized to hinge on the ability of expectations of monetary expansion to free a country from a liquidity trap. Krugman says its hard (hence it being a trap); Sumner says its easy.

[1] Innovation models may well describe what is happening at the micro level (individual firms), and may be useful to understand differences between either the first order information equilibrium model and empirical data or differences between different countries. One thing they aren't needed for is to explain the lack of accelerating growth.

[2] This headwind could be visualized as an economy taking longer and longer to explore (via dither) the nearby state space. A possibility is that not only is the volume of states growing, but the dimension is growing as well. That's a random placeholder thought. From Jaynes: "In constantly exploring the neighboring states, the economy is always more likely to move to one of higher than lower entropy, simply because there are more of them (greater multiplicity)."


Thursday, May 21, 2015

Frameworks and the Bohr model analogy

Not the Bohr model at all. From Wikimedia Commons.

I said I wouldn't do this, but here's another post inspired by Paul Romer. Romer, in this post, says this:
If you are an undergraduate thinking about studying economics in graduate school, I’d strongly recommend a physics degree, or at least lots of physics courses. Math is a tool, but math courses do not teach judgment about how to use this tool to make good abstract mental maps of real world terrain. Learning about models in physics–e.g. the Bohr model of the atom–exposes you to time-tested models that found a good balance between simplicity and insight about observables.
In physics we don't actually learn the Bohr model except maybe as part of a survey of the history of quantum theory. We re-derive the Bohr formula for the energy levels of the Hydrogen atom (only because it turned out to be correct at a given level of approximation). And the Bohr radius constant simplifies writing out the wavefunctions derived from the Schrodinger equation. But we don't really learn so-called "old quantum theory" ... we learn quantum mechanics.

It's not in any sense a "time-tested" model "that found a good balance between simplicity and insight about observables". It is an interesting middle stage in the history of quantum mechanics (i.e. it is relevant data in history of science research), but I can't imagine any insight you might glean from it. The Bohr model leads no further than its original formulation.

However, I think the way Romer sees the Bohr model really sheds some light on something that is truly missing in economics. That's because Romer sees the Bohr model with his economist's eyes.

Economics, as described by Romer, is a collection of maps:
There is no such thing as the perfect map. This does not mean that the incoherent scribbling of McGrattan and Prescott are on a par with the coherent, low-resolution Solow map that is so simple that all economists have memorized it. Nor with the Becker map that has become part of the everyday mental model of people inside and outside of economics. ... 
For specific purposes, some maps are better than others. Sometimes a subway map is better than a topographical map. Sometimes it is the other way around.
For Romer, the Solow model is analogous to his perception of the Bohr model -- it's some older map that's still useful. It's like a subway map that simplifies the geography of New York.

But the Bohr model is more like a mappa mundi -- an old map that gets right the idea of the three continents surrounding the Mediterranean, but is largely outdated both in technical accuracy, but more importantly in the methods it was made with. These methods are what form a framework of map-making. A mappa mundi was created from an old framework where religion dominated (so e.g. Jerusalem is at the center). Today's maps are made via a modern framework: surveying (for physical maps) or the nodes and links of network mapping (for things like subway maps).

The Bohr model appears as part of a transition from a classical mechanics framework (where atoms explode in a burst of ultraviolet radiation) to a quantum mechanical framework that includes Schrodinger equations, operators and Hilbert spaces. Attempting to use a model of quantum phenomena made without the quantum mechanics framework would be odd to say the least. Making a reference to Bohr-Sommerfeld quantization in a physics seminar would be more like making reference to the aether (wrong) than to Newton's law of gravitation (accurate in specific limits).

As far as I can tell, there are no frameworks for economic models. Sure, there are some principles, but no frameworks. That is to say all economic models are effectively part of the same default framework that I'll call mathematical philosophy. Mathematical philosophy is basically making arguments with math. Physics was part of this "default" framework from about the time of Galileo to about the time of Newton. Newton created the first true framework of physics. Analogously, Darwin created the first framework for biology.

Now it is arguable that economics does have a framework. Utility, game theory, search/matching theory and DSGE are the best candidates I've seen. Supply and demand diagrams are actually a really good framework, but modern economists (except Paul Krugman) unfortunately tend to eschew them. And remember, the existence of a framework doesn't mean it's the correct way to think about problems (remember, old quantum theory was wrong). From what is written out there, if DSGE is a framework for a large segment of economics, it doesn't seem to be a particularly successful framework. 

How can you figure out what a framework is? Imagine you're given an economic question. Now ask yourself if there is something you immediately write down to start solving it. Is there something? That's your framework.

Given a problem in physics, I will write down a Hamiltonian or Lagrangian (or any of the several things that are immediately derivable from them like equations of motion or path integrals). There are two flavors -- classical and quantum. The Schrodinger equation is a specific Hamiltonian (quantum mechanics) and Feynman diagrams are ways to solve the equations of motion implied by a Lagrangian (quantum field theory). Einstein's general relativity has the rather beautiful Lagrangian (density) R√-g.

Now given a problem in economics (and this is just me), I would write down an information equilibrium relationship.

Except for papers about DSGE models (which do tend to start with a definition of the DSGE system of equations, equilibrium conditions, etc), economists don't seem to start with anything other than some specific model. We'll start with a neoclassical growth model ... the Solow model ... monopolistic competition ... Diamond -Dybvig ... etc.

It would be like starting an atomic physics problem with the Bohr model (n.b. Romer doesn't think this is weird, but most physicists would). Even when physicists use "the Standard Model", they're actually referencing a specific Lagrangian.

And in the end, the reason physicists can easily ignore certain models as garbage is that they ignore the main frameworks: classical Newtonian mechanics, relativistic mechanics, quantum mechanics (nonrelativistic quantum theory) or quantum field theory (relativistic quantum theory). String theory is the most recently developed framework (it's not a specific model). If the model is working in a framework, you have to resort to empirical data to call it garbage.

Economists are still in the mathematical philosophy stage, so the only way you can eliminate garbage is empirical data. Romer is trying to set up some new framework that divides economics into "mathiness" and "science", but that's really just more mathematical philosophy! It's not going to work. Stephen Williamson has some good retorts ("What if the people disagreeing with us are idiots?"), but the gist is that you are arguing philosophy. You can only purge garbage with either empirical data or an empirically successful framework (which really is just a shortcut to comparing with all the data the framework succeeds in describing). Romer seems to believe that framework should be whatever the recognized experts/majority think they are. But that's not how science works.





PS


Here are some examples from physics
Model = MIT bag model 
Framework = Quantum mechanics 

Model = Chiral perturbation theory 
Framework = Quantum field theory 

Model = Solar system 
Framework = Newtonian mechanics (some relativistic corrections) 

Model = Hydrogen atom 
Framework = Quantum mechanics (some relativistic corrections) 

Model = Bohr model 
Framework = "Old quantum theory" -- n.b. model works, but framework wrong
From economics, some candidates are ... (this gets hard because as I mentioned, there don't seem to be any real frameworks)
Model(s) = Various DSGE models  
Framework = DSGE 

Model = Diamond-Dybvig 
Framework = Utility maximization, game theory (?) 

Model = IS-LM, AD-AS 
Framework = Supply and demand 

Model = Dornbusch overshooting 
Framework = Supply and demand (?)
Note that "sticky prices" or "expectations" are specific effects -- they are not models or frameworks -- they are included in models and frameworks.

I put in these economics "examples" more as a starting point for discussion. I'm not particularly sure about them.

Wednesday, May 20, 2015

Graphical version of my view of mathiness

Here is this post represented in graphical form ...



About me

I'm adding a bit more to my biographical information than: "I am a physicist who messes around with economic theory." It will still be a bit vague because this is a completely open forum and I'd like to maintain some privacy.

I went to the University of Texas at Austin and graduated in the 1990s with a degree in math and a degree in physics; my primary interests were topology and group theory in the former and plasma physics in the latter.

Discovery and me at the OPF.
Upon graduating, I moved up to Seattle to study at the University of Washington for my PhD in theoretical physics. I worked on nuclear and particle physics and my thesis was on the quark structure of nuclei. I spent time looking at the EMC effect -- I was the one who created the original article on Wikipedia -- and also trying to predict the results of experiments at TJNAF. I have several publications from this era. I was what is called in physics a "phenomenologist" (a special breed of theorist who connects theory to experiment).

I briefly flirted with the idea of becoming a "quant" as I was writing my thesis. I studied up on finance models, and was actually offered a few posititions but ended up turning them down. This was what started my interest in the econoblogosphere.

I instead got a job at a company that will remain nameless doing research and development work mostly revolving around signal processing. That was only interrupted by a government fellowship that included relocation to DC, attending some congressional committee meetings, and touring the OPF as pictured. One of the more economics-relevant things I saw was recent work on prediction markets (more about those here).

It was during that time I stumbled upon this paper by Fielitz and Borchardt and tried to apply the information transfer framework to what is essentially Hayek's description of the price mechanism. That didn't exactly work, but it did work if you thought about the problem differently. I thought it was sufficiently different that it might be an interesting development in economics -- or at least prediction markets -- so I started to write a paper. That paper became the first few posts of this blog which I started after I returned to Seattle (and that job at that nameless company).

I currently live in Seattle with my amazing wife and awesome step-daughter (who both are wonderful for putting up with the blogging and research).

PS Here's a post I wrote on what I considered to be formative reading for me -- at least with regard to this blog.

Update:

I have now put out my first economics paper as a preprint available at the arXiv and on EconPapers:

http://arxiv.org/abs/1510.02435
http://econpapers.repec.org/RePEc:arx:papers:1510.02435

Tuesday, May 19, 2015

Prescott and Lucas aren't Romer's problem

Show me the data!

Ok, this is the last post on mathiness -- there's real work to be done!

I'd like to start off by saying I'm overall sympathetic to the complaints Paul Romer is making about his field. I agree! Certain economists seem to obfuscate their political assumptions with a blizzard of math [0]. Overall, he seems like a good guy ... and was essentially a physics major, which is a plus in my book!

When Romer quoted this from his own mathiness paper:
The style that I am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in natural versus formal language and between statements with theoretical as opposed to empirical content.
I cheered. There was no mention of Solow in Romer's blog post -- I even thought he might be after the neoclassical embrace of the model as evidenced by Marginal Revolution University's whole unit on it (a unit filled with obfuscated political assumptions).

But I had mistakenly put the emphasis on the last clause
[mathiness] leaves ample room for slippage ... between statements with theoretical as opposed to empirical content.
Romer seems to be more upset about the lack of "tight links" -- essentially the lack of formal rigor. The key issue I have is that Romer believes this:
We assume that the measure of a country’s production locations is proportional to its population, since locations correspond to markets and some measure of people defines a market.
McGrattan and Prescott (2009)
is somehow less problematic than this [1]:
Output is to be understood as net output after making good the depreciation of capital. About production all we will say at the moment is that it shows constant returns to scale. Hence the production function is homogeneous of first degree. This amounts to assuming that there is no scarce nonaugmentable resource like land. Constant returns to scale seems the natural assumption to make in a theory of growth.
Solow (1956) 
Constant returns to scale are not assumed because they are empirically observed. Solow essentially says he assumes constant returns to scale because it seems to be the just the sort of assumption one makes when defining a theory of growth. (Wait ... did you just say there is no such thing as land?) Why not just say you're assuming constant returns to scale because it makes the math easier and narrows down the possibilities of the production function?

But the worst part of this is that it doesn't even make sense in terms of the economics of growth. I double my factories and my labor force, I double my output? Here is a short list of possible effects that would fight in different directions, but generally away from constant returns to scale:

  • Additional factories mean that the managerial and labor force could be less skilled on average (limited resource, decreasing returns to scale) [2]
  • Additional factories allow for more eyes and more possibilities to find tweaks that improve production (increasing returns to scale)
  • Additional capital and labor mean better paying jobs that incentivize the creation of amenities and higher standards of living (increasing returns to scale)
  • Additional captial and labor mean develop more experience, expanding the the source of new ideas and businesses (increasing returns to scale)

However, the assumption of constant returns to scale puts all of these effects into a new piece that gets the name Total Factor Productivity (TFP) ... that becomes a mysterious factor responsible for most of economic growth. If you look at it empirically, the Cobb-Douglas production function works just great if you just drop the extra theoretical assumptions.

I'm not saying economists don't know about these things -- Romer is one of the foremost experts on growth economics. I am saying theoretical models and assumptions are trumping empirical analysis. Theoretical assumptions create empirical problems that must be solved with additional assumptions (auxiliary hypotheses) -- a regressive research program.

Prescott and Lucas aren't Romer's problem. Romer's problem has been around for a lot longer than their recent papers (at least 1956, but really well before then). Connection with empirical data is the purgative growth economics needs. It is the purgative macroeconomics needs. Empirics aren't going to bow down to mathiness regardless of whether it is a lack of rigor or political theoretical assumptions.


...


PS Romer's mathiness doesn't just exist in economics -- it exists in theoretical particle physics and string theory. And the reason is similar: a lack of experimental data. There's no real politics in string theory, but that doesn't stop what are essentially alliances from forming around the big figures in the field. The way you get ahead is no longer solving empirical puzzles or predicting new experimental results. You get ahead by impressing a key figure in the field with your genius.

This is to say that it's not because economics has to deal with real world policy that politics dominates -- it's the lack of empirical data that lends itself to a system of alliances. And instead of forming a new type of alliance out of whole cloth, it takes one off the shelf (common everyday left-right politics). That politics is like the cosmic microwave background radiation -- the higher energy density areas lead to galaxies and the lower density to voids. "Saltwater" economics forms on the politically liberal coasts of the US, "Freshwater" in its more conservative interior.

No amount of discussion about methodology or mathiness will dislodge these alliances. Empirical data is the only thing that can. Noah Smith says macro data is uninformative. I call bullshit on that. Macro data can only be uninformative if your models are too complex for the data you have. Drop the complex models and show some simple models with lines going through data.

That's why I hope all of you out there reading these economics blogs start asking to see the data. Ask to see lines going through the data. If an econoblogger's model can't produce agreement with empirical data, don't be afraid to call it garbage. Tell that econoblogger they're away with the fairies. I get my stuff called garbage all the time; trust me it doesn't hurt my feelings (but then I like arguments). It's even helpful sometimes! And at least I disclose the models and show how they do with empirical data.


Footnotes:

[0] In a sense, that is something that is refreshing about the right-leaning economist/blogger Scott Sumner. Sumner seems to be saying there is no mathematical theory -- it's all political belief.

[1] Paul Romer: "The Solow model is an example of excellent theory."

[2] We all see this first hand when a restaurant we like opens another location that is never quite as good as the original. It's also the trend towards mediocrity that comes with ubiquity e.g. Wolfgang Puck canned soups.

Monday, May 18, 2015

Another mistake from Romer


I found another mistake in Romer's "proof", but I also got some (wrong-headed) pushback on the econ job rumors forum about my take on Paul Romer's mathiness. Here's the comment:
Yeah, [this post] is complete garbage. Mathematically, it's an issue of pointwise vs. uniform convergence, plain and simple. Economically, Romer says that Lucas + Moll don't get endogenous growth because of diffusion. Rather, they get endogenous growth because the "knowledge frontier" is already unbounded. Any empirical observations (which necessarily occur at finite T) depend critically on the rate of knowledge arrival. This guy just does not get it.
I think this shows us a bit where economics has gone off the rails with the mathiness. Economics should never refer to uniform vs pointwise convergence. This is a topic from real analysis [0]. If uniform vs pointwise convergence matters in economics, you're doing it wrong.

And it's not even a correct assessment -- it's actually a case of almost uniform convergence (the set of points where the limit doesn't converge uniformly has measure zero, and is in fact exactly the point $\beta = 0$).

An example from physics might help make this a bit clearer. Let's say we have a model of all possible collections of nuclei that decay via $\alpha$ decay with time constant $\tau$ so that the number of your original nuclei left at time t is given by

$$
N_{\tau}(t) = N_{0} e^{-t/\tau}
$$

This function has the essential properties of Lucas and Moll's growth function $g_{\beta}(t)$ where basically $1/\beta$ is now $\tau$.

$$
\lim_{\tau \rightarrow \infty} N_{\tau}(T) = 1
$$

$$
\lim_{t \rightarrow \infty} N_{\tau}(t) = 0
$$

The English versions of these two limits:

  1. If the nuclei are stable, they never decay.
  2. Unstable nuclei always eventually decay.

Lucas and Moll (2014) essentially is a model of decaying nuclei and their limit is the second one. They don't care about stable nuclei (economies where nothing happens), they care about something else (economies where everything has already happened [1]). Probably why they ignored Romer's comment! I'd ignore it too. And send back a pithy comment.

Note that my detractor says:
 "Any empirical observations (which necessarily occur at finite T) depend critically on the rate of knowledge arrival."
That is exactly my point! They also critically depend on finite $\beta$, so why are you taking any limits at all? The whole concept of choosing T only makes sense relative to $\beta$. And the model only exists at finite $\beta$, what ever limit you are taking has to be relative to $\beta$.

This entire argument might have been avoided by a simple $\beta > 0$ statement. But then, we get to Paul Romer's second mistake. He apparently notices that $\beta > 0$. He takes a limit that depends critically on a point he himself says doesn't exist as a point in the space!

His mathy proposition says that $\beta \in (0, \tilde{\beta})$, not $\beta \in [0, \tilde{\beta})$, so the limit $\beta \rightarrow 0$ takes $\beta$ to be arbitrarily small, but not zero. One notation for that is $0^{+}$ and that limit is uniformly convergent. You can exchange the order of limits under this condition:

$$
\lim_{t \rightarrow \infty} \lim_{\beta \rightarrow 0^{+}} g_{\beta}(t) = \lim_{\beta \rightarrow 0^{+}} \lim_{t \rightarrow \infty} g_{\beta}(t) = \gamma
$$

All of this nonsense could have been avoided by lightening up on the real analysis and not forgetting that we are trying to describe a real-life system!

Economists is weird.

Footnotes:

[0] I know. In addition to physics, I have a math degree too. I've been though that punishment.

[1] Not sure why they want to do this -- the interesting bit is where $T \sim 1/\beta$. And if this is what Romer is objecting to, why do it with the silly proof and the argument about limits? Just say the economy Lucas and Moll describe doesn't make economic sense -- not that they didn't properly account for the order of limits.

Saturday, May 16, 2015

The irony of Paul Romer's mathiness


From Paul Romer's extended mathiness appendix [pdf].

I apparently completely misunderstood Paul Romer's comment about "mathiness". My original interpretation was that Romer was upset about obfuscating political assumptions that aren't substantiated empirically by using fancy math. But then I started reading some of his appendix (pictured above). I was completely wrong.

Paul Romer is upset about the technical rigor of those political assumptions. If the function Lucas and Moll (2014) used allowed you to exchange the order of the limits everything apparently would have been fine! Unfortunately, it turns out that Romer's takedown of Lucas and Moll is the true mathiness.

Lucas and Moll (LM) put forward some economic growth function $g_{\beta}(t)$ where $\beta$ is the "rate of innovation". LM then tell us:

$$
\lim_{t \rightarrow \infty} \; g_{\beta}(t) = \text{ 2%}
$$

which is independent of $\beta$. Fine. But Romer objects! He shows (somewhere -- couldn't find the comment on LM he was referring to on his website)

$$
\lim_{\beta \rightarrow 0} \; g_{\beta}(T) = 0
$$

Oh noes! We have a case where the limit depends on the order you take it:

$$
\lim_{\beta \rightarrow 0} \; \lim_{T \rightarrow \infty} \; g_{\beta}(T) \neq \lim_{T \rightarrow \infty} \; \lim_{\beta \rightarrow 0}  \; g_{\beta}(T)
$$

He even goes on to state this in a mathy "proposition" form (pictured above).

Now if $g$ represented the average polarization of an Ising model, this might be interesting (even lending itself to the possibility of a topological solution). But infinity is not readily encountered in a real economy and the situation being described is the idea observing the economy at a date $T$ when the typical time until "knowledge arrives" (I know!) is $1/\beta$. What do these two limits mean in real life?

  1. You are observing an economy in which knowledge never arrives ($\beta \rightarrow 0$ first)
  2. You are observing an economy after which the knowledge has arrived ($T \rightarrow \infty$ first)

The second one is the sensible limit of Lucas and Moll (2014).

Now what does the time of observation $T$ mean in real life? I am assuming economists don't really pay attention to the big bang or eventual heat death of the universe and that an economy can happen at any time in the lifetime of the universe. That is to say -- economics has a time translation invariance where you could relabel the year 1991 as 10191 and it wouldn't make a bit of difference. Relative times matter, but not absolute times. Which means that I could shift $T$ to any finite, but large value arbitrarily ... in particular I can choose $T \gg 1/\beta$. I can't choose $\beta$ as it represents a real thing: the time between two "knowledge" events. 

That is to say there is an existing scale in the model $1/\beta$, the time difference between "knowledge" events. There is no such scale for $T$ unless it is $1/\beta$ and therefore the only sensible limit is:

$$
T \gg 1/\beta
$$

which is situation 2 above.

Romer is turns out to be ironically wrong because he uses too mathematical a description in a physical theory. You can discard the second first limit as a mathematical curiosity that doesn't represent a real life economy.

I don't have any particular fondness for Lucas, but I think in his zeal to take him down Romer falls for the exact mathiness he purports to dislike!

Cobb and Douglas didn't have changing TFP, and Is TFP entropy?


Continuing in this series (here, here and here), I found Cobb and Douglas's original paper from 1928 [pdf] where their least squares fit gives them the function:

$$
P = 1.01 L^{3/4} C^{1/4}
$$

And they get a pretty good result:




Also, Noah Smith writes today:
Yes, in a Solow model you can tie capital K to observable things like structures and machines and vehicles. But you'll be left with a big residual, A.


$$
NGDP = A \; K^{\alpha} \; L^{\beta}
$$

And use the "economic potential" (see also here):

$$
NGDP = TS + X + Y + ...
$$

$$
NGDP \approx (c/\kappa + \xi + \eta + ... ) NGDP
$$

So that ...

$$
NGDP \approx (c/\kappa + \xi + \eta + ... ) A \; K^{\alpha} \; L^{\beta}
$$

$$
= (A c/\kappa + A \xi + A \eta + ... ) \; K^{\alpha} \; L^{\beta}
$$

$$
= (\underbrace{A c/\kappa}_{\text{residual productivity}} + \underbrace{A \xi + A \eta + ...}_{\text{measurable output}}) \; K^{\alpha} \; L^{\beta}
$$

or

$$
= (\underbrace{A c/\kappa}_{\text{entropy}} + \underbrace{A \xi + A \eta + ...}_{\text{real output}}) \; K^{\alpha} \; L^{\beta}
$$

So that we say

$$
NGDP \approx (A_{TS} + A_{0}) \; K^{\alpha} \; L^{\beta}
$$

Noah's statement is essentially that we expect a number the size of $A_{0}$, but it turns out it is large (i.e. the size of $A_{TS} + A_{0}$) and $A_{TS}$ is this large residual (or the whole term is the large residual). In this description, the Cobb Douglas production function works because the entropy term is approximately proportional to output: $TS \approx (c/\kappa) NGDP$.