Diane Coyle has a review of a new book on statistics for a general audience. It's Truth or Truthiness: Distinguishing Fact From Fiction By Learning to Think Like a Data Scientist by Howard Wainer. It sounds fun and is definitely seems like the kind of book needed in today's data environment.
One of the things Diane writes about in the review is the ecological fallacy:
I also discovered that one aspect of something that’s bugged me since my thesis days – when I started disaggregating macro data – namely the pitfalls of aggregation, has a name elsewhere in the scholarly forest: “The ecological fallacy, in which apparent structure exists in grouped (eg average) data that disappears or even reverses on the individual level.” It seems it’s a commonplace in statistics ... Actually, I think the aggregation issues are more extensive in economics; for example I once heard Dave Giles do a brilliant lecture on how time aggregation can lead to spurious autocorrelation results.
Now I am not 100% sure I read this correctly, so I'm not going to attribute this interpretation to Diane. However, the way this is written could be taken to impugn the macro structure: this is not the meaning of the ecological fallacy.
The ecological fallacy states that observed macro structures do no imply anything about the micro agents. It does not say that the converse is true, i.e. the lack of agents behaving consistently with the macro structure implies the macro structure is spurious (it may or may not be).
I think a good example here is diffusion. The macro structure (an entropic force pushing density to become e.g. a uniform distribution) does not imply that individual molecules are seeking out areas of low density. Individual molecules are just moving randomly. A graphic from the Wikipedia article on diffusion illustrates this nicely:
However, the random motion of individual molecules does not make us question the validity of the macro observable diffusion. In a sense, all emergent properties would be suspect if this were true.
But Diane also said she noticed that macro structures tend to fall apart when disaggregated; this is exactly what we'd expect if macro and economic forces are entropic forces like diffusion. I've already noted that nominal rigidity (sticky prices and wages) appears to be the result of entropic force [1] (nominal rigidity appears in aggregate data, but isn't true for individual prices). We can see e.g. Calvo pricing as a "microfoundation" for something that doesn't exist at the micro level ‒ much like (erroneously) creating a density depended force for individual molecules in diffusion. I also showed how consumption smoothing, transitive preferences, and rational agents can arise from agents that fail to meet any of those properties.
Essentially, the issues with the ecological fallacy should be ubiquitous in economics if it really is about entropic forces and macro is different from aggregated micro.
...
Footnotes:
[1] Possibly even more interesting are causal entropic forces; this formulation can make inanimate objects appear to do intelligent things. I constructed a demand curve from them here. As I noted in the first link in this footnote, the causality may be deeply related to Duncan Foley and Eric Smith's observation that the real difference between economics and the physics of thermodynamics is the former's focus on irreversible transformations (agents don't willingly undo gains) and the latter's focus on reversible ones (for e.g. experiments).
Hey Jason, I have a question about the basics of information transfer economics. Like, "the very foundations of this blog" basics.
ReplyDeleteIn your post "information theory 101", you say that the Shannon information entropy of a single supply event is equal to the Shannon information entropy of a single demand event. Much of your blog is focused around substituting into this simple case different variables for demand and supply, and seeing how the results fit the data.
I'm a little confused as to the jump you make between saying that H(demand symbols) = H(supply symbols) and (D/dD)H(demand symbols) = (S/dS)H(supply symbols). If a single sampling of the supply distribution offers the same (expected) information as a single sampling of the demand distribution, then it makes sense that repeated sampling should create the relationship (D/dD)H(demand symbols) = (S/dS)H(supply symbols). But the original equation still holds. Can't we divide both sides of the second equation by each side of the first, so that D/dD = S/dS, and wouldn't this imply that k is always equal to 1?
I think a lot of this confusion stems from a non-rigorous definition of a "supply event" or "demand event". The fact that supply and demand have units makes them strikingly different phenomena from discrete, dimensionless symbols. I'm sure that I'm probably missing something obvious, but I feel like only one equation ((D/dD)H(demand symbols) = (S/dS)H(supply symbols) or H(demand symbols) = H(supply symbols) ) can be legitimate.
Could you help me out? This has been bothering me for a while.
That's a great question.
DeleteLet's just call D/dD = nd and S/dS = ns the number of draws from the single event distributions with info entropy H(d) and H(s). When you say H(d) = H(s), then you are implicitly saying you have the same number of "draws" (i.e. nd = ns = 1). This is correct that this does imply k = 1. However, in general we allow not just different distributions, but different numbers of draws from them. So the information ent of some number of demand events I(d) = nd H(d) is equal to the information ent of some number of supply events I(s) = ns H(s).
Let's take demand to be made up of die rolls (6-sided) and supply to be made up of coin flips. In order for the information entropy to be the same, you'd need 2.6 coin flips (each coin flip is one bit) for every die roll (2.6 bits per roll). So if nd goes up by one, ns needs to go up by 2.6.
In this case we have k = 2.6 = log 6/log 2. And a "supply event" is 2.6 coin flips (2.6 draws with 2 supply symbols), while a "demand event" is one die roll (one draw with 6 demand symbols).
The thing is: if the information theory derivation isn't intuitive, there's always the path of simply generalizing Fisher's marginal utility equation where k is just the relative utility of an element of D (i.e. satisfying the need, purchasing the item) to the relative utility of an element of S (i.e. seller's utility derived from selling the item).