Tuesday, May 3, 2016

A review of Cesar Hidalgo's Why Information Grows


I have been pondering a review of Cesar Hidalgo's Why Information Grows for a few months now because I was a bit uncertain of how to pin down exactly how I feel about it (I've mentioned it before here, here and here). One one hand, his book is an excellent "first principles" look into the theory of economic growth by a physicist; I think Schrodinger's What is Life? was a big influence on Hidalgo. As such, it's really great! He's much better than I am at explaining things like entropy and information in a readable way.

On the other hand, the book motivates yet another case of "graphing an index of things I think are important versus economic output". John Cochrane has one in the WSJ today:


John Cochrane chooses an "ease of doing business" factor. In his book, Hidalgo chooses "economic diversity" or "economic complexity":


One problem with this kind of economics is that it suffers from a bias in how the index is constructed. Adding elements that make the conclusion you want to draw more obvious are given preference over those that make it less obvious. The process seems to stop once you achieve an R2 of about 0.6 to 0.7 (meaning a Pearson's correlation coefficient of about 0.8, shown below from Wikipedia). I basically take any such graph with a grain of salt.

Why do so many graphs of output versus an index look like this?

Hidalgo's complexity is a bit better since it uses pretty well-defined properties of graphs (connecting nations with products). He measures a nation's economic diversity -- the number of connections to a nation -- and a product's ubiquity -- the number of connections to a product. Lots of nations produce fishing products (fish has high ubiquity) and a few nations export things across the economic spectrum (e.g. the US produces all kinds fo things from fish to airplanes). In the same way the MZM is a better measure of money because it uses a concrete definition, there is less ability to game the index with Hidalgo's complexity measure.

Dietrich Vollrath had a good comment on Hidalgo's book -- the title of his review is "Why Industrial Classification Diversity Grows" which points out his main argument that industrial classifications are actually pretty arbitrary divisions (nearly all software belongs to a single classification, while physical products are divided up finely). At least it's NAIC and not Hidalgo making those decisions like in the case of Cochrane's "ease of doing business" index.

However I have an additional complaint that Hidalgo's complexity index isn't really measuring "crystals of imagination" (a great phrase) per his thesis and it is essentially the argument in these two posts ([1], [2])

If we assume each NAIC classification is a dimension, and we have an overall budget constraint (a maximum possible economic output at a given time), then the greater the number of dimensions (higher complexity), the more likely we will find ourselves closer to that budget constraint.

For three dimensions d = 3 (i.e. three NAIC classifications) and a maximum amount M, the distance Δ of the average location from the budget constraint hyperplane goes as Δ/M ~ 1 - d/(d+1). This goes to zero as d increases. In general, an opportunity set with more dimensions has more possible states near its surface, and so a higher economic output. Here are some pictures, and the black dot is the average location. For d = 3, we're still a bit away.


This is the value of diversity [2], and it's where I agree with Hidalgo. One thing to note is that we've assumed all equal sides -- this is a bit more complex if your shape is asymmetric (as it likely is), but the same general principle holds. If one dimension dominates (major oil exporting countries, for example), you can effectively treat the country as having only one dimension.

However, the argument above is assuming random occupation of the available states. As I mention in [1], this also leads to the fact that you find the products of natural evolution in states close to maximum fitness -- even though evolution is exploring the available states randomly. This takes some of the shine off of Hidalgo's crystals of imagination. They're really products of tâtonnement, full of path dependence and optimal only on average.

Overall, I'd recommend the book -- it's a well-done meditation on information theory with lots of intuitive examples. Hidalgo also has a Google talk that goes over the same basic material. I'd take the complexity index with a grain of salt, though.

No comments:

Post a Comment