[Update: A (in my opinion, better) version of this blog post has been reprinted as an article at Evonomics.]
Vox talked with Chris Hayes of MSNBC in one of their podcasts. One of the topics that was discussed was neoclassical economics:
Vox talked with Chris Hayes of MSNBC in one of their podcasts. One of the topics that was discussed was neoclassical economics:
[Vox:] The center-of-right ideas the left ought to engage[?]
[Hayes:] The entirety of the corpus of Hayek, Friedman, and neoclassical economics. I think it’s an incredibly powerful intellectual tradition and a really important one to understand, these basic frameworks of neoclassical economics, the sort of ideas about market clearing prices, about the functioning of supply and demand, about thinking in marginal terms.
I think the tradition of economic thinking has been really influential. I think it's actually a thing that people on the left really should do — take the time to understand all of that. There is a tremendous amount of incredible insight into some of the things we're talking about, like non-zero-sum settings, and the way in which human exchange can be generative in this sort of amazing way. Understanding how capitalism works has been really, really important for me, and has been something that I feel like I'm a better thinker and an analyst because of the time and reading I put into a lot of conservative authors on that topic.
I can hear some of you asking: Do I have to?
The answer is: No.
Why? Because you can get the same understanding while also understanding where these ideas fall apart ‒ that is to say understanding the limited scope of neoclassical economics – using information theory.
Prices and Hayek
One thing that I think needs to be more widely understood is that Hayek did have some insight into prices having something to do with information, but got the details wrong. He saw market prices aggregating information; a crop failure, a population boom, speculating on turning rice into ethanol ‒ these events would cause food prices to increase, and that price change represented knowledge about the state of the world being communicated. However, Hayek was writing in a time before communication theory (Hayek's The Use of Knowledge in Society was written in 1945, a few years before Shannon's A Mathematical Theory of Communication in 1948). The issue is evident in my list. The large amount of knowledge about biological or ecological systems, population, and social systems are all condensed into a single number that goes up. Can you imagine the number of variables you'd need to describe crop failures, population booms, and market bubbles? Thousands? Millions? How many variables of information do you get out via the price of rice the market? One.
What we have is a complex multidimensional space of possibilities that is being compressed into a single dimensional space of possibilities (i.e. prices), therefore if the price represents information aggregation, we are losing a great deal of it in the process. As I talk about in more detail here, one way neoclassical economics deals with this is to turn that multidimensional space into a single variable (utility), but that just means we've compressed all that information into something else (e.g. non-transitive or unstable preferences).
However we can re-think the price mechanism's relationship with information. Stable prices mean a balance of crop failures and crop booms (supply), population declines and population booms (demand), speculation and risk-aversion (demand). The distribution of demand for rice is equal to the distribution of the supply of rice (see the pictures above: the transparent one is the "demand", the blue one is the "supply"). If prices change, the two distributions would have to have been unequal. If they come back to the original stable price ‒ or another stable price ‒ the two distributions must have become equal again. That is to say prices represent information about the differences (or changes) in the distributions. Coming back to a stable means information about the differences in one distribution must have flowed (through a communication channel) to the other distribution. We can call one distribution D and the other S for supply and demand. The price is then a function of changes in D and changes in S, or
p = f(ΔD, ΔS)
Note that we observe that an increase in S that's bigger than an increase in D generally leads to a falling price, while an increase in D that is bigger than the increase in S generally leads to a rising price. That means we can try
p = ΔD/ΔS
for our initial guess. Instead of a price aggregating information, we have a price detecting the flow of information. Constant prices tell us nothing. Price changes tell us information has flowed (or been lost) between one distribution and the other.
This picture also gets rid of the dimensionality problem: the distribution of demand can be as complex and multidimensional (i.e. depend on as many variables) as the distribution of supply.
Marginalism and supply and demand
Marginalism is far older than Friedman or Hayek, going back at least to Jevons and Marshall. In his 1892 thesis, Irving Fisher tried to argue that if you have gallons of one good A and bushels of another good B that were exchanged for each other then the last increment (the margin) was exchanged at the same rate as A and B, i.e.
ΔA/ΔB = A/B
calling both sides of the equation the price of B in terms of A. Note that the left side is our price equation above, just in terms of A and B (you could call A the demand for B). In fact, we can get a bit more out of this equation if we say
pₐ = A/B
If you hold A = A₀ constant and change B, the price goes down. For fixed demand, increasing supply causes prices to fall – a demand curve. Likewise if you hold B = B₀ constant and change A, the price goes up – a supply curve. However if we take tiny increments of A and B and use a bit of calculus (ΔA/ΔB →dA/dB) the equation only allows A to be proportional to B. It's quite limited, and Fisher attempts to break out of this by introducing marginal utility. However, thinking in terms of information can again help us.
Matching distributions |
If we think of our distribution of A and distribution of B (like the distribution of supply and demand above), each "draw" event from those distributions (like a draw of a card,a flip of a coin, or roll of a die) contains I₁ information (i.e. a flip of a coin contains 1 bit of information) for A and I₂ for B. If the distribution of A and B are in balance ("equilibrium"), each draw event from each distribution (a transaction event) will match in terms of information. Now it might cost two or three gallons of A for each bushel of B, so the numbers of of draws on either side will be different in general but as long as the number of draws is large the total information from those draws will be the same:
n₁ I₁ = n₂ I₂
k n₁ = n₂
Now say the smallest amount of A is ΔA and likewise for B. That means
n₁ = A/ΔA
n₂ = B/ΔB
i.e. the number of gallons of A is the amount of A (i.e. A) divided by 1 gallon of A (i.e. ΔA). Putting this together and re-arranging a bit we have
ΔA/ΔB = k A/B
This is just Fisher's equation again except there's a coefficient in it, making the result a bit more interesting when you use tiny increments (ΔA/ΔB →dA/dB) and use calculus. But there's a more useful bit of understanding you get from this approach that you don't get from neoclassical economics. What we have is information flowing between A and B and we've assumed that information transfer is perfect. But markets aren't perfect, and all we can really say is that the most information that gets from the distribution of A to the distribution of B is all of the information in the distribution of A. Basically
n₁ I₁ ≥ n₂ I₂
p = ΔA/ΔB ≤ k A/B
The real prices in a real economy will fall below the neoclassical prices. There's also another assumption in that derivation – that the number of transaction events is large. So even if the information transfer was ideal, neoclassical economics only applies in markets that are frequently traded.
Another insight we get is that supply and demand doesn't always work in the simple way described in Marshall's diagrams. We had to make the assumption that A or B was relatively constant while the other changed. In many real world examples we can't make that assumption. A salient one today is (empirically incorrect) claim that immigration lowers wages. A naive application of supply and demand (increase supply of labor lowers the price of labor) ignores the fact that more people means more people to buy goods and services produced by labor. Thinking in terms of information, it is impossible to say that you've increased the number of labor supply events without increasing the number of labor demand events so A and B must both increase.
Instead of the neoclassical picture of ideal markets and simple supply and demand, we have the picture the left (and to be fair many economists) tries to convey of not only market failures and inefficiency but more complex interactions of supply and demand. However, it is also possible through collective action to mend or mitigate some of these failures. We shouldn't assume that just because a market spontaneously formed or produced a result it is working, and we shouldn't assume that because a price went up either demand went up or supply went down.
The market as an algorithm
The picture above is of a market as an algorithm matching distributions by raising and lowering a price until it reaches a stable price. In fact, this picture is of a specific machine learning algorithm called Generative Adversarial Networks (GAN, described in this Medium article or in the original paper). The idea of the market as an algorithm to solve a problem is not new. For example one of the best blog posts of all time uses linear programming as the algorithm, giving an argument for why planned economies will likely fail, but the same reasons imply we cannot check the optimality of the market allocation of resources (therefore claims of markets as optimal are entirely faith-based). The Medium article uses a good analogy that I will repeat here:
Instead of the complex multidimensional distributions we have paintings. The "supply" B is the forged painting, the demand A is the "real" painting. Instead of the random initial input, we have the complex, irrational, entrepreneurial, animal spirits of people. The detective is the price p. When the detective can't tell the difference between the paintings (i.e. when the price reaches a relatively stable value because the distributions are the same), we've reached our solution (a market equilibrium).
Note that the problem the GAN algorithm tackles can be represented two-player minimax game from game theory. The thing is that with the wrong settings algorithms fail and you get garbage. I know from experience in my regular job researching machine learning, sparse reconstruction, and signal processing algorithms. So depending on the input data (i.e. human behavior), we shouldn't expect to get good results all of the time. These failures are exactly the failure of information to flow from the real painting to the forger through the detective – the failure for information from the demand to reach the supply via the price mechanism.
The understanding of neoclassical economics provided by information theory and machine learning algorithms is better equipped to understand markets. Ideas that were posited as articles of faith or created through incomplete arguments by Hayek and Friedman are not the whole story and leave you with no knowledge of the ways the price mechanism, marginalism, or supply and demand can go wrong. In fact, leaving out the failure modes effectively declares many of the concerns of the left moot by fiat. The potential and actual failures of markets are a major concern of the left, and are frequently part of discussions of inequality and social justice.
The left doesn't need to follow Chris Hayes advice and engage with Hayek, Friedman, and the rest of neoclassical economics. The left instead needs to engage with a real world vision of economics that recognizes its potential failures. Understanding economics in terms of information flow is one way of doing just that.
...
Update 26 April 2017
I must add that the derivation of the information equilibrium condition (i.e. dA/dB = k A/B) is originally from a paper by Peter Fielitz and Guenter Borchardt and applied to physical systems. The paper is always linked in the side bar, but it doesn't appear on mobile devices.
...
Update 26 April 2017
I must add that the derivation of the information equilibrium condition (i.e. dA/dB = k A/B) is originally from a paper by Peter Fielitz and Guenter Borchardt and applied to physical systems. The paper is always linked in the side bar, but it doesn't appear on mobile devices.
Nice post! I am playing around with some machine learning techniques and one problem I have run into for biological applications is that frequently they produce "black box" models that overfit the data, so they end up being less useful than traditional data mining techniques. Any general suggestions?
ReplyDeleteThanks.
DeleteRE: machine learning
My basic experience is that machine learning is useful to create an algorithm for something you basically understand via some other reason (e.g. theory), and you have some intuition about how to use that information but don't know how to program it directly. In a sense, you already know what "a signal" is or what "an anomaly" is but you don't know the exact filter you need to separate noise, signals, and anomalies. In those cases, machine learning can work eerily well.
One of my key interests is where and how these "black box" algorithms fail so that you can understand that instead of knowing what's in the black box and deriving where the algorithms fail.
Prediction markets are sort of an example of this outside of machine learning that have some similarities. You don't know how the "market" arrives at its decisions (black box), and sometimes it fails -- I think overfitting is a good way to describe prediction markets in the 2016 election. (Note by fail I don't mean the probability was high for a Clinton win and she lost; that happens.) It was these failures that led me to the information transfer framework in order to produce metrics for prediction markets ... and I'm thinking possibly machine learning algorithms.
In any case, I think it's still a place for research.
Let me add a soupcon about overfitting. I think that there is a fair literature about it, so this is just a thought or two. One is to fit only part of the data and use the rest of the data to test the fit. Another is to take a Chebyshev approach, to minimize the maximum errors. I have used that for economic data, in part because I am interested in the extremes more than in day to day events. But it is almost impossible to overfit using a Chebyshev approach. For instance, to fit a line to a 2D graph you only use three points to derive the equation of the line. (OC, you may have to use more points to derive the best fitting line. :)) How can you overfit when you ignore the specifics of most of the data?
DeleteHi Bill,
DeleteIn machine learning the issue of over-fitting is a bit more complex:
https://en.wikipedia.org/wiki/Overfitting#Machine_learning
If you don't have a good idea of what is noise and what isn't (from theory or some estimate of measurement error), then your algorithm can potentially learn on noise.
For example, when we look at NGDP growth data I see that most people in the econoblogosphere view fluctuations of 0.1 percentage points as "real" (i.e. that represents something being measured in the actual economy, not measurement noise). I tend to view the quarter to quarter fluctuations as almost entirely "noise" (that is to say random or complex factors that cause the measurements to miss the underlying trend). This is also related to how much you smooth with an HP filter.
If you don't know what you are going for, then you probably don't know what noise is. In those cases, your machine learning algorithm is likely to learn on noise (i.e. over-fit) regardless of how much data you set aside for learning and how much you set aside for testing.
Seems like a general problem. In reviewing some more successful application of machine learning, it looks like pre-filtering the inputs to find those that are likely to be relevant for understanding the phenomenon can produce better, more accurate and meaningful results. I will play around with some pre-selection of the data in that case.
DeleteThat's what happens most fo the time. That pre-selection usually ends up basically being implicit theorizing, though. This is fine if you have some sort of theory behind what you're looking at but is problematic in e.g. economics as I wrote a bit about here:
Deletehttp://informationtransfereconomics.blogspot.com/2015/09/machine-learning-and-implicit-theorizing.html
"The left doesn't need to follow Chris Hayes advice and engage with Hayek, Friedman, and the rest of neoclassical economics. The left instead needs to engage with a real world vision of economics that recognizes its potential failures. Understanding economics in terms of information flow is one way of doing just that."
ReplyDeleteThe left shouldn't at all follow the moron's advice since neoclassicism is a very counter science and plain ideology invented for inhibiting any effective understanding of capitalism. It is pure garbage and non sense apparently largely dismissed in the post war period, but then it came back on steroid in the 70s when such irrelevant tautological misconstruction was ideologically instrumental to the elites and their project, as Noam Chomsky has been telling, of lessening democracy, wages and growth and partially abandoning capitalism. M. Friedman with his opportunism and insane faith was the most useful propagandist, getting even the chance of implementing in Chile his free market dream though with the help of the visible hand of dictatorship and mass murdering.
Ironically Hayek, for having been against the religion of engineers and therefore even against excessive use of statistics, wouldn't maybe appreciate entirely the information flow approach. He actually for ideological reasons never got capitalism, and though rejected also the misleading neoclassicism, was a totally mystic about markets reaching in practice the same neoclassical conclusions on pure cabalistic faith basis.
By no means there are no doubts that the left should engage at last with a real world vision of economics. The brainwashing has been so effective that most self proclaiming leftists and liberals lack of a personal grammar for understanding capitalism, so when they talk of world and economic facts involuntary use the same ideological neoclassical categories producing comical effects and contradictory statements and proposals. Information theory for being an objective and the least biased way of manipulating and mastering statistics can be very helpful to develop an insight into world economic facts and also to get aware of the frequent distortions. Nevertheless, understanding capitalism still would demand something more in terms of critical point of view and logic.
As a postscript the claim "A naive application of supply and demand (increase supply of labor lowers the price of labor) ignores the fact that more people means more people to buy goods and services produced by labor." is questionable.
It is more probable specially in an environment of low growth that the increase of labor supply causes a decline of the rate of wage, which in turn could even cause a lower rate of accumulation.
M
On the last paragraph, empirical studies show no decline in wages.
DeleteBut I find it odd that you decry neoclassical Econ in the first paragraphs but employ it in the last?
The concepts of demand and supply such as others are not prerogative of the mythological and ideological neoclassical narrative, which just abuses them.
DeleteThen, i stand by my opinion, any L supply excess produces effects on nominal wages, they might not go down but wont go up either. It is not by chance that the nairu mythology has been religiously proclaimed for long. Prohibiting the government of pursuing full employment policies keeps power relations in order and wages down eliminating possibly risks of pressures on nominal wages and inflation. It is a political choice. The Center for Immigration Studies released an analysis in 2014 about North Caroline job situation showing how the picture is not easy.
That wage effect doesn't exhaust immigration issue is, however, obvious.
M
I think I am something of an expert on Hayek.
ReplyDeletehttps://mises.org/library/hayek-meet-press
Jason Smith really doesn't understand Hayek at all. The information imparted by prices is an approximation of what is going on inside the heads of the people doing the exchanges and not necessarily all the newly discovered technical information that might be collected. Such knowledge displayed by prices is not otherwise accessible or replaceable. The sophistication and complexity of new information outside of what inside the heads of individual people is irrelevant to the Hayekian analysis. The interventionists simply cannot wrap their heads around the socialist calculation problem or that economic crises, depressions and recessions result from the price distortions of Keynesian and/or monetarist interventionism and are a variation on the socialist calculation problem.
Further, Friedman is essentially a Keynesian and monetarism is a form of Keynesianism because both support violent intervention in the area of the government providing fiat funny money. Conflating Friedman with Hayek as "neo-classical" is like conflating Linda Ronstadt with The Beastie Boys.
https://www.auburn.edu/~garriro/fm2friedman.htm
Finally, “equilibrium” is not a concept generally used in Austrian School analysis.
Neither Jason Smith nor any of the other interventionists have any familiarity whatsoever with Austrian concepts or analysis and now he’s telling his compadres to be ever more vigilant in pursuing that level of total ignorance. What are you afraid of?
Bob Roddis not only doesn't really understand information theory at all, but talks about Jason Smith in the third person on his blog.
DeleteInformation theory is not about "newly discovered technical information that might be collected", but rather counting numbers of states, such as brain states "going on inside the heads of the people doing the exchanges".
As such Bob Roddis in fact makes the point Jason Smith is trying to convey: the number of brain states (even measured crudely by e.g. EEG) is exponentially larger (and reside in a much higher dimensional vector space) than the number of price states (which reside in a single dimensional vector space). Therefore any communication channel carrying information about the brain states that travels via the price must necessarily lose a great deal of information.
In the same way a violation of the second law of thermodynamics lets you build perpetual motion machines or traveling faster than light lets you travel back in time, Hayek's price mechanism (were it true) would allow you to send an entire HD movie over a 9600 baud modem in seconds. It violates the noisy channel coding theorem, and is probably equivalent to asserting P = NP.
Jason Smith offers a novel interpretation of Hayek's price mechanism that does not assert P = NP or violate Shannon's theorem, but also involves re-interpreting it not as an information aggregator, rather just an information speedometer.
Bob Roddis is also apparently just a lawyer and as such probably hasn't had a lot of experience with vector spaces, probability theory, information theory, or entropy. One should probably include this information in one's Bayesian prior when considering whether Jason Smith or Bob Roddis has the more appropriate expertise regarding the argument at hand.
Additionally, Milton Friedman is a full fledged member of the neoclassical synthesis which is Keynesian macroeconomic plus neoclassical microeconomics. The discussion above is decidedly microeconomics, not macroeconomics. Whether or not Friedman had Keynesian views of macro or not is irrelevant to neoclassical microeconomics.
However, Jason Smith appreciates the change of pace in arguing with an Austrian economics acolyte about information rather than the usual Post-Keynesian belaboring of accounting.
"Bob Roddis not only doesn't really understand information theory at all, but talks about Jason Smith in the third person on his blog." Lol!
Delete