Saturday, May 13, 2017

Theory and evidence in science versus economics

Noah Smith has a fine post on theory and evidence in economics so I suggest you read it. It is very true that there should be a combined approach:
In other words, econ seems too focused on "theory vs. evidence" instead of using the two in conjunction. And when they do get used in conjunction, it's often in a tacked-on, pro-forma sort of way, without a real meaningful interplay between the two. ... I see very few economists explicitly calling for the kind of "combined approach" to modeling that exists in other sciences - i.e., using evidence to continuously restrict the set of usable models.

This does assume the same definition of theory in economics and science, though. However there is a massive difference between "theory" in economics and "theory" in sciences. 

"Theory" in science

In science, "theory" generally speaking is the amalgamation of successful descriptions of empirical regularities in nature concisely packaged into a set of general principles that is sometimes called a framework. Theory for biology tends to stem from the theory of evolution which was empirically successful at explaining a large amount of the variation in species that had been documented by many people for decades. There is also the cell model. In geology you have plate tectonics that captures a lot of empirical evidence about earthquakes and volcanoes. Plate tectonics explains some of the fossil record as well (South America and Africa have some of the same fossils up to a point at which point they diverge because the continents split apart). In medicine, you have the germ theory of disease.

The quantum field theory framework is the most numerically precise amalgamation of empirical successes known to exist. But physics has been working with this kind of theory since the 1600s when Newton first came up with a concise set of principles that captured nearly all of the astronomical data about planets that had been recorded up to that point (along with Galileo's work on projectile motion).

But it is important to understand that the general usage of the word "theory" in the sciences is just shorthand for being consistent with past empirical successes. That's why string theory can be theory: it appears to be consistent with general relativity and quantum field theory and therefore can function as a kind of shorthand for the empirical successes of those theories ... at least in certain limits. This is not to say your new theoretical model will automatically be correct, but at least it doesn't obviously contradict Einstein's E = mc² or Newton's F = ma in the respective limits.

Theoretical biology (say, determining the effect of a change in habitat on a species) or theoretical geology (say, computing how the Earth's magnetic field changes) is similarly based on the empirical successes of biology and geology. These theories are then used to understand data and evidence and can be rejected if evidence contradicting them arises.

As an aside, experimental sciences (physics) have an advantage over observational ones (astronomy) in that the former can conduct experiments in order to extract the empirical regularities used to build theoretical frameworks. But even in experimental sciences, experiments might be harder to do in some fields than others. Everyone seems to consider physics the epitome of science, but in reality the only reason physics probably had a leg up in developing the first real scientific framework is that the necessary experiments required to observe the empirical regularities are incredibly easy to set up: a pendulum, some rocks, and some rolling balls and you're pretty much ready to experimentally confirm everything necessary to posit Newton's laws. In order to confirm the theory of evolution, you needed to collect species from around the world, breed some pigeons, and look at fossil evidence. That's a bit more of a chore than rolling a ball down a ramp.

"Theory" in economics

Theory in economics primarily appears to be solving utility maximization problems, but unlike science there does not appear to be any empirical regularity that is motivating that framework. Instead there are a couple of stylized facts that can be represented with the framework: marginalism and demand curves. However these stylized facts can also be represented with ... supply and demand curves. The question becomes what empirical regularity is described by utility maximization problems but not by supply and demand curves. Even the empirical work of Vernon Smith and John List can be described by supply and demand curves (in fact, at the link they can also be described by information equilibrium relationships).

Now there is nothing wrong with using utility maximization as a proposed framework. That is to say there's nothing wrong with positing any bit of mathematics as a potential framework for understanding and organizing empirical data. I've done as much with information equilibrium.

However the utility maximization "theory" in economics is not the same as "theory" in science. It isn't a shorthand for a bunch of empirical regularities that have been successfully described. It's just a proposed framework; it's mathematical philosophy.

The method of nascent science

This isn't necessarily bad, but it does mean that the interplay between theory and evidence reinforcing or refuting each other isn't the iterative process we need to be thinking about. I think a good analogy is an iterative algorithm. This algorithm produces a result that causes it to change some parameters or initial guess that is fed back into the same algorithm. This can converge to a final result if you start off close to it, but it requires your initial guess to be good. This is the case of science: the current state of knowledge is probably decent enough that the iterative process of theory and evidence will converge. You can think of this as the scientific method ... for established science.

For economics, it does not appear that the utility maximization framework is close enough to the "true theory" of economics for the method of established science to converge. What's needed is the scientific method that was used back when science first got its start. In a post from about a year ago, I called this the method of nascent science. That method was based around the different metric of usefulness rather than model rejection in established science. Here's a quote from that post:
Awhile ago, Noah Smith brought up the issue in economics that there are millions of theories and no way to reject them scientifically. And that's true! But I'm fairly sure we can reject most of them for being useless.

"Useless" is a much less rigorous and much broader category than "rejected". It also isn't necessarily a property of a single model on its own. If two independently useful models are completely different but are both consistent with the empirical data, then both models are useless. Because both models exist, they are useless. If one didn't [exist], the other would be useful.
Noah Smith (in the post linked at the beginning of this post) put forward three scenarios of theory and evidence in economics:
1. Some papers make structural models, observe that these models can fit (or sort-of fit) a couple of stylized facts, and call it a day. Economists who like these theories (based on intuition, plausibility, or the fact that their dissertation adviser made the model) then use them for policy predictions forever after, without ever checking them rigorously against empirical evidence. 
2. Other papers do purely empirical work, using simple linear models. Economists then use these linear models to make policy predictions ("Minimum wages don't have significant disemployment effects"). 
3. A third group of papers do empirical work, observe the results, and then make one structural model per paper to "explain" the empirical result they just found. These models are generally never used or seen again.
Using these categories, we can immediately say 1 & 3 are useless. If a model never checked rigorously against data or if a model is never seen again, they can't possibly be useful.

In this case, the theories represent at best mathematical philosophy (as I mentioned at the end of the previous section). It's not really theory in the (established) scientific sense.


Mathematical Principles of Natural Philosophy

Sometimes a little bit of mathematical philosophy will have legs. Isaac Newton's work, when it was proposed, was mathematical philosophy. It says so right in the title. So there's nothing wrong with the proliferation of "theory" (by which we mean mathematical philosophy) in economics. But it shouldn't be treated as "theory" in the same sense of science. Most if it will turn out to be useless, which is fine if you don't take it seriously in the first place. And using economic "theory" for policy would be like using Descartes to build a mag-lev train ...


Update 15 May 2017: Nascent versus "soft" science

I made a couple of grammatical corrections and added a "does" and a "though" to the sentence after the first Noah Smith quote in my post above.

But I did also want to add the point that by "established science" vs "nascent science" I don't mean the same thing as many people mean when they say "hard science" vs "soft science". So-called "soft" sciences can be established or nascent. I think of economics as a nascent science (economies and many of the questions about them barely existed until modern nation states came into being). I also think that some portions will eventually become a "hard" science (e.g. questions about the dynamics of the unemployment rate), while others might become a "soft" science with the soft science pieces being consumed by sociology (e.g. questions about what makes a group of people panic or behave as they do in a financial crisis).

I wrote up a post that goes into that in more detail about a year ago. However, the main idea is that economics might be explicable -- as a hard science even -- in cases where the law of large numbers kicks in and agents do not highly correlate (where economics becomes more about the state space itself than the actions of agents in that state space ... Lee Smolin called this "statistical economics" in an analogy with statistical mechanics). 

I think for example psychology is an established soft science. Its theoretical underpinnings are in medicine and neuroscience. That's what makes the replication crisis in psychology a pretty big problem for the field. In economics, it's actually less of a problem (the real problem is not the replication issue, but that we should all be taking the econ studies less seriously than we take psychology studies).

Exobiology or exogeology could be considered nascent hard sciences. Another nascent hard science might be so-called "data science": we don't quite know how to deal with the huge amounts of data that are only recently available to us and the traditional ways we treat data in science may not be optimal.


  1. I think this ties in nicely with the common counterargument economists make against the type of curve fitting you do: namely, that the maximisation paradigm allows economists to understand individual behaviour (a version of the Lucas Critique). I think this argument is somewhat inconsistent with the 'as-if' argument: if you only care about prediction, then you should do it parsimoniously. Utility maximisation quite clearly isn't a literal description of behaviour so I'm not sure by how much it advances our understanding. But perhaps there is something to the argument that we should look at the people underlying the data, as well as using the data to do science.

    1. I agree that building up models of humans interacting (e.g. agent based approaches) is a perfectly fine course of action.

      It may even turn out that "rational utility maximization" is a decent effective theory in certain scenarios. I wrote something about this awhile ago discussing it with David Glasner ... RUM might be fine near a macro equilibrium, but if you deviate from EQ the approximation may fail.

      The thing is that the RUM framework does not appear to have the right "scope conditions" to say: "you can use RUM here but not here".

      Understanding the human behaviors underlying the data is one possible way to potentially understand those scope conditions. I can envision a possible scenario where RUM only works as an approximation when nothing "interesting" is happening in the economy.

    2. Also the so-called "Lucas critique" itself is really just mathematical philosophy ... it's probably one of the best examples. From it is born those expectations operators E_t, but a lot of mathematical work should be going into what those operators mean (in the rational expectations model, they seem to serve as time translation operators).

      Brad DeLong had a post talking about "neo-Fisherism":

      where he makes a list of things that make a "good economist". In that list he has an item that I think points out that the math is just mathematical philosophy:

      "5. Accurately and sensibly evaluate whether the conclusions of the toy model are in fact robust and generalizable to our real world."

      DeLong is basically talking about the neo-Fisherite use of those expectations operators. Michael Woodford confirmed that you can get that same neo-Fisher behavior mathematically, so I don't think this is a Sokal hoax as DeLong puts it. What we have is a situation where those E-operators lead to a result that is really weird. DeLong says a "good economist" would "sensibly evaluate" whether or not the result "is robust" and applicable to "the real world". That is to say: philosophize using math.

      And that is fine, but really what should happen is that the result should cause you to question those E-operators rather than making up some ad hoc explanation for why you can't use them to obtain this particular result.

    3. I'm not 100% sure my last comment wasn't just rambling.

    4. Utility maximisation is useful for auction theory and matching; and possibly on a case by case basis for some applied problems (I'm thinking of Chris Carroll's paper on 'why do the rich save so much?'). But in general its use in economics has wildly exceeded its scope of usefulness.

    5. Yes, I've also seen some applications where it seems like it might be useful. However not understanding where and when it is useful (i.e. the scope of utility maximization) is the issue.

      "Established" sciences can make those determinations (physics can even tell you specifically their theories break down at distance scale near 10^-31 to 10^-35 meter). But generally you probably shouldn't use e.g. evolution in an argument for something that happens during a time period that is short compared to the length of a "generation" nor make arguments using strata and time in geology for a place that didn't have any water.

      That is to say we really need to understand utility better in order to understand where it fails as an "effective theory".

    6. "But in general its use in economics has wildly exceeded its scope of usefulness."

      I don't know how one would evaluate that claim, I really don't know why you would make such a claim unless you have read very widely and have some basis for concluding Carroll is exception not rule. My take is that claim is very wrong.

      lots of models need an engine, they need purposeful behaviour, so you need to write down an objective function and you need to perform some sort of operation on it. Sorry if my terminology is inaccurate, try to run with me here.

      so suppose you want to understand structural change during development process, you need workers and investors to move across sectors as time passes and conditions, prices etc. change. writing down a simple utility function and getting your economic agents to maximize it is just the easiest way to do that. And sure yes of course people have far more complicated objectives and they don't really maximise, etc. etc. but often that's besides the point - you've written down your model in an attempt to understand something else - I don't know, let's say the interaction of openness to trade and structural change or something - and coming up with a more realistic model of human behaviour would just massively complicate things and (probably) just get in the way of understanding what you are trying to understand. If (and I am making this up) the data suggests that openness to trade retards structural change under some initial conditions and accelerate it under others, and your model provides a mechanism that would explain that, maybe predicts a few other things we ought to observe if that mechanism at work, and they seem to check out empirically too, well there you go, you might have a nice model that's adding something to our understanding of the world. I contend there's very many models that use u-max like that.

      of course it might be the case that systematic deviations from utility max, if you are capable of modeling them, *might* produce a materially different result in this model, and if so, well OK, you need to incorporate that or your model is 'wrong', but otherwise all that stuff is just noise, metaphorically your model is tracing out a line in 2D space, the 'truth' looks more like a cloud of data on a scatter plot, but if you're line is a good fit, you're golden, and criticism that your model fails to include things that might explain away some of the residual error, are beside the point.

      so my contention is that the scope of utility-max is enormous, there are thousands and thousand of applied problems where you can go a long way with that simple engine of purposeful behaviour.

      Why just take that Carroll paper, why not also his buffer stock saving work, also based on u max. etc. etc.

    7. Luis,

      You seem to be making (a decent) argument for optimation, but not necessarily "utility" optimization. If you sort of abandon any particular definition of utility to mean "whatever function is maximized" then it's almost tautological. Any equilibrium will be a stationary point of some mathematical object.

      The issue is that utility max represents a particular form of human behavior -- one that doesn't hold up very well when tested empirically on individuals. Maybe it holds as an effective theory for ensembles of individuals. Maybe it's just generally wrong. Maybe it's only right in certain limits. The point is that it should capture empirical successes if it's going to be your framework.

      But when you say:

      "... writing down a simple utility function and getting your economic agents to maximize it is just the easiest way to do that. "

      It is not the easiest. Entropy maximization is the easiest way to do that and it makes limited assumptions about the underlying behavior. Gary Becker even derived demand curves from entropy max (though he didn't call it that) in his 1962 paper on irrational behavior. That's also the approach on this blog :)

    8. Also one more thing: by scope I mean something more technical than the realm of possibilities. I mean the realm of empirical validity.

      The pertinent question is whether you can say ahead of time whether a utility max approach will fail or not. If you don't know that, you don't yet understand the scope of the approach.

    9. thanks. i shall have to find the time to read up on your entropy approach.

      meanwhile, I suppose yes, not necessarily utility maximization, although if you want an objective where workers may care about, say, consumption and leisure, combining those things in a utility function seems a fairly natural way to go.

      I am not 100% confident I understand what you mean by scope, but from what I've read it wouldn't surprise me if the scope of much extant economics was not clear.

    10. Jason, yes I completely agree on scope conditions.

      Luis, alternatives to utility maximisation include: (a) simple hydraulic functions as used in SFC models (b) Gigerenzer's 'heuristics' rule (c) Jason's information transfer stuff!

      One of my issues with utility maximisation is a version of Rabin (2000): sometimes it provides useful predictions in one domain, but as soon as you apply the same model/functional form to other domains, it is very wrong. This paper summarises the issue nicely in the context of the life cycle framework:

      Utility maximisation is the default paradigm for economists, but that isn't evidence that it's right. Your comment just shows that *you* consider it useful in many contexts, but it doesn't give me any reason to prefer it empirically.

  2. I get the impression this post is getting retweeted a lot as if it's a take down of economics, but I don't read it that way. It fits with how I understand theory within economics - which is often a very limited, 'parable' or single-mechanism illustrating, sort of thing, and not "our best and most complete understanding of how everything works suitable for drawing policy conclusions from" - in fact I think many critics of economics mistakenly think economists are doing the latter when they are doing the former. And - like most everyone else - I would like to see more synthesis and more systematic work on 'model selection' for when models are needed for policy purposes - I mostly think the way economists do things mostly reflects the nature of the problems they face, not a deficiency in economists. Although - again agreeing with most everyone here - economist are guilty of leaping from toy model to policy conclusion without sufficient caveats.

    I cannot recall - sorry I do read your blog but not necessary every post and my memory is poor - if you have commented on this line of work (i.e this paper and others by same authors):

    1. I personally didn't intend it as a "take down" of economics, but rather more of a "take down" of comparing econ to science.

      I said a couple times above that mathematical philosophy is fine, and I actually think the utility maximization framework might well be a good approximation in some cases.

      The thing is that I don't necessarily think treating econ as you treat established science will quite work in the same way -- it mostly comes down to the fact that model rejection is too weak a criterion for economics.

      Science didn't start off with any metrics for model rejection. Galileo didn't test statistical significance. There weren't error bars on the astronomical data. The question was whether it was useful. And Newton's laws and Kepler's elliptical orbits let you aim your cannons and get the calendar right. That is to say the mathematical philosophy was useful.

      RE: the Postlewaite paper

      I do remember reading that paper, but I don't think I remember writing anything about it (I tried a couple of searches on my blog).

      That paper is a fine work of mathematical philosophy (in fact, it reminds me of some of the old formal logic papers I read when I was an undergrad math major). I think that is their claim: that econ need not be "scientific" or produce "predictions" ... it can serve as a critique of bad reasoning used to motivate policy.

      I think that paper's claim is basically contained in what Keynes said about economics in his letter to Roy Harrod:

      "It seems to me that economics is a branch of logic, a way of thinking; and that you do not repel sufficiently firmly attempts à la Schultz to turn it into a pseudo-natural-science. One can make some quite worthwhile progress merely by using your axioms and maxims. But one cannot get very far except by devising new and improved models. This requires, as you say, "a vigilant observation of the actual working of our system". Progress in economics consists almost entirely in a progressive improvement in the choice of models. The grave fault of the later classical school, exemplified by Pigou, has been to overwork a too simple or out of date model, and in not seeing that progress lay in improving the model; whilst Marshall often confused his models, for the devising of which he had great genius, by wanting to be realistic and by being unnecessarily ashamed of lean and abstract outlines.

      But it is of the essence of a model that one does not fill in real values for the variable functions. To do so would make it useless as a model. For as soon as this is done, the model loses its generality and its value as a mode of thought."

      And this is fine, but if this is the aim it really should come with a warming label: economics is philosophy, and should be taken with the same weight as Rawls or Kant.

      I personally think the phenomena that fall under the study of economics can be analyzed scientifically and that retreat into logic and philosophy away from empirical successes and failures (of which prediction is just one manifestation) will just make it less relevant to people not more.


Comments are welcome. Please see the Moderation and comment policy.

Also, try to avoid the use of dollar signs as they interfere with my setup of mathjax. I left it set up that way because I think this is funny for an economics blog. You can use € or £ instead.