Wednesday, August 12, 2015

Explicit implicit models

Not sure why I am doing this, but I thought it might be helpful to see "explicit" implicit models and how they frame the data. This is in regard to this (ongoing) discussion with Mark Sadowski.

Let's take (the log of the) monetary base (MB) data from 2009 to 2015 and fit it to two theoretical functions. One is a line (dashed) and the other is a series of three Fermi-Dirac distribution step functions (solid):

The first difference of the data (yellow), theoretical line (dashed) and theoretical steps (solid, filled) are shown in this plot:

If we expect a linear model, we can see the data as fluctuations around a constant level. If we expect the step model, we can see the data as fluctuations around three "pulses". It's not super-obvious from inspection that either of these is the better model of Δ log MB. The Spearman test for correlation [1] of the first differences is -0.07 (p = 0.53) for the line and 0.51 (p = 2e-6) for the steps.  The steps win this test. However if you use the data instead of the theoretical curves to compare to other variables, you can't actually conduct this test so you don't know which model of Δ log MB is best.

Now let's assume a linear model between the (log of the) price level P and log MB and fit the two theories:

Again, the first differences (data = yellow, line theory = dashed, step theory = filled):

Although it wasn't obvious from the difference data which model of Δ log MB was better, it's now super-obvious which model of Δ log MB is the better model of Δ log P (hint: it's the line). The Spearman test for correlation of the first differences is 0.23 (p = 0.04) for the line and 0.006 (p = 0.95) for the steps (i.e. the line is correlated with the data). This would imply that:

• If you believe the linear theory of log MB, then log MB and log P have a relationship.
• If you believe the step theory of log MB, then log MB and log P don't have a relationship.

This is what I mean by model dependence introduced by the underlying theory. If you think log MB is log-linear, you can tease a relationship out of the data.

Now if you go through this process with (the log of) short term interest rates (3-month secondary market rate), you end up with something pretty inconclusive on its face:

You might conclude (as Mark Sadowski does corrected; see comment below) that short term interest rates and the monetary base don't have a relationship. The Spearman test for correlation of the first differences says otherwise; it gives us -0.09 (p = 0.44) for the line and 0.29 (p = 0.01) for the steps (i.e. the steps are correlated with the data).

...

However, Mark left off the first part of QE1 in his investigation -- he started with Dec 2008.  So what happens if we include that data? It's the same as before, except now we use 4 Fermi-Dirac step functions for the step model:

Note that the linear model already looks worse ... here are the first differences:

The Spearman test for correlation of the first differences is -0.03 (p = 0.77) for the line and 0.59 (p = 6e-10) for the steps (i.e. the steps are correlated with the data).

The step theory (filled) captures many more of the features of the data (yellow) than the linear model (dashed). The price level first differences are pretty obviously the line, and pretty obviously not the step:

The Spearman test for correlation of the first differences says both are uncorrelated with the data; it gives -0.01 (p = 0.93) for the line and 0.04 (p = 0.72) for the steps (i.e. neither are correlated with the data).

But the really interesting part is in the (log of the) short term interest rates:

In the first differences, you can see the downward pulses associated with each step of QE:

The Spearman test for correlation of the first differences is -0.1 (p = 0.36) for the line and 0.25 (p = 0.02) for the steps (i.e. the steps are correlated with the data).  Actually, in the plot above there seems to be a market over-reaction to each step of QE -- rates fall too far, but then rise back up. The linear theory just says its all noise.

So the results of the 4 step model from 2008 to 2015?

• log MB is not related to log P
• log MB is related to log r

But remember -- all of these results are model-dependent (linear vs steps).

Footnotes:

[1] I used Spearman because Pearson's expects Gaussian errors and on some of the data, the errors weren't Gaussian. Mathematica automatically selected Spearman for most of the tests, so I decided to be consistent.

1. Jason, this is pretty interesting. One thing I don't understand is this sentence:

"However if you use the data instead of the theoretical curves to compare to other variables, you can't actually conduct this test so you don't know which model of Δ log MB is best."

Also, did you do any procedure to find a lag? The 7th plot looks like there's a definite lag between the yellow curve and the blue curve, especially on that 1st spike (but actually it's evident on the next two spikes as well... the blue curve leads the yellow, but about the same amount in all three cases).

Why did you choose to color inside the solid blue curve with blue? How did you define the horizontal line to stop coloring at? Was that just y=0?

Plot 8 shows the solid blue steps going down... does that just mean you took the four Fermi-Dirac steps fit to log MB in plot 6, and then scaled them and added added a constant (a two parameter OLS fit) to fit them to the short term interest rate data? Solving this, where all are column vectors

min over s and c || r - [f 1]*[s c]' ||^2

Where r is the vector of log interest rates, f is the vector of samples of the Fermi-Dirac approximation (previously fit to MB), 1 is a vector of 1s, and s and c are scalars (representing the scale factor and constant offset to f respectively)?

Of course you must have done the same thing previously when you just had three Fermi-Dirac steps in fitting it to the log P data.

Then you do your correlation check after this kind of fit, correct? I guess you could have resampled the Fermi-Dirac approximations too if you'd needed to in order to line up with sample times for either r or P that did not coincide with sample times for MB (I don't know if you did that).

How far off am I on the above?

1. Hi Tom,

"One thing I don't understand is this sentence ..."

I effectively created two 'idealized data sets': a line and some steps (MBline, MBstep). I then compared these two idealized data sets (MBline, MBstep) to the original data (MB). I then used MBline and MBstep to look at MBline vs P and MBstep vs P, etc. I could also compare MBline vs MB and MBstep vs MB. When Mark does his analysis, he doesn't explicitly create a model of MB -- he compared the MB data vs P.

If you compare the original data to itself (MB vs MB), it would line right up and you wouldn't learn anything. You can't do the analyses MBline vs MB and MBstep vs MB ... which is the meaning of that sentence.

The Granger causality tests implicitly create a bunch of linear models (that are translated in time by the lags). I wanted to show you could get different results if you assumed a single line (explicit implicit model) vs assuming a series of steps (a different explicit implicit model).

...

I didn't look for lags in the step model (because I was lazy) and the subsequent choice of the locations of the steps weren't constrained to the grid points (monthly data) so they appear off because of the sampling. For the linear model, a single lag is chosen implicitly by the slope/intercept. Take a lag of 1 ...

y = m (x - 1) + b = m x + m + b = m x + b'

with b' = m + b

...

The coloring was chosen to go to the axis because in the graph of delta log MB, the curves lined up almost too well to be seen clearly.

...

I fit the function f(t, {p}) where p are the parameters fit to the log MB data. To fit the interest rate and price level data, I fit

a f(t, {p}) + b

to the respective data using the fit parameters {p} from the original fit to MB, only fitting a and b.

2. "The Granger causality tests implicitly create a bunch of linear models..."

So was your single line fit meant to be an (explicit) approximation of what's implicitly going on in a Granger test?

3. This comment has been removed by the author.

4. Given the Granger test (more specifically the T&Y procedure, say) determined model parameters and lags, etc, is it possible to more directly use the resultant "bunch of linear models" in a comparison like you do in the above (computing the Spearman rho, etc) rather than a single line approximation? I've been trying to imagine how that would work, but my over-taxed brain can't connect all the dots. Maybe it's a fool's errand. (c:

5. Yes, it would involve looking at a mutli-dimensional space containing data set 1 and data set 2 and finding the multi-dimensional linear transformation (a matrix) that best maps one to the other (given some restrictions for causality, i.e. a triangular matrix -- each data point could be caused by all the previous ones, but not following ones).

6. Thanks Jason. You ever check your request box on the right? I think I'll send one.

2. It was not so much a "discussion" as a battle between two 500 foot tall mega-robots for the streets of Tokyo, laying waste to much of the city and countryside in the process. However, with this post you strapped him (metaphorically speaking) to a nuclear missile and detonated him in a low Earth orbit (so to speak).

1. Ha!

What's funny is that I actually don't entirely disagree with the result as I wrote about here:

The order of magnitude of the effects Mark finds are consistent with a view that monetary policy became about ten times less effective immediately after the financial crisis. It's consistent with an IT index going from 0.6 to 0.9.

3. Jason, after 1st differencing, were you concerned with stationarity at all? Is the idea that the 1st difference was supposed to render the various sequences above stationary, or at least true data points?

1. All of the series are almost but not entirely unlike tea. I mean, fail to reject the null hypothesis of stationarity ...

2. tea? ... I'm missing the joke there.

3. From the Hitchhiker's Guide ...

https://en.wikiquote.org/wiki/The_Hitchhiker%27s_Guide_to_the_Galaxy#Chapter_17

Generally, statistics has a way of making its technical language into an unnecessarily confusing chain of double negatives.

"the test on X fails to reject the null hypothesis of stationarity"

i.e. "X is stationary"

"the test on X rejects the null hypothesis of stationarity"

i.e. "X is non-stationary"

4. Lol... yes, I find I have to put my "speed reading" to the side when deciphering those sentences.

4. I found this to be an interesting overview of techniques:
http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm

1. Seems like a good reference.

5. O/T, does the ITM have anything to say about China recently? As I recall it's one of the countries whose information transfer coefficient puts it squarely in the QTM region.

1. Nothing specific -- it is a QTM country. So devaluing their currency relative to the US means there will be inflation and increased output. Just like normal.

7. A "stationary" series is *stochastic* process. The probabilistic counterpart of a stochastic process is a *deterministic* process.

A line is a deterministic process. So is a Fermi-Dirac distribution, which is simply an example of a logistic function. And if you splice three (or more) logistic functions together you still have a deterministic process.

Deterministic processes cannot be rendered stochastic by differencing them, so they cannot be made stationary. (And unit root tests will of course produce a "near singular matrix" error message.)

Correlation can only be meaningfully computed for stationary series. This is true even if the measure of correlation is Spearman's rho.

In other words, every single one of the p-values reported in this post are invalid, since the corresponding Spearman's rho values each involve at least one nonstationary series.

Jason:
"But remember -- all of these results are model-dependent (linear vs steps)."

All of these results are yet more examples of Jason's time series derpometrics.

Jason:
"You might conclude (as Mark Sadowski does) that short term interest rates and the monetary base don't have a relationship."

I've never said any such thing. In fact the monetary Granger causes the three month Treasury-Bill yield at the 5% significance level in the Age of ZIRP.

What I said was there is "no credible mechanism by which short term interest rates directly and significantly impact the economy".

There is no *short term interest rate channel of monetary transmission* in any monetary textbook, because no such thing exists.

P.S. When 3-month T-Bills are added to the baseline VAR, they have no significant effect on output or inflation in any month.

1. Imagine the limit of a stochastic process with drift as the variance goes to zero.

That this is acceptable is backed up by the fact that the results here are unchanged by adding a bit of noise.

As a physicist everything is the limit of stochastic processes :)

If short term interest rates aren't correlated with the economy while the monetary base is correlated with the economy, it stands to reason that there shouldn't be a strong correlation between the monetary base and short term rates. But I guess you show there is in some new calculation?

I will stand corrected on that, and will correct the above post.

Question: do 3-month rates Granger-cause the monetary base?

2. Mark, you wrote:

"A "stationary" series is *stochastic* process. The probabilistic counterpart of a stochastic process is a *deterministic* process."

Is that what you meant to write? I would have thought that a stochastic process would be the probabilistic counterpart to a deterministic process.

3. Matched filters work best on (correlating) deterministic signals. It's true they are intended to be robust to additive noise, but the less noise there is, the better they work (i.e. the more meaningful is their output).

4. Jason:
"That this is acceptable is backed up by the fact that the results here are unchanged by adding a bit of noise."

By themselves graphs are not sufficient evidence of anything.

Jason:
"If short term interest rates aren't correlated with the economy while the monetary base is correlated with the economy, it stands to reason that there shouldn't be a strong correlation between the monetary base and short term rates."

Why? The monetary base is strongly correlated to bank deposits and bank credit and yet there isn't a correlation between bank deposits or bank credit and the economy in the age of ZIRP.

Jason:
"Question: do 3-month rates Granger-cause the monetary base?"

No, the p-value is 22.0%.

5. Tom:
"Is that what you meant to write?"

Yes.

Tom:
"Matched filters work best on (correlating) deterministic signals."

Matched filters work via cross correlation, and cross correlation refers to the correlation between stochastic signals.

6. Mark,

"By themselves graphs are not sufficient evidence of anything."

I did everything I did above but with added noise (~ 1%) and it came out the same to the accuracy of the numbers reported above. The graph was just a sample because I am lazy and didn't want to just repeat everything. Your argument didn't hold after checking it, so there really isn't any reason to say anything more than that. But I'll post a link to a pdf of the notebook when I get a chance.

"No, the p-value is 22.0%."

Interesting. Are you comparing log MB with log r, or log MB with r (r = interest rate)?

Regarding Tom's question I think you might have messed up the sentence there (I don't think it's a big deal since I understood what you were trying to say). A stochastic process is the probabilistic counterpart to a deterministic process.

7. This comment has been removed by the author.

8. Jason:
"I did everything I did above but with added noise (~ 1%) and it came out the same to the accuracy of the numbers reported above."

Did you perform unit root tests to make sure differencing them makes them stationary? It's very likely that they are not integrated of order one.

Jason:
"Interesting. Are you comparing log MB with log r, or log MB with r (r = interest rate)?"

With r, not ln r.

In applied macroeconomics, interest rates are almost never logged.

In fact, in time series analysis in general, the rule of thumb is to only log those series which are expected to grow exponentially (e.g. NGDP, population etc.) but not those series expected to fluctuate around a fixed level (e.g. unemployment rates, interest rates etc.).

And, interest rates can be nonpositive, for which values the log will of course be undefined.

Furthermore, some master econometricians argue that the primary benefit of logging a series is that it stabilizes the variance. But if you take the log of an interest rate series near zero in value, variance becomes less stable, not more. So logging interest rates is, if anything, counterproductive.

Jason,
"Regarding Tom's question..."

9. "Did you perform unit root tests to make sure differencing them makes them stationary?"

Yes.

"In applied macroeconomics, interest rates are almost never logged. ... "

Of course that makes sense, but there are two reasons to look into the logged versions:

1) Interest rates change by a couple orders of magnitude (from ~ 1% to ~ 0.01%) at the onset of QE. They're not just fluctuating around a fixed level.

2) There seems to be a pattern across several countries where you can write a function a log MB + b ~ log r

US: Graph on FRED