I've done a little bit of site redesign changing out some of the graphics, the blog description and adding a set of random posts instead of popular posts to the sidebar. This post serves mainly as an explanation of the new diagram appearing in the upper right of this blog when viewed with a desktop browser (and appears schematically in miniature as the favicon).

Imagine supplying pints of blueberries to various stores (our "space" coordinate). Blueberries go bad if left on the shelf too long, so each unit of supply and each unit of demand represents a single square on space-time grid. A person shows up at store #4 (space grid point 4) at time t = 2 (time grid point 2) wanting to buy blueberries, represented on the grid as a white translucent box. If there are blueberries available (represented by a blue cube), then blue cube and white box match up and a transaction occurs. If not, the blueberries go to waste or some demand goes unsatisfied. Here is a basic picture:

This is an allocation problem where we are trying to match up the blue cubes with the white boxes. I labeled the blue and white boxes destination and source respectively because they represent an information source and an information destination ... what does this have to do with information you ask?

I'm going to simplify this picture a bit by saying the blueberries don't go bad ... so we can add up all the boxes along the time direction (integrate over time) like this:

This creates a bar chart that is essentially a probability distribution:

Matching these two probability distributions is the least restrictive constraint on the solution to the allocation problem. In communication systems, one effectively tries to build a channel that gets the distribution of symbols (e.g. letters in the English alphabet or binary 1's and 0's) right on the input side and the output side; this is the basis of information theory [1].

Now you can't sell blueberries if no one comes to the store to buy them, so any difference in the destination distribution (blue) from the source distribution represents a loss ... a loss of information available in the source distribution. In information theory, this loss can be measured via the Kullback-Leibler divergence (calculated in the graph). In general,

*I(destination) ≤ I(source)*. Assuming equality is called ideal information transfer or information equilibrium. It turns out to be equivalent to assuming ideal markets where there is no excess supply or demand -- the probability distributions are equal.
You may be asking where the price comes into this. Well, if the blue (

*B*) and white (*W*) boxes stay the same, then the framework is completely agnostic as to what the price is. All we can say is that the price is some constant value, call it*p0*, that we'd have to measure empirically. But if the allocations change by small amounts*dB*and*dW*, we can look at how the price fluctuates with changes in supply and demand. In this framework,*p = dW/dB*... the change in demand given a change in supply.
Now you can change some of the details, but the general principle of looking at the information in the distributions that solve the allocation problem -- and how they have to change when either the source or destination changes -- is the framework set up by the information equilibrium (information transfer) model.

The framework is a somewhat more general (i.e. less restrictive) take on utility maximization and search/matching theory where one tries to solve for the optimal allocation (more on that here).

**Footnotes:**

[1]

**Added 5/12/2015**: Borrowing from Shannon (1948), the probability distribution of letters in English evolves as we add dimensions:

Symbol frequency in English (1D categorical distribution)

OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL

IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE

"A person shows up at store #4 (space grid point 4) at time t = 2 (time grid point 2) wanting to buy blueberries, represented on the grid as a white translucent box. If there are blueberries available (represented by a blue cube), then blue cube and white box match up and a transaction occurs."

ReplyDeleteWhether we start counting from 0 or 1 at the apparent 1st square in either dimension, this appears to not be the case.

Assuming we count from 1, then t=2, s=4 has one pint of blueberries, but no demand. Counting from 0 is a square with neither supply nor demand.

OK, not very important perhaps, but I like to check these things.

You write:

"I'm going to simplify this picture a bit by saying the blueberries don't go bad"

... and that demand is persistent as well.

"In information theory, this loss can be measured via the Kullback-Leibler divergence (calculated in the graph). In general, I(destination) ≤ I(source). Assuming equality is called ideal information transfer or information equilibrium. It turns out to be equivalent to assuming ideal markets where there is no excess supply or demand -- the probability distributions are equal."

However, simplifying to two space possibilities and integrating over time again, and assuming (like I did in your more recent draft lecture post) that for demand we have P(Demand at space=1) = 0.1 and P(Supply at space 1) = 0.9, (i.e. both Bernoulli processes, unequal to each other, but with equal information entropy), what happens? I realize that KL divergence still applies, and that information has to be lost.... but are your words in this section 100% accurate?

I think I regenerated the figure (which was randomly generated) after writing the text with an earlier version.

DeleteI think it is best to view information equilibrium as a necessary, but not sufficient condition for two distributions to be equal. As in the previous post you had commented on, there is a heirarchy: equal, matched, information equilibrium. We are generalizing the idea of equilibrium/equality.

For example, a normal distribution, uniform distribution and a pareto distribution can have the same information entropy for a particular choice of parameters σ vs a,b vs x0,α ...

normal ~ 2 log σ + ...

pareto ~ log x0/α + ...

uniform ~ log (b-a)

but the sequence of draws will show differences. In your case, they'd show almost exactly the inverse frequencies of 1's and 0's. One averages 9 0's out of 10 and the other averages 9 1's out of 10. Changing the labels in your case eliminates the issue, but in the case of these uniform distributions, the KL-divergence will be different and you'd have to set up a more complex model to handle it. This is also why the KL divergence is only a measure of information loss, not the absolute measure of information loss.

In your example, we have the case where almost every demand isn't met and almost every supply doesn't fall on a demand. The two sequences contain the same information (they are complements of each other).

Your example just comes down to the symmetry of the binary information function (there are two possible binary probabilities yielding the same information p and 1-p).

One way to think of it is as the other solution to the quadratic equation given that we've made our measure of information positive definite.