A (strict) formulation of the optimal transport problem. It’s always taught with dirt piles; why deviate from tradition.

Consider dirt piles at locations and holes at locations . Suppose pile has units of dirt and hole can hold units of dirt. It costs to transport a unit of dirt from to . Our question is: Which dirt pile do we send to which hole in order to minimize the overall cost?

More formally, we want a transportation plan which maps piles to holes, telling us where to send the -th dirt pile. Each pile can only be sent to one hole, and at the end of the process, each hole should be full. That is, for each , we should have .

We are thus looking to solve the following problem:

This is the Monge formulation in the discrete case.

To see what this has to do with probability, suppose that and represent two discrete distributions, with mass at and mass at . That is, and constitute two discrete probability measures and . is a map between these two measures that obeys

That is, the pushforward measure must be equivalent to the distribution . With this perspective in mind, we can also notice that is simply the expected value of when . This motivates the extension of the discrete Monge formulation to the continuous case:

This is the (continuous) Monge formulation, and a map solving the above problem is sometimes called a Monge map (or a transportation plan). There’s an obvious problem with this formulation, however: What if there is no mapping between piles and holes such that each hole will be filled by each pile. That is, what if there is no map such that . In fact, this is a very restrictive requirement in many applications. This is what motivates the Kantorovich relaxation and optimal transport costs.