Groucho Marxism

Questions and answers on socialism, Marxism, and related topics

Mathematical optimization is the selection of a best element, with regard to some criterion, from some set of available alternatives. The criterion is specified using an objective function on a given domain and the aim is to find the variable that maximizes or minimizes this function within this domain. Mathematical optimization is generally divided into two sub-fields: continuous optimization and discrete optimization. In continuous optimization, the domain is a set of real numbers between which there are no gaps. Because of this no gap assumption, continuous optimization allows for the use of calculus techniques. In the simplest case, to find the maximum of minimum of a function f(), we simply set the derivative, f’(), to zero, and solve the resulting formula.

For example, suppose we want to find the minimum of the objective function f(x) = x2-x+1 defined on the real numbers. First we take the derivative: f’(x) = 2x-1; then we set it to zero, which yields x = 1/2. So the objective function is maximized or minimized when x = 1/2. To know whether it is a maximum or minimum we need to take the second derivative: f’’(x) = 2. As this is positive, we know we have found a minimum, and therefore the minimum value of f() is given by f(1/2) = 1/4-1/2+1 = 3/4. Easy! Thus there is a fundamental and perhaps surprising link between calculus, the mathematical study of change, and mathematical optimization. I say surprising because calculus was invented for an entirely different purpose – to describe the motion of projectiles and other objects.

In discrete optimization, on the other hand, the domain is a discrete (usually finite) set. In a previous blog post I outlined a discrete analogue of continuous calculus for functions defined on finite domains (see that blog post for more details). This raises the question of whether discrete calculus can be used for discrete optimization in the same way that continuous calculus can be used for continuous optimization. Consider a function f() defined on the finite domain X = {0,1,…,N}. The discrete forward derivative of a function f() defined on X is given by D+f(x) = f(x+1)-f(x), and the discrete backwards derivative is defined by Df(x) = f(x)-f(x-1). Suppose again that we want to find the minimum of the function f(x) = x2-x+1 defined on X. Can we use discrete calculus to do this?

Taking the forward derivative of the function f() gives: D+f(x) = f(x+1)-f(x) = (x+1)2-(x+1)+1-x2+x-1 = 2x; and setting this to zero gives x = 0. On the other hand, taking the backwards derivative of the function f() gives: Df(x) = f(x)-f(x-1) = x2-x+1-(x-1)2+(x-1)-1 = 2x-2; and setting this to zero gives x = 1. Immediately we see a couple of problems: we get a different value depending on which derivative we use, and both values are different to the value we get using the continuous derivative! The latter problem shouldn’t concern us as the value we obtained using the continuous derivative is not in the domain X; but neither is the value we obtained using the discrete backwards derivative. Clearly, we cannot simply apply the same technique for discrete optimization as we used in the continuous case.

We can, however, apply the same logic. If a minimum (maximum) exists at a point x in X then we would expect D+f(x) to be nonnegative (nonpositive) and Df(x) to be nonpositive (nonnegative). Thus, to find the minimum  of the function f(x) = x2-x+1 we need to find a value x such that 2x ≥ 0 and 2x-2 ≤ 0, or x ≥ 0 and x ≤ 1. The only values of x in the set X that satisfies these inequalities are 0 and 1. Does these give us the minimum? The answer is yes, because f(0) = f(1) = 1, and this is indeed the minimum value of f(x) when x is constrained to lie in the set X. So we have successfully used discrete calculus to find the minimum of the function f(x) = x2-x+1 on the domain X. The question that naturally arises is: does this method work in general, or did we just get lucky?!

Let us consider a more general problem of finding the minimum of the function f(x) = ax2+bx+c. Taking the forward derivative gives D+f(x) = f(x+1)-f(x) = a(x+1)2+b(x+1)+c-ax2-bx-c = a(2x+1)+b, and taking the backwards derivative gives Df(x) = ax2+bx+c-a(x-1)2-b(x-1)-c = a(2x-1)+b. Setting the former to be nonnegative gives x ≥ -(1/2)-(b/2a), and setting the latter to be nonpositive gives x ≤ (1/2)-(b/2a). For both these inequalities to hold, x must lie in an interval of width 1 around the point -b/2a. In other words, x must equal -b/2a rounded to the nearest integer, which might be two integers if -b/2a lies exactly halfway between them (as with our example above). We know from continuous calculus that the function f(x) = ax2+bx+c defined on the reals is minimized when x = -b/2a, so this is the correct result.

In fact it obvious when you think about it that D+f(x) ≥ 0, Df(x) ≤ 0 are necessary conditions for x to be a minimum of the discrete function f(), and that D+f(x) ≤ 0, Df(x) ≥ 0 are necessary for x to be a maximum. These are analogous to the conditions f’(x) = 0, f’’(x) > 0 and f’(x) = 0, f’’(x) < 0, respectively, for continuous functions.

Posted in

Leave a comment