Full Text
HOANG TUY
CONCAVE PROGRAMMING WITH LINEAR CONSTRAINTS
(Presented by Academician S. L. Sobolev, 11 V 1964)
1. In this note we shall consider the problem of finding the minimum of a concave function \(f(x)\) on a convex polyhedron \(D \subset R^n\), where the function \(f(x)\) is called concave on the convex set \(D\) if
\[ f(\lambda x + (1-\lambda)y) \geq \lambda f(x) + (1-\lambda)f(y) \tag{1} \]
for any \(x, y \in D\) and any number \(\lambda \in [0,1]\).
As is known, the main difficulty of the problem under consideration is connected with the fact that a local minimum is not necessarily global. To overcome this difficulty, a general method is proposed here, consisting in the systematic use of the following two simple properties:
I. The minimum of a concave function on a convex polyhedron, if it exists, is attained at least at one vertex.
Let us call a concave extension of the function \(f(x)\) any function concave on the whole space \(R^n\) and coinciding with \(f(x)\) on \(D\).
II. For every function \(f(x)\) concave on a convex set \(D\), there exists a concave extension \(F(x)\) such that \(F(x) \geq \overline{F}(x)\) for any point \(x \in R^n\) and any concave extension \(\overline{F}(x)\).
Property I shows that, in solving the problem, it suffices to consider only the vertices of the polyhedron \(D\), while property II makes it possible, using the values of \(F(x)\) outside \(D\), to limit the number of vertices subject to enumeration.
2. Let us first consider the following auxiliary problem:
\((*)\) Let \(u^1, u^2, \ldots, u^n\) be linearly independent vectors, and let \(P = P(u^1, \ldots, u^n)\) be the hyperplane passing through the points \(u^1, \ldots, u^n\). Find, on the side of the hyperplane \(P\) opposite to the origin, the vertex of the polyhedron \(D\) farthest from \(P\).
If \(g(x)=0\) is the equation of the hyperplane \(P\), with \(g(0)<0\), then the problem \((*)\) is evidently reduced to finding the maximum of the linear functional \(g(x)\) under the conditions
\[ x \in D, \tag{2} \]
\[ g(x) > 0. \tag{3} \]
But one can proceed differently. Since the points \(x\) of the hyperplane \(P\) are characterized by the equalities
\[ x = \sum_{k=1}^{n} \lambda_k u^k, \tag{4} \]
\[ \sum_{k=1}^{n} \lambda_k = 1, \tag{5} \]
the solution of problem \((*)\) corresponds to the maximum of the quantity \(\sum_{k=1}^{n} \lambda^k\) under the conditions (2), (4), and
\[ \sum_{k=1}^{n} \lambda_k > 1. \tag{6} \]
Equality (4) can be written in the form \(x = B\lambda\), where \(\lambda = (\lambda_1,\lambda_2,\ldots,\lambda_n)\), \(B\) is the matrix with columns \(u^1,u^2,\ldots,u^n\), and, since these vectors are linearly independent, there exists a matrix \(B^{-1}\) such that
\[ \lambda = B^{-1}x. \tag{7} \]
The last formula defines \(\sum_{k=1}^{n}\lambda_k\) as a certain linear function \(h(x)\) of \(x\). Thus, if the matrix \(B^{-1}\) is known, then problem () can be solved by finding (for example, by the simplex method) a vertex of the polyhedron \(D\) corresponding to the maximum of \(h(x)\): if this maximum is greater than 1, then the solution has been obtained; otherwise problem () has no solution.
In what follows, for brevity, we shall call the matrix \(B\) the defining matrix of problem (*).
- We now describe the method for solving the problem posed. Without loss of generality one may assume that the polyhedron is “nondegenerate” (otherwise one must use the well-known perturbation device). By any of the known methods of linear programming we find some vertex of the polyhedron \(D\), then pass, if possible, to a vertex adjacent to it with a smaller value of \(f(x)\), and so on, until a vertex \(x^0\) is obtained such that
\[ f(x)\geq f(x^0)=\alpha_1 \tag{8} \]
for all vertices \(x\) adjacent to it. To simplify the notation of some formulas, we take \(x^0\) as the origin of coordinates and begin the process of finding the global minimum.
In general, the process will consist of several steps, at each of which one has to solve a certain number of auxiliary problems (*) such that: 1) if none of these auxiliary problems has a solution, then the process is completed at this step and the solution of the problem posed has been obtained; 2) otherwise, each solvable auxiliary problem of the given step generates certain new auxiliary problems for the next step.
At the first step there is a single auxiliary problem, constructed as follows. In view of the nondegeneracy of the polyhedron \(D\), exactly \(n\) edges emanate from the vertex \(x^0=(0,\ldots,0)\). On the \(k\)-th edge \((k=1,2,\ldots,n)\) take the point \(y^{1,k}=\theta_{1,k}\xi^k\), where \(\xi^k\) is the direction vector of this edge, \(\theta_{1,k}=\max\{\theta:F(\theta\xi^k)\geq \alpha_1\}\), and \(F(x)\) is the concave extension of \(f(x)\) mentioned in I; moreover, if the set indicated in braces is unbounded, then \(\theta_{1,k}\) may be taken arbitrarily large. The matrix \((y^{1,1},y^{1,2},\ldots,y^{1,n})\) will serve as the defining matrix for the first auxiliary problem. Since, according to I, \(f(x^0)=\alpha_1\) is the minimum of the function \(F(x)\) in the polyhedron with vertices \(x^0,y^{1,1},y^{1,2},\ldots,y^{1,n}\), it is clear that the absence of a solution to the auxiliary problem under consideration will mean that \(x^0\) is the desired solution of the problem posed.
Suppose now that all auxiliary problems at the \(q\)-th step have already been solved. Put \(\alpha_q=\min\{\alpha_{q-1},\alpha'_q\}\), where \(\alpha'_q\) is the greatest value of \(f(x)\) at all vertices encountered in the process of solving these problems. If the \(k\)-th auxiliary problem with defining matrix
\[ B_{q,k}=(y^{q,k_1},y^{q,k_2},\ldots,y^{q,k_n}) \]
has a solution \(x^{q,k}\), then we construct the points \(\bar{x}^{q,k}=\theta_{q,k}x^{q,k}\), \(\bar{y}^{q,k_i}=\theta_{q,k_i}y^{q,k_i}\ (i=1,2,\ldots,n)\), where \(\theta_{q,k}=\max\{\theta:F(\theta x^{q,k})\geq \alpha_q\}\), \(\theta_{q,k_i}=\max\{\theta:F(\theta y^{q,k_i})\geq \alpha_q\}\), then from equation (7), where \(x\) is replaced by \(x^{q,k}\), and \(B\) by \(B_{q,k}\), we determine \(\lambda^{q,k}=(\lambda^{q,k}_1,\lambda^{q,k}_2,\ldots,\lambda^{q,k}_n)\), and for each \(j\) such that \(\lambda^{q,k}_j\ne 0\), we consider the auxiliary problem defi-
the defining matrix of which is obtained from \(B_{q,k}\) by replacing \(y^{q,k}_j\) by \(\bar{x}^{q,k}\), and \(y^{q,k}_i\) \((i \ne j)\) by \(\bar{y}^{q,k}_i\). The totality of all auxiliary problems generated in this way constitutes the set of auxiliary problems that will have to be solved at the \((q+1)\)-st step.
It can be shown that the process must terminate after a finite number of steps; moreover, if it terminates at the \(q\)-th step, then \(\alpha_q\) is the desired global minimum.
- Remark 1. All auxiliary problems have the same constraints (namely, the constraints of the original problem \(x \in D\)) and differ only in the linear form \(h(x)\) that is to be maximized. But, as indicated in Sec. 2, \(h(x)\) is easily computed from the inverse defining matrix \(B^{-1}\). If the auxiliary problem under consideration generates new ones, then the defining matrix \(B'\) of each new problem is obtained from the old one simply by multiplying each column by some number and by replacing one column. Hence, knowing \(B^{-1}\), one can easily find \((B')^{-1}\) by well-known formulas (using, for example, the multiplicative form of the inverse matrix, see (1)). In addition, when applying the simplex method to the solution of each new auxiliary problem, one can always use as the first (and, in general, quite good) approximation the solution of the old problem. These circumstances greatly facilitate the solution of the auxiliary problems.
Remark 2. It is easy to see that, for our method, the concave extension mentioned in property I is the best one. In the case where \(f(x)\) has the form
\[
f(x)=\sum_{j=1}^{n} f_j(x_j),
\]
where each function \(f_j(t)\) is concave, the best extension is obtained simply by linear extrapolation outside the interval \(\{t:t=x_j \text{ for at least one point } x\in D\}\).
Remark 3. Apparently, the method can be given various forms. For example, after solving the auxiliary problem at the first step, we have \(f(x) \geq f(x^0)=\alpha_1\) for all \(x\in D\) such that \(g(x)\leq 0\), where \(g_1(x)=0\) is the equation of the hyperplane \(P(y^{1,1},y^{1,2},\ldots,y^{1,n})\) \((g_1(0)<0)\), and therefore the same procedure that was applied to \(D\) may be applied to the remaining polyhedron \(\{x\in D,\ g_1(x)\geq 0\}\). As a result we obtain a new remaining polyhedron \(\{x\in D,\ g_1(x)\geq 0,\ g_2(x)\geq 0\}\), and so on. This variant has the advantage that at each step only one auxiliary problem need be solved, but its disadvantage is that more and more new constraints are added to the original ones. The latter is especially unpleasant when the original constraints have a simple specific structure (as, for example, constraints of the transportation type).
- Consider the case where the polyhedron \(D\) is given in canonical form: \(Ax=b,\ x\geq 0\) \((x\in R^n,\ b\in R^m,\ m\leq n)\). Let the initial vertex \(x^0\) be such that \(x^0_j>0\) for \(j=1,2,\ldots,m\); \(x^0_j=0\) for \(j=m+1,\ldots,n\). If \(\xi^k\) \((k=m+1,\ldots,n)\) is the direction vector of the \(k\)-th edge issuing from the vertex \(x^0\), and the points \(y^{1,k}\) \((k=m+1,\ldots,n)\) are defined as above:
\[ y^{1,k}=x^0+\theta_{1,k}\xi^k, \]
where
\[ \theta_{1,k}=\max\{\theta:F(x_0+\theta\xi^k)\geq f(x_0)\}, \]
then it is easy to show that \(f(x)\geq f(x_0)\) for all points \(x\in D\) satisfying the condition
\[ \sum_{k=m+1}^{n}\frac{x_k}{\theta_{1,k}}-1\leq 0. \]
Therefore, when the proposed method is applied to this case, the linear form of the first auxiliary problem is
\[ h(x)=\sum_{k=m+1}^{n}\frac{x_k}{\theta_{1,k}}-1, \]
so that its inverse defining matrix has diagonal form with diagonal elements
\[ \frac{1}{\theta_{1,k}} \]
\((k=m+1,\ldots,n)\). Hence it is very easy to find, recursively, the linear forms of all subsequent auxiliary problems.
- Finally, in the case of a problem of the transportation type, where one seeks the minimum
\[ f(x)=\sum_{i,j} f_{ij}(x_{ij}) \]
under the conditions
\[ \sum_j x_{ij}=a_i,\qquad \sum_i x_{ij}=b_j,\qquad x_{ij}\geq 0 \quad (i=1,2,\ldots,m;\ j=1,2,\ldots,n), \tag{9} \]
the proposed method has the following form.
Take a vertex of the polyhedron (9), i.e., a vector \(x^0=\{x^0_{ij}\}\) satisfying the conditions (9) and such that the set of “occupied cells”
\[ T=\{(i,j): x^0_{ij}>0\} \]
consists of exactly \(n+m-1\) elements not forming cycles (the problem is assumed to be nondegenerate). Then each cell \((i',j')\in \bar T\) forms, together with the occupied cells, a unique cycle, and one may, as in the MODI method, try to improve the feasible solution by making corrections along cycles. If such improvements are impossible, then \(x^0=\{x^0_{ij}\}\) is a local minimum. Then for each cell \((i',j')\in \bar T\) we define the positive number
\[ \theta_{i',j'}=\max\{\theta:\ \Sigma^+[f_{ij}(x_{ij}+\theta)-f_{ij}(x_{ij})] +\Sigma^-[f_{ij}(x_{ij}-\theta)-f_{ij}(x_{ij})]\geq 0\}, \]
where \(\Sigma^+\), \(\Sigma^-\) denote sums extended respectively over all odd and all even cells of the cycle determined by the cell \((i,j')\) (the cell \((i',j')\) is considered the first cell). It can be shown that \(f(x)\geq f(x^0)\) for all feasible \(x=\{x_{ij}\}\) such that
\[ g(x)=\sum_{(i,j)\in \bar T}\frac{x_{ij}}{\theta_{ij}}-1\leq 0. \]
Therefore one should solve the auxiliary problem:
\[ \max\left(\sum_{(i,j)\in \bar T}\frac{x_{ij}}{\theta_{ij}}\right) \]
under the conditions (9).
This is already an ordinary transportation problem. If this auxiliary problem has no solution \(x\) with the condition \(g(x)>1\), then \(x^0\) is the desired global minimum. Otherwise it generates a certain number of new auxiliary problems, which again will be transportation problems, and so on. It is known that the placement problem reduces precisely to a problem of the type considered here. Therefore the proposed method is applicable to it.
Hanoi University
Hanoi, Democratic Republic of Vietnam
Received
27 IV 1964
REFERENCES
- S. I. Gass, Linear Programming, Moscow, 1960.