Full Text
UDC 519.3
MATHEMATICS
A. A. KAPLAN
ON FINDING THE EXTREMUM OF A LINEAR FUNCTION ON A CONVEX SET
(Presented by Academician L. V. Kantorovich, 18 IV 1967)
The problem is considered of finding the maximum of the linear function
\(f(x) \equiv (c,x)\) on a convex closed bounded solid set \(X\) of the Euclidean space \(R^n\).
The following method for solving the problem is proposed. We shall assume that, in some way, an interior point \(x^* \in X\) has been determined and that a bounded polyhedron containing the set \(X\) has been constructed,
\[ \Omega_0 \equiv \{x:\ \psi_l(x) \geq 0,\quad l=1,2,\ldots,t\}, \]
where \(\psi_l(x) \equiv (a_l,x)-b_l\).
In the proposed process a sequence of polyhedra is constructed
\(\Omega_0 \supset \Omega_1 \supset \Omega_2 \supset \cdots\). On each of them a linear programming problem is solved, as a result of which we obtain a sequence of points \(\{y^k\}\), \(k=0,1,2,\ldots\), such that
\[ f(y^k)=\max_{x\in\Omega_k} f(x). \]
If at some step \(y^k \in X\), then \(y^k\) is the solution of the original problem, and the process terminates.
If \(y^k \notin X\), we find the point \(x^k\) at which the ray
\(x^*+\lambda(y^k-x^*)\), \(\lambda>0\), intersects the boundary of the set, and determine any nonzero linear functional
\(\eta_{t+k+1}(x^k)\in R^{n*}\), supporting the set \(X\) at the point \(x^k\). Adding to the constraints defining the polyhedron \(\Omega_k\) the linear constraint
\[ \psi_{t+k+1}(x)\equiv(\eta_{t+k+1}(x^k),\,x-x^k)\geq 0, \]
we cut off the part of \(\Omega_k\) that contains no points of the set \(X\). A new polyhedron is obtained,
\[ \Omega_{k+1}\equiv\{x:\ \psi_l(x)\geq 0,\quad l=1,2,\ldots,t+k+1\}, \]
and the process can be continued.
Let us additionally note two points in the implementation of the method.
1) In solving the linear programming problem at each step it is expedient to pass to the dual problem. Then the optimal basis of the preceding problem can be used as the initial basis in solving the next problem.
2) If \(X=\{x:\ g_i(x)\geq 0,\ i=1,2,\ldots,m\}\) and at the point \(x^k\) we have
\(g_j(x^k)=0\), \(j\in J\subset\{1,2,\ldots,m\}\), then as \(\eta_{t+k+1}(x^k)\) one may take the gradient of any of the functions \(g_j\), \(j\in J\), computed at the point \(x^k\) (provided that it exists and is not equal to 0).
To prove the convergence of the method, the concave function
\[ \varphi(x)\equiv \max_{\lambda>0}\left\{\frac{\lambda-1}{\lambda}:\ \lambda(x-x^*)+x^*\in X\right\}, \]
in an obvious way connected with the Minkowski functional for the original set. The set \(\{x:\varphi(x)\geqslant 0\}\), as is easy to see, coincides with \(X\). Consequently, we may assume that we are solving the problem of finding the maximum of the function \(f\) subject to the constraint \(\varphi(x)\geqslant 0\).
Remark. The sequences of vectors \(\{y^i\}\), \(i=0,1,2,\ldots\), and \(\{x^i\}\), \(i=0,1,2,\ldots\), obtained in the course of implementing the method are the same for different ways of specifying the set \(X\).
To each \(\delta>0\) we associate the set
\[ Y_\delta \equiv \{x:\varphi(x)\leqslant -\delta\}. \]
Next, if \(x\ne x^*\), there exists a unique \(\lambda>0\) such that \(\varphi(x^*+\lambda(x-x^*))=0\). We shall denote the indicated point \(x^*+\lambda(x-x^*)\) by \(\xi(x)\).
Lemma 1. Let \(m(x)\) be a nonzero linear functional supporting the set \(X\) at a boundary point \(x\). Then the functional
\[ d(x) \equiv m(x)/(m(x),x^*-x) \]
supports the function \(\varphi\) at the point \(x\).
Lemma 2. To each linear functional \(d(x)\) supporting the function \(\varphi\) at some point \(x\ne x^*\), there can be assigned some functional \(m(\xi(x))\), supporting the set \(X\) at the point \(\xi(x)\), such that
\[ d(x)=m(\xi(x))/(m(\xi(x)),x^*-\xi(x)). \]
Lemma 3. For each \(\delta>0\) there exists \(\varepsilon>0\) such that
\[ (d(\xi(x)),x-\xi(x))\leqslant -\varepsilon \]
for all \(x\in Y_\delta\) \(\bigl(d(\xi(x))\) is any linear functional supporting \(\varphi\) at the point \(\xi(x)\bigr)\).
Theorem 1. All limit points of the sequence \(\{y^k\}\), \(k=0,1,2,\ldots\), determined in the process of implementing the method belong to the set \(X\).
Theorem 2. Every limit point of the sequence \(\{y^k\}\), \(k=0,1,2,\ldots\), yields a solution of the original problem.
From the last theorem it is clear that, if the solution of the original problem is unique, then the sequence \(\{y^k\}\) converges to this solution.
An important feature of the method is the existence of two-sided estimates of the solution \(\bigl(f(x^k)\leqslant \max_{x\in X} f(x)\leqslant f(y^k)\) for all \(k\bigr)\); moreover, it should be noted that if a subsequence \(\{y^{n_j}\}\) converges to \(\bar{x}\), then also
\[ \lim_{j\to\infty} x^{n_j}=\bar{x}. \]
It should be noted that most of the known methods for solving the problem of maximizing a linear (or concave*) function on a convex set are applicable only in the case where the set is specified by constraints of the form \(g_i(x)\geqslant 0\), where the \(g_i\) are concave functions. For the proposed method it is immaterial how the set \(X\) is specified, and the difference in the implementation of the method for different ways of specifying \(X\) appears only in the solution of the one-dimensional problem of determining the points \(x^k\).
The method considered here, in terms of its mode of approximation, should be assigned to the group of cutting-plane methods \((^1,^2)\). Comparing it with Kelley’s cutting-plane method**, one should first of all note that the latter is applicable only to the solution of concave programming problems. In addition, important advantages of the proposed method are the existence of two-sided—
* The problem of maximizing a concave function, by adding one variable and one constraint, is reduced to the problem of maximizing a linear function. For most methods such a substitution is inadvisable.
** Other methods of this group are of no substantial interest in the finite-dimensional case.
of the estimates of the solution and the naturalness of the resulting linear approximation of the set \(X\). The drawback of the method is the need to determine an interior point \(x^*\). In practical problems, however, such a point is, as a rule, known or can easily be determined.
Let us turn to the question of applying some methods of concave programming to the solution of the original problem. We return to the consideration of the function \(\varphi\), which here played only an auxiliary role in connection with the proof of convergence of the method.
We consider the case where
\[
X=\{x:\ g_i(x)\geq 0,\ i=1,2,\ldots,m\},
\]
and some of the functions \(g_i\) are not concave. Representing the set \(X\) by means of the constraint \(\varphi(x)\geq 0\) makes it possible to use, for solving the problem, a number of methods of concave programming that are not applicable under the original way of specifying the set \(X\), for example, the method of centers, penalty-function methods \((^3)\), and Kelley’s method. In these methods, only the operations of computing the value and the gradient of \(\varphi\) are performed on the function \(\varphi\), and in some of them also the computation of the maximum of \(\varphi\) on a segment.* All these operations are feasible under the chosen way of specifying \(\varphi\) (taking Lemmas 1 and 2 into account). The implementation of the indicated methods in the present case is complicated only by the fact that, in order to compute the function or the gradient at some point \(x\), we have first to solve the one-dimensional problem of finding the point of intersection of the ray
\[
\lambda(x-x^*)+x^*,\quad \lambda>0,
\]
with the boundary of the set \(X\).
In conclusion, I express my gratitude to G. Sh. Rubinstein for a number of valuable suggestions during the discussion of the present work.
Institute of Mathematics
Siberian Branch of the Academy of Sciences of the USSR
Received
10 IV 1967
REFERENCES
- E. S. Levitin, B. T. Polyak, Journal of Computational Mathematics and Mathematical Physics, 6, No. 5 (1966).
- J. E. Kelley, J. Soc. Ind. Appl. Math., 8, No. 4 (1960).
- A. Fiacco, G. McCormick, Manag. Sci., 10, No. 2, No. 4 (1964).
* Here the determination of the maximum on a segment is a laborious operation.