Course Webpage

Local and global minimum. Symbolic calculus

  1. Using the necessary first-order conditions, find the minimum of the function
$$ f(x,y,z)=2x^{2}+xy+y^{2}+yz+z^{2}-6x-7y-8z+9. $$
  1. Check that this point is a local minimum using sufficient second-order conditions.

  2. Prove that the local minimum found is, in fact, global.

Introduction

Minimum of a scalar function of several variables

Given a function $f:\Omega\subset \mathbb{R}^n\to \mathbb{R}$ we say that $x^*\in \Omega$ is a minimum of $f$ in $\Omega$ if $f(x^*)\leq f(x)$ for all $x\in\Omega$.

Main cases

  • Optimization without constrictions: $\Omega=\mathbb{R}^n.$
  • Optimization with constrictions: $\Omega \subsetneq \mathbb{R}^n$, usually determined by a set of constraints given by equations or inequalities.

The are no general techniques for solving the problem of global optimization. Therefore, one usually solves the problem of local optimization:

Find $x^*\in \Omega$ such that $f(x^*)\leq f(x)$ for all $x$ such that $\left\Vert x-x^*\right\Vert\leq R$.

If $f$ is a strictly convex function and $\Omega$ is a strictly convex set, then $f$ has a unique global minimum in $\Omega$.

On the other hanc, the problem of finding a maximum is easily transformed into the problem of finding a minimum by changing the sign of the function. For example, the two-variable function $f:\mathbb{R}^2\to \mathbb{R}$, $f(x,y)=-x^2-y^2$ has a maximum at $(0,0)$ and the function $-f(x,y)=x^2+y^2$ has a minimum at $(0,0)$

So we will only deal with the problem of finding a minimum. Let's start by reviewing the necessary and sufficient minimum conditions of a sufficiently regular function of several variables with the example of the two-variable function $f(x,y) = x\,e^{-x^2-y^2}$. If we represent it in $(x,y)\in[-2,2]\times[-2,2]$

We can see that it has a relative maximum and a relative minimum, the minimum at $(x_0,y_0)=\left(-\dfrac{1}{\sqrt{2}},0\right)\approx (-0.71,0)$ and the maximum in $(x_1,y_1)=\left(\dfrac{1}{\sqrt{2}},0\right)\approx (0.71,0)$

Necessary condition of relative minimum

If we make $y$ constant and equal to zero, we obtain the curve $g_1(x)=f(x,0)=x\,e^{-x^2}$, which is the curve that we obtain when the surface intersects with the plane $y=0$

And we see that the derivative of this curve at $x_0=-\dfrac{1}{\sqrt{2}}$ and $x_1=\dfrac{1}{\sqrt{2}}$ is equal to zero (horizontal tangent line). That is, in both cases, the partial derivative of the surface $f$ with respect to the variable $x$ is zero at these points.

If we now make $x$ constant and equal to $\dfrac{1}{\sqrt{2}}$ it is equivalent to the intersection of the surface with the plane $x=\dfrac{1}{\sqrt{2}}$ and we have the curve $g_2(y)=f\left(\dfrac{1}{\sqrt{2}},y\right)= \dfrac{1}{\sqrt{2}}e^{-\frac{1}{2}-y^2}$

And we see that the derivative of this curve at $y_1=0$ is equal to zero (horizontal tangent line). That is, the partial derivative of the surface $f$ with respect to the variable $y$ is zero at this point.

And if we now make $x$ constant and equal to $-\dfrac{1}{\sqrt{2}}$ it is equivalent to cutting the surface along the plane $x=-\dfrac{1}{\sqrt{2}}$ y we have the curve $g_2(y)=f\left(-\dfrac{1}{\sqrt{2}},y\right)= -\dfrac{1}{\sqrt{2}}e^{-\frac{1}{2}-y^2}$

And we see that the derivative of this curve at $y_0=0$ is equal to zero (horizontal tangent line). That is, the partial derivative of the surface $f$ with respect to the variable $y$ is zero at this point.

Therefore, if we have a maximum or minimum of a function of several variables, the necessary condition is that its partial derivatives are zero. In particular, for a two-variable function like $f$

$$\dfrac{\partial f}{\partial x}=0 \quad \dfrac{\partial f}{\partial y}=0$$

And for the example function $f(x,y)= x\,e^{-x^2-y^2}$

$$ \begin{array}{lll}\\ \dfrac{\partial f}{\partial x}=0 & & e^{-x^2-y^2}+x\,e^{-x^2-y^2}(-2x)=0 & & e^{-x^2-y^2}(1-2x^2)=0& &1-2x^2=0\\ \dfrac{\partial f}{\partial y}=0 & & x\,e^{-x^2-y^2}(-2y)=0 & & e^{-x^2-y^2}(-2y)=0& &-2y=0 \end{array} $$

And the solution of this non-linear system is

$$ x = \pm\dfrac{1}{\sqrt{2}} \quad y=0 $$

And therefore the solutions are

$$(x_0,y_0)=\left(-\dfrac{1}{\sqrt{2}},0\right)\approx (-0.71,0)\qquad(x_1,y_1)=\left(\dfrac{1}{\sqrt{2}},0\right)\approx (0.71,0)$$

These are necessary, not sufficient, conditions because they can be met and there is neither a maximum nor a minimum. Then the point is called a saddle point. For example, for the function $f(x,y)=x^2-y^2,$ if we intersect the surface with the plane $y=0$ at the point $(0,0)$ we have a minimum, but if we intersect it with the plane $x=0$ we have a maximum. Therefore at $(0,0)$ the surface has neither a maximum nor a minimum, although the necessary conditions are met.

Interactive plot (In rectangular region)

Interactive plot (In circular region)

Sufficient condition of relative minimum

Once we have established that the necessary conditions are met at the point, the sufficient minimum condition is that the function is convex at the point. In one dimension, analytically it was $f''(x_0) \gt 0$. In more than one dimension, the Hessian matrix at the point must be positive definite. The Hessian matrix, that is the one that involves second-order partial derivatives, is

$$ H(x,y)=\left(\begin{array}{cc} \dfrac{\partial^2 f}{\partial x^2} & \dfrac{\partial^2 f}{\partial x\partial y}\\ \dfrac{\partial^2 f}{\partial y\partial x} & \dfrac{\partial^2 f}{\partial y^2} \end{array}\right) = \left(\begin{array}{cc} 2x(2x^2-3)e^{-x^2-y^2} & 2y(2x^2-1)e^{-x^2-y^2}\\ 2y(2x^2-1)e^{-x^2-y^2} & 2x(2y^2-1)e^{-x^2-y^2} \end{array}\right) = e^{-x^2-y^2}\left(\begin{array}{cc} 2x(2x^2-3) & 2y(2x^2-1)\\ 2y(2x^2-1) & 2x(2y^2-1) \end{array}\right) $$

And at the point $(x_0,y_0)=\left(-\dfrac{1}{\sqrt{2}},0\right)\approx (-0.71,0)$

$$ H\left(-\dfrac{1}{\sqrt{2}},0\right)= e^{-1/2}\left(\begin{array}{cc} -\dfrac{2}{\sqrt{2}}(2\dfrac{1}{2}-3) & 0)\\ 0 & \dfrac{2}{\sqrt{2}} \end{array}\right) = \frac{2}{\sqrt{2}}e^{-1/2}\left(\begin{array}{cc} -(2\dfrac{1}{2}-3) & 0)\\ 0 & 1 \end{array}\right) = \frac{2}{\sqrt{2}}e^{-1/2}\left(\begin{array}{rr} 2 & 0\\ 0 & 1 \end{array}\right) $$

A matrix is positive definite if and only if all its leading principal minors are positive. So we check them

$$ \left|2\right| = 2\gt 0 \quad \left|\begin{array}{rr} 2 & 0\\ 0 & 1 \end{array}\right|=2\gt 0 $$

and the Hessian matrix is positive definite at the point $(x_0,y_0)$ and we can guarantee that there is a local minimum.

Exercise

Using the necessary first-order conditions, find a minimum point of the function

$$ f(x,y,z)=2x^{2}+xy+y^{2}+yz+z^{2}-6x-7y-8z+9. $$

The minimum necessary condition for a point $\left(x_{m},y_{m},z_{m}\right)$ is that $\nabla f\left(x_{m},y_{m},z_{m}\right)=\left(f^{\prime}_{x},f^{\prime}_{y},f^{\prime}_{z}\right)=\left(0,0,0\right)$. That is

$$ \begin{cases} f^{\prime}_{x}=0\\ f^{\prime}_{y}=0\\ f^{\prime}_{z}=0\\ \end{cases} \qquad \begin{cases} 4x + y - 6=0\\ x + 2y + z - 7=0\\ y + 2z - 8=0 \end{cases} \qquad \begin{cases} x=1.2\\ y=1.2\\ z=3.4 \end{cases} $$

And the point $\left(x_{m},y_{m}\right)=\left(1.2,1.2,3.4\right)$ meets the necessary conditions for minimum.

We want to solve the system

$$ \begin{array}{rrrrrrr} 4x & + & y & + & & = & 6\\ x &+ & 2y& +& z & = & 7\\ & & y &+& 2z & = & 8 \end{array} $$

We will solve it by Gauss. We make zeros under the pivot $\mathbf{{\color{red}4}}$ in the first column.

$$ \begin{array}{c} r_{1}\\ r_{2}\\ r_{3} \end{array}\left(\begin{array}{rrr|r} \mathbf{{\color{red}4}} & 1 & 0 & 6\\ \mathbf{{\color{blue}1}} & 2 & 1 & 7\\ \mathbf{{\color{ForestGreen}0}} & 1 & 2 & 8 \end{array}\right)\begin{array}{rrl} r'_{1} & = & r_{1}\\ r'_{2} & = & r_{2}-\tfrac{\mathbf{{\color{blue}1}}}{\mathbf{{\color{red}4}}}r_{1}\\ r'_{3} & = & r_{3}-\tfrac{\mathbf{{\color{ForestGreen}0}}}{\mathbf{{\color{red}4}}}r_{1} \end{array} $$

We make zeros under the pivot $\mathbf{{\color{red}{-3}}}$ in the second column.

$$ \begin{array}{c} r'_{1}\\ r'_{2}\\ r'_{3} \end{array}\left(\begin{array}{rrr|r} 4 & 1 & 0 & 6\\ 0 & \mathbf{{\color{red}{7/4}}} & 1 & 11/2\\ 0 & \mathbf{\color{blue}{1}} & 2 & 8 \end{array}\right)\begin{array}{rrl} r''_{1} & = & r'_{1}\\ r''_{2} & = & r'_{2}\\ r''_{3} & = & r_{3}-\tfrac{\mathbf{\color{blue}{1}}}{\mathbf{{\color{red}{(7/4)}}}}r'_{2} \end{array} $$

And we already have an upper triangular matrix (with zeros below the main diagonal).

$$ \begin{array}{c} r''_{1}\\ r''_{2}\\ r''_{3} \end{array}\left(\begin{array}{ccc|c} 4 & 1 & 0 & 6\\ 0 & 7/4 & 1 & 11/2\\ 0 & 0 & 10/7 & 34/7 \end{array}\right) $$
$$ \begin{array}{cccccccc} & 4x & + & y & & & = & 6\\ & & & \dfrac{7}{4}y & + & z & = & \dfrac{11}{2}\\ & & & & & \dfrac{10}{7}z & = & \dfrac{34}{7} \end{array} $$

From the latter equation, we get $$z = \dfrac{\frac{34}{7}}{\frac{10}{7}} = \dfrac{34}{10}=3.4$$ From the second equation $$y =\dfrac{\frac{11}{2}-z}{\frac{7}{4}}=\dfrac{\frac{11}{2}-\frac{34}{10}}{\frac{7}{4}}= \dfrac{6}{5}=1.2$$ And from the first one $$x = \dfrac{6-y}{4}= \dfrac{6-\frac{6}{5}}{4} =\dfrac{6}{5}=1.2$$

And then $$ \fbox{$x = 1.2 \quad y = 1.2 \quad z = 3.4$} $$

Check that this point is a local minimum using sufficient second-order conditions

As

$$ \begin{cases} f^{\prime}_{x}=4x + y - 6\\ f^{\prime}_{y}=x + 2y + z - 7\\ f^{\prime}_{z}=y + 2z - 8 \end{cases} $$

The sufficient condition is that the Hessian matrix at the point $\left(x_{m},y_{m},z_{m}\right)$ is positive definite. The Hessian matrix for the function $f$ in any point $\left(x,y\right)$ is

$$ H_{f}=\left(\begin{array}{ccc} f^{\prime\prime}_{xx} & f^{\prime\prime}_{xy}& f^{\prime\prime}_{xz}\\ f^{\prime\prime}_{yx} & f^{\prime\prime}_{yy}& f^{\prime\prime}_{yz}\\ f^{\prime\prime}_{zx} & f^{\prime\prime}_{zy}& f^{\prime\prime}_{zz} \end{array}\right)=\left(\begin{array}{ccc} 4 & 1 & 0\\ 1 & 2 & 1\\ 0 & 1 & 2 \end{array}\right). $$

And in particular, at point $\left(x_{m},y_{m},z_{m}\right)=\left(1.2,1.2,3.4\right)$, since all the elements are constant, the value is the same.

The sufficient minimum condition is that the Hessian matrix is positive definite. A matrix is positive definite if and only if all its leading principal minors are positive. So we check them $$ \left|\begin{array}{ccc} 4 & 1 & 0\\ 1 & 2 & 1\\ 0 & 1 & 2 \end{array}\right|= 10\gt0 \qquad \left|\begin{array}{cc} 4 & 1\\ 1 & 2 \end{array}\right|=7\gt0\quad\mathrm{y} \qquad \left|4\right|=4\gt0. $$

Therefore the matrix is positive definite and the sufficient condition of minimum is met.

Prove that the local minimum found is, in fact, global

To prove that the minimum is unique, it is enough to prove that the function is convex.

If the Hessian is positive definite for all points of the function, it is convex. As the Hessian matrix has constant elements and is positive definite, it is positive for the entire function and if there is a minimum, it is unique.