Prof. Vladimir Dobrushkin
Department of Mathematics
TO SYLLABUS
TO GRADING
TO SOFT
TO BOOKS
TO QUIZZES
 

MTH243 (Calculus for Functions of Several Variables)
MATLAB. Chapter 15:
Optimization

Vladimir A. Dobrushkin,Lippitt Hall 202C, 874-5095,dobrush@uri.edu

In this course we will use Mathematica computer algebra system (CAS), which is available in computer labs at URI. The Mathematica projects are created to help you learn new concepts. Mathematica is very useful in visualizing graphs and surfaces in three dimensions. Matlab (commercial software) is also available at engineeering labs. Its free version is called Octave. A student can also use free CASs: SymPy (based on Python), Maxima, or Sage.

 

Section 15.1. Local Extrema

Example 1. Plotting the graph of the function \( f (x, y) = x^2+y^2 \)
Plot3D[(x^2 + y^2), {x, -3, 3}, {y, -3, 3}, Axes -> True, PlotStyle -> Green]

Now we create a new graph: \( g (x, y) = x^2 + y^2 + 3 \)
Plot3D[(x^2 + y^2 + 3), {x, -3, 3}, {y, -3, 3}, Axes -> True]
Another graph of \( h(x, y) = 5 - x^2 - y^2 \)
Plot3D[(5 - x^2 - y^2), {x, -3, 3}, {y, -3, 3}, Axes -> True, PlotStyle -> Orange]
One more: \( k (x, y) = x^2 + (y - 1)^2 \)
Plot3D[(x^2 + (y - 1)^2), {x, -3, 3}, {y, -3, 3}, PlotStyle -> None]
Example 2. Plotting the Graph of the Function \( G(x,y)=e^{-(x^2+y^2)} \)
Plot3D[(E^-(x^2 + y^2)), {x, -5, 5}, {y, -5, 5}, PlotStyle -> Opacity[.8]]
Cross Sections and the Graph of a Function where x=2
Plot3D[{(x^2 + y^2), (4 + y^2)}, {x, -3, 3}, {y, -3, 3}]

 

Section 15.2. Optimization

 

Section 15.3. Constrained Optimization

The method of Lagrange multipliers (named after Joseph Louis Lagrange, 1736--1813) is a strategy for finding the local maxima and minima of a function subject to equality constraints. For the case of only one constraint and only two choice variables (as exemplified in Figure 1), consider the optimization problem
\[ \begin{cases} \mbox{maximize/minimize} & \quad f(x,y), \\ \mbox{subject to} & \quad g(x,y) =c . \end{cases} \]
We assume that both f and g have continuous first partial derivatives. We introduce a new variable (λ) called a Lagrange multiplier and study the Lagrange function (or Lagrangian or Lagrangian expression) defined by
\[ {\cal L}(x,y,\lambda ) = f(x,y) - \lambda \left( g(x,y) -c \right) , \]

where the λ term may be either added or subtracted. If \( f(x_0, y_0) \) is a maximum of \( f(x, y) \) for the original constrained problem, then there exists \( λ_0 \) such that \( (x_0, y_0, \lambda_0 ) \) is a stationary point for the Lagrange function (stationary points are those points where the partial derivatives of the Lagrange function are zero). However, not all stationary points yield a solution of the original problem. Thus, the method of Lagrange multipliers yields a necessary condition for optimality in constrained problems. Sufficient conditions for a minimum or maximum also exist.

For the general case of an arbitrary number n of choice variables and an arbitrary number M of constraints, the Lagrangian takes the form

\[ {\cal L}(x_1, x_2, \ldots , x_n , \lambda_1 , \ldots , \lambda_M ) = f(x_1 , \ldots , x_n ) - \sum_{k=1}^M \lambda_k \left( g_k (x_1 , \ldots x_n ) -c_k \right) ; \]

again the constrained optimum of f coincides with a stationary point of the Lagrangian.

For the two-dimensional problem introduced above, the extreme points are determined by solving the following nonlinear system of equations:

\[ \nabla_{x,y,\lambda} {\cal L} =0 \qquad \Longleftrightarrow \qquad \begin{cases} \nabla_{x,y} f(x,y) &= \lambda \,\nabla_{x,y} g(x,y) , \\ g(x,y) &= c. \end{cases} \]
The method generalizes readily to functions on n variables
\[ \nabla_{x_1 ,\ldots , x_n , \lambda} {\cal L} =0 , \]

which amounts to solving \( n+1 \) equations in \( n+1 \) unknowns.

The constrained extrema of f are critical points of the Lagrangian \( {\cal L}, \) but they are not necessarily local extrema of \( {\cal L} \) (see Example 2 below). One may reformulate the Lagrangian as a Hamiltonian, in which case the solutions are local minima for the Hamiltonian. This is done in optimal control theory, in the form of Pontryagin's minimum principle. It was formulated in 1956 by the Russian mathematician Lev Pontryagin and his students.

Example 1. We start with a simple (actually linear) objective function \( f(x,y) = x+ y \) subject to a single constraint \( g(x,y) = x^2 + y^2 =2 .\)

First, we can plot our function to be optimized \( f(x,y) = x+ y \) Let’s open up Matlab and create our function.

function lagrange_multiplier
clear all; close all
x = linspace(-1.5,1.5);
[X,Y] = meshgrid(x,x);
figure; hold all
surf(X,Y,X+Y);
shading interp;
end

As a review of Matlab, notice that the function, titled “lagrange_multiplier,” must contain an end. Clear all and close all are used to eliminate any previously saved variables from other functions of matlab. Linspace stands for “linearly spaced” and creates a row vector of equally spaced points between X1 and X2 in linspace(X1,X2). The values here are chosen to be \( \pm 1.5 \) 5 because it is in the same general neighborhood as our boundary condition, the unit circle. We can adjust the linspace of the function. The function meshgrid creates a Cartesian grid in 2D or 3D, and are typically used for evaluations of functions of two to three variables and for surface plots. Our equation \( x+ y \) is included in the “surf” function, which operates similarly to matlab’s “plot” function, except color is proportional to surface height, which will allow us to obtain information about our function.

Figure 1. - plot of function to be optimized.

From the graph, we may observe that \( f(x,y) = x+ y \) is an unbounded plane which has level sets being diagonal lines with slope -1.

We may now plot our boundary conditions with the following script:

theta = linspace(0,2*pi);
[x1,y1] = pol2cart(theta,1);
plot3(x1,y1,x1+y1)
The bounds for linspace here are from 0 to 2π because we are plotting a circle. The Matlab function “pol2cart” is a neat little Matlab function that transforms polar to Cartesian coordinates. To assist in debugging: theta here must be in radians, and the two arrays must be of the same size. Plot3 works very similarly to the more standard function plot expect it specifically helps plot lines and points in 3D. Once again, check to make sure that the matrices are the same size in this function to avoid errors. The plot with the unit circle added looks like this:
Figure 2. - plot of the original function and the constraint function.

We can glean an even better idea of what is going on by swapping our view around. We can better look at our function by adding rotate3d to the end of it, or by setting view to a certain angle. Add the following two lines of code to your .m file
view(48,8) %
rotate3d on
The first fixes the view to a set point, while rotate3d allows you to drag the graph with your cursor.
Figure 3. - plot of the original function and the constraint function (adjusted view).


Graphically, the minimum and maximum points occur at \( (1,1) \) and \( (-1,-1) .\) We will only study the maximum for now, because it was included in the problem statement. Let’s now use Lagrange multipliers to find the minimum and maximum values.

Our constraint equation must be where our boundary equations equal zero. We can find this point rather easily by subtracting 2 from both sides of the equation: \( g(x,y) = x^2 + y^2 -2 . \) We can now write our Lagrangian Function as:

\[ {\cal L} (x,y,\lambda ) = f(x,y) + \lambda \,g(x,y) = x+y + \lambda \left( x^2 + y^2 -2 \right) . \]
We can now create a new function for our Langrangian function:
function gamma = func(X)
        x = X(1);
        y = X(2);
        lambda = X(3);
        gamma = x + y + lambda*(x^2 + y^2 - 2);
    end

Note that adding \( \lambda \,g(x,y) \) to our function shouldn’t change its value because at the constraints, g(x,y) by definition is equal to zero (we set it equal to zero earlier!). Therefore, this assumption is only valid at the boundary conditions!

Next, we need to find the partial derivatives of the “augmented function.” We can do this in Matlab:

function dLambda = dfunc(X)
    % setting the partial derivative to have the same size as the function
       dLambda = nan(size(X));
 
       h = 1e-3; % this is a tiny step we are using.
       for i=1:numel(X)
           dX=zeros(size(X));
           dX(i) = h;
           dLambda(i) = (func(X+dX)-func(X-dX))/(2*h);
       end
end

This is a numerical approximated approach to finding the derivatives. We construct a loop within the function that contains a very small step size. Matlab tries various solutions in increments of step size until it reaches a solution within acceptable error parameters. It is nice to have a computer to use instead of manually doing partial derivatives. We’ve been nesting smaller functions within the larger function; ensure that each of your functions are enclosed with end!

From here, the function we already defined will equal zero at any extrema. We can use the following code to evaluate the zeroes in the partial derivatives.

X1 = fsolve(@dfunc,[1 1 0],optimset('display','off'))
% let's check the value of the function at this solution
fval1 = func(X1)
sprintf('%f',fval1)
Because there are two solutions to the function (both the minimum and maximum value), we can plot the point to ensure that we are getting the correct answer. Sprintf simply writes the data obtained as a string. Optimset creates or alters optimization “Options” structure where named parameters have specific values; here, x and y are 1 and 1 because of the boundary conditions. Note that other Matlab optimization functions could be used here, notably fmincon. For maximization with fmincon, simply set the function to be maximized to be negative and find the minimum!
plot3(X1(1),X1(2),fval1,'ro','markerfacecolor','b')
Plot3 again is used here because our plot is in 3D. If we obtain the minimum value, then the initial guess of the solver was at the minimum point. You may either set the starting “initial guess” of the solver to a specific point until the solution is at a maximum, or simply run the code multiple times until the point is at the maximum.
Figure 4. - Complete solution of Lagrange Multiplier Function.


Add the final two lines to close the initial function and finish the solution:

'done'
end

X1 =

0.7071 0.7071 -0.7071

fval1 =

1.4142

ans =

1.414214

Let’s check our work to confirm that matlab has done it’s job correctly. If we have \( {\cal L} (x,y \lambda ) = x+y + \lambda \left( x^2 + y^2 -2 \right) , \) we may now calculate the gradient of the Lagrangian function by taking partial derivatives of \( {\cal L} (x,y ,\lambda ) \) with respect to x, y and λ.

\[ \nabla {\cal L} = \left[ 1 + 2\lambda x , 1 + 2\lambda y , x^2 + y^2 -2 \right] . \]
If we solve the three individual equations to find the value of λ, we may then find that λ is equal to \( \pm 1/2 . \) The stationary points of \( {\cal L} (x,y ,\lambda ) \) are
\[ \left( 1 , 1 , \frac{1}{2} \right) \qquad \mbox{and} \qquad \left( -1 , -1 , -\frac{1}{2} \right) . \]
Plugging these points into our original function, we obtain the maximum value
\[ f\left( 1 , 1 \right) =2 . \]

Example 2. Suppose we want to find the maximum values of \( f(x,y) = x^2 y \) with the condition that the x and y coordinates lie on the circle around the origin with radius 3, that is, subject to the constraint

\[ x^2 + y^2 -9 =0 . \]

As there is just a single constraint, we will use only one multiplier, say λ.

The constraint g(x, y) is identically zero on the circle of radius 3. See that any multiple of g(x, y) may be added to f(x, y) leaving f(x, y) unchanged in the region of interest (on the circle where our original constraint is satisfied). Apply the ordinary Langrange multiplier method. Let:

\[ {\cal L}(x,y,\lambda ) = f(x,y) - \lambda \left( g(x,y) -c \right) = x^2 y - \lambda \left( x^2 + y^2 -9 \right) . \]
Now we can calculate the gradient of the Lagrangian and get the system of equations
\[ \begin{cases} 2xy &= 2\lambda x, \\ x^2 &= 2\lambda y , \\ x^2 + y^2 &= 9 , \end{cases} \qquad \Longleftrightarrow \qquad \begin{cases} 2x\left( y - \lambda \right) &= 0, \\ x^2 &= 2\lambda y , \\ x^2 + y^2 &= 9 . \end{cases} \]
If x=0, then \( y=\pm 3 \) and λ =0. If \( y=\lambda , \)

we get \( x^2 = 2 y^2 . \) Substituting this into the constraint condition and solving for y gives \( y = \pm \sqrt{3} . \) Thus, there are six critical points of the Lagrangian:
\[ (0,3), \quad (0,-3), \quad (2\sqrt{3} , \sqrt{3} ) , \quad (-2\sqrt{3} , \sqrt{3} ) , \quad (2\sqrt{3} , -\sqrt{3} ) , \quad (-2\sqrt{3} , -\sqrt{3} ) . \]
Evaluating the objective at these points, we find that
\[ f(0,\pm 3) =0, \qquad f\left( \pm 2\sqrt{3} , -\sqrt{3} \right) = -15 , \qquad f\left( \pm 2\sqrt{3} , \sqrt{3} \right) = 15. \]

Therefore, the objective function attains the global maximum (subject to the constraints) at \( \left( \pm 2\sqrt{3} , \sqrt{3} \right) \) and the global minimum at \( \left( \pm 2\sqrt{3} , -\sqrt{3} \right) .\) The point ( 0 , 3 ) is a local minimum of f and ( 0 , − 3 ) is a local maximum of f, as may be determined by consideration of the Hessian matrix of \( {\cal L} ( x , y , 0 ) . \)

Note that while \( \left( 2\sqrt{3} , \sqrt{3} , 1 \right) \) is a critical point of \( {\cal L} ( x , y , \lambda ) ,\) it is not a local extremum of \( {\cal L} ( x , y , \lambda ) . \)

 

Langrange multipliers and constrained optimization play an important role in economic theory, nonlinear programming and control theory. In economics, maximizing utility functions subject to budget constraints forms the basis of the consumer choice problem. Additionally, the Lagrange multiplier even has an economics meaning as a marginal change in value. In programming, it may be used in the Convex Multiplier Rule, and in Control theory, Lagrange multipliers serve as costate variables. Lagrange multipliers are a useful way to solve optimization problems with boundary conditions, and the finite difference approach is useful if we do not want to calculate partial derivatives by hand to solve analytical derivatives. Lastly, numerical derivatives can use any number of variables without significant changes in code or processing power!