Re: Optimization for a positive definite constraint




In article <dtd2ut$mv3$1@xxxxxxxxxxxx>,
Hiu Chung Law <antispam@xxxxxxxxxxxx> writes:
see below under the ========= -line
two points:
the approach of cutting the step in order to remain feasible works often if
the solution itself is unconstrained.
but it may lead to inefficient small steps if one comes near to a boundary.
in your problem below the only point there positive definiteness enters
formally is the sqrt(det(..)).
hence you might replace this with sqrt(abs(det(..))) and
add a penalty term for indefiniteness:
i propose to use a smoothed version of the exact penalty term
-alpha*min{0,det(..)}
(not the squared one, which is too weak):
psi(det(.),alpha)= -alpha*det(..)+log(1+exp(alpha*det(..))) with alpha=100
say.
psi is a convex function of det which has precision

abs( psi(det,alpha) + alpha*min{0,det} )
<=min{log(2),(4/3)*exp(-alpha*abs(det)) }
maybe this becomes more efficient.
a side remark: trace(B_j*y_i*y_i^T) = y_i^T*B_j*y_i
hth
peter
===========================
Peter Spellucci <spellucci@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

In article <dt2aoc$q5p$1@xxxxxxxxxxxx>,
Hiu Chung Law <antispam@xxxxxxxxxxxx> writes:
>Let X be a n by n matrix.
>
>The goal is to minimize a real-value function f(X)
>subject to the constraint that X is symmetric, and X is positive definite.
>(The function f is not defined for X that is not positive definite).
>
>I am now treating this as an optimization problem with n^2 variables,
>and the gradient and the Hessian can be computed.
>
>The symmetric constraint can be handled easily because it is linear.
>However, the constraint that X should be positive-definite is more difficult
>to handle. Right now, I simply force my line-search procedure to reject
>any X that is not positive definite. However, this does not seem to be
>robust. I have also re-parameterized X by X = F * F', but the Hessian
>with respect to F is much more nasty than the Hessian with respect to X.
>
>
>I thought something related to how to handle bound constraint may help, but
>I cannot find anything similar. Also, this is quite different from
>semi-definite programming....
>
>Do you have any advice on how to tackle this type of problem?
>
>Thank you.
>

if f(X) is not sum_{i=1to n} x(i)*A(i) with given symmetric positive
semidefinite matrices A(i) (semidefinite programming)
but rather a "general" smooth function
then the approach you already tried
X=F*F' F lower triangular with positive diagonal
seems to me the best one, because, if for
example you go the way of parametrizing by the eigenvalues,
X=U*diag(lambda(i))*U' , lambda(i) >=0
you will also have the U as unknowns and the nasty orthogonality constraints
U'*U = I . Although there are special tricks to deal with orthogonality
constraints (paper by alan edelman and coworkers) this is even more nasty.
cutting the step is a faulty approach.
can you tell more about your f?
hth
peter


Thank you for the advice given by Peter and Brian.
I shall give the explicit form of my objective function.


The variables to be optimized consist of
{ a_1, a_2, ..., a_k, B_1, B_2, ..., B_k}

Here, a_j is a d by 1 vector, and B_j is a d by d matrix.

We are also given a set of fixed y_i, where i runs from 1 to n,
and each y_i is a d by 1 vector.
We are also given a set of numbers a_hi, where h runs from 1 to m,
i runs from 1 to n, and for each h, only two of the a_hi are non-zero,
and they equal to one half.

Define

q_ij = exp( - 0.5 * trace( B_j y_i y_i^T ) + y_i^T a_j ) ( det B_j )^(1/2)
r_ij = q_ij / sum_l q_il
s_hj = sum_i a_hi r_ij
g_h = sum_j s_hj log s_hj

The objective function to be minimized is

- sum_i log sum_j q_ij - sum_h g_h

subject to the constraints that all the B_j are positive definite and
that B_j are symmetric.

As one can see, this is a highly non-linear function which is not
convex, though its derivative is not difficult to compute.

Although there are k d (d+1) variables, it turns out that computing

inv(H) * u,

where H is the approximate Hessian matrix with respect to all the variables,
can be done efficiently. It is also not difficult to enforce the
restriction that H is positive definite. So, the Newton method can be
used.... if this is an unconstrained optimization problem.

Using the re-parameterization of B_j = C_j C_j^T, will, unfortunately,
destroy this nice property of the Hessian matrix.


I have also implemented an alternative approach to solve this problem
by optimizing wrt to a_j and C_j, using damped Newton method.
However, it is significantly slower than the version that optimize
with respect to a_j and B_j, probably because a large value of delta I
is needed to be added to the Hessian matrix.


Since my hack of "truncated line search" seems to be working okay and
is quite efficient, I hope I can stay with it. However, I also know
that I am lucky so far, and this is not a robust approach....


Can you see any method that can utilize the nice structure of the Hessian
matrix robustly?


Thank you in advance for any comment and insight!


.



Relevant Pages

  • Re: Optimization for a positive definite constraint
    ... >subject to the constraint that X is symmetric, ... the constraint that X should be positive-definite is more difficult ... if this is an unconstrained optimization problem. ... destroy this nice property of the Hessian matrix. ...
    (sci.math.num-analysis)
  • Re: Optimization for a positive definite constraint
    ... subject to the constraint that X is symmetric, ... If fis linear in the elements of X, ... semidefinite programming problem. ... the constraint that X should be positive-definite is more difficult ...
    (sci.math.num-analysis)
  • Re: Optimization for a positive definite constraint
    ... The symmetric constraint can be handled easily because it is linear. ... the constraint that X should be positive-definite is more difficult ... I have also re-parameterized X by X = F * F', but the Hessian ... with respect to F is much more nasty than the Hessian with respect to X. ...
    (sci.math.num-analysis)
  • Optimization for a positive definite constraint
    ... The goal is to minimize a real-value function f ... The symmetric constraint can be handled easily because it is linear. ... the constraint that X should be positive-definite is more difficult ... I have also re-parameterized X by X = F * F', but the Hessian ...
    (sci.math.num-analysis)