Re: Step Control for Gradient Descent ?
- From: dave.rudolf@xxxxxxxx
- Date: Wed, 18 Feb 2009 17:34:36 -0800 (PST)
On Feb 18, 3:31 pm, rgvick...@xxxxxxxxx wrote:
How large is the problem? (That is, how many variables and constraints
do you have?) If it is small (say, no more than 200-300 variables and
100-200 constraints) you can try using readily-available software,
such as the built-in EXCEL Solver.
Roughly 1000 variables with perhaps 50 or 100 constraints.
Alternatively, if the problem is
much smaller, you could try using the latest Global optimization
package in the LINGO solver, which is available from the LINDO Corp. I
think that limited-capacity "student" versions are available for free,
while larger-capacity implementations are available for a modest
cost---or much larger versions for around $1000. If you are at a
university that has a site licence for some optimization packages, why
not try using them?
Certainly, I can't use Excel for anything other than testing, as
eventually this has to be integrated with other software. However,
buying software might be an option down the road when I have more time
to redesign what is in front of me. There are other reasons to keep
along this track, but I will spare you the details. For now, I'm just
trying to squeeze a tiny bit more juice out of the current solver
until I have time to integrate a different one. I heard that people
had worked on dynamic stepping for gradient descent, and so I thought
I would ask around.
Why am I making an issue of this? Well, there are two aspects to your
problem: (1) getting a local optimum in reasonable time and with
reasonable accuracy; and (2) getting a global optimum. Let's just look
at (1) for the moment. It has long been known through examples that
simple gradient searches (of the type you seem to be using) are
dangerous: you can have convergence to a NON-OPTIMAL point.
What do you mean non-optimal? Of course it can converge on a local
minimum, but that is a problem with any local method, as far as I
understand. Newton's method, BFGS, conjugate gradient all have this
problem. Gradient methods will find the local minimum, given enough
time. Or do you have a counter-example of this?
Furthermore, even in the absence of constraints, the method can be
incredible slow, and can even converge to a non-optimal point when
using finite-wordlength arithmetic on a real computer (even though it
converges theoretically if one uses infinite-precision computations).
That is why numerous teams of researchers working for decades have
devoted so much effort to dealing with such problems. Typical,
powerful implementations would include better unconstrained
optimization methods (such as conjugate-gradient or quasi-Newton
methods) having quadratic convergence or better. And there is the
important issue of how to handle constraints, especially of inequality
type. There a typical, powerful approach is the reduced-gradient or
generalized reduced-gradient method, perhaps with modern "augmented
Lagrangian" methods. For example, some algorithms (such as MINOS) for
nonlinearly-constrained problems use generalized reduced-gradient
methods to handle linear constraints and augmented lagrangians to deal
with nonlinear constraints.
Alright, I will look into these as well.
As I said, teams of researchers have spent decades developing powerful
methods to handle such problems---because real-worls problems are too
hard to be handled naively---and have come up with techniques that
will be orders of magnitudes better than what you are attempting. Why
re-invent the wheel, and ineffectively at that? Even a tool such as
the EXCEL Solver (also available on open-source3 spreadsheets) embody
much of this advanced research, and utilize quasi-Newton mehtods
together with generalized reduced-gradient methods. The LINGO package
uses similar recent researh results.
Next, there is the issue (2): the question of global optimization. It
would likely pay you to read some of the extant research on this
topic; even doing a Google search will turn up a lot of useful
material free of charge. It is still a difficult problem, and without
some special structure to exploit, one can probably never be
guaranteed that a true global optimum has been reached. There are,
however, numerous effective heuristics that seem to have good
performance. Often one must apply (local) optimization packages
numerous times, perhaps using several randomly-generated starting
points. Other techniques, such as simulated annealing, taboo search or
genetic algorithms are brought to bear on such problems.
You might get better information and advice by posting your message to
the Operations Research newsgroup 'sci.op-research'. The majority of
postings over there deal with issues related to optimization.
Good luck.
R.G. Vickson
Adjunct Professor, University of Waterloo
Thanks Ray. This gives me a good start.
Dave
.
- Follow-Ups:
- Re: Step Control for Gradient Descent ?
- From: Ray Vickson
- Re: Step Control for Gradient Descent ?
- References:
- Step Control for Gradient Descent ?
- From: dave . rudolf
- Re: Step Control for Gradient Descent ?
- From: rgvickson
- Step Control for Gradient Descent ?
- Prev by Date: Re: compact set's definition
- Next by Date: Re: degree 61 polynomial
- Previous by thread: Re: Step Control for Gradient Descent ?
- Next by thread: Re: Step Control for Gradient Descent ?
- Index(es):
Loading