Ipopt, limited-memory, and linear solver performance

860 views
Skip to first unread message

Tim Holy

unread,
Oct 21, 2015, 2:13:18 PM10/21/15
to julia-opt
(Apologies for any duplicates; I sent this earlier from a different address,
one that I've not registered with this mailing list, and very likely it got
munched by the spamivore. So trying again.)

I've got an optimization involving ~10^7 variables for which I'm testing
Ipopt, with options that include the following:
hessian_approximation="limited-memory",
linear_system_scaling="none", # better performance? doesn't help
print_level=print_level,
sb="yes") # suppress the annoying banner

My objective function is quite complicated (among other things, it involves
performing calculations on a 100GB array stored in a file), but I've put in a
lot of work to make its evaluation efficient. The very first function evaluation
is I/O limited and takes about 20 minutes, but once the relevant data are
cached in memory it drops to about 0.5s per function evaluation.

Now that I've achieved good performance with my objective function, very
distressingly I am noting that Ipopt is extracting a huge cost: something like
95% of the time is spent in routines that profiling (with C=true) reveals to be
the MUMPS solver. This seems surprising, since with "limited-memory" I presume
it's using l-bfgs (see http://www.coin-or.org/Ipopt/documentation/node52.html), and the linear
algebra operations of l-bfgs are not very taxing (they should be much
cheaper than my objective function).

Any ideas? I just submitted a registration to download the HSL routines under
an academic license, but the notice suggested it could be a day or two before
that's approved. I don't really want to twiddle my thumbs in the meantime,
especially since I have no idea whether it will help.

(Since I first wrote this I've been playing with the teh/constrained branch of
Optim with very promising results, but I'll discuss that elsewhere.)

--Tim

Tony Kelman

unread,
Oct 21, 2015, 2:49:11 PM10/21/15
to julia-opt
It's using a limited memory quasi-newton approximation to the hessian when you specify limited-memory, but it's still a direct newton-based method, solving the KKT system at each iteration to compute a descent direction. If you can provide an exact Hessian, you will get better convergence in fewer iterations.

HSL solvers should give you at least a factor of 3 speedup and possibly far more, depending on your problem characteristics. Number of variables doesn't matter as much as the number of nonzeros in the Jacobian and Hessian (either exact or quasi-newton).

Tim Holy

unread,
Oct 21, 2015, 5:14:18 PM10/21/15
to juli...@googlegroups.com
Thanks for explaining that it's not actually L-BFGS. I'll probably check out
what other solvers deliver when I get usage rights.

Regarding your suggestion to supply the Hessian, it's not inconceivable, but
neither is it trivial. For my information, if the Hessian is 10^7x10^7 and
there are ~10^11 nonzeros, where does that range on the scale from "walk in
the park" to "would take weeks of computation"? Alternatively, I could compute
Hessian-vector products and probably gain an effective 100x higher sparsity.

But the Optim branch is looking great now (convergence in ~2hrs to the same
minimum Ipopt seems to be zeroing in on after 2 days) using just derivatives
and L-BFGS. So unless there's much better performance to be had through extra
effort, I may just go with what works.

--Tim

Tony Kelman

unread,
Oct 21, 2015, 5:30:41 PM10/21/15
to julia-opt
Probably closer to days or weeks. Worth the time to build a custom version of Ipopt with better linear solvers (also worth trying Pardiso and WSMP), linking against an LP64 build of openblas or MKL, etc. You have an order of magnitude more variables than the largest problems in http://www.optimization-online.org/DB_FILE/2011/04/2992.pdf, though they don't report sparsity statistics and you haven't said anything about what your constraints look like.

If your problems are well-behaved enough and have few enough constraints that you can get a simpler algorithms to converge well, then you should probably do so. But what you're seeing with sparse linear algebra taking all of the time in Ipopt is perfectly normal and expected behavior.

Alex Dowling

unread,
Dec 12, 2015, 11:11:34 AM12/12/15
to julia-opt
Hello Tim,

I'd like to reiterate Tony' comments. IPOPT is typically much faster and more reliable with the HSL routines. Here is where you can download the HSL routines: http://www.hsl.rl.ac.uk/ipopt/ There is a free license for academics, and the free personal license for some of the older HSL solvers.

In my experience, IPOPT really shines on sparse problems and with an exact user supplied Hessian. Convergence is much slower (and less robust) with a Hessian approximation.

Good luck,
Alex Dowling
Reply all
Reply to author
Forward
0 new messages