Passing a function as an objective in JuMP

cd

unread,

Feb 6, 2015, 4:34:18 PM2/6/15

to juli...@googlegroups.com

Hi,

I've tried different solvers in the past and it made my life miserable to try to understand each peculiar syntax. I've been recently trying out JuMP and I think it's a great project because of how it unifies the syntax. Also, I've never found it so easy to get different solver to work.

My apologies if this question should just be solved by looking at the docs. I looked around, but couldn't figure out the best approach.

I have a simple maximum likelihood (conditional logit estimation) problem that I'd like to use with the solver linked to JuMP.
My workflow is to define functions, see how they work and then use Optim to do a test maximization. Later I'd like to use it with solvers such as Knitro.

Here is a self-contained example

using DataFrames
using Optim
using JuMP
using KNITRO


global depvar = [1,1,1,4,4]


global x_mat = [866.0 962.64 859.9 995.76 1135.5 199.69 151.72 553.34 505.6 237.88
 727.93 758.89 796.82 894.69 968.9 168.66 168.66 520.24 486.49 199.19
 599.48 783.05 719.86 900.11 1048.3 165.58 137.8 439.06 404.74 171.47
 835.17 793.06 761.25 831.04 1048.7 180.88 147.14 483.0 425.22 222.95
 755.59 846.29 858.86 985.64 883.05 174.91 138.9 404.41 389.52 178.49]


function log_pn{T}(b::Vector{T}, xs::Vector{T}; J=5):
    """
    Returns the vector of J=5x1 probabilities given a beta=b
    b : 2x1 vector
    xs : data on alternatives. 10x1 vector
    """
    num = Array(Float64, J)
    base_x = [xs[1], xs[J]]
    for option in 1:J
        num[option] = e^(dot(b, [xs[option], xs[option+J]] - base_x))
    end
    den = sum(num)
    return log(num) - log(den)
end


function neg_log_lik(beta)
    N = size(depvar)[1]
    sum_ll = 0.0
    for i in 1:N
        xs = vec(x_mat[i, :])
        sum_ll += log_pn(beta, xs)[depvar[i]]
    end
    return -sum_ll
end
    
## Solver with Optim.optimize
function optim_s( ; beta0=[0.01, 0.01])
    optimum = optimize(neg_log_lik, beta0, method=:cg, iterations=50)
    println(optimum.minimum)
end


optim_s()

Is there an easy way to pass the neg_log_like function to a JuMP @setNLObjective to then call Knitro? Should I switch to MathProgBase? Also, ideally I would like to use the automatic differentiation from JuMP instead of having to provide exact first derivatives.

I could probably rewrite this problem in matrix form and it might fit into JuMP, but I'm just wondering in general what should be the
workflow for this kind of problems.

Any general or specific tips appreciated, thanks!

cd

Miles Lubin

unread,

Feb 6, 2015, 7:52:50 PM2/6/15

to juli...@googlegroups.com

No, there isn't an easy way to use a user-defined function inside @setNLObjective (unless they take scalar input and have scalar output, which isn't the case here). Your options are to:

1. Rewrite everything as closed-form algebraic expressions in the format that JuMP accepts

2. For the case of convex objectives, use Convex.jl which has built-in support for primitives like log_sum_exp which seem appear in your model. (Note that Convex.jl doesn't support derivative-based nonlinear solvers like Ipopt and KNITRO; instead it will use conic solvers like SCS.)

3. Implement your model in the MathProgBase nonlinear form. You're still on the hook for providing derivatives, though. There's currently no interoperability between MPB nonlinear form and the input to Optim, but it wouldn't be hard to write a wrapper in one direction or the other.

cd

unread,

Feb 7, 2015, 9:10:25 PM2/7/15

to juli...@googlegroups.com

Thanks for your answer!

Is there a reference for what algebraic expressions are accepted in JuMP? From the docs and example I could get that sum() and prod() work.
Also, is the AffExpr() object intended to be used for building more complicated expressions?

I'll check MathProgBase and see if I can formulate things there.

I think the documentation is quite sparse for some things. Should I open up an issue in github for particular requests?

Miles Lubin

unread,

Feb 8, 2015, 2:39:59 PM2/8/15

to juli...@googlegroups.com

You can use sum{} and prod{} (but not sum() or prod() or AffExpr()) inside nonlinear expressions.

You're right that the syntax isn't well documented. Please open an issue on github with specific requests.

cd

unread,

Feb 9, 2015, 9:16:27 PM2/9/15

to juli...@googlegroups.com

Thanks again! I opened an issue in KNITRO.jl
I couldn't find many examples of MathProgBase nonlinear form. Can someone provide a link with some examples?

Yee Sian Ng

unread,

Feb 9, 2015, 11:31:04 PM2/9/15

to juli...@googlegroups.com

Thanks for filing the issue! Please see the reply here: https://github.com/JuliaOpt/KNITRO.jl/issues/25, and let us know if it's resolved?

Miles Lubin

unread,

Feb 9, 2015, 11:38:22 PM2/9/15

to juli...@googlegroups.com, jung....@gmail.com

Currently the largest collection of examples for specifying a problem in MPB nonlinear form is here: https://github.com/JuliaOpt/MathProgBase.jl/blob/master/test/nlp.jl.

I know Felix Jung has been developing some code also, maybe he can share some experience or examples.

This would be a good chance to "soft" launch juliaopt-notebooks: https://github.com/JuliaOpt/juliaopt-notebooks. We're starting to collect example IJulia notebooks in a form that's easy to view and share. Contributions are welcome.

Felix Jung

unread,

Feb 13, 2015, 4:45:21 AM2/13/15

to juli...@googlegroups.com, jung....@gmail.com

True, I've been implementing NLP with MathProgBase. Currently testing with IPOPT and NLopt. However, haven't obtained any results yet. I can post my code, but its undocumented and basically just picks up bread crumbs from the tests Miles posted and the MathProgBase docs. Anyway, here's my current code.

using MathProgBase
import MathProgBase.MathProgSolverInterface


type NSOpt <: MathProgSolverInterface.AbstractNLPEvaluator
    ## Constant model input
    sample::Matrix{Float64}         # Stores the data set
    surv_firms::Vector{Bool}        # Which firms in sample survive
    surv_dur_lim::Matrix{Float64}   # Limits of the survival duration interval
    exit_firms::Vector{Bool}        # Which firms in sample default
    exit_lim::Matrix{Float64}       # Limits of the default period (relative to t)
    ## Computation results storage
    params::Vector{Float64}         # Temporarily stores the current parameters used by solver
    nsfactors::Matrix{Float64}      # Temporarily stores the current ns factors for use by solver
    nsdecay::Vector{Float64}        # Temporarily stores the current ns decay parameter for use by solver
    int_stor::Vector{Float64}       # Storage vector for intensity integral
    int_stor_g::Matrix{Float64}     # Storage matrix for intensity integral gradient
    int_stor_h::Array{Float64, 3}   # Storage array for intensity integral Hessian
    int_stor_h_com::Matrix{Float64} # Storage matrix for common factors in intensity integral Hessian


    function NSOpt(sample::Matrix{Float64}, surv_firms::Vector{Bool}, surv_dur_lim::Matrix{Float64},
        exit_firms::Vector{Bool}, exit_lim::Matrix{Float64}, params::Vector{Float64},
        int_stor::Vector{Float64}, int_stor_g::Matrix{Float64}, int_stor_h::Array{Float64, 3},
        int_stor_h_com::Matrix{Float64})


        ## Compute Nelson-Siegel factors and decay parameter
        nsfactors, nsdecay = compute_nsfactors(sample, params)
        ## Initialize NSOpt object
        new(sample, surv_firms, surv_dur_lim, exit_firms, exit_lim, params, nsfactors, nsdecay,
            int_stor, int_stor_g, int_stor_h, int_stor_h_com)
    end
end


function compute_nsfactors(sample::Matrix{Float64}, params::Vector{Float64})
    trans     = reshape(params, (convert(Int64, length(params) / 4), 4)) # Transformation matrix for sample
    nsfactors = sample * trans[:, 1:3]                   # Update NS factors in model
    nsdecay   = exp(sample * trans[:, 4])                # Update NS decay parameters in model


    return nsfactors, nsdecay
end


# TODO (Felix, 2015-02-04): Figure out way to deal with different number of parameters
function update_model!(params::Vector{Float64}, d::NSOpt)
    if params != d.params
        d.params    = params                                   # Update parameters in model
        trans       = reshape(params, (convert(Int64, length(params) / 4), 4)) # Transformation matrix for sample
        d.nsfactors = d.sample * trans[:, 1:3]                 # Update NS factors in model
        d.nsdecay   = exp(d.sample * trans[:, 4])              # Update NS decay parameters in model
    end
end


# TODO (Felix, 2015-02-03): Continue implementation from here.
#   - Try to implement evaluation functions as wrappers for functions in ns_likelihood.jl
#   - Put execution code into ns_estimation.jl


function MathProgSolverInterface.initialize(d::NSOpt, requested_features::Vector{Symbol})
    for feat in requested_features
        if !(feat in [:Grad, :Jac, :Hess])
            error("Unsupported feature $feat")
            # TODO: implement Jac-vec and Hess-vec products
            # for solvers that need them
        end
    end
end


MathProgSolverInterface.features_available(d::NSOpt) = [:Grad, :Jac, :Hess]


## Compute target function
function MathProgSolverInterface.eval_f(d::NSOpt, x)
    #= update_model!(x, d) =#
    default_lik!(x, d.sample, d.surv_firms, d.surv_dur_lim, d.exit_firms, d.exit_lim, d.int_stor)
end


## Compute constraints (this problem is unconstrained)
MathProgSolverInterface.eval_g(d::NSOpt, g, x) = nothing


## Compute target function gradient
function MathProgSolverInterface.eval_grad_f(d::NSOpt, grad_f, x)
    #= update_model!(x, d) =#
    default_lik_g!(x, d.sample, d.surv_firms, d.surv_dur_lim, d.exit_firms, d.exit_lim,
        d.int_stor, d.int_stor_g, grad_f)
end


## Compute Hessian of the Lagrangian matrix
function MathProgSolverInterface.eval_hesslag(d::NSOpt, H, x, σ, μ)
    # TODO (Felix, 2015-02-04): How to deal with non existing constraints?
    #= update_model!(x, d) =#
    ## Compute likelihood Hessian vector
    default_lik_h!(x, d.sample, d.surv_firms, d.surv_dur_lim, d.exit_firms, d.exit_lim, d.int_stor,
        d.int_stor_g, d.int_stor_h, d.int_stor_h_com, H)


    ## Multiply with weight σ and multiply onto H to obtain Lagrangian Hessian
    H *= σ
end


### Structure of the Lagrangian Hessian
function MathProgSolverInterface.hesslag_structure(d::NSOpt)
    # TODO (Felix, 2015-02-04): figure out how to incorporate models with fewer factors
    dim = length(d.params)
    return findn(sparse(triu(ones(dim, dim))))
end


# Structure of the Jacobian of all constraints (this problem is unconstrained).
MathProgSolverInterface.jac_structure(d::NSOpt) = [],[]


## Compute jacobian matrix of constraints
MathProgSolverInterface.eval_jac_g(d::NSOpt, J, x) = nothing

You can see that this is an unconstrained optimization problem (actually an MLE). I store input data for objective, gradient, and Hessian as arrays inside the NLPEvaluator. I also store some arrays used for temporary storage inside the NLPEvaluator. However, I've come to find that this can be troublesome. For example, when using certain global optimization algorithms of NLopt, the seemingly parallel optimizations of the local sub-problems will overwrite these storage arrays without respecting other local sub-problems' results stored in these arrays. So, I guess you shouldn't use this technique in these cases. The actual optimization setup and call is done in another file:

## Setup optimization input
startvals = [randn(3 * length(COVAR_SEL)), randn(length(COVAR_SEL)) / 1000]
l         = -20 .* ones(length(startvals))
u         = 20 .* ones(length(startvals))
lb        = Float64[]
ub        = Float64[]


## Setup optimization
solver = NLoptSolver(algorithm = :GN_ISRES)
m      = MathProgSolverInterface.model(solver)
p      = NSOpt(COVARIATES[:, COVAR_SEL], DEF_FIRMS_SURV, DEF_SURV_LIMITS, DEF_FIRMS_EXIT,
        DEF_EXIT_LIMITS, startvals, stor, stor_g, stor_h, stor_h_com)


MathProgSolverInterface.loadnonlinearproblem!(m, length(startvals), 0, l, u, lb, ub, :Max, p)
MathProgSolverInterface.setwarmstart!(m, startvals)


## Run optimization
info("Starting optimisation...")
MathProgSolverInterface.optimize!(m)

As I said. The implementation of the MathProgBase interface here seems to work. However, my objective, gradient and Hessian still seem to have some problems. Also, the choice of the proper optimization algorithm is non-trivial. So far, IPOPT returns garbage and I've hence started looking at NLopt.

Good luck!

Tyler Ransom

unread,

Jul 10, 2015, 3:10:26 PM7/10/15

to juli...@googlegroups.com

Hi cd,

I don't know how much progress you've made in the past 5 months, but I have the conditional logit model coded up in JuMP. I'd be happy to share it with you.

Tyler

On Friday, February 6, 2015 at 4:34:18 PM UTC-5, cd wrote:

David Anthoff

unread,

Jul 10, 2015, 4:14:23 PM7/10/15

to juli...@googlegroups.com

It would actually be nice to have an interface to MathProgBase that is at the complexity level of the Optim interface. Autodiff could be supported in the same way as in Optim, via Duals. The whole thing would be less fancy than JuMP, but probably still very useful in some cases.

Having worked with MathProgBase a fair bit lately, I think there is actually nothing preventing this from happening, other than a volunteer to try to code it up :)

--
You received this message because you are subscribed to the Google Groups "julia-opt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-opt+...@googlegroups.com.
Visit this group at http://groups.google.com/group/julia-opt.
For more options, visit https://groups.google.com/d/optout.

Miles Lubin

unread,

Jul 10, 2015, 4:18:44 PM7/10/15

to juli...@googlegroups.com, ant...@berkeley.edu

Agreed, this is a low-hanging fruit. Contributions are welcome :)

Reply all

Reply to author

Forward