Using external function (scipy.stats.ks_2samp) in objective function

146 views
Skip to first unread message

herge...@gmail.com

unread,
Jul 14, 2016, 6:25:20 AM7/14/16
to Gurobi Optimization
Hi all,

I am trying to apply the 2-sample KS-test to some probability distributions as part of my objective function. I would like to use the following built-in python function: scipy.stats.ks_2samp. The aim is to minimise the p-value which the function outputs (the lower the p-value the closer the 2 distributions of interest).

Does the objective function (m.setObjective) in Gurobi's Python interface support external functions which are loaded from separate modules? In this case, I would be interested in the ks_2samp function from the scipy module.

If not, how should I proceed to still achieve my goal of applying a 2-sample KS-test?

Thanks for your help!

Kind regards,
Nadja


Sascha S.

unread,
Jul 14, 2016, 9:28:17 AM7/14/16
to Gurobi Optimization
Of course that's NOT possible and there are so many good reasons for that.

Just a few remarks:
- Gurobi, while tackling a lot of different optimization problems, only can deal with convex ones
- Taking some arbitrary external function means, that there might be some nonlinear-function which Gurobi can't handle (there are other solvers, but nonlinear-optimization is hard!)
- Gurobi can't convert this (possibly) nonlinear-function and convert it to some proxy-model (which is convex). This is active research and has some theoretical limits.
- The KS-test doesn't look like it's easy to describe in a convex-setting. Looking at some empirical distribution function (which can be non-convex) and the sup of the difference between two nonconvex functions makes me believe that this is very hard to approximate within a convex-setting

It also looks like you are missing some basics about the problems Gurobi is able to tackle. You should look up the docs.

Oliver Angelil

unread,
Jul 14, 2016, 7:26:31 PM7/14/16
to Gurobi Optimization

Hi Sascha and Nadja,

I am not too sure whether the two sample KS test is non-linear as Sascha is assuming. However I am also confused because, Sascha, you say that Gurobi can solve any convex problem, which would include x^p for p>=2. These are non-linear, which you say Gurobi can't handle. However, Gurobi is clearly able to solve quadratic functions according to this site.

When looking at the source code for the scipy ks_samp2 function, the following is used:
data_all =np.concatenate([data1,data2])
KS = np.max(np.abs(data1.searchsorted(data_all, side='right') / n1 - data2.searchsorted(data_all, side='right') / n2

So, all it does is concatenate the 2 data sets, then sort them, take the absolute value of a difference and apply the max() function. Are any of those functions non-linear?

hth,
Oliver

Daniel Espinoza

unread,
Jul 15, 2016, 4:23:10 PM7/15/16
to Gurobi Optimization
Hi Nadja,

While Gurobi does not allow to call external functions in the objective function; it can deal with many thinks.

If I understand you correctly, what you would need to do is to pose the KS function as a MIP (i.e. a mixed integer program/optimization problem).

For that you need to `sort' the values within the optimization problem; one possible trick for that is to use a CVaR-lilke formulation for each sample level; or try some approximation of the permutahedron (see Goeman's formulation which is theoretical)
Then, once sorted, you need some integer variables that would count the comparison sense between each (sorted) sample/ cumulative distribution level.

I am not aware of anybody actually trying this approach, but in principle should work.

Best,
Daniel E.
Reply all
Reply to author
Forward
0 new messages