Optimization of a complicated expensive closed black box function

83 views
Skip to first unread message

Devraj Mandal

unread,
Apr 1, 2016, 6:37:39 AM4/1/16
to BayesOpt discussion
Hi

I am working on tuning the hyper-parameters of a complicated black box function whose functional form I do not know. Is it possible to tune the hyper-parameters to get the best possible result using BayesOpt.

Let me explain the scenario

My function does face recognition and ideally I would like to get the best possible result. So the output of the function is the accuracy.
Suppose the function is 'f' and the parameters are 'a', 'b' and 'c'. Now I have a fair guess that these parameters should lie between certain intervals like
a between [1 , 0.0001] but I want to search within log intervals only i.e., 1 0.1, 0.01,....
b between [1 , 10] but I want to search in intervals of '2'
c can be any range.

Is it possible to setup such bounds in the bayesopt ?

Any help will be greatly appreciated.

José Nogueira

unread,
Apr 1, 2016, 7:42:53 AM4/1/16
to Devraj Mandal, BayesOpt discussion

Do you mean you wish to search for  discrete values of these parameters  inside certain ranges? Or are they continuous?

--
You received this message because you are subscribed to the Google Groups "BayesOpt discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bayesopt-discus...@googlegroups.com.
To post to this group, send email to bayesopt-...@googlegroups.com.
Visit this group at https://groups.google.com/group/bayesopt-discussion.

Devraj Mandal

unread,
Apr 1, 2016, 7:55:29 AM4/1/16
to BayesOpt discussion
Yes that's exactly what I need. For example I generally know that my algorithm will perform similarly for some values like 0.1,0.2,.... but it will improve/get worse on some values like 0.01,0.001,.. etc. So I wonder if I do know that my search space is discrete and this search space is different for each of the parameters, can such optimization be done in the bayesopt framework ? 

José Nogueira

unread,
Apr 1, 2016, 8:08:00 AM4/1/16
to Devraj Mandal, BayesOpt discussion

I never tried using Discrete Models with BayesOpt, but I think there is a discrete implementation for Bayesian Optimization in BayesOpt toolbox. There you have the possibility to choose the range of the search space for each of the variables.

Still, I don't know if in the discrete case you can specify which discrete values can be used in BO. You can check the BayesOpt documentation to find out or you can simply make a function that converts a search table for each variable into a number n=1,...,nmax.
Do you have any sample code already? If I find the time I can help you out.

--

Devraj Mandal

unread,
Apr 1, 2016, 8:31:31 AM4/1/16
to bayesopt-...@googlegroups.com
I do have a sample code but before sending to you I need to specify a few things. I am working on cross-modal data analysis. This basically means that I will be matching data from one modality to another modality. In my case I am using the features from this dataset https://www.google.co.in/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=multiple%20features%20dataset. The dataset contains images of the digits encoded in six different feature dimensions. I want to perform matching across different feature types.

To test the effectiveness of the BayesOpt I intended to make a very simple example in which I basically use the BayesOpt framework to tune the hyper-parameters of a dictionary based cross modal approach like this CDL approach (found here http://www.cs.toronto.edu/~slwang/SCDL.pdf).  The code is available at the author's webpage here http://www4.comp.polyu.edu.hk/~cslzhang/SCDL.htm. Now consider that the algorithm's performance is dependent on some parameters like lambda , gamma and the dictionary size. All the parameters scale is different.

Like : dictionary size generally varies from 30 to 100 on-wards. We generally manually tune them in intervals of 10.
lambda and gamma will in general vary from 100 to 1e-5 (these are associated with regularizes to the objective function). So you can well understand that the upper and lower bounds of each of the parameters are hugely different.

I made a working model of this in MATLAB and am trying to implement it : 

The main lines of my code are follows : 

            disp('Discrete optimization');                        
            fun = 'cdl_acc'; n = 3;

            % fun is the BLACK BOX I intend to optimize my function for. Basically it means I will have some data which I feed it the algorithm and                 % based on the hyper-parameters set I will get some accuracy. My basic objective is to maximize the accuracy of the cdl_acc function.
       
                     
            % The set of points must be numDimension x numPoints.
            np = 10;
            xset = [];            
           
            % bound for dict_size
            lb = 30; ub = 100;
            set = round(repmat((ub-lb),1,np) .* rand(1,np) - repmat(lb,1,np));
            xset(1,:) = set;
           
            % bound for lambda
            lb = 1e-5; ub = 100;
            set = repmat((ub-lb),1,np) .* rand(1,np) - repmat(lb,1,np);
            xset(2,:) = set;
           
            % bound for lambda
            lb = 1e-5; ub = 100;
            set = repmat((ub-lb),1,np) .* rand(1,np) - repmat(lb,1,np);
            xset(3,:) = set;
           
            tic;
            bayesoptdisc(fun, xset, params)
            toc;
           
            yset = zeros(np,1);
            for i=1:np
                yset(i) = feval(fun,xset(:,i));
            end;
            [y_min,id] = min(yset);
            disp('Actual optimal');
            disp(xset(:,id));
            disp(y_min);
            disp('Press INTRO');
            pause;


I am not sure that I am doing the correct thing here. I mean does the same number of points need to be sampled for each parameter i.e., 10 in this case ? Please note that this is a very basic model and in the future I intend to optimize hyper-parameters of complex algorithm with multiple number of parameters. But in all the cases I will have some prior knowledge regarding the search space (though it will still be huge). Regarding code sharing is it okay if I share with you the code in matlab ?

Thanks in advance.


On Friday, April 1, 2016 at 4:07:39 PM UTC+5:30, Devraj Mandal wrote:

Ruben Martinez-Cantin

unread,
Apr 8, 2016, 5:27:10 AM4/8/16
to BayesOpt discussion
Hi Devraj,

The discrete optimization requires a vector/list of inputs. That is by
design. It is more general because the inputs might not need to be
integers or even in a grid. If you only care about abstract discrete
values without ordering, the categorical model is better.

For the log part, you can add it inside the target function. That
would be better than making the surrogate model work in log-space.

Best,

Ruben
Reply all
Reply to author
Forward
0 new messages