Distributed Optimization

7 views
Skip to first unread message

nonlinear5

unread,
Sep 6, 2008, 3:56:06 PM9/6/08
to JBookTrader
Florent brought my attention to GridGain (http://www.gridgain.com/
index.html), which is an open source software for parallel processing
specifically for Java applications. I looked at some of their
documentations, and I liked their concepts. I am going to see what it
would take to integrate JBT with it. If anyone has already tried it,
let me know what you found. If I can distribute optimization job to
two computers (one running at work, the other one at home), this would
be valuable.

Florent Guiliani

unread,
Sep 6, 2008, 8:45:50 PM9/6/08
to jbook...@googlegroups.com
nonlinear5 a écrit :

- as all is serialized between the nodes (bytecode and data)
- as our data if heavy and our bytecode is light

I suggest to put manually the data on all nodes locally. Our code would
load it from local disc node after being serialized on the node.

I'm pretty sure that it would be practically impossible to serialize the
data between the nodes over a slow network.

--
Florent,

Florent Guiliani

unread,
Sep 6, 2008, 9:51:51 PM9/6/08
to jbook...@googlegroups.com

Kelvin

unread,
Sep 7, 2008, 12:01:31 AM9/7/08
to JBookTrader
I know there are powerful clusters in some universities, which
includes N low-cost towers. N usually can be 16-128. Hence, the
distributed optimization is highly useful. But, those clusters only
support java without GI.

On Sep 6, 9:51 pm, Florent Guiliani <flor...@guiliani.fr> wrote:
> Time to buy some low cost used servers?:http://cgi.ebay.com/HP-DL380-G3-Server-DUAL-XEON-3-06GHz-4GB-DVD-CD-R...39%3A1|66%3A2|65%3A12|240%3A1308&_trksid=p3286.c0.m14

rnicoll

unread,
Sep 7, 2008, 6:42:32 AM9/7/08
to JBookTrader
I imagine a headless optimization-only client could be used, that's
not really so much of an issue.

On Sep 7, 5:01 am, Kelvin <baoxing...@gmail.com> wrote:
> I know there are powerful clusters in some universities, which
> includes N low-cost towers. N usually can be 16-128. Hence, the
> distributed optimization is highly useful. But, those clusters only
> support java without GI.
>
> On Sep 6, 9:51 pm, Florent Guiliani <flor...@guiliani.fr> wrote:
>
> > Time to buy some low cost used servers?:http://cgi.ebay.com/HP-DL380-G3-Server-DUAL-XEON-3-06GHz-4GB-DVD-CD-R...66%3A2|65%3A12|240%3A1308&_trksid=p3286.c0.m14

Florent Guiliani

unread,
Sep 7, 2008, 7:42:57 AM9/7/08
to jbook...@googlegroups.com
Exact, for me hardware and software isn't a issue.

Room, power supply, noise, time to setup and maintain IS the real issue ;)

That's why amazon EC2 could be for me a good option.

rnicoll a écrit :

xjustin

unread,
Sep 8, 2008, 9:07:40 PM9/8/08
to JBookTrader
I included GridGain into my backtesting system. If a choice is made,
the updaetIndicatorsExectue() method starts GridGain and invokes
updaetIndicatorsExectueGridified() method, which is specified with
@Gridify to use
(taskClass=PriceBarUpdate_UpdateIndicatorsExecuteTask.class) . In
PriceBarUpdate_UpdateIndicatorsExecuteTask which extends
GridTaskSplitAdapter<GridifyArgument,
ArrayList<OptimizationResult>>, the overriden split() will do the job.

BTW, I have integrated RapidMiner into my system, if you have plan to
include it as well, I'd like to contribute part of my code.


public static ArrayList<OptimizationResult>
updaetIndicatorsExectue(OptimizerRunner optimizerRunner,
LinkedList<Strategy> stratCandidates, QuoteHistoryBars
quoteHistoryOfBars,
MidPointHistory midQuoteHistory) throws JBookTraderException
{
boolean toGridGain =
optimizerRunner.optimizerDialog.getGridOrNot();
if(toGridGain){
ArrayList<OptimizationResult> results;
try{
GridFactory.start();
results = updaetIndicatorsExectueGridified(
optimizerRunner,
stratCandidates, //.get(stratCandidates.size()/2)
quoteHistoryOfBars,
midQuoteHistory
);
GridFactory.stop(true);
return results;

// com.primoi.gridgain.HelloWorld.main(new String[]{""});
// return new ArrayList<Result>();
}catch(Exception e){
e.printStackTrace();
return null;
}
}
else {
return
updaetIndicatorsExectueGridifiedNot(optimizerRunner,stratCandidates,quoteHistoryOfBars,midQuoteHistory);
}
}



@Gridify(taskClass=PriceBarUpdate_UpdateIndicatorsExecuteTask.class)
protected static ArrayList<OptimizationResult>
updaetIndicatorsExectueGridified(
OptimizerRunner optimizerRunner,
LinkedList<Strategy> stratCandidates,
QuoteHistoryBars quoteHistoryOfBars,
MidPointHistory midQuoteHistory
) throws JBookTraderException
{
// System.out.println("<<<<<<<<<<<<<80:"
// +optimizerRunner+" "
// +stratCandidates.size()+" "
// +quoteHistoryOfBars.size()+" "
// +midQuoteHistory.size()+" "
// );
// return new ArrayList<Result>();

return updaetIndicatorsExectueGridifiedNot(optimizerRunner,
stratCandidates,quoteHistoryOfBars,midQuoteHistory);
}



public class PriceBarUpdate_UpdateIndicatorsExecuteTask extends
GridTaskSplitAdapter<GridifyArgument, ArrayList<OptimizationResult>>
{

@Override
protected Collection<? extends GridJob> split(int gridSize,final
GridifyArgument arg
) throws GridException {
List<GridJobAdapter<ArrayList<Strategy>> > jobs =
new ArrayList<GridJobAdapter<ArrayList<Strategy>> >();

try{
// Split number of iterations.
Object[] params = arg.getMethodParameters();
final OptimizerRunner optimizerRunner = (OptimizerRunner)params[0];
final LinkedList<Strategy> stratCands =
(LinkedList<Strategy>)params[1];
final QuoteHistoryBars quoteHistoryOfBars =
(QuoteHistoryBars)params[2];
final MidPointHistory midQuoteHistory =
(MidPointHistory)params[3];



// for(int i=0;i<3;i++){}
// gridSize = 2;
// debug.Tracker.track("30:" + gridSize);
// System.err.println();

int sizeInGrid = stratCands.size()/gridSize;

//? extends Strategy
final ArrayList<Strategy> list = new ArrayList<Strategy>();
int stratCount = 0;
for (final Strategy strat : stratCands) {
stratCount++;
list.add(strat);

if (stratCount % sizeInGrid == 0) {

System.out.println("60"+" "+list.size());
System.out.println("75:"+" "+list.get(0)+"
"+list.get(list.size()-1));

ArrayList<Strategy> listArg = new
ArrayList<Strategy>(list.size());
for(Strategy stratTmp: list){
listArg.add(stratTmp);
}


jobs.add(new
GridJobAdapter<ArrayList<Strategy>>((ArrayList<Strategy>)listArg) { //
ArrayList<Strategy>

@Override
public Serializable execute() throws GridException {

try {
ArrayList<Strategy> argInner = getArgument();
// System.out.println("80:"+optimizerRunner+" "+argInner.size()
+" "+quoteHistoryOfBars.size()+" "+midQuoteHistory.size()+" "+arg);
return OptimizerRunner.updaetIndicatorsExectueGridified(

optimizerRunner,stratCands,quoteHistoryOfBars,midQuoteHistory); ////it
is necessary to return it
} catch (Exception e) {
e.printStackTrace();
return null;
}
}

});

list.clear();

}

}

}catch(Exception e){
e.printStackTrace();
throw new GridException(e.getMessage());
}


return jobs;
}

@Override
public ArrayList<OptimizationResult> reduce(List<GridJobResult>
gridJobResults)
throws GridException {
ArrayList<OptimizationResult> resultsAll = new
ArrayList<OptimizationResult>();
for (GridJobResult gridResult : gridJobResults) {
if(gridResult==null)continue;

ArrayList<OptimizationResult> results = gridResult.getData();

for (OptimizationResult optResult : results) {
resultsAll.add(optResult);
}
}
return resultsAll;

Shane

unread,
Sep 8, 2008, 9:21:41 PM9/8/08
to jbook...@googlegroups.com
good work xjustin!  I have gone through some rapid miner tutorials in the past, but really don't know where to start to use it to analyze trading strategies.  I would appreciate any help or ideas you may have in that regard.

xjustin

unread,
Sep 8, 2008, 9:33:55 PM9/8/08
to JBookTrader
Sure. GridGain is very easy to use. Basically, you just need to start
up GridGian programs on every machines in your network, and GridGain
will find all nodes and do configurations for you.

nonlinear5

unread,
Sep 8, 2008, 9:50:34 PM9/8/08
to JBookTrader
> I included GridGain into my backtesting system. If a choice is made,
> the updaetIndicatorsExectue() method starts GridGain and invokes
> updaetIndicatorsExectueGridified() method, which is specified with

Thanks, xjusting. Backtesting in JBT is very fast (order of seconds)
even with very large data sets. I think you mean to grid-enable
*optimization*. I have not looked at your code in detail yet, but it
seems that your "gridification" takes place at the "update indicators"
level, which is too low. A more natural "break up and compute" point
is right before or inside the execute(List<Strategy> strategies)
method of OptimizerRunner.

So, does your code run distributed?

>BTW, I have integrated RapidMiner into my system, if you have plan to
>include it as well, I'd like to contribute part of my code.

I glanced at it quickly, it looks interesting. It appears that it has
a heavy focus on visualization, so I wonder if it can do 4-dimentional
graphs. I need these to construct optimizations maps for strategies
with 3 parameters. Three parameters plus the resulting performance
metric for each triplet of parameters make it a 4-dimensional hyper-
surface. So, I imagine, there would be way to construct a 3-
dimensional surface and designate the color as the performance metric.
I hope I am explaining this well.

xjustin

unread,
Sep 9, 2008, 12:56:16 AM9/9/08
to JBookTrader


On Sep 8, 9:50 pm, nonlinear5 <eugene.kono...@gmail.com> wrote:
> > I included GridGain into my backtesting system. If a choice is made,
> > the updaetIndicatorsExectue() method starts GridGain and invokes
> > updaetIndicatorsExectueGridified() method, which is specified with
>
> Thanks, xjusting. Backtesting in JBT is very fast (order of seconds)
> even with very large data sets. I think you mean to grid-enable
> *optimization*. I have not looked at your code in detail yet, but it
> seems that your "gridification" takes place at the "update indicators"
> level, which is too low. A more natural "break up and compute" point
> is right before or inside the execute(List<Strategy> strategies)
> method of OptimizerRunner.

In my code, you can see the method
updaetIndicatorsExectue(OptimizerRunner,
LinkedList<Strategy>, QuoteHistoryBars ,
MidPointHistory )
which wraps updaetIndicatorsExectueGridified, which executes
List<Strategy>, after upadting indicators.
My system has both TA and FA indicators and not all indicators need to
be updated according to strategy parameters.
So I put strategy-parameter dependent part into one methods.


> So, does your code run distributed?
Yes. I run my system on 4 computers. Actually, it can also split into
multiple jobs on a
computer with multiple processors. My system is split into 6 jobs with
2 dual-core.


> >BTW, I have integrated RapidMiner into my system, if you have plan to
> >include it as well, I'd like to contribute part of my code.
>
> I glanced at it quickly, it looks interesting. It appears that it has
> a heavy focus on visualization, so I wonder if it can do 4-dimentional
> graphs. I need these to construct optimizations maps for strategies
> with 3 parameters. Three parameters plus the resulting performance
> metric for each triplet of parameters make it a 4-dimensional hyper-
> surface. So, I imagine, there would be way to construct a 3-
> dimensional surface and designate the color as the performance metric.
> I hope I am explaining this well.

RapidMiner is more about modeling/calculating. I did not find 4-d
surface. Will check more.

Florent Guiliani

unread,
Sep 9, 2008, 6:31:12 AM9/9/08
to jbook...@googlegroups.com
xjustin wrote:
> In my code, you can see the method
> updaetIndicatorsExectue(OptimizerRunner,
> LinkedList<Strategy>, QuoteHistoryBars ,
> MidPointHistory )
> which wraps updaetIndicatorsExectueGridified, which executes
> List<Strategy>, after upadting indicators.
> My system has both TA and FA indicators and not all indicators need to
> be updated according to strategy parameters.
> So I put strategy-parameter dependent part into one methods.

Did you put the data locally to each node or did you let GridGain serialize the
whole market snapshots over the nodes?

What is your time ratio? (1 thread locally / X remote nodes )

In the case you let GridGain serialize the whole data, how much time it took to
serialize it?

--
Florent,

B Rannode

unread,
Dec 8, 2008, 6:02:50 PM12/8/08
to xjustin, jbook...@googlegroups.com
I'd like to hear the answer to Florent's questions as well.

Are there any limitations (obvious bottlenecks) with GridGain? Such as
the lack of a distributed file system and so forth?

Are you still using the system? Have you made changes other than the
diff you posted in the thread?

Thank you.

On Sep 9, 5:31 am, Florent Guiliani <flor...@guiliani.fr> wrote:


> xjustin wrote:
> > In my code, you can see the method
> > updaetIndicatorsExectue(OptimizerRunner,
> > LinkedList<Strategy>, QuoteHistoryBars ,
> > MidPointHistory )
> > which wraps updaetIndicatorsExectueGridified, which executes
> > List<Strategy>, after upadting indicators.
> > My system has both TA and FA indicators and not all indicators need to
> > be updated according to strategy parameters.
> > So I put strategy-parameter dependent part into one methods.
>

Reply all
Reply to author
Forward
0 new messages