This is puzzling - Need an explanation

48 views
Skip to first unread message

Bill

unread,
Apr 17, 2012, 10:53:25 AM4/17/12
to Adaptrade Builder
Post from here: http://groups.google.com/group/adaptrade-builder/browse_thread/thread/6c330b2415585189?hl=en

This is with a project files Mark has provided with dally ES data from
1993 to 2012. I have not changed anything.

Build on IS only: 15 long trades, win rate = 80%. Manual OOS test
produces no trades.

Build on IS+OSS: 364 trades in IS, win rate = 76.65%, failed in OOS.

The troubling issue here is the significant increase in trades in the
IS when the OOS is added. I need a good explanation for the difference
in the number of trades in the IS (11/1993 - 12/2008). Maybe there is
something I am missing here.

Michael R. Bryant

unread,
Apr 17, 2012, 12:20:04 PM4/17/12
to adaptrad...@googlegroups.com
Can you explain what you mean with "Build on IS+OOS"? That appears to be a
contradiction. You can only build on in-sample data. Put another way, if
you're using the OOS data for building, it's not OOS data. Unless you divide
the data into in-sample and OOS segments, there's no OOS data.

The fitness, which is used to guide the build process, is calculated only on
the in-sample data, but it's up to you to divide the data using the slider
control on the Markets tab into in-sample and OOS segments.

Mike Bryant

-----Original Message-----
From: adaptrad...@googlegroups.com
[mailto:adaptrad...@googlegroups.com] On Behalf Of Bill
Sent: Tuesday, April 17, 2012 7:53 AM
To: Adaptrade Builder
Subject: This is puzzling - Need an explanation

Post from here:
http://groups.google.com/group/adaptrade-builder/browse_thread/thread/6c330b
2415585189?hl=en

Bill

unread,
Apr 17, 2012, 2:20:15 PM4/17/12
to Adaptrade Builder


On Apr 17, 12:20 pm, "Michael R. Bryant" <m...@BreakoutFutures.com>
wrote:
> Can you explain what you mean with "Build on IS+OOS"? That appears to be a
> contradiction. You can only build on in-sample data. Put another way, if
> you're using the OOS data for building, it's not OOS data. Unless you divide
> the data into in-sample and OOS segments, there's no OOS data.

Sure, I can explain. In the first test I divide the files physically
in an IS file and an OOS file. Thus I have two files saved at two
different locations on my hard drive. I use the IS file in the
builder with 100% set with the slider and then the OOS manually to
check the performance of the best strategies. I get 15 trades for
highest net profit (actually the process converges for the 100 top
strategies to that number) and no trades in the manual OOS test

Then, I use the undivided file and the slider to define the IS and OOS
to the same proportions as with the physical split. In this case I get
364 trades in IS, win rate = 76.65%, but failure in OOS.

I think this should be clear. I argue that this should not be
happening unless I’m missing something.

>
> The fitness, which is used to guide the build process, is calculated only on
> the in-sample data, but it's up to you to divide the data using the slider
> control on the Markets tab into in-sample and OOS segments.
>
> Mike Bryant
>
>
>
> -----Original Message-----
> From: adaptrad...@googlegroups.com
>
> [mailto:adaptrad...@googlegroups.com] On Behalf Of Bill
> Sent: Tuesday, April 17, 2012 7:53 AM
> To: Adaptrade Builder
> Subject: This is puzzling - Need an explanation
>
> Post from here:http://groups.google.com/group/adaptrade-builder/browse_thread/thread...
> 2415585189?hl=en
>
> This is with a project files Mark has provided with dally ES data from
> 1993 to 2012. I have not changed anything.
>
> Build on IS only: 15 long trades, win rate = 80%. Manual OOS test
> produces no trades.
>
> Build on IS+OSS: 364 trades in IS, win rate = 76.65%, failed in OOS.
>
> The troubling issue here is the significant increase in trades in the
> IS when the OOS is added. I need a good explanation for the difference
> in the number of trades in the IS (11/1993 - 12/2008). Maybe there is
> something I am missing here.- Hide quoted text -
>
> - Show quoted text -

Michael R. Bryant

unread,
Apr 17, 2012, 2:48:15 PM4/17/12
to adaptrad...@googlegroups.com
I don't know what else I can add to what I've already said. The fitness is
calculated on in-sample results only. The OOS data is only for reporting
purposes. As I'm sure you've learned, the GP process has an element of
randomness to it. That's how it works. What you're seeing seems a bit
unusual (at least if it's repeated) but is not impossible given the random
elements involved. Perhaps it has something to do with your settings on the
different files. I don't know, but, just to be clear, there's no reason to
do what you've been doing unless you believe I've mislead you about the role
of OOS data in the program. The whole reason there's a selection for OOS
data is so you don't have to manually divide the data. You can also change
the start and/or end date of the data using the calendar selectors on the
Markets tab and in that way create a third period for further testing if you
want, all on the same, single file.

Mike Bryant

Bill

unread,
Apr 17, 2012, 6:39:34 PM4/17/12
to Adaptrade Builder


On Apr 17, 2:48 pm, "Michael R. Bryant" <m...@BreakoutFutures.com>
wrote:
> I don't know what else I can add to what I've already said. The fitness is
> calculated on in-sample results only. The OOS data is only for reporting
> purposes. As I'm sure you've learned, the GP process has an element of
> randomness to it. That's how it works. What you're seeing seems a bit
> unusual (at least if it's repeated) but is not impossible given the random
> elements involved. Perhaps it has something to do with your settings on the
> different files. I don't know, but, just to be clear, there's no reason to
> do what you've been doing unless you believe I've mislead you about the role
> of OOS data in the program.

I just did that way because I already had the data files and I was
used to do it like that in the past.

I ran 10 pairs of tests of a IS build followed by a manual OOS test
and of a build with the combined IS+OOS after using the slider to
divide the data so that the end dates correspond exactly to those of
the manually divided data files. I used the project file Mark provided
and the only change I made was to increase the number of generations
to 20 from 10. I noticed wide variations in the number of trades I get
from the manually defined IS and that generated when the program
defined the same IS. But this was not the only problem. Here are some
statistics:

1) For the 10 tests with the manually divided files, 8 failed big time
in the manual OOS test and just two passed. (Net high profit strategy)
2) For the corresponding 10 tests with the combined IS+OOS divided
using the slider, 6 showed positive OOS performance, two did not have
any trades in OOS and just two failed.

These tests indicate that something improves the results in OOS when
the file is divided by the program. It can be anything, like a small
bug in the code that fails to reset the IS or OOS dates. The tendency
is clear from the results I got.

I suggest you try to reproduce the results. I used ES daily data from
09/11/1997 to 1/22/2010 for the IS and 01/15/2010 to 03/30/2012 for
the OOS.
> happening unless I’m missing something.- Hide quoted text -

Michael R. Bryant

unread,
Apr 17, 2012, 10:12:20 PM4/17/12
to adaptrad...@googlegroups.com
You really think it's more likely that there's an obvious error in the
program that's been there for over a year and that I and all my customers
overlooked it or just missed it rather than the other possibility, which is
that you, who've used the program for just a few days and don't seem to
understand the GP process, may be misinterpreting the results?

Let me add that I've also performed the same type of tests that you have,
with data divided into two files vs. all the data in the same file. I built
over the first file, tested on the second, then built over the combined file
with the divider set to match the division between the two files. Guess
what? No significant difference. Of course there is variability from one run
to the next, but no pattern or any other reason to suspect you've somehow
uncovered a mysterious bug.

Please move on. This is my last comment on this non-issue.

Bill

unread,
Apr 18, 2012, 2:27:43 AM4/18/12
to Adaptrade Builder
Please understand that I noticed some behavior with your program and I
thought I would be a good idea to discuss it here since this is the
forum for it. I did not express my feelings about the program but
instead have done a lot of work with it for no pay, I spent many
hours, and please be advised that I understand GPs very well, I
program in several languages and I have used many trading platforms.

I must tell you again that with my extensive ES tests based on a
project file Mark provided, there is a clear tendency of the program
to show better OOS behavior when the whole IS+OOS is available to it
and the exact opposite tendency shows when the build and OOS tests
occur on physically divided and separated files.

I will give the program and you the benefit of the doubt and repeat
the tests with a completely different file, something like a stock
maybe. If the same behavior is noticed maybe you should look at it
more seriously instead of (a) attacking me and (b) asking me to move
on.



On Apr 17, 10:12 pm, "Michael R. Bryant" <m...@BreakoutFutures.com>
> the OOS.- Hide quoted text -

Bill

unread,
Apr 18, 2012, 10:15:01 AM4/18/12
to Adaptrade Builder
I repeated the ES tests but this time I did them a little differently.
I first ran 10 builds using the whole data file and the slider to
define the IS and OSS (Test IS+OOS) and then I ran 10 builds with the
manually divided files of the IS and OSS (Build IS then manual OOS
test ). I used the project file provided by Mark and data from
09/11/1997 to 03/30/2012. Here are the results;

IS+OOS: for highest net profit: 6 pass in OOS - 4 fail in OOS. For
highest correlation: 6 pass in OOS - 4 fail in OOS
IS then OOS: for highest net profit: 3 pass - 7 fail. For highest
correlation: 3 pass - 7 fail.

The above results are compatible with the results I initially got and
reported in previous posts and the sample is now statistically
significant. It shows a clear bias of the program to generate a higher
number of positive OOS results when the OOS is included is the file
and the slider is used to define it, as opposed to when the build
takes place on a physically separate but identical otherwise IS and
the OOS test is done manually on a separate but otherwise identical
file.

I repeated the above test using a file for DIA from 01/20/1998 -
04/17/2012. The IS was: 01/20/1998 - 01/07/2009 to match exactly the
slider dates. Here are the results for 10 tests:

IS+OOS: for highest net profit: 6 pass in OOS - 4 fail in OOS. For
highest correlation: 6 pass - 4 fail (in total agreement with ES
results)
IS then OOS: for highest net profit: 3 pass - 7 fail. For highest
correlation: 3 pass - 7 fail. (in total agreement with ES results)

The agreement was stunning. The cause of this I cannot know but the
tendency is clear. It may be related to some lack of resetting of
populations after every generation when so that the next generation
may know the results of the previous in the OOS and perform
appropriate mutations.

Thank you for your attention.
> > - Show quoted text -- Hide quoted text -

Mark Knecht

unread,
Apr 18, 2012, 12:51:12 PM4/18/12
to adaptrad...@googlegroups.com
On Wed, Apr 18, 2012 at 7:15 AM, Bill <billch...@gmail.com> wrote:
> I repeated the ES tests but this time I did them a little differently.

Bill,
I'm still very much under the weather (6 days in pajamas, etc.) so
I'm really not following what you're trying to do. However let me
point out one thing and maybe give you one additional way to consider
running your tests.

1) I want to make sure you understand that the gpstrat file I provided
you (which was just a quick file I put together) resets the population
after 10 generation if the OS isn't profitable. That implies that the
OS is NOT really OS. It's part of the overall IS but it's not used to
optimize the parameter set when Builder is creating the model. (Mike
or others: Please correct me if I'm wrong on that point.) Anyway, WRT
to that _specific_ gpstrat file I personally don't think there is
_ANY_ true OOS data.

2) If I understand what you're trying to discuss (I'm I'm pretty sure
I don't) I think an alternative and possibly better test would be to
use a shorter data set within Builder and then test real OS data in
TS. (Or Builder, but personally I would use TS for that part.) For
instance, set up your Builder file using maybe 1997-2010 data. Make 10
runs where there is no OOS data, and make another 10 runs where maybe
1997-2009 is in-sample and 2009-2010 is OOS. Then take all 20 of those
runs back to TS and check how they all work from 2010 through to now.

Please note that I don't build many models using daily data so maybe
someone else could give more appropriate guidance.

Good luck.

Cheers,
Mark

Bill

unread,
Apr 19, 2012, 1:44:15 AM4/19/12
to Adaptrade Builder


On Apr 18, 12:51 pm, Mark Knecht <markkne...@gmail.com> wrote:
> On Wed, Apr 18, 2012 at 7:15 AM, Bill <billcheno...@gmail.com> wrote:
> > I repeated the ES tests but this time I did them a little differently.
>
> Bill,
>    I'm still very much under the weather (6 days in pajamas, etc.) so
> I'm really not following what you're trying to do. However let me
> point out one thing and maybe give you one additional way to consider
> running your tests.
>
> 1) I want to make sure you understand that the gpstrat file I provided
> you (which was just a quick file I put together) resets the population
> after 10 generation if the OS isn't profitable. That implies that the
> OS is NOT really OS. It's part of the overall IS but it's not used to
> optimize the parameter set when Builder is creating the model. (Mike
> or others: Please correct me if I'm wrong on that point.) Anyway, WRT
> to that _specific_ gpstrat file I personally don't think there is
> _ANY_ true OOS data.

Mark, the option to reset the populations if OOS performance is not
profitable was not checked in the project options. I specifically
checked that and it was not the case. Actually, my analysis implies
that this may be happening internally although it should not be the
case, possibly due to a bug.


>
> 2) If I understand what you're trying to discuss (I'm I'm pretty sure
> I don't) I think an alternative and possibly better test would be to
> use a shorter data set within Builder and then test real OS data in
> TS. (Or Builder, but personally I would use TS for that part.) For
> instance, set up your Builder file using maybe 1997-2010 data. Make 10
> runs where there is no OOS data, and make another 10 runs where maybe
> 1997-2009 is in-sample and 2009-2010 is OOS. Then take all 20 of those
> runs back to TS and check how they all work from 2010 through to now.
>

I do not think there should be a need to use another backtesting
program if one trusts the backtesting capability of the builder and I
do. Otherwise there is no point in talking about this program anyway
if we assume we have to use another program to check it out. The issue
here is whether there is some bug that gives an edge to strategies in
the OOS when the slider is used to define the IS and the OOS. I found
that there is. I spent a lot of time. It is now your turn guys. Thank
you.

Mark Knecht

unread,
Apr 19, 2012, 9:22:59 AM4/19/12
to adaptrad...@googlegroups.com
On Wed, Apr 18, 2012 at 10:44 PM, Bill <billch...@gmail.com> wrote:
>
>
> On Apr 18, 12:51 pm, Mark Knecht <markkne...@gmail.com> wrote:
>> On Wed, Apr 18, 2012 at 7:15 AM, Bill <billcheno...@gmail.com> wrote:
>> > I repeated the ES tests but this time I did them a little differently.
>>
>> Bill,
>>    I'm still very much under the weather (6 days in pajamas, etc.) so
>> I'm really not following what you're trying to do. However let me
>> point out one thing and maybe give you one additional way to consider
>> running your tests.
>>
>> 1) I want to make sure you understand that the gpstrat file I provided
>> you (which was just a quick file I put together) resets the population
>> after 10 generation if the OS isn't profitable. That implies that the
>> OS is NOT really OS. It's part of the overall IS but it's not used to
>> optimize the parameter set when Builder is creating the model. (Mike
>> or others: Please correct me if I'm wrong on that point.) Anyway, WRT
>> to that _specific_ gpstrat file I personally don't think there is
>> _ANY_ true OOS data.
>
> Mark, the option to reset the populations if OOS performance is not
> profitable was not checked in the project options. I specifically
> checked that and it was not the case. Actually, my analysis implies
> that this may be happening internally although it should not be the
> case, possibly due to a bug.
>

OK. Well, that option is set in the file I emailed. You said you're
using that file. Possibly you turned it off. Possibly there's a bug in
Builder Possibly this is only a problem you're running into for some
reason. I don't know.

Good luck. I cannot help more I think. In fact I've probably helped
too much and caused some problem. Who knows? Time to just shut up and
do other things.

Bye,
Mark

Bill

unread,
Apr 19, 2012, 1:49:16 PM4/19/12
to Adaptrade Builder


On Apr 19, 9:22 am, Mark Knecht <markkne...@gmail.com> wrote:
Mark - Thanks. I think that option was not set in the file I got but I
might have marked it off anyway. I am running last tests with NQ
futures. I will let you know what comes out.

>
> Good luck. I cannot help more I think. In fact I've probably helped
> too much and caused some problem. Who knows? Time to just shut up and
> do other things.

I see no problem with what you did, it was a good sanity check for me
and helped me a lot evaluate the program.

I will only post once more to report the NQ test results.

Bye


>
> Bye,
> Mark- Hide quoted text -
Reply all
Reply to author
Forward
0 new messages