Interview with Dr. Howard Bandy

737 views
Skip to first unread message

John Verbrugge

unread,
Sep 6, 2013, 11:49:10 AM9/6/13
to adaptrad...@googlegroups.com
Hello everyone,

Many of you recognize Dr. Howard Bandy's name; he is the author of three excellent books on trading system development (Quantitative Trading Systems, Mean Reversion Trading Systems, and Modeling Trading System Performance).  He is also a systems developer and speaker, and he posts to this forum on such topics.

Please take a few minutes and listen to my interview with Dr. Bandy on the Trader Tech Talk podcast.  Dr. Bandy gives out really excellent material in the interview, and I learned a lot from him.

Some highlights from the interview:
    The harsh realities of back testing (and some hope, too!)
    How to validate your trading system, and statistically show how it will do in the future
    A tool you can use to run some statistics on your trading results
    The most important piece we miss in system development

You can go right to the podcast here.  It's episode 10:

http://blog.tradertechtalk.com/10
https://itunes.apple.com/us/podcast/trader-tech-talk/id663583667

Thank you!


Ell

unread,
Sep 7, 2013, 8:09:40 AM9/7/13
to adaptrad...@googlegroups.com
Very interesting interview but imo it raised more questions than it answered.

I think nobody should disagree with what Dr. Bandy said about skills and competition. I too agree with him and his good points along these lines.

But he was asked twice what is the best choice of parameters from the walk-forward testing and he did not answer. 

We all know that if a system is developed on an uptrend and the last walk-forward test is actually an uptrend but then the market moves sideways then the system will fail. I simply cannot see how walk-forward testing increases the chances of developing better systems when the future is unknown. 

Then Dr. Bandy said that each time a trade is placed an inefficiency is removed from the market. Here I also fail to see the relevance of his statement. Every tick is caused by a real trade and if real trades remove inefficiencies then past data cannot be used to develop systems. Unless there is not really any connections between inefficiencies and trading system performance. 

Lastly, Dr. Bandy said that system drawdown is something like proportional to the square root of holding time. Is there a rigorous justification for this statement? One may hold trades for only very short period of time and still have a very large drawdown even larger than a trend follower who holds positions for months. Here I get the impression that everything Dr. Bandy said was related to a specific class of systems he is working with and is not to be conceived as generally true. 

I would also like to say that I know a few individuals that develop systems for hedge funds and the way they describe their process to me sounds nothing like Dr. Bandy described in the interview other than the use of statistical analysis. For example some work with artificial data first to develop systems which they then test on past data. In this case walk-forward and oos testing are really redundant operations. 

Howard B

unread,
Sep 8, 2013, 2:02:41 PM9/8/13
to adaptrad...@googlegroups.com
Hi Ell, and all --

There is no general guideline to determine the length of either the in-sample period or the out-of-sample period.

The length of the in-sample period depends on the frequency of the patterns the model is programmed to identify.  Each trade contributes one data point to the analysis.  There must be enough data points in the in-sample period to draw some generalizations.  If the system intends to trade infrequently and hold a long time, the in-sample period must be long.  If the system intends to trade frequently, the in-sample period might be either long or short.  The decision then depends on the stability of the signals relative to the indicators.  If the indicators require frequent resynchronization, then the in-sample period must be short enough so that there is not a dilution of signal strength.  If the signals are quite stable, then the in-sample period could be longer.  But in any case, the length can only be determined through experimentation.

The length of the out-of-sample period is more straight-forward.  It is however long the model and the data remain synchronized and the signals remain profitable.  Again, there is no general guideline. 

The future is always unknown.  Whenever the characteristics of the data change significantly for any system -- trend following, mean reversion, pattern, seasonality, etc -- the profitability declines.  The walk forward process gives the developer an opportunity to observe the transition from development to trading at each in-sample to out-of-sample transition.  One of the advantages of the walk forward process is that the out-of-sample trades provide a data set that can be used to compare future performance.  If a system is being traded (live or paper), having a baseline with which to compare recent performance helps decide whether the system is working or is broken -- whether to continue to trade it or to take it offline.  Even better, that "best estimate" set of trades provides the data for use in a Monte Carlo process to estimate the maximum safe position size.

I am open to alternative methods of validation.   I have been studying and applying modeling, simulation, and forecasting for about 50 years.  I began using it in the 1960s in industrial applications.  I continued studying it, and wrote papers explaining the use of the technique, calling it walk forward at that time, in the late 1960s while in graduate school.  I have been using it regularly since.  I have found no better technique for use with financial time series than the walk forward method.   Perhaps readers have some to offer? 

I believe the markets are nearly,  but not completely, efficient.  If they were completely efficient, then only inside information would enable profitable trading.  If there is some residual inefficiency, technical analysis / quantitative analysis depends on that inefficiency being discoverable in the price and volume series.  Trading systems attempt to identify inefficiencies and make profitable trades based on those inefficiencies.  Each profitable trade removes some of the very inefficiency the system was designed to identify.  Future trades made by that system will be less profitable because there is less inefficiency.  Eventually all trading systems fail because the inefficiency they identify has been removed.  The process is similar to heat transfer and entropy.  We can see examples of this by examining the performance of well known systems over time.  For example, the Donchian breakout system worked well on futures until the 1980s and 1990s, has since become ineffective, and will never be profitable in those markets again.

Drawdown does increase in proportion to the square root of the holding period.  I have a section in my "Quantitative Trading Systems" book where I show the data, the charts, the methods, the statistics.  Theoretically, based on the near-efficiency of the financial markets, the phenomenon also follows from the diffusion equation that describes the distance particles traveling randomly are from their initial point.  The result is not dependent on any particular class of systems.  The guideline holds throughout -- longer holding periods expose the trade to higher drawdown in proportion to the square root of the holding period.  There is a design feature that follows from this -- given alternative systems that have equivalent performance based on closed trades, prefer these that have the shorter holding period.  For a silly example, assume I must walk from one place to another on a typical day in the northwest in December (I live in Oregon).  The probability of my getting wet from rain increases in proportion to the number of blocks I must travel.  Given a desire to stay dry, I should prefer the route that takes less time. 

The use of artificial data in the training of trading systems has extremely little value.  In my opinion, it is of no value to non-hedge fund individuals developing systems for their own use.  In order for a system to be profitable, it must identify patterns in the data that precede profitable trades.  If we were able to generate artificial data that was useful in predicting profitable trades, we would already have knowledge of the desired patterns and we would not need the artificial data.  There is an adequate amount of real data that has real potential for real profit if real patterns can be detected.  Why work with anything else?

The final comment compares my process to that used by hedge funds, and concludes that they are similar only in their use of statistics.  I do not disagree, and thank Ell for making my point.  My point is that the goal of every trading system development project is to give the developer / trader confidence that the signals generated by the system will provide reward adequate to compensate for the risk.  The key word is confidence.  The primary limitation is risk.  The modeling, simulation, and statistical methods I recommend and teach are designed to help developers and traders quantify the risk, estimate the profit potential, determine the maximum safe position size, and provide metrics -- often in the form of statistical measurement -- to gauge the confidence.

Thanks for listening,
Howard    





--
You received this message because you are subscribed to the Google Groups "Adaptrade Builder" group.
To unsubscribe from this group and stop receiving emails from it, send an email to adaptrade-buil...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Lawrence Lewis

unread,
Sep 9, 2013, 11:04:43 AM9/9/13
to adaptrad...@googlegroups.com
Michael, Dr. Bandy, All...

I'd like to get your opinion on the following concept for robustness testing.

If you have a trading system with, say, 3 OR conditions for entry:

if cond1 then enter or
if cond2 then enter or
if cond3 then enter;

It seems to me you have to be careful when reviewing your results to make sure you have a statistically significant number of trades for each condition. Also, it seems that the percent of contributed profit for each trade condition should be somewhat proportional to the total profit. Otherwise, you might have, say, 50 trades for cond1 that contribute 10% of the profit, and 2 trades for cond2 that contribute 80% of the profit, and, conclude that because you had more than 50 trades, you had a robust system.

If you think this might have some validity, it would lead me to believe we need to come up with a better robustness statistic that would try and capture this concept.

Larry Lewis

Howard B

unread,
Sep 9, 2013, 1:23:40 PM9/9/13
to adaptrad...@googlegroups.com
Hi Lawrence --

I am a pragmatist when it comes to statistical tests of trading system results.  Working with time series is different than working with non-time series -- voter preference, agricultural productivity, or industrial quality control as examples.  And working with financial time series and trading results is still more different.  It is very difficult to develop a trading system that passes a t-test with high degree of confidence.  In an industrial quality control environment, the standard deviation is small relative to the mean.  Ball bearings might be 2.00 centimeters in diameter, with a standard deviation of 0.01 centimeters.  It is relatively easy to tell when manufacturing quality has slipped.  A trading system might have a mean gain per trade of 1.0 percent, but the standard deviation is typically three times that, or more -- plus or minus 3.0 percent.  This puts the mean, 1.0% well within the +/- one standard deviation control band -- which would be -2.0 to +4.0 range.  The two standard deviation bands would be -5.0 to + 7.0 percent, and it would be nearly impossible to tell whether the system was out of control.  Additionally, the performance of a trading system is not stationary -- the mean, standard deviation, and other metrics change over time.

You are describing a multi-rule system.  Most trading systems are based on ANDing or ORing several rules together to create Buy and Sell signals.  I agree that each rule should contribute to the performance of the system.  You can test this as you are doing preliminary development and remove rules that do not contribute "enough."  Enough is subjective, as are many of the decision that must be made during system development.  The developer could create separate versions of the system with different combinations of the rules, test and validate each, then analyze each determining risk and reward.  Certainly, as the complexity of the model is greatly increased, robustness decreases.  But some complexity is required.  In fact, I believe the difference between profitability and not depends on the system being clever in a way that competing systems are not.

When the system has several rules, each of which could be treated as a system on its own, it should be treated as a system of system -- a multi-system system.  I have a chapter in my Mean Reversion book that addresses this.  The tradeoff that the developer faces is a choice between increased complexity and the associated increase in the number of variables in the system (the curse of dimensionality), versus individual systems that do not trade often enough for any one of them to be validated independently.  There may be no easy way around this.  Note the additional complication of serial correlation of trades if each system is traded independently.  The Monte Carlo methods that Michael and I both use assume the data points are independent and identically distributed (iid, in statistical jargon).  A group of six systems that all trade SPY, all have about six trades a year, and all hold about three days each, are certain to be picking the same trades -- there is significant serial correlation -- meaning that the sample cannot be populated by using all 36 trades.  Using all 36 trades will overestimate reward and seriously underestimate risk -- perhaps by a factor of six. 

So, what to do?  I know of no easy answer that always results in trading systems that are profitable and safe.  My best advice is to continue to design systems that are complex enough to find good trades, but not so complex that they are overly fit to the noise and fail in live trading.  Use the best modeling and simulation techniques and tools.  Be wary of introducing biases -- there are no benign biases -- every bias over estimates profit and underestimates risk.  Use whatever validation techniques you need to give yourself confidence in your system.  Monitor live trading and resynchronize as necessary.  Be quick to reduce position size on a system whose performance is declining.

All "in my opinion" of course.

Thanks for listening,
Howard

 

Michael R. Bryant

unread,
Sep 9, 2013, 1:51:19 PM9/9/13
to adaptrad...@googlegroups.com

I think you have a valid point. However, there are probably a number of such edge cases one could come up with that might require slightly different tests to thoroughly capture the statistical results. No test or even set of tests will be perfect. That’s ultimately why we rely on out-of-sample and, particularly, real-time tracking as the final test of validity.

 

Mike Bryant

 

Subject: Re: Interview with Dr. Howard Bandy

Ell

unread,
Sep 10, 2013, 10:23:56 AM9/10/13
to adaptrad...@googlegroups.com
Hello Dr. Bandy,

I appreciate your long reply and I agree with some of the things you say but your statement that " Drawdown does increase in proportion to the square root of the holding period" must be wrong otherwise we should not be trading at all. This is because what you claim is true for a random-walk with zero drift. In such case there is no edge possible and trading leads to exhaustion of account due to friction.

Intuitively I agree that the longer one stays in the market the higher the chance of a large drawdown but how that depends on holding time it should depend on particular distribution of returns. 

Also, I still do not see the practical validity of walk-forward testing and specifically I cannot find a practical answer to the question of how on determined the parameters to use based on it. 

1. Do you use the last parameter values?
2. Do you use an average?

Finally, some well-known fund managers use randomly synthesized data but their methods are not published and kept secret. 

Lawrence Lewis

unread,
Sep 10, 2013, 5:10:16 PM9/10/13
to adaptrad...@googlegroups.com
 Walk forward testing does two things: It confirms that whatever method you use to select the best trading parameters based on some historical test, that that method is statistically valid, and it is based on the concept that the underlying process may change slowly with time. If you don't do walk forward testing, then you are assuming that the the underlying process doesn't change or that the changes can't be captured by selecting different parameter sets based on recent past performance, in which case you are satisfied with an "average" set based on the entire test period. Either one could be true based on your belief of how the underlying process works. I believe that it does usually change slowly and that it is possible to capture some of that change. If you don't believe that, that's fine. In the end, long term trading results will point the way.

With regard to walk forward testing, I might suggest you take a look at meyersanalytics.com. He points out that the maximum equity metric may not be the best one to use to select a parameter set for walk forward testing.

In fact, it seems to me that selecting the metric that best predicts future results versus the one that best measure past results is an interesting area in itself. It is quite similar to selecting the best fitness function. This is non-trivial as I know users of Adaptrade understand. Yet, it seems most people just look at maximum equity. I think this is definitely an area where Adaptrade shines.

Howard B

unread,
Sep 10, 2013, 5:24:04 PM9/10/13
to adaptrad...@googlegroups.com
Hi Lawrence --

I agree with your comments.

I am in the process of writing another book.  One of the topics is metrics.  During development, the goal is accurate identification of patterns that precede profitable trades.  The best, that is, the least biased, metrics focus on the trades themselves (actually, the distribution of trades) -- such as number of trades, gain per trade, maximum adverse excursion, maximum favorable excursion -- rather than on results that depend on a specific sequence of trades -- such as drawdown.  Particularly if compounding is allowed during testing, using maximum equity as a metric introduces an unfavorable bias.  Alternatives that rank high in maximum equity often perform poorly out-of-sample.

Best regards,
Howard

 

Howard B

unread,
Sep 10, 2013, 5:13:55 PM9/10/13
to adaptrad...@googlegroups.com
Hi Ell --

I do stand with my statement about drawdown increasing in proportion to the square root of holding period.  And I believe the markets are very nearly efficient -- near enough that the diffusion equations that describe random motion are reasonable approximations of financial data.  I do hold the opinion that the markets are becoming more efficient, and that developing profitable trading systems is becoming increasingly difficult.  One of the sections in one of my books discusses this.  I list the steps I recommend when there seems to be nothing that works.  One of the steps is to find another vocation.

If you agree that the goal of system development is to have systems with quantifiable measures of risk, return, and confidence, I see no better validation technique than walk forward.  What would you use as an alternative?  Or, if you have a different goal, what is it?  And how do you achieve it?

The steps for using walk forward validation, including the transitions from development to trading, are described in complete detail in my books and presentations.  In short -- following validation use the final top-ranked model to trade live.  Resynchronize periodically.  There is a brief explanation in Chapter 2 of the Mean Reversion book.  It is a free download:
http://www.meanreversiontradingsystems.com/book.html

I am a skeptic as to the value of synthetic data.  And I will be until I hear a reasonable and detailed explanation of what can be accomplished using it.  (I spent some time in the managed fund industry developing and implementing systems.  I know of no credible use of synthetic data.)

Best regards,
Howard

 

Ell

unread,
Sep 11, 2013, 8:05:31 AM9/11/13
to adaptrad...@googlegroups.com
Dr. Bandy,

I am puzzled by your statement that 

" I do stand with my statement about drawdown increasing in proportion to the square root of holding period.  And I believe the markets are very nearly efficient -"

System traders, consciously or unconsciously subscribe to the view that markets are way no efficient. Otherwise there i no point in using software like adaptrade and technical analysis in general. There is also no point in writing books about trading system development other than to discourage traders from developing systems. So I must say I see a lot of contradiction between your statement above and what you do. nearly efficient means that no system can be profitable if one includes friction. 

As far as walk-forward testing, I am aware that this method was proposed many years ago by a trader in one of his books but there is no formal justifications for it as results highly depend on length of intervals. On the contrary, out-of-sample testing cross validation is justified formally in data mining texts. To accept walk forward testing as superior to just out-of-sample testing I will have to see extensive studies of comparisons of false acceptance and falser rejection rates. I am simply not going to use a method because it sounds good or because someone says without justification that it is superior to others. In that respect I agree with you about the use of artificial, random data. There is simply no justification for that only sketchy arguments.  

If you present evidence that the markets are nearly efficient I will quit trading same day. Please note that what you said is exactly the same argument many use in forums to discourage people from trading. I hope that was not your intention.

Thank you.

Michael R. Bryant

unread,
Sep 11, 2013, 1:53:15 PM9/11/13
to adaptrad...@googlegroups.com

Ell,

I think Dr. Bandy has been quite clear in his explanations, which are entirely reasonable. Anyone who has traded the markets would probably agree that they are nearly efficient, as evidenced by the difficulty in being a successful trader for a sustained period of time. However, there is a big difference between “nearly efficient” and entirely efficient. The entire CTA (Commodity Trading Advisor) business, for example, is based on this difference, as is the business of anyone else who trades for a living. Nearly efficient in this context clearly means it’s possible to find an edge, which means after accounting for trading costs.

 

Drawdown increasing in proportion to the square root of the holding period is an approximation. So is Newtonian physics. I don’t think anyone would argue against the utility of Newton’s laws of physics.

 

As a practical matter, walk-forward testing is essentially the same as cross-validation. It serves the same purpose in essentially the same way. Whether or not it’s superior to using a single out-of-sample interval, I suppose, would depend on your purpose. I’ve always thought the main benefit of walk-forward testing is that it more accurately simulates how a strategy would be traded live; i.e., you trade it for a while, re-optimize the parameter values, trade for a while longer, re-optimize, etc. If that’s how you plan to trade, walk-walk forward testing is clearly how you want to test your strategy since the results will be more representative.

 

Mike Bryant

 

Subject: Re: Interview with Dr. Howard Bandy

 

Dr. Bandy,

Howard B

unread,
Sep 11, 2013, 1:56:33 PM9/11/13
to adaptrad...@googlegroups.com
Hi Ell --

Please interpret "very nearly efficient" as "very difficult to detect profitable opportunities" -- not as "no point in writing ..."  "Nearly efficient" does not mean "no system can be profitable."

Cross validation is discussed in data mining books and it is appropriate for use in validating some categories of models, but it is not appropriate for systems trading financial time series.

I am confused.  Your postings include statements in precise agreement with mine, followed by rejection of mine.  You have expressed your point that you do not believe the techniques I recommend are valid.  Short of reiterating about 1000 pages of text and hundreds of examples, I will be unable to provide more. 

If your methods work for you and help you develop profitable system that you are confident in using, then continue using them -- you do not need my advice. 

Let's end this.  Have a good day.    

Best regards,
Howard
 
Reply all
Reply to author
Forward
0 new messages