Re: Eureqa - Eureqa Time Series Models

232 views
Skip to first unread message

Andrew Lamb

unread,
May 10, 2013, 2:04:40 PM5/10/13
to eureqa...@googlegroups.com
If you would be willing to help me reproduce the problem locally on my machine, I would love to help you figure out what is going on.


 


On Fri, May 10, 2013 at 12:28 PM, Devon Kyle <dsk...@gmail.com> wrote:
Hello
I'm using Eureqa to process models for  financial time series data (Daily Gold Prices) of 400+ rows .
The value I'm attempting to predict is an optimal BUY, HOLD  ,SELL  decimal values from approx - 2.0 to + 2.0 .
I have software that generates these numeric signals for me based on what would have been the perfect time to Buy or SELL
were I to have a "crystal ball" to see into the future.
( SELL <= -0.5 , BUY >= +0.5 , all values in between are a HOLD signal).
 I have processed a Eureqa model with this data using Correlation Coefficient 70 % Training 30  % Validation
 with a returned Correlation Coefficient of .99  Fit 0.006 (Solid Green).  
 
 I've taken the solution formula that Eureqa has given me
 
 Prediction( my Optimal Signal) =
less(PERCT_CHG_STOCHASTIC, -0.1638)/cosh(-1.245*PERCT_CHG_MACDHistogram*PERCT_CHG_STOCHASTIC)
 
 WHERE (1 example row)

 PERCT_CHG_STOCHASTIC = 1.105438  (Percent change of the Stochastic Indicator  from the previous day close value)
 PERCT_CHG_MACDHistogram = 0.004059  (Percent change of the MACD Histogram Indicator from the previous day close value)

 
 I believe the result I'm getting in the example above breaks down to 0 / cosh(-1.245 * 0.004059 * 1.105438)
 which by default is "0" . The Optimal Signal for that row should have been close to 0.667 (a BUY signal)
 I'm assuming cosh is the hyperbolic cosine function even though it's not listed in the Eureqa Building Blocks documentation.
 
 I've  implemented the formula in Excel with the raw data that I used to build the model
 for hundreds of rows and the values this formula is returning back to me are not even CLOSE to the Optimal Signal values that I'm attempting to model. I wasn't expecting it to be 100 % accurate, but certainly not 0% accurate.

Just as a sanity check, I've done this with about 5-10 different datasets resulting in very different formulas and I'm getting the same (bad)  type of results when I test the formulas in Excel on the original datasets and their Optimal Signals.
I'm obviously missing something very fundamental in my model building process or understanding.
 

Also, one thing I've noticed is using any of the Boolean Building Block functions seems to result in most  equations with any division and multiply operators, returning a value of zero ( 0/ (some long equation)  = zero.  - or -  (some long equation) * 0 = zero. Zero very
rarely represents the value I'm attempting to model.
Should I be only using Boolean functions only when I'm processing binary datasets?

Any feedback would be appreciated.
Thanks in advance.
Devon Kyle

--
You received this message because you are subscribed to the Google Groups "Eureqa Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eureqa-group...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Andrew Lamb

unread,
May 11, 2013, 4:34:37 PM5/11/13
to eureqa...@googlegroups.com

Thanks!

On May 11, 2013 4:22 PM, "Devon Kyle" <dsk...@gmail.com> wrote:
Will do Andrew - let me prep a small zip file and forward it to you -should be able to get to this weekend..
After we determine the issue, I'll make sure to post back here to share the info for the entire forum.. Thanks
devon

Michael Schmidt

unread,
May 15, 2013, 7:28:18 PM5/15/13
to eureqa...@googlegroups.com
Kyle, have you tried using the "Evaluate model on a dataset" report in the "Report/Analyze" tab? You can calculate the outputs from this, and it will take into account any normalization settings you set, and produce the exact calculation used internally.

Michael

Message has been deleted

Michael Schmidt

unread,
May 22, 2013, 2:15:26 PM5/22/13
to eureqa...@googlegroups.com
I definitely agree, any preprocessing you can do ahead of time outside of Eureqa is always preferred.


On Thu, May 16, 2013 at 7:42 PM, Devon Kyle <dsk...@gmail.com> wrote:
I JUST discovered that feature (totally by accident ) two days ago. It's a really wonderful feature within Eureqa and I'm doing a lot of experiments with it. It's great because I can create a model and merely add new rows of data and get an immediate output response based on a pre-existing model I created within Eureqa. Much much easier than mucking around in Excel.
I also became self-aware that the issues I was having mentioned in this original thread are probably do to the preprocessing (outliers, normalization, smoothing etc)  that I was performing on my raw dataset within Eureqa which I wasn't factoring into or duplicating within my Excel equation process. I'm sure that would have something to do with my discrepancies.
 
My ultimate goal is the create time series models that I can backtest and use in a real time automated algo envirornment. Create the models in Eureqa and recode those equations/models in C++/C#/ or Python. Because the preprocessing functions within Eureqa seems to be a bit of a black box, my latest direction is to do my preprocessing before I input the dataset into Eureqa (smoothing, outliers Moving avgs etc), build the data on that "raw" set of data points and then matching the exact same preprocess in my algo app code. Hopefully that will get me comparing apples to apples.
 
I now you and your Nutonian team are busy with lots of things so I wanted to thank you Michael taking time out of your busy day and reaching with your input and feedback. Appreciate it.
Devon Kyle
p.s. possible idea for future version of Eureqa.... some kind of API where I could create my own preprocessing functions (C++, Python ? etc) and apply them to my raw datasets before
processing the models ??
Reply all
Reply to author
Forward
0 new messages