Hello
I'm using Eureqa to process models for financial time series data (Daily Gold Prices) of 400+ rows .
The value I'm attempting to predict is an optimal BUY, HOLD ,SELL decimal values from approx - 2.0 to + 2.0 .
I have software that generates these numeric signals for me based on what would have been the perfect time to Buy or SELLwere I to have a "crystal ball" to see into the future.
( SELL <= -0.5 , BUY >= +0.5 , all values in between are a HOLD signal).I have processed a Eureqa model with this data using Correlation Coefficient 70 % Training 30 % Validation
with a returned Correlation Coefficient of .99 Fit 0.006 (Solid Green).
I've taken the solution formula that Eureqa has given me
Prediction( my Optimal Signal) =
less(PERCT_CHG_STOCHASTIC, -0.1638)/cosh(-1.245*PERCT_CHG_MACDHistogram*PERCT_CHG_STOCHASTIC)
WHERE (1 example row)PERCT_CHG_STOCHASTIC = 1.105438 (Percent change of the Stochastic Indicator from the previous day close value)
PERCT_CHG_MACDHistogram = 0.004059 (Percent change of the MACD Histogram Indicator from the previous day close value)
I believe the result I'm getting in the example above breaks down to 0 / cosh(-1.245 * 0.004059 * 1.105438)
which by default is "0" . The Optimal Signal for that row should have been close to 0.667 (a BUY signal)
I'm assuming cosh is the hyperbolic cosine function even though it's not listed in the Eureqa Building Blocks documentation.
I've implemented the formula in Excel with the raw data that I used to build the model
for hundreds of rows and the values this formula is returning back to me are not even CLOSE to the Optimal Signal values that I'm attempting to model. I wasn't expecting it to be 100 % accurate, but certainly not 0% accurate.Just as a sanity check, I've done this with about 5-10 different datasets resulting in very different formulas and I'm getting the same (bad) type of results when I test the formulas in Excel on the original datasets and their Optimal Signals.
I'm obviously missing something very fundamental in my model building process or understanding.
Also, one thing I've noticed is using any of the Boolean Building Block functions seems to result in most equations with any division and multiply operators, returning a value of zero ( 0/ (some long equation) = zero. - or - (some long equation) * 0 = zero. Zero veryrarely represents the value I'm attempting to model.
Should I be only using Boolean functions only when I'm processing binary datasets?Any feedback would be appreciated.
--
Thanks in advance.
Devon Kyle
You received this message because you are subscribed to the Google Groups "Eureqa Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eureqa-group...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Thanks!
Will do Andrew - let me prep a small zip file and forward it to you -should be able to get to this weekend..
After we determine the issue, I'll make sure to post back here to share the info for the entire forum.. Thanksdevon
I JUST discovered that feature (totally by accident ) two days ago. It's a really wonderful feature within Eureqa and I'm doing a lot of experiments with it. It's great because I can create a model and merely add new rows of data and get an immediate output response based on a pre-existing model I created within Eureqa. Much much easier than mucking around in Excel.I also became self-aware that the issues I was having mentioned in this original thread are probably do to the preprocessing (outliers, normalization, smoothing etc) that I was performing on my raw dataset within Eureqa which I wasn't factoring into or duplicating within my Excel equation process. I'm sure that would have something to do with my discrepancies.My ultimate goal is the create time series models that I can backtest and use in a real time automated algo envirornment. Create the models in Eureqa and recode those equations/models in C++/C#/ or Python. Because the preprocessing functions within Eureqa seems to be a bit of a black box, my latest direction is to do my preprocessing before I input the dataset into Eureqa (smoothing, outliers Moving avgs etc), build the data on that "raw" set of data points and then matching the exact same preprocess in my algo app code. Hopefully that will get me comparing apples to apples.I now you and your Nutonian team are busy with lots of things so I wanted to thank you Michael taking time out of your busy day and reaching with your input and feedback. Appreciate it.Devon Kylep.s. possible idea for future version of Eureqa.... some kind of API where I could create my own preprocessing functions (C++, Python ? etc) and apply them to my raw datasets beforeprocessing the models ??