NHL Trend Aggregator

summerlink

unread,

Jan 28, 2016, 10:05:23 AM1/28/16

to SportsDataBase

Hi everyone,

I've collected a set of 37 NHL trends ( some made by me, some taken from the SBR NHL Situational plays thread - > see nhlsdql attachment). All of them are profitable in their own right.
I thought I did something to see how they work together.

So I put together a small tool that takes all the valid plays in the sports database suggested by all the trends and calculates the ROI and Yield for the "aggregated" super system.
A play in this aggregated system is any play of a particular trend that is not contradicted by another trend.

The ROI is calculated based on 100$ bets and all the plays are made on the ML. The results below are from the beginning of the database up until today ( 27th January 2016 included )

Period        YIELD     ROI     Win   Loss

2006-2007 20.31    14244    329    214
2007-2008 18.64    12722    334    241
2008-2009    9.76    6945    308    272
2009-2010 11.40    7881    316    263
2010-2011 14.89    10753    346    260
2011-2012 14.98    9913 321    232
2012-2013 15.99    6768    208    150     ( lock-out shortened season, that's why there are only 358 plays )
2013-2014 16.43    11628    340    249
2014-2015 12.17    8497    319    249
2015-2016    1.50    563    159    151    ( see attachment for the list of plays )

Overall 14.11    89914    2980 2281   ( 0.5664 % hit rate , not necessarily that relevant since all the plays are on the ML )

As it can be seen, after 9 strong years of almost 100 unit ROI profit/year and better than 10% Yield, came this year, with just 1.5% Yield and 5.6 u profit.

I'm sure that the set of trends I am using is not perfect, and I kinda need some of your guys help to polish the trends I use ( either add yours or improve existing ).

I will get back sometime in the next couple of days with a mini version of the tool you guys can use for yourself.

Of course, with what I've done so far its easy to transition to other sports and under/over categories.
I will most likely be doing something similar for MLB before the season starts and then for NFL and NBA in the summer. So again, your input there will be valuable.

Cheers everyone

20152016plays.txt

nhlsdql

JJ 21

unread,

Feb 4, 2016, 2:49:02 PM2/4/16

to SportsDataBase

I'd be interested to take a look at the tool you're building that helps rule out contradictory trends/systems. Does it come in the form of an SDQL code that we're able to use with any of our own trends?

Pcg

unread,

Feb 4, 2016, 7:44:15 PM2/4/16

to SportsDataBase

The key thing now will be to see how this mass aggregation does moving forward.

A little trick to save some time is to remove random samples from your "research" pool, find your systems and then just enter the sample you randomly removed previously to sort of move forward in time.

summerlink

unread,

Feb 6, 2016, 5:25:45 PM2/6/16

to SportsDataBase

Its code-based.

The aggreagting algorithm is this:

1) Im taking each trend, one at a time, and populate two maps ( a "on" map and a "fade" map ). More on these two in the notes section

2) Im building an auxiliary map, exact duplicate of the "on" map.

3) for each game in the "on" map I'm searching if it's ID exists also in the fade map
If it does, I am removing it from the auxuliary map

4) After all the on map is iterated through, the auxiliary map contains the list of all the games that only exist in the "on" map and do not exist in the fade map

Noes: :
- a gameID is, for example : "20160206:Devils-Capitals"
- the fade map is built by reversing each game id from the "onMap".
For example if the first trend says "20160206:Devils-Capitals" is a play, then I add "20160206:Capitals-Devils" to the fade map and "20160206:Devils-Capitals in the on map

Im almost sure there must be a more efficient way to filter out the contradictory games ( or optimize my own ), but the method I just described does what its supposed to and it doesn't have a performance penalty.

Any decent junior high computer science student should be able to not only come up with the steps above, but also translate them into actual code.

I have tried to remove the worst performing trends from the set, but invariably there was an ROI and Yeild penalty. Even the worst performing trends in the set I use have 7% ROI and give a couple of dozen units profit.

I'm more concerned with not having the best trends or having too much correlation between them. The main idea here is to come up with the correct set of trends. Thats why I posted here, a place with SDQL gurus.Thats why I need your help.

ps: I am aware there can be other types of aggregating strategies ( like only considering trend consensus ( at least 2 separate trends that give the same play and are not contradicted by some other trend ) ) and staking plans that can be used,

JJ, I am working at making it as easy to use as possible, I will get back when I have a version I am comfortable in sharing

Sydney

unread,

Feb 7, 2016, 5:16:00 PM2/7/16

to SportsDataBase

Did you try to calculate the Sharpe ratio for your trends? The Sharpe Ratio is from the financial domain. Calculate the average return per season and the standard deviation. Then divide the average return by the standard deviation. Everything greater than 1 is pretty good. You can use my tool to calculate it: https://blooming-stream-4451.herokuapp.com/ (might be slow to start).

summerlink

unread,

Feb 10, 2016, 10:47:44 AM2/10/16

to SportsDataBase

- Please read How To Use before getting your hands dirty

- Only NHL and MLB are supported so far ( only moneyline performance calculation is done so far, I still need to do spread based calculations for NBA and NFL )

- Please get back with feedback or problems you find

- Dont forget to share your own trends, as much as you are willing.

Enjoy. :)

ps: @Sidney: I have yet to even scratch the surface with all the available statistical tools that can show the actual value of the trends. That part will come in the future of course

TrendAggregator.jar

How to Use.txt

vitor marcos

unread,

Mar 19, 2016, 2:58:20 PM3/19/16

to SportsDataBase

Summer,

can you share mlb trendsyouhave on your file agregator?

summerlink

unread,

Mar 20, 2016, 4:56:36 AM3/20/16

to SportsDataBase

I'm just in the midst of crunching down some 4K + posts on SBR Forums MLB situational plays threads from the past 2 years. I already have about 60 trends and I hope I will have as many as possible to then be able to select the very very best. I will include the Sharpe Ratio and Z score to each trend and supply a new version of the aggregator in the following days.

I will most likely not be sharing my findings if they are extremely valuable in terms of ROI.
I have already given you the tool that does the "hardest" part ( integrating with SportsDatabase API and aggregating the trends ). I gave some decent NHL trends. I even told you where I'm mining the trends from.

Your job is simple, put the puzzle together. Feed your trends to the machine, and hope for positive ROI over big sample sizes.

Vitor, you will see that in life the biggest joy you will get from the things you worked the hardest for.

SystemSeeker

unread,

Mar 24, 2016, 5:52:15 PM3/24/16

to SportsDataBase

So am I correct in assuming this Portfolio takes all of the qualifying teams from multiple queries and gives you the total results for each play?
For example, If I have 3 systems and they all 3 pick team A, and one system picks team B and one picks team C, and A, B and C all win, it will show a total of 5 wins (A,A,A,B,C), or just 3 wins (A,B,C)? Really liking what I've seen so far and want to make sure I have the correct instructions on how to use it.

Thanks

On Thursday, January 28, 2016 at 10:05:23 AM UTC-5, summerlink wrote:

SystemSeeker

unread,

Mar 24, 2016, 5:59:33 PM3/24/16

to SportsDataBase

Guess in a sense, what I am trying to say is, does it count the same team in multiple systems as multiple wins, or just as one win?

On Thursday, January 28, 2016 at 10:05:23 AM UTC-5, summerlink wrote:

Message has been deleted

Sydney

unread,

Mar 26, 2016, 11:43:01 AM3/26/16

to SportsDataBase

Can you tell what kind of adjustment you made? Are these trends impacted by the 3-on-3 overtime rule change?

Thanks

On Saturday, March 26, 2016 at 1:49:52 PM UTC+1, DrFill28 wrote:

This season is a severe outlier when it comes to several trends. A quarter of the way through the season my ROI was similar to yours, I was able to make the necessary adjustments after extensive trouble shooting and my roi is at 8% this season. I would be happy to show you some of my analysis and trends.

By the way, you say you extract your trends from SBR situational threads? Could you link that forum, I have developed all of my own trends I would like to compare.

On Thursday, January 28, 2016 at 10:05:23 AM UTC-5, summerlink wrote:

unread,

May 1, 2016, 3:46:22 AM5/1/16

to SportsDataBase

@drfill:

sumer...@yahoo.com

@ rest of you degens out there : I know its a long post, but its an interesting read, I promise :)

A month has passed in MLB and the 65 trend set I use went 41-41 with 2.44 units of profit and 2.61% ROI. A far far cry of the lifetime results of 6156-3423 (30.65% ROI & 3414.55 Units profit)

NHL trend set ( which I actuall refined a bit since the first post ) is 141-116 this season, with 37.95 units up (12.97% ROI). Lifetime its 1425-1059 with 636 units up and 22% ROI.

If you are curious, I did followed the MLB system by placing bets at the beggining of the season ( first couple of weeks ). After being aprox 10 unit on the negative ( MLB poor start combined with NHL bad stretch, and with negative start on over/under plays in MLB given by the trends ( yes, I tried THAT too ) ), I reverted to my old habits of poor bankroll management and diverted from the SDQL systems. You kinda now what happened next... ;) ( damn you OKC :) )
I should have been only following the MLB season, and only with moneyline plays. Im not prepared yet to follow a system ( not disciplined enough ) to the bone, but hope one day will be.

As a note, I found out some problems I should have anticipated with following such a system to the nail. There are days in which I place bets at 5PM local time ( with first games starting at 8PM ) , and by the time the games actually start, the line movement makes those plays inelligible. Reversely, there are situations where line movement makes a non-play to be active. Of course, in the long run, such situations should even out to 50-50 ( wins-losses ), but I was at the bad end of multiple such situations in the beggining of the season, which added to the frustration.

Also, line shopping is critical (to say the least) ( I did anticipated this ). I recommend having at least 3 bookies ( 2 are simply not enough IMO ) where you can find the best line.

As far as the project goes, I havent worked on it much in the past few weeks, But I have some tricks up my sleeves I want to try. Coming soon... :)

Best regards
Remus

summerlink

unread,

May 23, 2016, 9:17:10 AM5/23/16

to SportsDataBase

quick update:

I started working on the random trend generator. Its easier than I first thought... ;)

Using a micro set of possible query parameters ( see below ), MLB as a sport of choice and letting the "beast" run for 30 minutes....

String[] elements1 = {"A", "H", "F", "D"};
String[] elements2 = {"p:H", "p:A", "p:W", "p:L", "p:D", "p:F"};
String[] elements3 = {"p:DAY", "p:NGT", "p:X"};
String[] elements4 = {"SG=1", "SG=2", "SG=3", "1<SG<4", "1<=SG<4", "n:SG=1"};
String[] elements5 = {"DIV", "C", "not DIV", "not C"};

... returned no fewer than 107 different trends with ROI of > 4% ! ( albeit 7 of them had tiny sample sizes of less than 12 ).

39 of them have ROI > 10%

The "beast" randomly picks an element from each of the arrays ( elements1 .... elements5 ) and concatenates them to form a query. So all queries have 5 parameters.

For Example :

F and p:F and p:DAY and 1<SG<4 and not C

ROI : 4.03 Profit : 1343 ( 134u ) 137W - 82L

There are 1728 total combinations for these 5 arrays and I suspect that most of them were covered since my initial result list had a lot of duplications ( :-) )

Next steps ... cover all the query parameters supported by SDQL, and introduce Z scores as filter, alongside ROI. Also will need a way to introduce trend correlation calculation into the fold.

summerlink

unread,

May 26, 2016, 4:20:03 AM5/26/16

to SportsDataBase

I want to add more aggregation strategies, so I'm rewriting the code that performs the aggregation logic.

I'm thinking about the following strategies :

1) Unanimity of n ( Un ) , which means there has to be a consensus among all trends that have an "opinion" on a game, and the consensus should be of at least n trends.

2) Majority of p ( Mp ) , which means there has to be a majority of at least p% among all trends that have an "opinion" on a game. This means trends that contradict the majority are allowed as long as they dont surpass the 1-p percentage limit .

Currently, the strategy used is U1 ( at least 1 trend "for" and no trends against )

What other aggregation strategies do you think can be valuable , besides Un and Mp ?

ps: Ive added average Line calculation in the mix, so Z score is next, (based on the thread that discussed Z scores).

summerlink

unread,

May 28, 2016, 9:46:08 AM5/28/16

to SportsDataBase

Things are getting really interesting... :)

After successfully implementing Un and Mp, adding Z scores to the output and enhancing the random trend generator for MLB ( now it has 7 params it considers ), I can safely chillax while the Laptop is running and doing the scavanging work. In ~ 30 minutes it spit out a half dozen trends with Z score of > 2.5, which is not bad at all in my estimations.

Of course, I would like something better than just random generation. So my next idea will be to somehow make this dummy generator a bit smarter, to be able to self-refine/adjust trends that show potential.

And the search for a way to automatically generate optimal trend sets can and should have another feature : for a given set of trends with big Z scores ( TZ ), find the optimal subset. Which means the one containing the optimal trends along with the optimal strategy ( Un for a given n, or Mp for a given p )

This is computationally extensive to say the least ( there are (2^k)*(n+p) possible combinations ). I figure that the partial trend set TZ will have at least a couple dozen trends ( k ~=40, n would take values from 1 to 5 and p in { 51, 60, 70, 80, 90} , which means somewhere along the lines of 10^13 ( does 10 trillion sound better ?) subsets of TZ to be analyzed. To find the very "fucking" best one of course.

summerlink

unread,

Jun 1, 2016, 9:08:09 AM6/1/16

to SportsDataBase

After 2 months of MLB, my trend set performed way below the expected results from previous years. YoY results are below ( 2016 includes games up until and including 31st of May ). Is this set due for a regression to the mean ? Or did something happen in 2014 with the game itself, and most trends that held up until then start reversing ?

Who knows....

Season	ROI	Zscore	AVG	Profit	Wins	Losses	Pushes
2005	29.92	8.79	103	25800	473	269	0
2006	30.68	8.74	103.4	24915	451	251	0
2007	21.93	6.57	-101.5	19895	460	301	0
2008	26.44	7.77	102	22995	469	280	0
2009	30.69	8.99	101	26485	478	258	0
2010	31.85	9.24	103.6	26815	461	251	0
2011	23	6.68	-100.8	19430	444	276	0
2012	30.52	8.82	-100.9	25697	464	247	0
2013	21.68	6.36	104.1	18756	448	303	0
2014	9.68	2.84	102.9	8451	414	350	0
2015	8.57	2.49	101.5	7367	402	343	0
2016	-6.72	-1.02	-103	-1664	96	111	0

Ognj3n

unread,

Jun 1, 2016, 2:28:32 PM6/1/16

to SportsDataBase

Well You seem to have picked up a lot of stuff ( queries ) that used to work, but the bookies have caught up to over time.
The betting market is a dynamic environment and if something works for a while, more and more people start using it until it influences the market prices ( lines ).

I will give You an example of a simple system that got outdated :

http://sportsdatabase.com/nhl/query?output=default&sdql=A+and+wins%3Co%3Awins+and+season&submit=++S+D+Q+L+!++

( sort it by season )

Plain betting the worse team in an away situation worked flat out until 2009 then the market caught up til 2011 and it got very unreliable.
This system can be optimized, for example if You add the opponent coming off the loss :

http://sportsdatabase.com/nhl/query?output=default&sdql=A+and+wins+%3C+o%3Awins+and+op%3AL+and+season&submit=++S+D+Q+L+!++

The ROI jumps but if You add and season it still shows the same unreliability after the lines adjusted slowly 2009>>2011.

That seems to have happened to Your dataset. The cumulative ROI goes from 30ish in 2005>>2012
to 9-10 in 2014>>15 to going kerplunk this year.

"and season" is a very important parameter to use in addition to the z-score to see if stuff makes sense in the end, and if it is valid.

Sydney

unread,

Jun 1, 2016, 5:24:38 PM6/1/16

to SportsDataBase

@summerlink: Try to calculate the Sharpe Ratio for your system by adding and season, then the formula is pretty simple: average profit by season / standard deviation profit by season. A Sharpe Ratio > 1 is considered pretty good.

For the first trend Ognj3n posted

-2203

-4886

-294

-729

-754

1158

2646

2975

3783

7259

Average: 895.5

Standard Deviation: 3433.340947

Sharpe Ratio: 0.260824664

which is poor.

The second trend is better but not great with a Sharpe Ratio of 0.55

Ognj3n

unread,

Jun 2, 2016, 4:23:59 AM6/2/16

to SportsDataBase

To make it clearer let's add "and p:H" then :

http://sportsdatabase.com/nhl/query?output=default&sdql=A+and+p%3AH+and+wins%3C%3Do%3Awins+and+op%3AL+and+season&submit=++S+D+Q+L+!++

I was trying to point out the decline over time, in the first five years You 'll find a Sharpe ratio of ~1.3, in the last five much less. That was the point.
We had to cut the sample size in half, though. Here it makes sense, but that is not often the case.

JJ 21

unread,

Jun 29, 2016, 2:06:20 PM6/29/16

to SportsDataBase

Wow, this stuff is intense guys. Very impressive. I've been following along and once in awhile just like to read back through and try to absorb the theories. One thing that can't be left out of the equation is good old fashioned handicapping. Knowledge of the game, line shopping, 'feel', homework (checking injuries, etc) -- When all of the numbers point one direction and then the capping notes lineup, that's something that (hopefully) goes above what the bookies and consensus bettors are leaning on.

Sometimes when we see a once-strong angle that worked from say, 1989-2009, and then fell off, we also have to consider how the game has changed. There have been some pretty big shifts in the way NFL and NHL games are played between 2000 and 2016.

Anyway, continued success in your hard work. I'm inspired to want to take a computer science course and/or programming course one of these days to advance my knowledge.

Cheers!

summerlink

unread,

Jul 6, 2016, 11:00:11 AM7/6/16

to SportsDataBase

Completely agree with what JJ said above.

Just an example : I just cannot ( in good faith ) put money on Cincy @ Cubbies, when Cody Reed ( one of the biggest fade in recent memory ) goes up against a prolific offense and arguably the best team in MLB, which just happens to plays at home after a road sweep. All these arguments led me to CHC ( which I correctly backed ), even though my Trend Aggregator said to happily put money on Cincy at +200.

Nevertheless, I want to finish what I started. There will be a tool made available to this forum sometime in the near future.

But big cautions/warnings will have to be addressed before believing it will be (anything remotely like) a money-making machine.

Steve S

unread,

Oct 24, 2016, 11:09:43 PM10/24/16

to SportsDataBase

Apologies for such a late reply, but I just tried the TrendAggregator, and got a java exception:

$ java -version

java version "1.8.0_112"

Java(TM) SE Runtime Environment (build 1.8.0_112-b16)

Java HotSpot(TM) 64-Bit Server VM (build 25.112-b16, mixed mode)

$ java -jar TrendAggregator.jar nhl nhltrends

java.net.UnknownHostException: proxy.houston.hp.com

at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)

...

Suggestions?

Thanks.

Steve

Reply all

Reply to author

Forward

Message has been deleted