Mission statement

399 views
Skip to first unread message

Henry Harrison

unread,
Jan 2, 2015, 9:27:05 PM1/2/15
to dfs-portfo...@googlegroups.com
Here is the reddit thread that prompted the creation of this group: http://www.reddit.com/r/dfsports/comments/2qtwmm/looking_to_team_up_on_a_dfs_statisticswebsite/

Beyond projections

In the DFS world, everyone thinks they have what it takes to come up with better projections than everyone else. As a result, the market is saturated with projections. Many of these are publicly available. Therefore, any value added by coming up with projections is only marginal, as it must be compared to the value you could get for free from others' projections. Furthermore, while the advantage of more accurate projections is self-evident, I think it is smaller than people realize. Having a slightly more accurate projection will lead to a higher win rate, but given the volatility of fantasy sports, it may not be that much higher. For these reasons, I want to pioneer a strategy that goes beyond projections. I want projections to be not the endpoint of a DFS strategy, but rather the starting point.

What is beyond projections, you ask? Probability distributions. Whereas projections may predict the most likely outcome, a probability distribution estimates the likelihood of all possible outcomes. Having probability distributions on hand allows one to come up with more advanced strategies than the status-quo approach of maximizing a lineup's projected score. In particular, I believe a distribution-based approach holds particular promise for constructing portfolios consisting of many unique lineups in a single slate. I also believe this holds more promise for GPPs than for cash games.

Note: I don't intend to discourage anyone from working on projections (in the traditional sense of producing a single number per player). I can certainly see the appeal in that. But I want to make it clear that better projections is explicitly not the purpose of this group. If you produce better projections, we will use them, but producing better projections is not the reason I put this group together.

A note about different sports

I believe this strategy is sports-independent. However, as a disclaimer I will add that my interest lies mainly in the NFL. While I believe we can keep many of the analytical tools sport-agnostic, this will not always be possible. For example, when it comes to procuring data, there will not be a sports-independent solution. In these cases, my personal efforts will focus on NFL. However, I don't mean that to be part of this group's mission; I encourage everyone to produce sport-agnostic work where possible, and otherwise to focus on their personal interests.

A portfolio strategy for GPPs

A portfolio strategy is the approach of entering many unique lineups in a single slate (not necessarily on a single site). There are many informal ways of describing this strategy. The way I like to think of it is that we are searching for players that have huge games. On any given slate, there are only so many of these players. To win a GPP, you need multiple in a single lineup. So, a portfolio strategy might be to consider all the players that could potentially have huge games, and enter all (or many) possible combinations of these players. The hope is that eventually you will "hit" with at least one lineup. You only need one first-place finish to cover thousands of entries.

That is a basic approach. Of course, it will get much more complicated, as you estimate probability distributions for every player. It is no longer just about just having some pool of players that could hit big, but taking into account each player's particular likelihood of doing so. And, importantly, understanding the way different players' outcomes are correlated with each other.

Goals

The dream of many an analytically-minded DFS player is to automate lineup creation. That is not our immediate goal, for two reasons: first, to stay realistic. If we start with a goal that is too far off, we will get discouraged along the way. Better to work incrementally toward smaller goals. Second, the flip side of that, is to mesh organically with our existing DFS workflows. We're not looking to replace these workflows, but rather to augment them. With that in mind, my general, long-term goals are
  • to develop strategies for estimating the probability distributions of player outcomes;
  • to use these distributions to generate useful information, in table and graphical form, as aids in lineup construction;
  • to develop strategies for constructing lineup portfolios; and
  • to generate useful information, in table and graphical form, about lineup portfolios.
Steps for action

I will post a separate discussion thread for each of these.

- Secure data. There are four categories of data I have considered, in order of importance:
  1. Performances. The statistics from the players' actual performances. In many cases the fantasy-point scores will be sufficient, but complete data sets should be sought in order to produce tools that can be applied to any arbitrary fantasy-point system.
  2. Projections. If we are to consider projections as the starting point, we need projections. Current projections are easy enough, but if we want to fit our model, we'll also need historical projections. This is the challenge. Since they are of limited availability, it is of critical importance to begin archiving as many projections as possible.
  3. Salaries. Past salaries, for this year at least, seem to be available on a few different sites. Still, we should archive them so we're not at the mercy of these sites.
  4. Owned percentages from past games. This is also something that may not be possible to get historically, but it is probably possible to begin archiving them. I haven't yet brought up ownership percentages, but it is something we should think about at some point.
- Develop statistical models for estimating fantasy-point probability distributions. This is the hardest part, I think, and the part that will require the most brainstorming and trial-and-error. This is where we need people with training in statistics.

- Develop and test strategies for constructing portfolios. 

- Develop a programming API for manipulating data and hypothetical lineups. My language of choice is Python, and there is already some infrastructure in the Python ecosystem. But I don't think there's wrong with exploring other directions if someone wants to.

- Develop visualizations and interactive tools. I've already started this with IPython widgets, although the code is too messy at this point.

- Put these interactive tools online. The other action items I mentioned in no particular order, but this one I think goes last. I don't want to put anything online until we have figured out what we're offering to the casual user. But there's no harm in putting up prototypes of various tools for our internal use, so that people who are interested in helping out but are not developers can see what we're doing. Not a priority for me, but someone may be interested in setting this up.

I'll leave this thread open to discuss any of these points I've raised or have other big-picture discussions.

Andrew Knox

unread,
Jan 4, 2015, 5:17:32 AM1/4/15
to dfs-portfo...@googlegroups.com
One question that the portfolio approach introduces is: at what point does it become unprofitable to play more unique lineups?

This, of course, can be built into any system (user selects desired risk, desired risk dictates how many unique lineups to produce and frequency that certain players are used) but I think the question should at least be in everyone's mind.

Out of curiosity, are you just starting from the assumption that the portfolio approach is more profitable than running a handful of lineups that you're extremely confident in?  Is there any data behind this belief?

Henry Harrison

unread,
Jan 4, 2015, 12:13:04 PM1/4/15
to dfs-portfo...@googlegroups.com
That's an important question. My answer is NO. The most profitable strategy is unequivocally playing only one lineup each slate, the one that maximizes your expected winnings. The problem is that you can only play so many slates in a season, and it would take many lifetimes to play enough volume to overcome variance with a single-lineup theory.

I come from poker, where the equivalent of the GPP is the multi-table tournament (MTT). Check out this post. Basically, to overcome variance and have a reasonable confidence that you are a profitable player, you need to play tens of thousands of MTTs. This is more than a full-time poker player is likely to play in year. Now, I don't know how the variance of GPPs and MTTs relate--there could easily be an order of magnitude difference one way or the other. Point is, how long is it going to take to play ten thousand GPPs? The only possible way to do that in a reasonable amount of time is to multi-entry. Otherwise, as far as I'm concerned, GPPs are little more than playing the lottery, with slightly better odds.

By playing multiple entries, you lose some in expected value, sure, but I don't think it's that much. For example, if you play the second-most optimal lineup every week, I'll contend (without doing any actual analysis, mind you), that you'll have something like 99.999% the EV of a player who plays only the most optimal lineups. Even if you play the 100th-most optimal lineup, you're probably not looking at more than a 1% decrease in EV. IMO, the variance in outcomes overshadows the difference in EV between near-optimal lineups to such an extent that the latter becomes a near-meaningless consideration. This is why I think linear solvers are a good starting point but not a good ending point. I think a human can pick a lineup by hand that is "close enough" to the linear optimal solution that it basically doesn't matter.

Anyways, the take-away is that there's a tradeoff between EV and variance (someone mentioned this in one of the other threads). In my opinion this tradeoff is a no-brainer and the dropoff in EV is relatively small. Still, this is a decision that everyone needs to make for themselves, as it determines how many lineups you are comfortable entering in any given slate. That's one reason why I don't think we can have any kind of automatic tool that spits out a "best" portfolio without human interaction. What we should shoot is giving visual aids that help you keep track of even simple things like what percent of your lineups include each player. Even simple tools like that are completely missing for multi-entry DFS players.
Reply all
Reply to author
Forward
0 new messages