Our Blog


Volume, Sample Size, and Predicted Performance

Every two months or so, there’s always a rookie that everyone treats like a shiny new toy. Last year it was Ollie Schniederjans, early this year it was Bryson DeChambeau, and currently it’s Jon Rahm. The ‘shiny new toy syndrome’ often happens for a reason: Most rookies have some sort of hype around them, either from college or amateur success. Oftentimes, they’ll come out swinging (no pun intended), post some good initial results, and then their ownership levels are high the next week. The previously-unknown rookie that starts strong is almost always a value in other DFS sports, so why are the fast-start rookies not a lock in golf?

For starters, other sports don’t have odds-based pricing, so golf player pricing is far more efficient. And second, fast-start rookies consistently regress to a lower level of play. I’ve referred to this phenomenon in previous podcasts where I talked about how I’ve been bearish on Bryson DeChambeau, and this is as good a place as any to elaborate why.

Take a hypothetical: If two players have equal stats (Long-Term Adjusted Round Score, Driving Distance, Greens in Regulation, etc.), but one player has played in five tournaments and the other player in 55, will they both do equally well on average next time? We can test this by introducing a volume index for each player. My definition of the volume index for a given player is their total number of tournaments divided by the highest number of tournaments played by a player in the ranking pool. Everyone’s volume index will always be between 0 and 1. Using my previous metric for relative performance, I took the average relative performance of all players and plotted them against their volume indices. Here’s how that plot looks:

volume_index

This, in a nutshell, is why I will never roster rookies. Players with low sample size do an abysmal eight percent lower than their raw numbers would otherwise suggest. Now, to be fair, I could believe that the Rahms and DeChambeaus of the world are not your “typical” low-volume player — since they have known amateur success that other rookies don’t — so maybe the generic curve doesn’t apply to them.

But how much would you say it wouldn’t apply? 100 percent? 50 percent? Coming up with a number is completely speculative without actual data. And given how strong the pattern is otherwise, I remain very conservative on breaking from the pattern without an ironclad reason. And it’s admittedly anecdotal, but on most of these cases to date, even those highly-touted rookies have in fact regressed from their hot start, so I haven’t see anything that would make me question if the volume curve doesn’t apply to the hyped rookies as well.

One thing this chart doesn’t factor in, though, is the effect of low-volume players on overall risk. Remember, average performance represents average outcomes, but it doesn’t encapsulate range of outcomes, which is what we’ve been focusing on with our non-projection metrics series. In addition to correlating volume with average outcomes, we can piece together its effect on our overall risk profile of a player. Once the series on establishing our non-projection metrics is complete, we’ll get into how we can do that.

Every two months or so, there’s always a rookie that everyone treats like a shiny new toy. Last year it was Ollie Schniederjans, early this year it was Bryson DeChambeau, and currently it’s Jon Rahm. The ‘shiny new toy syndrome’ often happens for a reason: Most rookies have some sort of hype around them, either from college or amateur success. Oftentimes, they’ll come out swinging (no pun intended), post some good initial results, and then their ownership levels are high the next week. The previously-unknown rookie that starts strong is almost always a value in other DFS sports, so why are the fast-start rookies not a lock in golf?

For starters, other sports don’t have odds-based pricing, so golf player pricing is far more efficient. And second, fast-start rookies consistently regress to a lower level of play. I’ve referred to this phenomenon in previous podcasts where I talked about how I’ve been bearish on Bryson DeChambeau, and this is as good a place as any to elaborate why.

Take a hypothetical: If two players have equal stats (Long-Term Adjusted Round Score, Driving Distance, Greens in Regulation, etc.), but one player has played in five tournaments and the other player in 55, will they both do equally well on average next time? We can test this by introducing a volume index for each player. My definition of the volume index for a given player is their total number of tournaments divided by the highest number of tournaments played by a player in the ranking pool. Everyone’s volume index will always be between 0 and 1. Using my previous metric for relative performance, I took the average relative performance of all players and plotted them against their volume indices. Here’s how that plot looks:

volume_index

This, in a nutshell, is why I will never roster rookies. Players with low sample size do an abysmal eight percent lower than their raw numbers would otherwise suggest. Now, to be fair, I could believe that the Rahms and DeChambeaus of the world are not your “typical” low-volume player — since they have known amateur success that other rookies don’t — so maybe the generic curve doesn’t apply to them.

But how much would you say it wouldn’t apply? 100 percent? 50 percent? Coming up with a number is completely speculative without actual data. And given how strong the pattern is otherwise, I remain very conservative on breaking from the pattern without an ironclad reason. And it’s admittedly anecdotal, but on most of these cases to date, even those highly-touted rookies have in fact regressed from their hot start, so I haven’t see anything that would make me question if the volume curve doesn’t apply to the hyped rookies as well.

One thing this chart doesn’t factor in, though, is the effect of low-volume players on overall risk. Remember, average performance represents average outcomes, but it doesn’t encapsulate range of outcomes, which is what we’ve been focusing on with our non-projection metrics series. In addition to correlating volume with average outcomes, we can piece together its effect on our overall risk profile of a player. Once the series on establishing our non-projection metrics is complete, we’ll get into how we can do that.