Here at FantasyLabs, we have a Trends tool that allows you to query tons of situations and see how they’ve historically led to DFS value. For example, you can see how a golfer has done at specific events when coming in with awful recent form or having never played the course before.
Using all of that data — specifically the data points in our PGA Models — we can see which metrics have been the most valuable for the U.S. Open and then use that data to optimize a model for this week.
To measure value, we use a propriety metric called Plus/Minus. It’s simple: We know based on history how many DraftKings points a golfer should score based on his salary, and then we can measure performance above or below that expectation.
Building an Optimal Model for the 2019 U.S. Open
Justin Bailey used our Trends tool to backtest previous U.S. Opens and see which metrics entering the tournament were the most predictive of success. He looked at the top 20% of golfers for each metric, and here’s how a corresponding model in Labs would look based on his research (total model points =100).
- Recent Par-5 Scoring: 19 points
- Recent Scrambling: 14
- Long-Term Adjusted Round Score: 8
- Long-Term Scrambling: 8
- Recent Putts Per Round: 8
- Long-Term Tournament Count: 8
- Recent Adjusted Round Score: 7
- Long-Term Par-4 Scoring: 6
- Long-Term Birdies: 5
- Long-Term Bogeys: 5
- Long-Term Par-3 Scoring: 5
- Long-Term Driving Distance: 4
- Long-Term Eagles: 3
Meanwhile, to highlight upside, I have backtested based on the top 10% of golfers, and here’s how an optimized model would look based on the important factors.
- Long-Term Field Score: 17
- Recent Par-5 Scoring: 15
- Long-Term Putts Per Round: 9
- Long-Term Eagles: 8
- Long-Term Driving Distance: 8
- Recent Driving Distance: 8
- Recent Birdies: 7
- Long-Term Scrambling: 6
- Long-Term Tournament Count: 5
- Adjusted Round Differential: 5
- Recent Par-3 Scoring: 4
- Recent Adjusted Round Score: 4
- Recent Scrambling: 4
There are some similarities but some notable differences, too, likely because some of the bigger names in the field have struggled at the U.S. Open over the past couple of years (except for Koepka, who is ridiculous): McIlroy, Sergio Garcia, Jon Rahm, Jordan Spieth, Adam Scott, Jason Day and Tiger Woods all badly missed the cut last year. Shinnecock had crazy penalizing rough and wind gusts up over 20 miles per hour. That made things less predictive, and thus our data is going to be less useful moving forward.
For this tournament, Peter Jennings (CSURAM88) has broken down his personal model, which he has kept relatively simple, relying on the big-hitting metrics like Adjusted Round Score and Greens in Regulation. He also emphasizes Vegas data, which hasn’t been predictive in recent years, but (again) that’s probable because the course and weather last year enhanced the overall randomness and some big-name guys missed the cut.
Takeaways
I always want to use all the data available, but this week I would lean toward a simple model like Peter’s. I think the past U.S. Open data might be skewed, and building solely around Pebble Beach data seems flawed since it’s being set up tougher for the major. I would use that data on top of the model as a tiebreaker or to identify golfers who have consistently terrible history at the course.
Photo credit: Eric Bolte-USA TODAY Sports
Pictured: Rory McIlroy