Golf Analytics

How Golfers Win

Tag Archives: flawed stats

What Stats Don’t Suggest

Posts like this really agitate me as a golf fan interested in analytics. I’m not sure what annoys me more: the complete disregard for stats as more than trivia or the complete lack of insight it provides. Whichever, this post is a perfect example of everything that’s wrong with how people talk about golf stats. Nothing in that in post has even a whiff of predictive value; there’s no attempt to actually figure out what the stats suggest about players who should be more successful at Summerlin. Instead, we get pseudo-insight like “par 4 scoring demands our attention” and “all five winners ranked inside the top 20 in Strokes Gained Putting”. Well, of course par 4 scoring is important – over half the holes on every course on Tour are par 4s. Guys who can’t score on par 4s can’t be successful on Tour. And of course strokes gained putting is important. Gaining strokes on the field is a certain prerequisite for winning or contending in a tournament.

That’s enough picking on Rob Bolton, who might have a handle on fantasy golf, but is monumentally out of his league when forced to discuss stats. What this post is for is to discuss how successful a golfer has to be at putting to contend at or win a golf tournament. There’s nothing predictive here; everything I’m going to talk about is descriptive.  I downloaded the per tournament results for every 2013 PGA Tour player, including their finishing position and their average Strokes Gained Putting/round during the tournament. Using that data I found how successfully golfers who finished highly putted. Results below in bullet-form.

  • Tournament winners exceeded their season average for Strokes Gained Putting/round by 1.3 strokes/round – 5.2 strokes/tournament.
  • Those finished T10 or better exceeded their season average for SGP by 0.9 strokes/round – 3.6 strokes/tournament.
  • Tournament winners averaged 1.44 strokes gained/round; while those T10 or better averaged 0.92 strokes gained/round.
  • Tournament winners averaged finishing 13th in Strokes Gained Putting for the week while those T10 or better averaged finished 27th for the week.

Clearly, putting very well is necessary to contend for or win a tournament. Nothing in that is novel in the least. Nothing about that is predictive in the least. It just indicates that guys who win do so because they’re playing and putting better than they normally do. Claiming that SGP is important this week ignores completely that it’s important every single week. Moreover, it’s not more important in birdie-fests like this week. SGP counts strokes gained on the field. Golfers this week are going to hole a lot more putts than normal on Tour both because they’ll have disproportionately closer putts and putts of a certain length will be disproportionately easier than normal. However, that just means the threshold for gaining putts on the field is higher.

Advertisements

The Intersection of Driving Distance & Accuracy

If you have ever watched televised golf I’m sure you have heard an announcer bemoan the wildness of a golfer’s drive. Tiger Woods and Phil Mickelson in particular seem to dogged by comments about how often they end up in the rough compared to the field.  However, I cannot recall hearing much talk at all about the distance golfers are hitting the ball. Now, a lot of that is due to it being easy to convey the advantage of hitting an approach shot from the fairway rather than the rough. We see the thick rough and remember the times golfers have been forced to pitch out into the fairway when they are behind obstructions. On the other hand, it’s difficult to convey the advantage hitting an approach shot from 20 yards provides to a golfer. However, that advantage is very real.

The 2013 ShotLink data shows that, on average, PGA golfers hit the green on 71% of their shots from 125-150 yards, but on only 64% of their shots from 150-175 yards. In his seminal Assessing Golfer Performance on the PGA Tour, Mark Broadie shows that, on average, a golfer will take 2.89 shots to finish a hole from 137.5 yards, but 3.00 shots to finish from 162.5 yards. In other words, driving the ball 25 yards further provides a substantial advantage in hitting greens and scoring low. There is certainly an advantage to avoiding the rough also. According to ShotLink data, golfers hit the green nearly 76% of the time from the fairway, but only 51% of the time when they missed the fairway. Birdies are 50% more likely when you hit the fairway versus the rough (21% to 14% of holes).

However, almost every golfer is forced to choose which skill – distance or accuracy – they want to attempt to excel at. Driving Distance and Driving Accuracy are strongly negatively correlated (R = -0.51), meaning that very few players perform well in both categories. For example, of the 216 golfers who exceeded 10 tournaments played or finished in the FedEx Cup top 200, Dustin Johnson ranked 1st in driving distance and 195th in driving accuracy. Rory McIlroy followed at 2nd in distance, but 181st in accuracy. Opposite those two, Russell Knox finished 1st in accuracy, but only 135th in distance, while Chez Reavie was 5th in accuracy, but only 159th in distance. As the following graph shows, only one of six PGA golfers exceed the mean for distance and accuracy (shown in red) and no one is +1 standard deviation from the mean in both distance and accuracy (shown in yellow).

2013 Driving Distance Accuracy Correlation

However, knowing that it is important to do both well, but difficult to do both well, is their one skill that predominates? To determine just how important each factor was to analyzing driving skill, I set-up a regression of driving distance and driving accuracy on a golfer’s greens in regulation (GIR). Because the courses played can vary in difficulty, I used my course adjusted stats which determines how much better or worse than field average a golfer performed each week in each stat. These adjust most slightly, but for golfers like Tiger Woods who typically play tougher courses than average the adjustment can be significant. I’ve attached a Google Doc of every PGA player to finish in the FedEx Cup top 200 plus anyone else with >10 tournaments entered showing these adjusted stats.

The results show that combining distance and accuracy predicts 50% of the variance in GIR (R^2=0.494). The p-values are highly significant and indistinguishable from zero, which certainly squares with the empirical stats provided in the second paragraph. To predict GIR, the equation is Y=(.00283*Distance in yards)+(.4418*Accuracy in %)-(.4429). Basically, hitting the ball an extra three yards is worth around 2% in driving accuracy, meaning a golfer should be indifferent to adding three yards of distance if it means giving up 2% in accuracy.

If a golfer was provided with the choice of being one standard deviation better than average in one skill and one standard deviation below average in the other skill there is almost no difference between being good at driving distance and bad at accuracy or vice-versa (63.9% for good at distance and 63.6% for good at accuracy). This shows that performing well at either skill is a legitimate path to success on Tour.

Using this equation, we can also calculate a Total Driving skill stat. The PGA Tour has such a stat, which they calculate solely by adding together a golfer’s rank in distance and accuracy. Mine simply ranks golfers based on their predicted GIR based on their driving distance and accuracy. The leader, Henrik Stenson, finished 8th in accuracy and 55th in distance, with a predicted GIR of 69.2%, meaning a golfer with average approach shot ability would’ve hit the green 69% of the time shooting from his average location. The worst golfer by this metric, Mike Weir, finished 213th in distance and 196th in accuracy, with a predicted GIR of 56.2%.

Tiger Woods, who is regularly criticized for his wayward drives, actually finishes 20th in Total Driving on the strength of his 34th ranked accuracy and 78th (above average!) accuracy. His predicted GIR was 66.6%. On the other hand, Phil Mickelson is also criticized for being wild with the driver, and he has been wild this season (58% accuracy; 163rd on Tour), but his distance has killed him nearly as much. He’s only driven it 288 yards on average (98th on Tour). As a result, he was the 149th best driver on Tour last year.

I’ve attached the predicted GIR/Total Driving stats in this Google Doc.

An Accurate Measurement of Scrambling Skill

(This is the first of three planned posts explaining the flaws in commonly used stats for evaluating golfers’s skills at driving, hitting approach shots, and scrambling and laying out a replacement based on publicly available PGA Tour stats.)

The Scrambling stat was developed to measure how often a golfer avoids bogey after missing the green with their approach shot. That requires a golfer to (usually) chip or pitch onto the green and then hole a par putt. Scrambling is simply calculated by dividing successful scrambles (par or less) by total GIR missed. The average for PGA Tour players in 2012 was ~59%, indicating that, when missing the green, they made par a bit more than half the time.

Scrambling is often used to evaluate whether a player has a good “short game”. Luke Donald has ranked 5th, 8th, and 4th in recent years and is generally proclaimed as one of the best short game players on Tour, along with Steve Stricker, Ian Poulter, and Brian Gay (who all have multiple top 10 finishes in the last few years). However, what scrambling really measures is a combination of three skills. First, it measures what it purports to – the ability to hit chips, pitches, sand shots, etc. around the green close to the pin. But it also measures the ability to putt, because scrambling requires a putting stroke to finish up, and hit approach shots, because players hit their chips, pitches, sand shots, etc. from locations that vary in difficulty. A very good putter will have an inflated scrambling ratio because they make a lot of putts after leaving their ball short of the hole that an average putter would miss. A good approach shot player will have an inflated scrambling ratio because when they miss the green, they leave themselves closer to the pin and in better locations (fairway or fringe instead of bunker or rough).

So with those shortcomings, how do you see through the noise and capture only the ability to hit good chips, pitches, sand shots, etc.? I first downloaded the PGA Tour data for scramblings from >30 yards, 20-30 yards, 10-20 yards, and <10 yards. This data represents all scrambling shots taken in tournaments where ShotLink is used (US based tournaments except Majors). I then found each golfer’s GIR in ShotLink measured rounds. Then I calculated how often PGA Tour golfers successfully scramble from each of the aforementioned distance bins (>30 yards – 27%, 20-30 yards – 52%, 10-20 yards – 64%, <10 yards – 85%). I then adjusted each players data to find how often they shot from each of the four distance bins, then multiplied that number by how often the average golfer successfully scrambled from that distance. The result for each golfer is how often the average PGA golfer would be expected to scramble successfully based on where that golfer hit their scrambling shot from. That solves the problem of golfers hitting from varying locations.

To adjust for putting skill, I downloaded each golfer’s Strokes Gained Putting for 2012. This stat measures how well a player putts compared to PGA average based on the length of the putt (ie, players who make more 20 footers than average will be above-average). I threw Strokes Gained Putting into a linear regression with Strokes Gained Putting and the earlier calculated expected scrambling by distance stat as the independent variables and a golfer’s overall Scrambling ratio as the dependent variable. I had 191 golfers in the regression. My R=0.70, which indicates that Putting and location of the shot explains 70% of the Scrambling stat, which is extremely large for a stat that is used to rank golfers ability to hit around the green. Both SGP and expected scrambling by distance were significant at the 0.001 level. The regression produced an equation (y = -0.038+(1.061*Putting)+(.0659*Location). I calculated the Expected Scrambling stat from that equation for each golfer. This measures how often a golfer should get up-and-down given a certain skill at putting and a certain location before the scrambling shot. When adjusting for these factors, Bo Van Pelt, Luke Donald, Brandt Snedeker, and Zach Johnson faced the easiest scrambling situations, being expected to make par or better on 65% of their missed greens.

From their, determining actual skill around the greens was simple. I subtracted a golfer’s Expected Scrambling from their actual Scrambling performance. The result indicated how much more often a player successfully scrambled, corrected for the location of their scrambling shots and their skill putting.

Top 10 and Bottom 10 in Adjusted Scrambling:ImageSeveral of the golfers in the old Scrambling rankings look good based on this adjusted ranking – Dufner, Poulter, and Rose were in the top 10 before and remain there again. But most of the rest were not highly rated by old Scrambling, highlighted by Nick O’Hern ranking  93rd (roughly average) in Scrambling despite being a poor putter and hitting his shots from the 2nd worst locations of anyone on Tour.

The trailers are more reflective of the old Scrambling rankings, though Bo Van Pelt was ranked 104th in scrambling by the old system, but because of he putted very well last year and hit from the best locations, he comes out 2nd worst in this ranking.

It is worth noting that this analysis ignores the difficulty of courses played. It’s probable that certain courses are more difficult to scramble successfully on, while others are easier. In his seminal Assessing Golfer Performance on the PGA Tour Broadie found that there were differences in course difficulty (~4 strokes between the most & least difficult), but that they were heavily concentrated in the long game (drives and approaches over 100 yards). The difference between the ten most difficult and ten least difficult courses overall was only 0.3 strokes when considering the short game (basically any shots inside 100 yards). His definition of short game means scrambling shots considered above make up roughly 2/3rds of the shots considered. I would expect that differences in scrambling difficulty by courses shouldn’t affect these adjusted Scrambling numbers by more than 2%, though I will revisit the topic of course difficulty in a future post.