My main interest in analyzing golf is using past data to most accurately predict future golf performance. Inherent in that are the questions of how to figure out how much randomness is affecting the data and how to remove the effects of randomness from the data. The easiest way to find how much randomness is involved with data is to find the correlation between subsequent measures of performance. For example, in this post I found the correlation between a golfers performance in various samples of years from 2009-12 and their performance in 2013. Based on that, I concluded that using such basic methods produced a correlation of around 0.70 – meaning that in a subsequent season we can expect a golfer to repeat about 70% of their prior performance above or below the mean. I’ve achieved correlations slightly higher than that using more sophisticated methods and more detailed data, but an R of 0.70 should be viewed as the typical baseline for judging the repeatability of golf performance.
In this post, I attempt to find the repeatability of performance on different shot types using the same methodology as above. I will use Strokes Gained Putting to measure putting performance relative to the field, my own Z-Score measure minus Strokes Gained Putting to measure tee to green (driving, approach shots, short game) performance relative to the field, and I’ll use my own Scrambling metric (methodology here) to measure performance on only scrambling shots (a scrambling shot is the first shot following a missed green). I have not stripped these scrambling shots out from the tee to green measure; tee to green measures all non-putting strokes.
I gathered data for all PGA Tour golfers for 2008-2013 who had a qualifying number of rounds played (50). I then paired consecutive seasons for each of my three performance measures and graphed the correlations below.
For all three graphs negative values represent performances better than the field. Values are expressed in standard deviations above or below PGA Tour average.
The measure that was most strongly correlated from season to season was tee to green performance. That makes intuitive sense as tee to green performance includes the most shots/round of any of my measures (roughly 40/round). In fact, tee to green performance is almost as repeatable as overall performance (R = 0.69 compared to 0.69 to 0.72 in the study linked above). This indicates that a golfer who performs well above or below average in a season should be expected to sustain most of that over or under-performance in the following season.
Putting was less repeatable than the tee to green measure, but skill still shows strongly through the noise. An R of 0.54 indicates that a golfer’s putting performance should be regressed by nearly 50% toward the mean in the following season (provided you only know their performance in that one season). I would mainly explain the lower correlation on sample size; golfers normally hit 25-30 putts/round, but many of these putts are short gimmes that are converted upwards of 95% of the time. The number of meaningful putts struck in a round is more like 15. This indicates that it takes a golfer over 2.5 seasons of putting to reach a season’s worth of tee to green shots. This suggests that putting is less repeatable from season-to-season than tee to green strokes, which indicates we should be wary of golfers who build their success in a season largely off of very good putting.
Off the three measures tested, Scrambling was the least able to be repeated (R =0.30). This indicates that performance on these short shots around the green is very random. It’s not atypical for a golfer to perform as one of the best 10% on Tour one season and average the next. Again, this is likely a function of sample size. A golfer hits only 6-8 scrambling shots/round (every time they miss a green). It takes a golfer around five seasons of scrambling shots to reach one season’s worth of drives/approaches.
There are two important caveats with this approach. These correlations explicitly measure only one season compared to the following season. Measures like scrambling and putting likely are more strongly correlated when several seasons are used as the initial sample. I intend to test this in a future post. In addition, this study only considers golfers who played consecutive seasons with >50 rounds/season. This leaves out a certain sample of golfers who played so poorly in the first year that they were not allowed to play >50 rounds the following season. These stats may be slightly more repeatable if those golfers were included.