Golf Analytics

How Golfers Win

Tag Archives: research

Is Going Low a Skill?

Winning a PGA Tour tournament normally requires besting the field average by 12-15 strokes. However, Tour courses vary widely in difficulty between the US Open tracks which normally play several strokes over par and the easier resort style courses which can play several strokes under par. On the former, winning scores hover around par, while on the latter it often takes -25 under par to win the tournament. On difficult courses making mostly pars puts you in contention, while on easy courses a golfer needs to make a lot of birdies to even have a chance. What I was interested in is whether certain types of golfers perform better on difficult or easy courses. That is, are golfers who make more birdies than the field better set to tackle courses where you have to go low? And are golfers whose games are built on avoiding bogeys a better bet on hard courses where you have to grind for par? My research showed that there was a small affect in each direction, but one that doesn’t significantly impact most golfers’ results.

To set-up my study I collected the results for every golfer who played a qualifying number of rounds on the PGA Tour in 2013 and 2014 (~360 golfer seasons). I then adjusted their rates of birdie holes and bogey holes to account for the courses they played each season. In most cases the adjustments weren’t significant, but the better golfers tend to play harder courses and the worse golfers tend to play easier courses (there is positive relationship between the prestige of the tournament and the difficulty of the course). That yielded a relative birdie% and relative bogey% for each golfer. By subtracting bogey% from birdie%, you can find how dependent each golfer is on making birdies versus avoiding bogey for their scoring.

Last year, Rory McIlroy ranked 1st in birdie% and 4th in bogey%; he was more dependent on making birdies for his scoring. Jim Furyk ranked 39th in birdie% and 1st in bogey%; he was more dependent on avoiding bogeys for his scoring. I divided every golfer into one of six groups depending on how dependent they were on making birdies or avoiding bogeys. I’ll refer to those dependent on making birdies as Birdie Generators and those dependent on avoiding bogeys as Bogey Avoiders.

I then gathered the course specific scoring averages for each event, adjusted them by the quality of the field, and compared them to each courses’ par. For example, at the Sony Open in 2014 the field averaged 69.3, the field was 0.3 strokes worse than PGA Tour average, and the course par was 70. Adding all that together, an average field would have played that course in 1 strokes under par (70-(69.3-0.3)). For the 2014 Masters, the field averaged 73.95, the field was -0.3 strokes better than PGA Tour average, and the course par was 72. An average field would have played that course in 2.2 strokes over par (72-(73.95+0.3)). I then divided those courses into four groups based on their average difficulty; Easiest (>1 stroke under par), Easy (0-1 stroke under par), Hard (0-1 stroke over par), and Hardest (>1 stroke over par).

bogey vs birdie course fit study

Read the chart with more negative values indicating better performance. The “Difference” column indicates the difference between performance on the easiest courses and hardest courses. Negative values indicate golfers in that group performed better on the easiest courses.

The results indicate that the most extreme Bogey Avoiders perform almost 0.2 strokes worse on the easiest courses relative to the hardest courses. The most extreme Birdie Generators don’t display the reverse, but with only eleven golfers, those results may reflect randomness because the next two less extreme groups of Birdie Generators definitely perform better on the easiest courses.

However, the results also indicate these values are not impacting tournament results for most golfers. Perhaps the most extreme Bogey Avoiders should be downgraded slightly on the easiest courses (like at the Humana this week), while it looks like the Birdie Generators definitely do play slightly better on easy courses and slightly worse on hard courses. However, even for the most extreme golfers, playing an unfavorable courses is only worth between 0.1 and 0.2 strokes different from their regular performance – enough to shift the odds of winning by around 1% for the best golfers and by much smaller fractions for lesser golfers. Going low is a skill, but it only makes a very small impact on real-world results.

Anatomy of a Breakout

Predicting breakouts and new tournament winners are some of the main allures in golf prognostication. Not only do you get the satisfaction that comes from watching a golfer that you’ve touted succeed, you also can bask in the glow of having identified that golfer before other golf pundits (and brag on Twitter). What we as a golf community don’t have is a good understanding of what goes into a breakout. Who typically breaks out, what type of guys win their first PGA Tour tournament, do guys sustain these breakouts in following seasons, etc. My research has concluded that 1. younger players are more likely to dramatically improve their performance than middle-aged or older golfers, 2. break-outs are more likely for bad players (even for bad players with a consistent track record of poor performance), 3. most first time Tour winners are above-average or better PGA Tour players already, 4. most first time winners are established as good players (that this, they don’t play much better in the season when they first win), and 4. first time winners don’t carry-over any particular boost in their performance the following season.

To judge performance I’m using my z-score ratings based on performance relative to the field and to judge expected performance I’m using my projected z-score ratings that I generate weekly based on overall performance, adjusted for recency. I gathered a sample of every golfer between the PGA, European, Web.com, and Challenge Tours who played at least 25 rounds in consecutive seasons. I compared their performance over the 2nd season to their projected performance from after the 1st season. That yielded a change in performance. On average, my sample improved slightly (by around 0.1 strokes/round), likely because I’m excluding some players who performed so poorly in their 1st season rounds that they didn’t record enough rounds in the 2nd season.

I found that for golfers under 30, 33% improved their performance by at least 0.5 strokes/round. Improving by that amount would generally improve an average PGA Tour golfer from 125th in FedEx Cup points to around 65th – a fairly clear breakout. For both golfers in their 30s and their 40s, only 25% broke-out to such an extent. Players rated at around the level of an average Web.com or Challenge Tour player broke-out at a 39% rate, while those established as very good or better PGA Tour players broke out at only a 22% rate. And these situations aren’t examples of guys like Paul Casey or Mike Weir completely losing their games and bouncing back. On average guys who break-out in a big way show fairly consistent performance in the three seasons prior to their breakout.

So these large improvements in performance season to season are more likely for the worst pros (the idea that there’s nowhere to go but up) and for younger golfers (which is certainly intuitive).

Next I wanted to look just at first time PGA Tour winners. I gathered 63 players who had won for the first time since 2010 (51 who had won for the first time in 2010-2013). These guys ran the gamut from Charl Schwartzel at the 2011 Masters to Matt Bettencourt, Bill Lunde, and Arjun Atwal in a two month stretch in 2010. The first thing I found was that their performance in the season they won for the first time hardly increased from the previous year (0.15 strokes versus the 0.1 strokes I found a few paragraphs ago in the general pro population who played 25+ rounds). That is, first time winners generally play only slightly better in the season they win as they did in the previous season. For every Jason Dufner or Graeme McDowell who goes from solid Tour pro to superstar in the season they first win, there’s a Matt Jones (declined by 0.75 strokes) or Tommy Gainey (declined by 0.60 strokes).

The average first time winner played about 0.3 strokes better than PGA Tour average the year they won (approximately around 50th best in the world).

What about the following season, though? Do first time winners carry momentum over and perform better the next season? Of the 51 first time winners from 2010-2013, they didn’t perform any better than in their previous season (in fact losing around 0.1 strokes). Youth is no guarantee here, as for every McIlroy or Patrick Reed who reached new heights in the season after they first won there’s Gary Woodland or Kyle Stanley who slumped.

In general, predicting first time winners mainly comes down to identifying who the clearly above-average PGA Tour golfers are and then waiting. Of the top fifty golfers in my predictive ratings at the beginning of 2010, twenty had never won a PGA Tour tournament. Of those, there are ten mainly European based players (guys like Francesco Molinari or Anders Hansen). Of the remaining ten who spent a lot of time playing in the US, seven won in 2010 or 2011.

The guys who clearly stand-out as the most likely to be first time winners this year who both hold PGA Tour membership and are good at golf are the obvious names like Brooks Koepka, Victor Dubuisson, and Graham DeLaet and elite rookies like Tony Finau, Justin Thomas, and Blayne Barber, but also established pros like Russell Knox, David Hearn, and Brendon de Jonge.

Research on Pyschological Impacts on Performance

I have written a lot on the consistency of performance and using past performance to predict future performance. Once you have the data, those studies are straightforward to conduct and produce intuitive results. I’ve neglected much discussion of the mental side of the game because, on the whole, there isn’t any data out there that directly measures whether a player is confident, nervous, distracted, overwhelmed, able to cope with pressure, etc. I’ve just read two papers – Confidence Enhanced Performance by Rosenqvist & Skans and The Impact of Pressure on Performance by Hickman & Metz – that attempt to measure the psychological impacts on performance inherent in golf using performance data.

Confidence Enhanced Performance:

Rosenqvist & Skans use European Tour data from the past decade to measure the impact of confidence on performance. Because of the existence of the cut in most tournaments and the natural division of the field into successes and failures by the cut, it’s possible to look at how making or missing the cut affects performance in the subsequent tournament. Players who make or miss the cut are separated by very small differences in performance (as little as a single stroke for those directly on either side of the cut line) and are also nearly identical in terms of long-term talent. That means we should expect their performances to be similar in subsequent weeks – assuming that there isn’t any impact from prior weeks.

What Rosenqvist & Skans find is that there is a difference in performance between those who barely made the cut and those who barely missed the cut (they create these groups using players within six strokes of the cut in either direction, though they also compared smaller ranges). Players who just made the cut in the treatment tournament are ~3% more likely to make it in the outcome tournament. Players who make the cut also play ~0.125 strokes better per round in the first two rounds of the following tournament. The authors explain this outcome as a product of enhanced or diminished confidence effecting the players’s performance.

I’ve found similar impacts on performance in my own work. Players who exceed their normal or expected performance one week retain a small portion of that over-performance the following week; that is, they continue to perform slightly better the following week. The same is true for those who play worse than expected. It’s very difficult to say precisely why this occurs.

The authors say it’s because the player in question is more or less confident than normal. They designed their study to directly compare players with similar performance who were clearly separated into successes and failures. However, the difference in strokes between making and missing the cut is still large – even comparing only those players on either side of the cut line. In their study, the smallest range they examined was those players within four strokes to either side of the cut line. There is still a substantial difference in performance between the two groups; the missed cut group on average played one stroke per round worse than the cut line and the made cut group on average played one stroke per round better than the cut line. That shows a two stroke difference between the two groups of players. The authors showed that both groups were comprised of fairly similar players, so the made cut group slightly overachieved by roughly one stroke per round and the missed cut group slightly underachieved by roughly one stroke per round.

So the authors were not comparing players who were separated only into a successful and an unsuccessful group. They were comparing players who were successful/had overachieved versus a group who were unsuccessful/had underachieved. It’s impossible, with this data, to state that the observed differences in making or missing the next cut can be explained by confidence versus other factors. The slightly higher probability of making the cut in subsequent tournaments can be just as easily explained by saying the player’s swing was slightly better than normal or that his mind/body were in better physical condition. Teasing apart the impact of psychological vs. physical is difficult – perhaps impossible without administering a psychological evaluation and analyzing Trackman launch monitor data.

The authors’s finding of a small carry-over in performance is the important discovery, though. They show that players who perform well the previous week perform slightly better the following week than those who did not perform well the previous week – by around 0.125 strokes per round. However, this impact is extremely small and is surely overstated by those writing about and discussing golf. Good play the previous week only barely increases the probability that you will play well the following week. This reinforces the importance of looking at long-term performance when attempting to predict future performance.

The Impact of Pressure on Performance:

Hickman & Metz use Shot Link data to examine the probability of making a putt on the last hole of a tournament, considering the amount of money riding on the putt, the distance of the putt, the experience of the player, the amount of money the player won in the previous season, and the putting performance of the player so far in the tournament. They find that for every ~$30,000 that is riding on a putt (ie, if someone makes they win $100,000 and if they miss they win $70,000) the probability of making the putt drops by 1% all else being equal. They also find that this most impacts short putts of between 3 to 12 feet.

I don’t have much to say about this study except that I wish the authors had included a better control for putting ability. They controlled for ability using Total Putts Gained (basically Strokes Gained Putting throughout the tournament). Unfortunately, putting performance in small samples like one tournament is essentially noise. If you know how well a player putts in general, how well they are putting in a tournament isn’t predictive at all. In their study you’ll see that TPG is a significant factor in predicting the probability of making a putt, but if you look at the coefficients you can see that for every Total Putt Gained in the tournament (or for every 0.25 TPG in each round) the probability of making a putt increases by 0.9%. Over 18 holes this would translate to 0.16 putts gained. So players who putt well, in general, will tend to putt well in a tournament, so they’ll tend to make slightly more putts on the 18th hole. I don’t think this effects their results, but I do wish they’d controlled for ability in a better way. It’s possible that good putters block out the pressure better, or that bad putters are less affected by pressure.

Edit: It’s also unclear whether they take putts gained on the final hole out of their TPG measure.

However, the effect they’ve found is significant in terms of examining the impact of pressure. It indicates that for putts with very large differences in prize money (for example the Bubba Watson example they quoted on page 9 where $300,000 was on the line) the difference in probability of making the putt compared to a non-pressure situation could be up to 10%.

The Impact of Tournament Position on Performance:

I’ve dug into my database of results for the last half decade and examined the impact on performance of starting position in terms of strokes behind the leader. I’ve found two interesting results: 1. players who begin the second and fourth rounds a large number of strokes behind the leader perform worse than expected (I have compared all performance to my expected z-score performance) and 2. players who begin the fourth round in the lead or one stroke back perform worse than those who begin the round near the leader, but further back. For #1, it’s possible that these players are out-of-form (whether because of injury, swing, fatigue, etc.) or that they’re “giving-up” – focusing less because there’s less on the line for them. For #2, I suggest that players near the lead “choke” or play slightly worse than normal because of pressure.

It’s the nature of a golf tournament that after the second round a cut is taken that eliminates the worse half of the field. After the fourth round, prize money is awarded based on finish –  with those at the top earning as much as 17% of the purse and most of those near the bottom earning prizes of around 0.5-1% of the purse. In other words, players who begin the second round far behind the leaders normally will not be able to make the cut, while those who begin the fourth round far behind the leaders are normally locked into a very small prize. This means that players who are near the bottom beginning the second and fourth rounds aren’t playing for much; the first group is likely to miss the cut and earn nothing while the second group is largely locked into a small prize. I’ve found the negative impact on performance to be as much as a quarter to half a stroke for those near the bottom of a leaderboard (the green and blue lines on the below graph). It’s impossible, with this data, to attribute this effect to “giving-up” or to physical factors – out-of-form swing, injury, fatigue, etc.

impact of starting position

Similarly, if you focus on the fourth round results in blue, you can see that players in positions zero and one strokes behind the leader performed approximately 0.1 strokes worse than expected while those in positions two to seven strokes behind the leader performed approximately 0.06 strokes better than expected. All of these players had a reason to remain fully engaged mentally with the tournament. Those finishing in the positions they started the round in stand to earn the large prizes. This shows that players who begin the round in or near the lead typically play slightly worse than would be expected by their prior performance, and more importantly that the same is not true for players who begin the round in close positions, but not in or right behind the leader. This reinforces the idea that pressure exerts a negative effect on those in the lead.

I also should address the red line for the third round. Players in the third round begin in opposite order of performance so far, meaning those furthest back of the leader tee of early, while those closest to the leader tee off late. PGA Tour courses play more difficult, in general, in the afternoon than the morning. Steven Rachesky found a difference of roughly 0.15 strokes between early and late tee-times – similar to the results I’ve observed from my data. That is the major reason why the data for the third round looks drastically different.

 

Predicting Putting Performance by Distance

Mark Broadie’s research of the Shot Link data established a clear relationship between putt distance and % of putts made. PGA Tour pros make a very high percentage of their close putts, but only about half of their putts around 10 feet and only around one in six around 20 feet. Pros hole very few (~5%) of their longest efforts from 25 feet and beyond. That data on % of putts made for each distance now forms the backbone of the PGA Tour’s Strokes Gained Putting statistic where players are credited and debited for making or missing every putt from every distance. Over a single season Strokes Gained Putting is often an unreliable indicator of putting performance, particularly at the extremes and also for players who have putted much worse or much better than in previous seasons.

Putting performance is polluted by randomness; Tour players just don’t attempt enough putts over the course of the season to get an accurate picture of their underlying putting ability. However, to make accurate projections of putting ability, you need to know whether Graeme McDowell’s 0.9 putts gained this season represents more talent or more luck. I’ve broken down putting performance into four different distance buckets from the PGA Tour data: putts inside 5 feet, 5-15 footers, 15-25 footers, and putts outside 25 feet. The results show that putting performance is far more predictable and consistent at the short distances. Long putting is so noisy that it’s difficult to say anyone gains much of an advantage from their long putting over the long-term.

Inside 5 Feet:

These putts are almost always converted (average 96%). The spread in performance between 2011-14 was 93% to 99%. The spread in expected performance derived from weighting the previous four seasons is 94.3% to 97.8%. This indicates that we should expect every regular Tour player’s true talent from inside 5 feet to fall somewhere inside that 3.5% range. Based on an average of over 900 putts attempted inside 5 feet over a season, we should expect every regular Tour player’s talent in terms of putts gained or lost to fall between +0.2/round and -0.3/round.

The graph below shows the correlation between a three year average (2011-13) and 2014 performance for all players with qualifying rounds in all four seasons. The correlation (R=0.56) between prior performance and 2014 performance is strongest in this distance range.

inside5feet

5-15 foot Putts:

This length is either short birdie putts or par putts after a scrambling shot that are converted approximately half the time. The spread in performance between 2011-14 was 36% to 54%. The spread in expected performance derived from weighting the previous four seasons is 40% to 52%. Based on around 450 putts attempted from 5-15 feet over a season, we should expect every regular Tour player’s talent in terms of putts gained or lost to fall between +0.4/round and and -0.5/round. Compare that to the best putters on Tour gaining about 0.75 putts/round.

The correlation between three year average and 2014 performance is below. The correlation (R=0.53) is similar to that for the short <5 foot putts.

5-15 footers

15-25 foot Putts:

These length are normally longer birdies putts and are converted about 16% of the time. The spread in performance between 2011-14 was 8% to 26%. The spread in expected performance derived from weighting the previous four seasons is 12% to 20%. Based on around 225 putts attempted from 15-25 feet over a season, we should expect every regular Tour player’s talent in terms of putts gained or lost to fall between +0.15/round and and -0.15/round. There’s much less at stake from this range than the previous two, just because so few putts are attempted from 15-25 feet.

The correlation between three year average and 2014 performance is below. There’s not much of a relationship (R=0.28), showing that putting performance from this range is much more affected by random chance over a full season than the shorter length putts.

15-25 footers

Putts outside 25 feet:

These length are the longest birdie putts, often really lag putts just to get it close for par. The spread in performance between 2011-14 was 2% to 13%. The spread in expected performance derived from weighting the previous four seasons is 4% to 9%. Based on around 300 putts attempted from beyond 25 feet over a season, we should expect every regular Tour player’s talent in terms of putts gained or lost to fall between +0.1/round and and -0.1/round. Again, there’s very little difference in expected performance from this distance. Even the very best long putter on Tour will gain little from these putts – over the long term.

The correlation between three year average and 2014 performance is below. There’s almost no relationship (R=0.10), which means it’s almost impossible to predict how well a player will putt on these long putts. The top ten long putters from 2011-13 average hitting 7.6% of their putts (versus 5.5% average). They only hit 6.7% of their putts in 2014 – a regression of almost 50% to the mean.

outside25ft

The Big Picture:

This graph shows performance in all four ranges. The longer putts show little relationship to future performance, while the shorter putts do show a more consistent relationship. This means that players who gained a lot of putts last season based off their longer putts will start making putts at a lower rate, while those who gained a lot of putts based on shorter putts are better bets to retain that putting ability.

bigpicture

Most Improved Putters from 5-15 feet in 2014:

1. Graeme McDowell

2. Charley Hoffman

3. Billy Horschel

4. Justin Leonard

5. Michael Thompson

These guys have a better chance of retaining their putting performance into 2015.

Most Improved Putters from > 25 feet in 2014:

1. Rory McIlroy

2. Y.E. Yang

3. David Toms

4. Brendan Steele

5. Brian Gay

These guys look likely to regress in terms of putting performance, especially McIlroy who performed to career average on all other putts, but hit 8% more of his long putts – gaining almost a third of a putt per round over his career average.

Measuring the Signal in First Round Performance

After the 1st Round of the Deutsche Bank Championship a month ago, Keegan Bradley sat two strokes off the lead. Playing in front of the home fans, Bradley fired a six under 65 fueled by great putting (4.2 strokes gained) and a solid long game (2.3 strokes gained on tee shots and approach shots). At that point he looked in great shape keep it going and capture his first win of the season. However, he came out the next three rounds and shot 71-69-71 to finish T16. The culprit wasn’t his long game either; he gained 1.6 strokes on the field per round in the second, third, and fourth rounds, good enough to finish in the top ten for the event in strokes gained off tee shots and approach shots. No, it was the putter that let him down. After being hot in the opening round, he actually lost 0.4 strokes per round from his putting.

My question is: how common is Bradley’s experience? When golfers come out in the 1st round and play/putt very well, how often do they keep playing/putting well? What about when they come out hitting the tee shots and approach shots well? Does that carry over to the next day? Many around the game act like one round of performance is really meaningful (just look at everyone who advocated for playing Jordan Spieth and Patrick Reed after their Friday morning 5&4 win at the Ryder Cup), but does first round performance tell us anything about how a player will perform in the following round?

Looking at Putting:

I gathered round by round Strokes Gained Putting data from the twelve most recent PGA Tour tournaments (Travelers Championship through the Tour Championship). First, I checked how 1st round putting performance predicted 2nd round putting performance. That’s the first graph below, and the results show how player putted in the 1st round hardly sheds any light on how they will putt in the 2nd round (R^2 of 0.001). In fact, someone who putted as well as Keegan Bradley did in the above mentioned round would be predicted only to putt 0.2 strokes above average the following round.

rd1SGP v rd2SGP

Next I generated prior expectations of Strokes Gained Putting performance from the past several years of data. I’ve shown before that putting performance isn’t very consistent season-to-season, so I’m using performance from 2011 to 2014 to generate the prior. The below graph shows how well the prior expectation predicted 2nd round putting. The results still were not highly predictive – R^2 of 0.01 (performance round to round is highly variable in golf) – but the regression line produced tracks pretty closely with results. Players predicted by the prior to putt well generally putted well and those predicted to putt poorly generally putted poorly.

priorSGP v Rd2SGP

Finally, I tied both pieces of information together. The prior estimate proved way more predictive than just 1st round performance, but does 1st round performance have any information to add? I set-up a linear regression with the prior estimate as x1 and the 1st round performance as x2. The results indicated 1st round putting performance provides no extra information to predict 2nd round putting performance (the coefficient was indistinguishable from zero). If you have a good guess of how well a player will putt, you can safely ignore first round putting performance.

Looking at Long Game Performance:

The long game is tee shots and approach shots (drivers/woods/irons essentially). I gathered long game performance data from the same twelve PGA Tour tournaments for the first and second rounds. I then ran the exact studies as above just substituting long game data for putting data. The correlation between 1st round long game performance and 2nd round long game performance was higher than with putting, but still didn’t contain a lot of information (R^2 of 0.03). If a player plays four strokes above field average in long game strokes gained, they’re expected to play 0.6 strokes better in the long game in the 2nd round.

rd1LONG vs rd2LONG

There was also a higher correlation between my prior estimate for long game ability and 2nd round long game performance (R^2 = 0.10). Again though, the regression line tracks closely with the results. Top ten long game players (around +1.2 strokes or above) generally performed to that level in the 2nd round.

priorLONG vs. rd2LONG

Tying both pieces together indicated that there is a small amount of signal in 1st round long game performance. Combining the prior estimate with 1st round performance slightly increases the fit of the model. The regression equation suggests that you should weight your prior estimate at twelve times the strength of first round performance. This indicates that someone who is PGA Tour average in long game shots, but produces an elite round of 4.0 long game strokes gained, should be expected to play about 0.3 strokes above average in long game shots. That seems like a small difference, but it’s enough of a shift in talent to move a player from around 50th best in the world to about 30th best in the world.

The Takeaway:

Based on these results, it looks like 1. a single round of performance is much less predictive than an estimate built on past-performance and 2. the small amount of signal contained in single rounds is from performance on tee shots and approach shots. Putting results from one round provide no more information than was available before the round. On the other hand, golfers who play particularly well on tee shots and approach shots in a round should perform slightly better than expected the following round.

 

Is Distance or Accuracy Preferable Off the Tee?

The driving distance vs. driving accuracy debate is one of the longest running in the golf community. You cannot go a single televised round without a commentator stating the critical importance of hitting fairways, while a lot of the new statistical research supports the idea of driving distance being of preeminent importance. Players have also shaped their games towards one of the extremes; Tiger Woods under Sean Foley struggled with hitting fairways, so he often opted for irons/woods off the tee – shorter distance, but more accuracy. What I have found is that you can find information to support both view points depending on what stats you’re looking at, but in terms of producing lower scores distance is king.

Impact on Hitting Greens:

Greens in regulation is the best of the traditional stats in judging performance in professional golf. Each missed green relative to the field average costs a player around 0.6 strokes – a huge sum. In comparison, each missed fairway relative to the field costs only around 0.3 strokes. One of my first posts examined the relationship between driving distance and accuracy (it’s negative: R = -0.48) and how driving distance and accuracy predict greens in regulation (pretty well: R^2 = 0.49). A linear regression of driving distance and driving accuracy on greens in regulation showed that distance and accuracy were almost equally predictive of greens in regulation [1]. That means that long/inaccurate hitters hit about as many greens as short/accurate hitters.

I replicated this study with 2014 season data and found almost the exact same results. The only difference was that accuracy was slightly superior in producing greens in regulation than distance. But in general the results were consistent; if hitting greens is a player’s main concern, it doesn’t matter whether they’re long/inaccurate or short/accurate.

Impact on Scoring:

However, that’s not the conclusion reached when you look deeper into the data. I have calculated Strokes Gained Driving for all qualifying Tour players this season. Strokes Gained Driving was invented by Mark Broadie (look here & here) and measures the value of all par 4 and par 5 tee shots depending on how far they end up from the pin and whether they’re in the rough, fairway, first cut, bunker, out of bounds, water, etc. In short, it tells you how well each player drove the ball this season. Mark has posts here and here on PGA Tour.com discussing the leaders and the results are pretty intuitive on who is normally seen as good at driving.

Using driving distance and accuracy, I attempted to predict Strokes Gained Driving using a linear regression. The two stats were highly predictive (R^2 = 0.88), but driving distance was more valuable to predicting Strokes Gained Driving [2]. In fact, a PGA Tour player in the top ten in driving distance and near the bottom in accuracy (think Dustin Johnson or Jimmy Walker) gains about 0.45 strokes per round on another player who is near the bottom in driving distance and top ten in accuracy (think Zach Johnson or Graeme McDowell). The same relationship holds closer to average: it’s preferable to be slightly longer than slightly more accurate in general. You hit closer approach shots which produce easier looks at birdie.

Impact on Birdie & Bogey Rates:

I finally looked at the impact of driving distance and accuracy on Birdie% and Bogey%. For Birdie% I used the PGA Tour’s Par 4 Birdie or Better stat and for Bogey% I used the PGA Tour’s Bogey Avoidance stat. Both were the best available, though they aren’t adjusted for course difficulty which may limit these results. Distance and accuracy proved much less predictive of Birdie% and Bogey% than they were of Strokes Gained Driving and GIR (which makes sense because we’re ignoring scrambling ability and putting). The R^2 for predicting Birdie% was only 0.15 and only 0.06 for predicting Bogey%.

However, the results for Birdie% showed a clear advantage for those long and inaccurate drivers in generating birdies (about 1% more Birdies for long/inaccurate hitters) [3]. The results for Bogey% showed a clear disadvantage for those long and inaccurate drivers in avoiding bogeys (about 1% more Bogeys for long/inaccurate hitters) [4]. Long hitters make more birdies, but also more bogeys. Short hitters make fewer birdies, but avoid bogeys. That’s fairly intuitive.

Summing it all up:

Based on these results it’s clear that long/inaccurate hitters hit about as many greens as short/accurate hitters, but they produce way more value with their tee-shots. They hit their approach shots from easier positions in general than the shorter hitters and produce more birdies. However, they  hit into dangerous areas slightly more and make more bogeys than the short/accurate hitters. In fact, these results indicate that it’s easier to consistently hit greens when you’re in the fairway, but it’s easier to produce those close approach shots that turn into birdies when you’re closer to the pin

The main take-away though is that long/inaccurate hitters produce more value with their drives over the course of the season. They’re constantly hitting closer approach shots which leads to more birdies. The only advantage possessed by short/accurate hitters is avoiding bogeys, but at the cost of making fewer birdies.


 

Notes:

[1] the coefficients were: 0.003 for yards above PGA Tour average and 0.447 for % fairways hit above PGA Tour average

[2] the coefficients were: 0.0645 for yards above PGA Tour average and 4.504 for % fairways hit above PGA Tour average

[3] the coefficients were: 0.1595 for the intercept, 0.00121 for yards above PGA Tour average, and 0.0635 for % fairways hit above PGA Tour average

[4] the coefficients were: 0.1709 for the intercept, -0.00032 for yards above PGA Tour average, and 0.0981 for % fairways hit above PGA Tour average

[5] All stats here have been adjusted for course except for the Birdie% and Bogey%

How Real are Hot Streaks in Golf?

Whenever a golfer goes on a high profile hot streak – think Rickie Fowler since June, Billy Horschel for the FedEx Playoffs, Henrik Stenson at the end of last year – there’s always a ton of talk in the media and among fans about how their new swing/putting stroke finally clicked, or that player is returning to form, or they’re finally mature enough to win, etc. Humans love writing narratives to explain why things happen. The end result of all that talk is that a guy in the middle of a hot streak is considered to be much better than they would’ve been considered before the hot streak. No one thought Billy Horschel was deserving of a Ryder Cup pick a month ago, but now everyone thinks we should toss Webb/Mahan off to make room for him. No one thought Rickie Fowler was one of the 1-2 best American players in May, but now that’s almost assumed. Everyone around golf seems to think that hot streaks are real – that they actually predict who’s going to continue to play well. In this post I’ll provide evidence that shows that hot streaks are retained to a small degree – even months later – but that extreme performances still regress strongly to prior expectations.

Methodology:

I settled on using five week periods to measure performance. My sample was everyone who had recorded at least 8 rounds in a five week period and then recorded at least 8 rounds in the next five weeks. All my data is from the 2011-2014 seasons. The actual metric I used to measure performance was my z-score ratings, which are basically strokes better or worse than the field adjusted for the strength of the field. I compared each player’s z-score over that five week sample to my prior z-score rating. I have a prior rating for every player in my sample generated each week which mostly uses prior performance and very recent play to predict how well a player will play that week. They’re designed to be the most accurate prediction of performance. I subtract the prior expectation from the sample performance to get the change in performance which I’ll call the Initial Delta.

So my metric looks like this:

(Sample performance over 5 weeks) – (Prior expectation) = (Initial Delta)

I generated an Initial Delta for every player who qualified for my sample, generating over 27000 separate data points.

I then calculated a Subsequent Delta for every player using the same method only using the next five weeks as my Sample performance and the same Prior expectation used above (meaning I don’t consider any recent results). I then compare the Initial Delta to the Subsequent Delta. If players get hot and stay hot, the two should be strongly correlated. If whether a player has been hot or cold does not predict their subsequent performance, the two will not be correlated.

tl;dr of the above is I’m comparing how much better/worse a guy played over the first 5 or 10 weeks to how much better/worse he played over the next 5 weeks.

Results:

The results show that in general players retain only a small portion of their over or under-performance. Overall, about 20% of the Initial Delta is retained over the next five weeks. This means that if Billy Horschel played 1.8 strokes better than expected over the last five weeks, he should be expected to play about 0.36 strokes better than previously expected in the next five weeks. Now, 0.36 strokes is a large amount, but it’s not enough to bring him up to Bubba/Fowler/Keegan’s level (here is an example of the distribution of talent among the top 50 in the world). Right now, he should be considered slightly better than Mahan or Webb, but not to some ridiculous amount and certainly not to any degree that’s going to effect the outcome of the Cup.

5weekNOPRIOR

Looking Further Ahead:

The above shows that hot streaks can be retained to some degree over a short period of time, but how much is retained further down the road? Is Billy Horschel going to be able to retain any of that ability he showed to win the FedEx Cup going into next season? I set-up the same study as above, only instead of looking at performance in the next five weeks, I looked over the next four months (16 weeks to be precise). Everything is calculated the same, though I only included players with at least 20 rounds over that four month span.

The results here showed that about 18% of the Initial Delta is retained over the next four months, a similar amount to what is retained over the next five weeks. Golfers who play significantly better than expected over five weeks should perform better than previously expected, but only to a small degree. To give you a sense of when recent performance becomes mostly insignificant, if a player performs 0.5 strokes better than expected over five weeks (basically what Chris Kirk has done in the FedEx Cup Playoffs), he is expected to retain only around 0.1 strokes (which is insignificant, basically a rounding error in predictive terms).

4monthNOPRIOR

Adjusting Expectations:

I’ve attached a list of the top and bottom ten guys who have most over or under-performed over the last five weeks (PGA/European Tour only).

Movers9182014

Obviously Horschel is at the top along with some FedEx Playoff stalwarts like Palmer/Fowler/Day. Ryder Cupper Jamie Donaldson has been killing it over in Europe as well. Among the trailers, Phil’s name sticks out like a sore thumb. The US team has to hope his multiple weeks off can help him rediscover his game before the next week. Probably the most terrifying thing is how close Ryan Moore came to making this team – he finished 11th in points, but was only a stroke away from jumping Zach Johnson in points at the PGA Championship. Moore is dead last of 339 pro golfers in terms of his performance relative to expectation.

Does High Ball Flight Hurt at the British Open?

During the coverage of The Open Championship on ESPN I’ve heard numerous references to how a high ball flight hurts a golfer (because it exposes the ball to the wind longer) or how golfers have to adjust their ball flight lower to avoid the wind. The conventional wisdom certainly makes sense here, but I’ve never actually looked at it. If it does make a significant difference then guys like McIlroy, Fowler, Jason Day, Stenson, etc. who bomb it high would either have to adjust their games or suffer the penalty. To test whether this is actually happening I’ve taken the PGA Tour’s posted Trackman data on Apex Height and used the results of the past six Open Championships to measure whether there is any connection.

Design Assumptions:

1. That a golfer’s ball flight with their driver, woods, irons, etc. is represented by their Trackman Apex Height measured only on drives. My limited knowledge of projectile physics and golf confirms this is likely true.

2. That a golfer is hurt by either 1. high ball flight or 2. adjusting their game away from normal high ball flight. It may be that high ball flight harms golfers at The Open, but golfers easily compensate for it without harming their ability. It seems unlikely, but keep in mind I’m not compensating for whether golfers play differently at The Open.

Design:

The PGA Tour has radar measurements since 2007. My Apex Height for each golfer is simply their average of the two prior seasons (or one prior season if they have only one). I’ve tossed out anyone without listed data; this removes most European Tour golfers. This is unfortunate, but there’s no data available (and it avoids confounding the results with any home continent advantage).

To judge performance I simply compared each golfer’s performance relative to the field in that year’s Open Championship to what I expected from them entering the event, using my ratings up to the date that tournament was played. These ratings are heavily based on the previous couple of seasons of overall performance. This allows me to measure performance changes due to the impact of ball flight. A golfer who performs to expectation will receive credit for no change, while one that overperforms by a lot will be credited for overperforming. I’ve used Open Championship results from 2008-2013 because that’s when there is Trackman data available.

Results:

First, ball flight is strongly correlated between seasons for golfers. Season N is correlated at R=0.77 to Season N-1 and R=0.68 to Season N-2. What Apex Height measures is definitely a persistent ability for golfers.

Apex Height British Open

It appears that a high ball flight does not impact performance in The Open Championship positively or negatively. This is based on 437 tournaments over 2008-2013.

Discussion:

The results indicate there is no impact on performance, however some caveats are needed. 1. This assumes that the randomness of putting has washed out of the results. Golfers have hot or cold putting rounds all the time which are not impacted by ball flight at all, and I cannot be certain this hasn’t impacted the results. 2. This also assumes that weather effects have been distributed evenly. It seems unlikely that the high ball flight golfers have benefited from disproportionately calmer weather conditions, but weather is highly variable and the sample is only 437 golfer tournaments.

Despite those caveats, I think it’s likely that either high ball flight doesn’t harm a golfer’s chances at The Open or that golfers can modify their ball flight at The Open without seeing any negative impacts on performance.

Performance Impact of Playing John Deere Classic vs. Scottish Open

The week before The Open Championship features a choice for the elite golfers on the PGA Tour; they can enter the John Deere Classic, a weak field PGA Tour event with the smallest purse of regular full-field events, they can take the week off to prepare for The Open Championship, or they can enter the Scottish Open, one of the premier European Tour events held on a links style course which exempts the OWGR top 60 – giving PGA Tour golfers a chance to play in Europe. This week, in addition to golfers who actually hold European Tour cards, Phil Mickelson, Rickie Fowler, Jimmy Walker, and several others all chose to tee it up in Scotland, while Jordan Spieth, Zach Johnson, Harris English, and most others entered the John Deere Classic.

The arguments in favor of entering the Scottish Open are two-fold in my eyes: first, you get to acclimate to the time difference a week before many of the other golfers, and second, you get to play a links style course rather than a typical PGA Tour layout which presumably better prepares you to play the following week. Phil Mickelson famously won the Scottish Open last summer, before winning his first Open Championship at Muirfield the next week. As a bonus the Scottish Open purse is the same size as the John Deere Classic (£3,000,000 vs. $4,500,000).

As to the advantages of playing in the John Deere? First you accumulate FedEx Cup points to either secure your card or bolster your chances of advancing further in the Playoffs. The PGA Tour awards no points for guys playing in Scotland. That obviously goes for the exemptions that come with winning the tournament as well. It’s possible that certain players couldn’t secure entrance to the Scottish Open, but it exempts the top 60 in the World Rankings. Spieth, Zach Johnson, English, Ryan Moore, Kevin Na, Chris Kirk, Kevin Streelman, and John Senden all could’ve entered but chose to play in Illinois. As a bonus, the John Deere provides a Sunday night charter that transports golfers from Illinois to Britain.

When I tested my assumptions that playing the Scottish Open the week before was an advantage my results indicated that golfers who played in Scotland rather than Illinois the week before enjoyed a significant advantage relative to those who played in the John Deere. I’ll explain my process and results below.

First I gathered the results of the last six Open Championships that I had handy for another project. I then found basic performance expectations for each golfer with results based on their prior performances (basically using results from the two years prior to each Open). That gave me results to compare and a baseline of expectations. If players significantly over or underperformed that baseline in each Open, that likely indicates playing the Scottish Open or John Deere the week before was a better strategy. I then gathered data for the prior week, finding who played in Scotland or Illinois.

My results were very convincing that playing the Scottish Open gave a player an advantage. On average, golfers who played the Scottish Open performed around 0.65 strokes/round better than those who played in the John Deere, relative to their baseline expectation (John Deere players were around 0.9 strokes worse than expected and Scottish Open players were around 0.25 strokes worse than expected). I had 144 players in the John Deere sample and 310 in the Scottish Open sample. The results are below denominated in strokes compared to PGA Tour average; negative numbers indicate performance better than average and in the difference column positive numbers indicate worse performances.

british open prep

The results were consistent in each of the six seasons I examined. Players who played the Scottish Open always were advantaged over those playing the John Deere, by between 0.2 strokes/round to 1.5 strokes/round. I ran simulations using the baseline expectations, normal observed standard deviations, and the sample sizes from the last six years. The simulations showed a difference of the size measured in favor of the Scottish Open players less than 1% of time. Based on the size of the effect and the fairly large samples for both tournaments, I think it’s very unlikely this is simply variance.

My current methods don’t allow me to completely tease out what factors are contributing to that advantage. The group of golfers who played neither event the week before underperformed their expectation by 0.43 strokes (compared to 0.83 for John Deere and 0.25 for Scottish Open). Mixed in that are golfers who competed elsewhere in the world the week before The Open (Asian Tour, Japan, etc.), golfers who arrived in Britain for prep the same week as the Scottish Open and didn’t compete anywhere, golfers who arrived in Britain at the same time as the John Deere competitors, and European based golfers who didn’t compete anywhere. All this indicates is that there appears to be some advantage to playing the Scottish Open and some disadvantage to playing the John Deere classic.

Perhaps there is some residual home continent advantage to getting used to the time change over the course of ten days, rather than three. Perhaps the value of an extra week preparing on a links course is actually sizable. There’s also the fact that most of the sample of Scottish Open players are regular European Tour players with slightly more experience playing links style courses. While I don’t think overall experience plays much of a role in this difference, it’s likely a small factor. It’s clear however that something is causing players from the John Deere to dramatically under-perform relative to those who are playing in the Scottish Open. In future seasons PGA Tour players qualifying for the Scottish Open would be better served taking advantage of the invitation in order to better perform the following week.

Aging Curves for Scrambling and Driving Distance

In the past months I’ve posted about aging on the PGA Tour several times, including a general aging curve and an aging curve for putting performance. The general shape of the aging curve for PGA Tour players is a slight improvement from the early 20s to early 30s, followed by a period of relative stability through the mid 30s, and then a steady decline from the late 30s until 50. I assume from limited post-50 PGA Tour data and the paucity of Champions Tour golfers over 60 that aging continues to steadily erode a golfer’s game after 50. For putting, the curve was similar, but much less pronounced. Aging due to putting accounts for little of the improvement experienced in a golfer’s 20s and less than 10% of the decline experienced from the late 30s onward. The source of age-related improvement and decline is clearly some other part of a golfer’s game – either off the tee, the short game, or the long iron approach shot game. I’ve constructed aging curves for the first two components.

SGPAgingCompared

Aging Curve for the Short Game

I’ve calculated my own adjusted scrambling statistic previously which uses the PGA Tour’s scrambling stat as its base, but attempts to remove the influence of putting and difficulty of the lie to figure out which golfers play the ball into the best positions when they miss the green. I’ve described the calculation in this post. The spread in talent between the best and worst golfers each year by this metric is roughly a stroke. That is, the best golfers at scrambling gain half a stroke/round on the field and the worst golfers lose half a stroke/round. For comparison, the best putters gain nearly a stroke/round on the field and the worst putters lose nearly a stroke/round while the best golfers tee to green gain two strokes and the worst lose two strokes. Scrambling shots make up only 10% of a golfer’s strokes per round (6-8 strokes).

Unfortunately, this method does not include strokes where a golfer went for the green in two on a par 5, but missed the green. I estimate these shots comprise only roughly 1.5 strokes/round for the average golfer, though as many as two strokes for longer hitters and as few as one for shorter hitters. These shots are no different than scrambling shots on par 3s or 4s (besides scrambling for birdie rather than par), but the data just isn’t there to include them. So around 15-20% of the sample of short game strokes is missing. I’m confident this will not materially affect the results of this study.

I gathered data between 2008-2013 for which each golfer played consecutive seasons with at least 30 PGA Tour rounds. This resulted in 693 pairs of seasons to compare. As with prior aging studies, I used the delta method popularized by Mitchell Lichtman for use in baseball research. This method simply aggregates all improvements and declines between golfers at each age to see whether an age cohort generally improved or declined. From that data, a curve can be constructed.

Scrambling Aging Curve

That is not even a curve, but instead a steadily increasing trendline. The data indicates that scrambling ability increases linearly with age. Golfers under the age of 24 (34 seasons since 2008) performed around 0.05 strokes worse than the field average at scrambling. From there, golfers improved by around 0.01 strokes/season – a small amount, but one that indicates that experience on Tour leads to improvement in short game play. This finding runs contrary to any other aging study I’ve conducted.

I suppose it is not that shocking, however. Short game play involves a lot of technical skill – stance, position of hands, correct judgement of swing speed, club head position, etc. (in addition to strategy) – while not involving much of the physical strength that declines with age. Any golfer who can play on Tour can generate the proper amount of swing speed to play <50 yard shots. Not so much for the ability to generate the swing speed to hit 300+ yard drives or reach the green with 3 wood from 275 yards.

Compare scrambling here with the other component curves I’ve introduced. Putting generally declines slightly, while tee to green play (meaning all non-putting/scrambling strokes) improves only until around 27, stagnates until the mid 30s, and then sharply declines. I’ll discuss why I believe tee to green play stagnates around 27 next.

component aging curves

Aging Curve for Driving Distance

My catch-all category “tee to green” from above includes a wide array of shots: drives on par 4s/5s, tee shots on par 3s, going for the green shots on par 5s, long approach shots on par 4s/layups on par 5s, and short wedges on par 4s/par 5s. Some of those shots require a golfer to exert maximum effort to hit the ball near his peak ability (most drives/going for green shots), while most others require at least a full swing. In short, most of the shots contained in the “tee to green” bucket are going to be heavily affected by how much physical strength of golfer can exert. There are other factors certainly (strategy, precision, etc.), but physical strength is a large part of it.

The problem with that is the type of physical ability that combines body coordination with physical power – think driving a golf ball or hitting for power in baseball – begins declining as early as the mid 20s. This study from 2012 observed that while baseball hitters have tended to peak around 27, their ability to hit for power (home runs, doubles, etc.) has peaked at 25. I’ve observed the same phenomenon in golfers’ ability to drive for distance.

To measure the impact of aging on driving distance I gathered the Trackman data the PGA Tour has from 2008-2013. Typically the PGA Tour sets up Trackman on one hole per tournament to gather information about the club head speed, ball speed, launch angle, carry distance, etc. for the drive. I prefer using this data to measure driving distance because it places all golfers on an even surface. The hole is selected for whether most golfers will hit driver and the carry distance measures only distance in the air (removing the effects of firm or soft fairways. I gathered data for all golfers with qualifying number of Trackman readings (>20/year) from 2008-2013 (the extent of the data collected by the Tour). I used the same delta method as above to measure the increase or decline in carry distance between consecutive seasons. That yielded 696 pairs of seasons.

Carry Distance

This indicates that a golfer’s peak driving distance performance comes from age 25 and earlier. Golfers out-drive the field by 6 yards before the age of 24, declining to roughly average by age 35, and then decline heavily from that point onward – losing almost 20 yards to the field by age 48. This is the exact pattern suggested by the baseball power hitting aging curve above.

But What About…?

Now, at this point you may be wondering how a golfer can lose so much driving distance over the course of their career and still remain competitive. However, this aging curve doesn’t prove that every golfer ages similarly and it especially doesn’t mean that we should observe elite golfers losing so much distance. Elite golfers are elite because they have overcome many of the age-related obstacles that derail other golfers. This curve merely shows what we should expect out of the typical PGA Tour golfer. Very few golfers survive to have the type of career Davis Love III, Jim Furyk, or Phil Mickelson have had. That is a direct consequence of aging; a large number of golfers simply do not age well (whether due to injury, lack of commitment to practice, or general physical decline) and find themselves off the PGA Tour by age 40. Below is a graph of three golfers – two elite, top-ten-of-the-last-25-years types and one above-average player. It gives you an idea of how even very good golfers decline in driving distance.

normanvijayappleby

Greg Norman was one of the best golfers in the world in the 1980s-90s; he won 20 times on the PGA Tour and 14 times in Europe, largely by relying on his superior distance off the tee. 1983 was Norman’s first season on Tour with reliable driving distance data. That year at age 28 he out-drove the field by 19 yards – equal in performance to Bubba Watson and Dustin Johnson currently. Norman continued to out-drive the field by large margins, but his advantage fell to 14 yards (’87-’89, age 33), 13 yards (’92-’94, age 38), and finally 2 yards (’97-’99, age 43). Norman’s game couldn’t sustain the massive drop in driving distance and he declined from one of the best players on Tour in his late 30s to one who didn’t win a PGA/European Tour tournament after 1997 (age 42). He was only a part-time player from that point.

Vijay Singh was a late-bloomer on Tour, not becoming a full-time member until 1993 when he was 30. He out-drove the field by 14 yards that season. Even at age 40 he out-drove the field by 16 yards as he rivaled Tiger Woods for the #1 ranking. However, in 2013 and 2014 Singh has been exactly PGA Tour average at driving – a decline of ~15 yards over ten years; exactly what the aging curve predicts between 40 and 50.

Stuart Appleby was a very good PGA Tour player by 2006, winning eight times and recording top 25 stroke averages in several seasons. He out-drove the field by 10 yards on average between ages 25-35 (1996-2006). However from 2006 onward he declined sharply, averaging only average driving performance between 2006 and 2013 – winning only a single tournament and basically being a non-factor on leaderboards.

Applying Driving Distance to Performance:

Driving distance is correlated strongly with performance. Mark Broadie’s work has shown that for every yard closer to the pin your tee shot lands, you save around 0.004 strokes. That looks small, but it suggests that the absolute best drivers are gaining roughly a 0.1 strokes/drive based on their distance. Applying the 0.004 figure to the above aging curve means we should expect a golfer to decline by 0.12 standard deviations between 25-35, another 0.12 standard deviations between 35-40, and twice that amount between 40-50. In all, a decline in driving distance explains roughly half of the decline in tee to green play. The component aging curve graph from above is reproduced with tee to green game separated into driving and non-driving shots (“Approach”).

comp with driving

This shows that approach shots exist in a middle-ground between the largely power based driving strokes (which begin declining by 25) and the precision/technique based scrambling strokes (which never decline). This makes sense as iron/wedge shots with a full swing combine both power and precision elements.