Golf Analytics

How Golfers Win

Tag Archives: projections

Similarity Model Projections for 2015

I’ve been playing around with a projection model based on comparing golfers to past golfers to attempt to predict performance for the upcoming season. These types of models are common; most notably, Baseball Prospectus has PECOTA for MLB and Kevin Pelton has SCHOENE for the NBA. They work by certain identifying characteristics (stats, physical measurements, age, etc.) and then generating a list of comparable players who are most similar to each player being projected. From those comparable players you can generate statistical projections, confidence intervals, breakout/decline probabilities, etc. The hope is that by comparing players to thousands of past player seasons you can identify characteristics of players who improve or decline that aren’t immediately obvious if you just have point estimates of their talent level.

For my model, I selected six inputs (N=season prior to the one being projected): performance in seasons N, N-1, and N-2, change in performance between seasons N-2 and N-1 and between N-1 and N, and age in the middle of season N+1. For this year’s projections that means I’m using performance in 2012, 2013, and 2014, change in performance between 2012 to 2013 and 2013 to 2014, and age as of 7/1/2015. Based of these inputs I’ve generated a list of the 100 most comparable golfers from 1992-2013 (or fewer if that golfer had fewer comparables that met my threshold). All projections are based on the subsequent performance of those comparable golfers.

I’ve generated probabilities for each golfer to breakout (improve by 0.5 strokes/round), collapse (decline by 0.5 strokes/round), improve (any improvement), and decline (any decline). I’ve also generated a mean projection and 95% confidence interval for each golfer’s performance this season. Each golfer’s five most comparable seasons is listed, as well as the number of comparable golfers that met my threshold (no one reaches 100 because of the presence of a handful of 2014 seasons; these are ignored). I’ve attached the results in the Google Doc at the end.

Back-testing using the projections for last season this model predicted that 18% of golfers would breakout and 22% would collapse. In reality, 16% broke-out and 24% collapsed. 94% of projected golfers fell within their 95% confidence intervals. The correlation between projected and actual mean performance was 0.75, which is slightly stronger than past attempts to project simply using prior performance. The mean absolute error was 0.5 strokes, meaning the average projection missed by 0.5 strokes in either direction.

Most likely to breakout:

1. Webb Simpson

2. Charl Schwartzel

3. Hideki Matsuyama

4. Ryo Ishikawa

5. Brandt Snedeker

My use of breakout is to say a golfer will play significantly better (0.5 strokes) than they did the year before. Examples are G-Mac, Jason Kokrak, and Ryan Palmer last year. Simpson, Schwartzel, and Snedeker all had down years last year, but the model sees outstanding prior performance and ideal age and predicts a comeback. Matsuyama’s 2014 compared very favorably to a lot of outstanding seasons by young players including McIlroy, Sergio, Jason Day, and Justin Leonard. Ishikawa gets credit for consistently average play at a young age.

Most likely to collapse:

1. Rory McIlroy (50%)

2. Angel Cabrera (44%)

3. Jim Furyk (44%)

4. Bubba Watson (42%)

5. Jason Kokrak (41%)

I define a collapse when a golfer declines in performance by 0.5 strokes. Examples are Tiger Woods, Jason Dufner, and Nicolas Colsaerts last year. McIlroy’s place here will likely look stupid in 12 months, but his projection is generated from a very short list of comparables. Almost no one has had the up-and-down last three seasons that he has had while also playing at an absurdly high level. The model is most unsure about his projection of any of the 200+ guys projected. The model hates older golfers – it gives Angel only a 28% chance of playing better than he did last year – and John Senden, Thongchai Jaidee, and Robert Karlsson all appear near the top.


Link to projections (Google Doc)


Anatomy of a Breakout

Predicting breakouts and new tournament winners are some of the main allures in golf prognostication. Not only do you get the satisfaction that comes from watching a golfer that you’ve touted succeed, you also can bask in the glow of having identified that golfer before other golf pundits (and brag on Twitter). What we as a golf community don’t have is a good understanding of what goes into a breakout. Who typically breaks out, what type of guys win their first PGA Tour tournament, do guys sustain these breakouts in following seasons, etc. My research has concluded that 1. younger players are more likely to dramatically improve their performance than middle-aged or older golfers, 2. break-outs are more likely for bad players (even for bad players with a consistent track record of poor performance), 3. most first time Tour winners are above-average or better PGA Tour players already, 4. most first time winners are established as good players (that this, they don’t play much better in the season when they first win), and 4. first time winners don’t carry-over any particular boost in their performance the following season.

To judge performance I’m using my z-score ratings based on performance relative to the field and to judge expected performance I’m using my projected z-score ratings that I generate weekly based on overall performance, adjusted for recency. I gathered a sample of every golfer between the PGA, European,, and Challenge Tours who played at least 25 rounds in consecutive seasons. I compared their performance over the 2nd season to their projected performance from after the 1st season. That yielded a change in performance. On average, my sample improved slightly (by around 0.1 strokes/round), likely because I’m excluding some players who performed so poorly in their 1st season rounds that they didn’t record enough rounds in the 2nd season.

I found that for golfers under 30, 33% improved their performance by at least 0.5 strokes/round. Improving by that amount would generally improve an average PGA Tour golfer from 125th in FedEx Cup points to around 65th – a fairly clear breakout. For both golfers in their 30s and their 40s, only 25% broke-out to such an extent. Players rated at around the level of an average or Challenge Tour player broke-out at a 39% rate, while those established as very good or better PGA Tour players broke out at only a 22% rate. And these situations aren’t examples of guys like Paul Casey or Mike Weir completely losing their games and bouncing back. On average guys who break-out in a big way show fairly consistent performance in the three seasons prior to their breakout.

So these large improvements in performance season to season are more likely for the worst pros (the idea that there’s nowhere to go but up) and for younger golfers (which is certainly intuitive).

Next I wanted to look just at first time PGA Tour winners. I gathered 63 players who had won for the first time since 2010 (51 who had won for the first time in 2010-2013). These guys ran the gamut from Charl Schwartzel at the 2011 Masters to Matt Bettencourt, Bill Lunde, and Arjun Atwal in a two month stretch in 2010. The first thing I found was that their performance in the season they won for the first time hardly increased from the previous year (0.15 strokes versus the 0.1 strokes I found a few paragraphs ago in the general pro population who played 25+ rounds). That is, first time winners generally play only slightly better in the season they win as they did in the previous season. For every Jason Dufner or Graeme McDowell who goes from solid Tour pro to superstar in the season they first win, there’s a Matt Jones (declined by 0.75 strokes) or Tommy Gainey (declined by 0.60 strokes).

The average first time winner played about 0.3 strokes better than PGA Tour average the year they won (approximately around 50th best in the world).

What about the following season, though? Do first time winners carry momentum over and perform better the next season? Of the 51 first time winners from 2010-2013, they didn’t perform any better than in their previous season (in fact losing around 0.1 strokes). Youth is no guarantee here, as for every McIlroy or Patrick Reed who reached new heights in the season after they first won there’s Gary Woodland or Kyle Stanley who slumped.

In general, predicting first time winners mainly comes down to identifying who the clearly above-average PGA Tour golfers are and then waiting. Of the top fifty golfers in my predictive ratings at the beginning of 2010, twenty had never won a PGA Tour tournament. Of those, there are ten mainly European based players (guys like Francesco Molinari or Anders Hansen). Of the remaining ten who spent a lot of time playing in the US, seven won in 2010 or 2011.

The guys who clearly stand-out as the most likely to be first time winners this year who both hold PGA Tour membership and are good at golf are the obvious names like Brooks Koepka, Victor Dubuisson, and Graham DeLaet and elite rookies like Tony Finau, Justin Thomas, and Blayne Barber, but also established pros like Russell Knox, David Hearn, and Brendon de Jonge.

Measuring the Signal in First Round Performance

After the 1st Round of the Deutsche Bank Championship a month ago, Keegan Bradley sat two strokes off the lead. Playing in front of the home fans, Bradley fired a six under 65 fueled by great putting (4.2 strokes gained) and a solid long game (2.3 strokes gained on tee shots and approach shots). At that point he looked in great shape keep it going and capture his first win of the season. However, he came out the next three rounds and shot 71-69-71 to finish T16. The culprit wasn’t his long game either; he gained 1.6 strokes on the field per round in the second, third, and fourth rounds, good enough to finish in the top ten for the event in strokes gained off tee shots and approach shots. No, it was the putter that let him down. After being hot in the opening round, he actually lost 0.4 strokes per round from his putting.

My question is: how common is Bradley’s experience? When golfers come out in the 1st round and play/putt very well, how often do they keep playing/putting well? What about when they come out hitting the tee shots and approach shots well? Does that carry over to the next day? Many around the game act like one round of performance is really meaningful (just look at everyone who advocated for playing Jordan Spieth and Patrick Reed after their Friday morning 5&4 win at the Ryder Cup), but does first round performance tell us anything about how a player will perform in the following round?

Looking at Putting:

I gathered round by round Strokes Gained Putting data from the twelve most recent PGA Tour tournaments (Travelers Championship through the Tour Championship). First, I checked how 1st round putting performance predicted 2nd round putting performance. That’s the first graph below, and the results show how player putted in the 1st round hardly sheds any light on how they will putt in the 2nd round (R^2 of 0.001). In fact, someone who putted as well as Keegan Bradley did in the above mentioned round would be predicted only to putt 0.2 strokes above average the following round.

rd1SGP v rd2SGP

Next I generated prior expectations of Strokes Gained Putting performance from the past several years of data. I’ve shown before that putting performance isn’t very consistent season-to-season, so I’m using performance from 2011 to 2014 to generate the prior. The below graph shows how well the prior expectation predicted 2nd round putting. The results still were not highly predictive – R^2 of 0.01 (performance round to round is highly variable in golf) – but the regression line produced tracks pretty closely with results. Players predicted by the prior to putt well generally putted well and those predicted to putt poorly generally putted poorly.

priorSGP v Rd2SGP

Finally, I tied both pieces of information together. The prior estimate proved way more predictive than just 1st round performance, but does 1st round performance have any information to add? I set-up a linear regression with the prior estimate as x1 and the 1st round performance as x2. The results indicated 1st round putting performance provides no extra information to predict 2nd round putting performance (the coefficient was indistinguishable from zero). If you have a good guess of how well a player will putt, you can safely ignore first round putting performance.

Looking at Long Game Performance:

The long game is tee shots and approach shots (drivers/woods/irons essentially). I gathered long game performance data from the same twelve PGA Tour tournaments for the first and second rounds. I then ran the exact studies as above just substituting long game data for putting data. The correlation between 1st round long game performance and 2nd round long game performance was higher than with putting, but still didn’t contain a lot of information (R^2 of 0.03). If a player plays four strokes above field average in long game strokes gained, they’re expected to play 0.6 strokes better in the long game in the 2nd round.

rd1LONG vs rd2LONG

There was also a higher correlation between my prior estimate for long game ability and 2nd round long game performance (R^2 = 0.10). Again though, the regression line tracks closely with the results. Top ten long game players (around +1.2 strokes or above) generally performed to that level in the 2nd round.

priorLONG vs. rd2LONG

Tying both pieces together indicated that there is a small amount of signal in 1st round long game performance. Combining the prior estimate with 1st round performance slightly increases the fit of the model. The regression equation suggests that you should weight your prior estimate at twelve times the strength of first round performance. This indicates that someone who is PGA Tour average in long game shots, but produces an elite round of 4.0 long game strokes gained, should be expected to play about 0.3 strokes above average in long game shots. That seems like a small difference, but it’s enough of a shift in talent to move a player from around 50th best in the world to about 30th best in the world.

The Takeaway:

Based on these results, it looks like 1. a single round of performance is much less predictive than an estimate built on past-performance and 2. the small amount of signal contained in single rounds is from performance on tee shots and approach shots. Putting results from one round provide no more information than was available before the round. On the other hand, golfers who play particularly well on tee shots and approach shots in a round should perform slightly better than expected the following round.


How Real are Hot Streaks in Golf?

Whenever a golfer goes on a high profile hot streak – think Rickie Fowler since June, Billy Horschel for the FedEx Playoffs, Henrik Stenson at the end of last year – there’s always a ton of talk in the media and among fans about how their new swing/putting stroke finally clicked, or that player is returning to form, or they’re finally mature enough to win, etc. Humans love writing narratives to explain why things happen. The end result of all that talk is that a guy in the middle of a hot streak is considered to be much better than they would’ve been considered before the hot streak. No one thought Billy Horschel was deserving of a Ryder Cup pick a month ago, but now everyone thinks we should toss Webb/Mahan off to make room for him. No one thought Rickie Fowler was one of the 1-2 best American players in May, but now that’s almost assumed. Everyone around golf seems to think that hot streaks are real – that they actually predict who’s going to continue to play well. In this post I’ll provide evidence that shows that hot streaks are retained to a small degree – even months later – but that extreme performances still regress strongly to prior expectations.


I settled on using five week periods to measure performance. My sample was everyone who had recorded at least 8 rounds in a five week period and then recorded at least 8 rounds in the next five weeks. All my data is from the 2011-2014 seasons. The actual metric I used to measure performance was my z-score ratings, which are basically strokes better or worse than the field adjusted for the strength of the field. I compared each player’s z-score over that five week sample to my prior z-score rating. I have a prior rating for every player in my sample generated each week which mostly uses prior performance and very recent play to predict how well a player will play that week. They’re designed to be the most accurate prediction of performance. I subtract the prior expectation from the sample performance to get the change in performance which I’ll call the Initial Delta.

So my metric looks like this:

(Sample performance over 5 weeks) – (Prior expectation) = (Initial Delta)

I generated an Initial Delta for every player who qualified for my sample, generating over 27000 separate data points.

I then calculated a Subsequent Delta for every player using the same method only using the next five weeks as my Sample performance and the same Prior expectation used above (meaning I don’t consider any recent results). I then compare the Initial Delta to the Subsequent Delta. If players get hot and stay hot, the two should be strongly correlated. If whether a player has been hot or cold does not predict their subsequent performance, the two will not be correlated.

tl;dr of the above is I’m comparing how much better/worse a guy played over the first 5 or 10 weeks to how much better/worse he played over the next 5 weeks.


The results show that in general players retain only a small portion of their over or under-performance. Overall, about 20% of the Initial Delta is retained over the next five weeks. This means that if Billy Horschel played 1.8 strokes better than expected over the last five weeks, he should be expected to play about 0.36 strokes better than previously expected in the next five weeks. Now, 0.36 strokes is a large amount, but it’s not enough to bring him up to Bubba/Fowler/Keegan’s level (here is an example of the distribution of talent among the top 50 in the world). Right now, he should be considered slightly better than Mahan or Webb, but not to some ridiculous amount and certainly not to any degree that’s going to effect the outcome of the Cup.


Looking Further Ahead:

The above shows that hot streaks can be retained to some degree over a short period of time, but how much is retained further down the road? Is Billy Horschel going to be able to retain any of that ability he showed to win the FedEx Cup going into next season? I set-up the same study as above, only instead of looking at performance in the next five weeks, I looked over the next four months (16 weeks to be precise). Everything is calculated the same, though I only included players with at least 20 rounds over that four month span.

The results here showed that about 18% of the Initial Delta is retained over the next four months, a similar amount to what is retained over the next five weeks. Golfers who play significantly better than expected over five weeks should perform better than previously expected, but only to a small degree. To give you a sense of when recent performance becomes mostly insignificant, if a player performs 0.5 strokes better than expected over five weeks (basically what Chris Kirk has done in the FedEx Cup Playoffs), he is expected to retain only around 0.1 strokes (which is insignificant, basically a rounding error in predictive terms).


Adjusting Expectations:

I’ve attached a list of the top and bottom ten guys who have most over or under-performed over the last five weeks (PGA/European Tour only).


Obviously Horschel is at the top along with some FedEx Playoff stalwarts like Palmer/Fowler/Day. Ryder Cupper Jamie Donaldson has been killing it over in Europe as well. Among the trailers, Phil’s name sticks out like a sore thumb. The US team has to hope his multiple weeks off can help him rediscover his game before the next week. Probably the most terrifying thing is how close Ryan Moore came to making this team – he finished 11th in points, but was only a stroke away from jumping Zach Johnson in points at the PGA Championship. Moore is dead last of 339 pro golfers in terms of his performance relative to expectation.

Repeatability of Golf Performance by Shot Type

My main interest in analyzing golf is using past data to most accurately predict future golf performance. Inherent in that are the questions of how to figure out how much randomness is affecting the data and how to remove the effects of randomness from the data. The easiest way to find how much randomness is involved with data is to find the correlation between subsequent measures of performance. For example, in this post I found the correlation between a golfers performance in various samples of years from 2009-12 and their performance in 2013. Based on that, I concluded that using such basic methods produced a correlation of around 0.70 – meaning that in a subsequent season we can expect a golfer to repeat about 70% of their prior performance above or below the mean. I’ve achieved correlations slightly higher than that using more sophisticated methods and more detailed data, but an R of 0.70 should be viewed as the typical baseline for judging the repeatability of golf performance.

In this post, I attempt to find the repeatability of performance on different shot types using the same methodology as above. I will use Strokes Gained Putting to measure putting performance relative to the field, my own Z-Score measure minus Strokes Gained Putting to measure tee to green (driving, approach shots, short game) performance relative to the field, and I’ll use my own Scrambling metric (methodology here) to measure performance on only scrambling shots (a scrambling shot is the first shot following a missed green). I have not stripped these scrambling shots out from the tee to green measure; tee to green measures all non-putting strokes.

I gathered data for all PGA Tour golfers for 2008-2013 who had a qualifying number of rounds played (50). I then paired consecutive seasons for each of my three performance measures and graphed the correlations below.

repeat teetogreen

repeat putting

repeat scram

For all three graphs negative values represent performances better than the field. Values are expressed in standard deviations above or below PGA Tour average.

The measure that was most strongly correlated from season to season was tee to green performance. That makes intuitive sense as tee to green performance includes the most shots/round of any of my measures (roughly 40/round). In fact, tee to green performance is almost as repeatable as overall performance (R = 0.69 compared to 0.69 to 0.72 in the study linked above). This indicates that a golfer who performs well above or below average in a season should be expected to sustain most of that over or under-performance in the following season.

Putting was less repeatable than the tee to green measure, but skill still shows strongly through the noise. An R of 0.54 indicates that a golfer’s putting performance should be regressed by nearly 50% toward the mean in the following season (provided you only know their performance in that one season). I would mainly explain the lower correlation on sample size; golfers normally hit 25-30 putts/round, but many of these putts are short gimmes that are converted upwards of 95% of the time. The number of meaningful putts struck in a round is more like 15. This indicates that it takes a golfer over 2.5 seasons of putting to reach a season’s worth of tee to green shots. This suggests that putting is less repeatable from season-to-season than tee to green strokes, which indicates we should be wary of golfers who build their success in a season largely off of very good putting.

Off the three measures tested, Scrambling was the least able to be repeated (R =0.30). This indicates that performance on these short shots around the green is very random. It’s not atypical for a golfer to perform as one of the best 10% on Tour one season and average the next. Again, this is likely a function of sample size. A golfer hits only 6-8 scrambling shots/round (every time they miss a green). It takes a golfer around five seasons of scrambling shots to reach one season’s worth of drives/approaches.

There are two important caveats with this approach. These correlations explicitly measure only one season compared to the following season. Measures like scrambling and putting likely are more strongly correlated when several seasons are used as the initial sample. I intend to test this in a future post. In addition, this study only considers golfers who played consecutive seasons with >50 rounds/season. This leaves out a certain sample of golfers who played so poorly in the first year that they were not allowed to play >50 rounds the following season. These stats may be slightly more repeatable if those golfers were included.

Will Jimmy Walker Continue to Putt at an Elite Level?

I got some push-back from Chris Catena on twitter today about my contention that Jimmy Walker’s recent run of great play was driven by lucky putting. In that post, I showed that Walker had established himself recently as an above-average, but not elite putter (a strokes gained putting of around +0.25-0.30/round for the last five years). During Walker’s recent run ( Open through Northern Trust Open), he’s putted at a +1.20 level. That +0.9 strokes/round improvement is entirely what carried him to three wins in the last four months. I also contended that Walker continuing to putt at this level is very unlikely, simply because no one ever has for a full-season. Moreover, Walker’s best putting season (+0.46) and average putting season (+0.26 from 2009-2013) are far short of the kind of elite, sustained level of play we often see out of the golfers who lead the Tour in strokes gained putting. This post is to defend those claims in more depth and show why I think it’s very unlikely that Jimmy Walker will continue putting and playing as well as he has in the last four months.


Above is a graph of Walker’s strokes gained putting performance per tournament in every tournament the PGA Tour has data for since the start of 2012. The red dashed line is a linear trendline of his performance. It has basically zero (R=0.03) relationship with the passage of time, indicating that on the whole, Walker’s performance hasn’t improved over time. This is important to note because if we hypothesize that Walker changed something in his ability to putt, it clicked in only weeks after his worst putting stretch of the past 2+ years. Now, poor performance is certainly a motivator to change and try to improve, but a simpler explanation is that Walker got unlucky during the summer, and has been riding a combination of luck and improved putting since.

What Walker has done in the past 23 rounds on Tour isn’t unprecedented even during the 2013 season. I divided the tournaments in 2013 (Hyundai ToC to Tour Championship) into four quartiles with 7-8 tournaments in each quartile. I then found the golfers who had participated in 4+ tournaments in each bucket and averaged their SGP for each quartile. I gathered all golfers who had qualifying consecutive quartiles and compared them using Q1->Q2, Q2-Q3, etc. For Q4, I compared it to performance so far in 2013-14 from the Open to the Northern Trust Open. From all that, I had 365 pairs of quartiles where a golfer had played at least four tournaments during each quartile. A graph of of those pairs is follows.

pairs of SGP quartiles

There was very little relationship between a golfer’s performance in one set of tournaments and their performance in the following set of tournaments (R=0.04, indicating a tenuous at best relationship). I had 61 quartiles with a performance > +0.50, averaging 0.72. Those quartiles played to only +0.12 in the next set of tournaments. In fact, in only 12 of those samples of > +0.50 performance did a golfer again average > +0.50 the next quartile. None of the six samples of > +1.00 SGP had > +0.52 SGP in the following quartile. In short, we should be very skeptical of elite putting performances over fairly short periods of time.

Now, when I said that Jimmy Walker’s performance was largely driven by luck I meant the “largely” part. I think it’s extremely unlikely that all of his putting performance can be explained by variance alone. Jimmy Walker has +1.20 strokes gained putting/round in 23 measured rounds so far this season. The observed standard deviation between 23 round samples for PGA Tour players is around 0.35 strokes. That means if an average (+0.00) putter plays an infinite number of 23 round samples, 68% of them will yield an SGP average of -0.35 to +0.35, while 95% of them will yield an SGP average of -0.70 to +0.70. In short, there’s a ton of variation between 23 round samples. For an average golfer, it wouldn’t be shocking for them to putt extremely poorly or very well over 23 rounds. Plugging that standard deviation (0.35), Walker’s 2013-14 SGP (+1.20) and Walker’s five year SGP average (+0.26) into a Z-score equation yields a Z of 2.7 which indicates <1% chance that Walker’s SGP is entirely due to chance. That means there is some signal in all that noise.

But how much? I consider myself a Bayesian in that I think it’s very important to compare any observed performance to our prior expectation for that performance. Up until October 2013, Jimmy Walker was an above-average, but not elite putter. Since then, in 23 rounds, Walker has putted out of his mind. Surely we should consider Walker a better putter than we did in October, but how much better? Fortunately, there’s a simple equation we can use to estimate how the 23 round sample should change our expectation for him. It’s ((Prior Performance)/(Prior variance) + (Sample performance)/(Sample variance))/((1/Prior variance)+(1/Sample variance)). Basically, this equation tells us how confident, statistically, we should be about a golfer establishing a new level of performance based on how far his performance is from the prior expectation and how large of a sample we’re dealing with.

We know the prior performance and sample performance from the previous paragraph. The sample variance is simply the 23 round standard deviation from above (0.35) squared (0.12). To find the prior variance, I was forced to run some simulations as my data was limited. I know the variance for 100 round sample is around 0.025, so the prior variance for Walker over his >300 rounds in 2009-2013 must be no greater than that. Simulations indicated to use a figure around 0.02.

Plugging those values into the equation yielded a new expectation for Walker of around +0.40. That’s significantly higher than his five year average, but also much less than what he’s done recently. The equation is saying that Walker’s been much better, but that 23 rounds isn’t nearly enough to say that he should be expected to continue to putt at an elite level. If we had just seen Walker putt at a +1.20 SGP level for 80 rounds, we’d be much more confident in him continuing to putt at an elite level.

The tl;dr here is that extremely good SGP performances over small samples (~4-8 tournaments) sharply regress to the mean in the following 4-8 tournaments. Sustaining the kind of putting Walker has shown recently is unprecedented over a large sample of rounds from 2013-14. Moreover, the expected level of variance of 23 rounds is very large. It would not be abnormal for an average putter to putt at a top 20 or bottom 20 level over 23 rounds. Considering all that, we should expect Walker to putt better over the rest of the season than he did in 2009-2013, but not nearly as well as he has since October.

Bayesian Prediction of Golfer Performance (Individual Tournament)

I’ve posted several studies attempting to predict golfer performance. This attempted to find the importance of the previous week when predicting the following week. The study was not particularly sophisticated (simple linear regression), but the results indicated that the previous week’s performance should be valued at around 10% of the projection for the golfer the following week (90% would be the two-year performance). This other study attempted to predict golfer performance for an entire season using prior season data. That study found that no matter how many years are used or whether those years are weighted for recency, the resulting correlation is ~70%. Doing better than that for full-season prediction would indicate an additional level of sophistication beyond aggregating prior seasons or weighted data for recency.

This post, however, concerns predicting individual tournament performance using my Bayesian rankings. These rankings are generated each week by combining prior performance and sample performance using the equation ((prior mean/prior variance)+(observed mean/observed variance))/((1/prior variance)+(1/observed variance)). In this way, each golfer’s prediction for a week is updated when new information is encountered. The prior mean for a week is the Bayesian mean generated the prior week. My rankings also slowly regress to a golfer’s two-year performance if they are inactive for a period of weeks. For each week, the prior mean is calculated using the equation  (((Divisor – (Weeks since competed)) / Divisor) * (Prior Mean)) + ((1 – ((Divisor – (Weeks since competed)) / Divisor)) * (Two-year Z-Score)). I use 50 as the Divisor, which weights two-year performance at 2% for 1 week off, 27% for 5 weeks off, and 69% for 10 weeks off.

To measure how predictive these rankings were, I gathered data for all golfers who had accumulated 100 rounds on the PGA, European,, or Challenge Tour between 1-2010 and 7-2013. My sample was 643 golfers. I then examined performance in all tournaments between the 3-28-2013 and 8-8-2013. My sample was 6246 tournaments played. I then generated Bayesian rankings predicting performance before each of these tournaments played. The mean of my predictions was +0.08, indicating I expected the sample to be slightly worse than PGA average. I then compared each prediction to the golfer’s actual performance.

The table below shows the performance of Bayesian and pure Two-year predictions by including all predictions within +/- 0.05 from the displayed prediction (ie, -0.50 includes all predictions between -0.45 and -0.55). The accompanying graph shows the same information with best-fit lines.



Obviously, the Bayesian and Two-year predictions perform similarly. To test which is better I tested the mean square error. This shows how closely the prediction matched actual performance. I also included “dumb” predictions which simply predict all rounds will perform to the mean of all predictions (+0.08 for Bayesian, +0.055 for Two-year). The “dumb” predictions are the baseline for judging any predictions. If a prediction can’t beat it, it’s worthless.

The mean square error for the Bayesian predictions was 0.381 and 0.446 for the “dumb” predictions. The mean square error for the Two-year predictions was 0.389 and 0.452 for the “dumb” predictions. So both sets of predictions provide value over the “dumb” predictions, but both perform fairly similarly when compared to the “dumb” predictions (-0.065 for Bayesian and -0.063 for Two-year).

This study indicates two things; first, using Bayesian methods to predict golfer performance doesn’t substantially improve accuracy relative to unweighted aggregation of the last two years of performance, and second, that predicting golfer performance in individual tournaments is very difficult. A mean square error of 0.38 indicates an average miss of 3.5 strokes for golfers playing four rounds and 2.5 strokes for golfers playing two rounds.

The Aging Curve for PGA Tour Golfers (Part III) – Using Bayesian Prior

Several weeks ago I posted a two studies on aging among PGA Tour golfers, the most recent of which compared sequential seasons, regressing both seasons to PGA Tour average based on the number of rounds a golfer had played in the seasons. DSMok1 suggested modifying the amount and degree of regression by including a better prior, which makes more sense than regressing every golfer to the same mean. Instead of simply adding 25.5 rounds of average play to each golfer’s season, I found a Bayesian prior based on play in the prior season and measured the change in performance from that prior in the following season.

Sample and Design:

I included every player with >20 PGA Tour rounds in a season for 2010, 2011, and 2012. This limited my sample to 703 seasons. I then gathered data for YR N-1, YR N, and YR N+1 (ie, 2009, 2010, and 2011 for golfers with >20 rounds in 2010) on all major Tours (PGA, European,, and Challenge).

Using the equation ((prior mean/prior variance)+(observed mean/observed variance))/((1/prior variance)+(1/observed variance)) I found my prior expectation on performance, inputting data from YR N-1 for prior mean and variance and from YR N for observed mean and variance. That equation adjusts the observed performance based on what we’ve observed in the prior season to generate a true-talent level (True YR N) for YR N+1. I used the same equation to find the true-talent level for YR N+1. I inputted the prior generated from YR N-1 and YR N as the prior mean and the data for YR N+1 as the observed mean. This produced True YR N+1. I then compared both True YR N and True YR N+1to find the change in true-talent for each age group.

I weighted the results using the harmonic mean rounds played in YR N and YR N+1. For example, there were 18 golfers for age 26, so I took the sum of each harmonic mean of rounds and divided each golfer’s change in true talent by their share of the total rounds. This produced my total change in true-talent due to age for each age-group.

If a golfer had no performance in YR N-1 I used +0.08 (slightly below PGA Tour average) as their YR N-1 prior. In most cases, these players qualified via Qualifying School and +0.08 is the observed true-talent for Q-School golfers for 2009-2013. Only 8 golfers had 0 rounds in YR N-1 however.


20    -0.05    2
21    -0.06    3
22    -0.01    6
23    -0.05    8
24    -0.07    9
25    -0.11    11
26    -0.13    18
27    -0.13    23
28    -0.14    29
29    -0.12    36
30    -0.13    34
31    -0.11    39
32    -0.12    36
33    -0.11    34
34    -0.13    34
35    -0.12    36
36    -0.11    37
37    -0.10    42
38    -0.08    26
39    -0.05    30
40    -0.01    21
41    0.03    35
42    0.07    28
43    0.12    19
44    0.13    17
45    0.15    13
46    0.21    17
47    0.25    19
48    0.31    13
49    0.36    12
50    0.35    9
51    0.45    4
52    0.47    2

bayesian aging


The curve generated is very similar to that of the prior study regressing to a mean of +0.00. The peak is slightly lower and the decline is deeper in the late 40s, but otherwise this study supports my prior conclusion of aging with a peak in the mid 30s and subsequent decline.

The Aging Curve for PGA Tour Golfers (Part II)

Yesterday I posted the results of my study on aging among PGA Tour members. You can read the methodology at the link, but basically it compared pairs of seasons by age to find how much a player should be expected to improve or decline based solely on age (I included a mechanism to regress performance in an attempt to find “true talent”).  At the end I said I’d like to try a different regression mechanism that I hoped would produce a more accurate representation of true talent.

I’ve found before that it’s correct to regress PGA Tour performance around 30% to the mean to find true talent. However, that’s most accurate for golfers who play something like a full season (ie, 50-100 rounds worldwide/season). For regular Tour members, regressing 30% is correct, but for golfers playing only partial seasons it’s likely not regressing enough. A performance over 20 rounds is more likely to be the product of luck than a performance over 60 rounds. That’s problematic for this study because it doesn’t regress more extreme good or bad performances enough to the mean. You’ll see the errors that result when I compare the two studies below.

In prior research comparing sets of rounds [1], I’ve found that adding 25.5 rounds of average (0.00) performance properly regresses a performance to the mean. This means for a player with around 60 rounds, the 30% figure quoted above is accurate. For those playing more, like Brendon de Jonge’s 118 rounds in 2012, regressing 30% is way too much. We know a lot more about de Jonge’s true talent in 118 rounds than we do about, say, Jason Day’s 60 round sample in 2012, enough to regress de Jonge only 18%. Similarly, Hideki Matsuyama’s 26 major tour rounds in 2013 tell us much less about his true talent, and by adding 25.5 rounds of average he gets regressed 50% to the mean.

Sample & Design:

The same sample and methodology as the above quoted study were used, except instead of regressing using the equation True Talent=(.6944*Observed)+0.01, I simply added 25.5 rounds of average performance to every observed performance: True Talent=((Observed Z*Observed N)/(Observed N + 25.5)).

I still did not weight my data.

age         delta      N
19           0.02        3
20           -0.02      2
21           -0.03      4
22           0.01        8
23           -0.03      8
24           -0.01      11
25           -0.06      16
26           -0.02      23
27           -0.01      30
28           -0.01      39
29           -0.03      46
30           0.04        45
31           0.00        49
32           -0.01      44
33           -0.02      43
34           0.04        46
35           0.01        46
36           -0.02      49
37           0.01        51
38           0.04        38
39           0.03        34
40           0.03        38
41           0.05        40
42           0.03        28
43           0.01        27
44           0.04        21
45           0.10        18
46           0.00        28
47           0.03        22
48           0.06        15
49           0.03        16
50           0.02        10
51           0.00        6
52           0.07        2

aging w25.5regression

The smoothed curve averages the improvement of year N-1, N, and N+1.

The results here were much different using a more accurate regression mechanism. There is an observed slow increase in true talent of around -0.02/season from 19 to 29. Between 30 and 37 the curve is more or less flat, declining almost imperceptibly. Beginning in the late 30s is the steady decline of around 0.04/season that was also observed (though to a greater extent) in the previous study.

With this more accurate methodology, I think the previous study can be discarded. There IS age related improvement in a golfer’s twenties. Golfers tend to peak between 29 and 34, with a sharp decline around 38 onwards. This study does not necessarily disprove my prior hypothesis that there is a decline based on lessened commitment to practice/preparation among the more transient PGA Tour members, but it certainly means there is a larger improvement in the 20s being observed among the more permanent members.

[1] This study ordered PGA Tour rounds for a large group of golfers over a full-season from oldest to newest. I then selected two samples – one comprised of the even number rounds and one of odd number rounds – and compared them to see how predictive one half was of the other. I expect to reproduce that study with a larger sample of seasons and golfers soon.

Predicting the Professional Performance of Collegiate Golfers (Part IV)

Earlier this week I posted the latest version of a study measuring how collegiate golfers perform in their first professional season compared to their average Sagarin rating in college. That study used every season of collegiate data, but considering golfers typically improve from freshman to senior year do the final two seasons of college predict pro performance better than using up to four seasons?

My sample was the same as the previous study linked above, except I used only the final two seasons of collegiate competition. For golfers like Rickie Fowler, who played only two seasons, the observed college performance didn’t change. For others it did.

N=52. Average Sagarin rating=70.47. Average pro performance=+0.15.

college golf regression 4

The results were slightly less predictive (R^2=0.294, R=0.54) than using all four seasons of data (R^2=0.356, R=0.59), suggesting that including the earlier data provides some value in predicting later results. I would guess this is because the college season is so short (around 40 rounds); using four seasons provides twice the sample size and a more reliable observation of performance, even if the overall performance was worse. For the record, using only the final season gives R^2=0.205, R=0.45.