Golf Analytics

How Golfers Win

Tag Archives: research

Putting Driven Performance Changes are Illusory

Last week I posted about how repeatable performance on different shot types was from season to season. Tee to green play is more repeatable than putting which is more repeatable than scrambling. That makes sense once you realize that golfers play 2-3 times more tee to green shots than meaningful putts in a round; there’s just more inherent randomness in a season’s worth of putts than in a season’s worth of tee to green shots. Golfers play even fewer scrambling shots resulting in even more randomness in a season’s worth of scrambling.

Last month I also examined how repeatable small samples (4-8 tournaments) of putting performances are, in the context of discussing why I expected Jimmy Walker’s performance to regress to the mean. That micro-study indicated that there was very little correlation between a golfer’s performance in a 4-8 tournament sample of putts and the following 4-8 tournament sample of putts. In the whole, performances in such short samples regress almost entirely to the mean.

Those two lines of inquiry led me to examine whether putting was more random than tee to green performance. I have always believed that improvements/declines that were driven by over-performance in putting were less real than those driven by tee to green over-performance, but I had never actually tested that hypothesis. The key question is whether changes in performance driven by putting are less persistent than those driven by tee to green play. That is when a golfer performs better over the first half of a season, and much of the improvement can be traced back to an improvement in his putting stats, will that golfer continue to perform better in the second half of the season? The evidence says changes in performance driven by putting are more illusory than changes in performance driven by tee to green play.

Design:

I gathered the tournament by tournament overall, tee to green, and putting performances of all PGA Tour golfers in rounds measured by the ShotLink system for 2011-Present. I divided those rounds into roughly half-season chunks (January-May 2011, May-November 2011, January-May 2012, May-November 2012, January-May 2013, May-September 2013, October 2013-Present). Each chunk included around 15-18 tournaments. I considered all golfers who recorded at least 20 rounds in consecutive half-season chunks.

To measure putting performance I used the PGA Tour’s Strokes Gained Putting stat and to measure tee to green performance I used my own overall ratings with putting performance subtracted out. This methodology is consistent with my measurement of tee to green performance in numerous recent work.

Half-Season Correlations by Shot Type:

First, I measured how repeatable putting and tee to green performance was between half-season samples, much like the full-season samples used in this study. I included all golfers with at least 20 rounds in consecutive half-season samples and compared each half-season to the half-season that directly followed, including 2nd halves to 1st halves of following calendar years. This yielded samples of ~800 golfers for both tee to green and putting. Graphs are below.

half tee to green

half putting

Tee to green performance was again more repeatable than putting performance. In the study linked above consecutive full-seasons of tee to green performance were correlated at a R=0.69 level. I found a correlation of R=0.62 between consecutive half-seasons, understandably less given the smaller number of rounds/shots played. The full-season correlation for putting was R=0.55. Half-season putting performances were similarly less correlated than full-seasons at R=0.40. Both these findings are consistent with the understanding that randomness between samples increases when fewer rounds/shots are compared. Most importantly, putting is less repeatable than tee to green play.

Persistence of Changes in Performance by Shot Type:

Next, I measured how persistent changes in performance are when considering putting and tee to green play. That is, when a golfer improves their putting over a half-season sample, how much of that performance is retained in the following half-season? If 100% of the performance is retained, changes in putting performance over a half-season entirely represent a change in true talent. If 0% of the performance is retained, changes in putting performance over a half-season entirely represent randomness. The same for tee to green play. My assumption was that a larger percent of performance would be retained for tee to green play than putting, meaning that half-season samples of putting are more affected by randomness than half-seasons of tee to green play.

To measure the effect, I first established prior expectations of performance for every golfer in my sample. I simply averaged performance in tee to green play and putting for the three years prior to the beginning of each half-season sample. For example, for the May-November 2011 sample, I averaged play between May 2008 and May 2011. This is not an ideal measure of performance, but it provides a consistent baseline for comparisons to be made.

I removed all golfers from the sample who had no prior performances. This reduced my sample to around 750 consecutive half-seasons.

The values I compared were the initial delta (Prior minus 1st Half-season) and the subsequent delta (Prior minus 2nd Half-season). Using this method I can find how persistent a change in performance is between to half-seasons. I did this considering putting and tee to green play. Graphs are below.

persist tee to green

persist putting

Changes in tee to green play were twice as persistent as changes in putting play, meaning golfers who improved their tee to green play retained twice as much of those improvements as golfers who improved a similar amount in putting. Golfers maintained around 60% of their tee to green improvements, but only 30% of their putting improvements. This indicates that putting performances regress more sharply to prior expectations than tee to green performances.

Are Putting Performances More Illusory?

Finally, I gathered the data from above to measure whether changes in performance driven by putting less real than changes in performance driven by tee to green play. I ran a linear regression using the initial delta for overall performance and the initial delta for putting performance as independent variables and the subsequent delta for overall performance as the dependent variable. In short, given a certain overall change in performance and a certain change in putting performance over the first half-season, how much of that overall change in performance is retained over the second half-season?

As the following table shows golfers retain much more of their improvement or decline when that improvement or decline occurred in tee to green shots than if it occurred in putting. The columns show improvements/declines in overall play (considering all shots) and the rows show improvements/declines solely in putting. The table shows that a golfer who improves overall by 0.50 strokes will retain only a quarter of their improvement if all of the improvement was due to putting (0.50), while they will retain over half of their improvement if none of the improvement was due to putting (0.00). The equation used to produce this chart is Subsequent Delta = (0.56 * Initial Overall Delta) – (0.28 * Initial Putting Delta).

delta comparisons

Discussion:

These findings should fundamentally alter how we discuss short-term changes in performance. I’ve already shown repeatedly that performances better than prior expectation will regress to the mean over larger samples. That idea is consistent across sports analytics. However, these findings indicate that the amount of regression depends on which part of a golfer’s game is improving or declining. Golfers who improve on the basis of putting are largely getting lucky and will regress more strongly to the mean than golfers who are improve on the basis of the tee to green game. Those who improve using the tee to green game are showing more robust improvements which should be expected to be more strongly retained.

The golfers who represent either side of this for the 2014 season are Jimmy Walker and Patrick Reed. I’ve discussed both in the past month, alluding to how Walker’s improvements were almost entirely driven by putting and how Reed’s were mostly driven by tee to green play. Based off these findings, Reed is more likely to retain his improvements over the rest of the season, all else being equal, than Walker.

 

All graphs/charts are denominated in strokes better or worse than PGA Tour average. Negative numbers indicate performances better than PGA Tour average.

Advertisements

An Aging Curve for Putting

In prior posts on the PGA Tour aging curve I’ve established that golfers tend to peak in their late 20s and sustain that peak for nearly a decade. They begin to decline, on average, in their late 30s and their skills degrade far below where they started in the early 20s. In short, golfers experience a small and steady increase in performance in their twenties before suffering a large and steady decrease in performance in their forties. However, all of those studies considered performance in the aggregate – driving, approach shots, the short game, and putting – which prevents deeper analysis of why golfers improve slightly before declining greatly. This post attempts to construct a typical aging curve for PGA Tour golfers’ putting games.

Initial Thoughts:

I anticipated that golfers would noticeably improve their putting games in their twenties; they would learn to read greens better and approach putts using a more optimal strategy gained through experience. I then anticipated that they would decline by age 40. This decline is suggested by the constant references to age-related putting yips. Because putting is not the main driver of performance differences between PGA Tour players, I expected that the age-related putting improvements and declines would be modest relative to overall gains (the general aging curve for overall performance shows a per year improvement of ~0.01 standard deviations from age 20 to 30 and a per year decline of ~0.04 standard deviations from age 38 to 50).

Design:

As with my prior aging curve work, I’m using the delta method which measures the change between Year 1 and Year 2. Mitchell Lichtman explains the concept in this article and a general Google search for delta method aging curve provides more information.

The major impediment to this study is the consideration of survivor bias. The only accurate measure of putting skill is the PGA Tour’s strokes gained putting (SGP) statistic. This stat is the same as I use in general for my overall performance analyses, except it’s denominated in strokes instead of standard deviations. However, the PGA Tour only gathers the data to calculate SGP in regular PGA Tour events (not majors, events outside the United States, events opposite other PGA Tour tournaments, or Web.com Tour events). This means that for golfers who played on the PGA Tour in Year 1, but not in Year 2, would not have an SGP measure to calculate the delta from. When I ended up forming my sample, roughly a quarter of seasons that qualified in Year 1 did not qualify in Year 2.

I included in my sample all golfers who recorded at least 30 measured rounds (rounds where the Shot Link system was available to calculate SGP) in both Year 1 and Year 2. The years used were 2008-09, 2009-10, 2010-11, 2011-12, and 2012-13. 1021 seasons met the criteria for Year 1, while 769 met the criteria for Year 1 and 2 and were included in the study used in the sample. These included seasons averaged a SGP of 0.02 (above-average) and averaged 69 measured rounds. 252 seasons did not meat the criteria and were discarded from the sample. These seasons averaged a SGP of -0.04 (below average) and averaged 54 rounds played. This suggests that on average those included in the sample were better putters and likely better golfers overall.

Results:

My results showed a very slight increase in putting skill in the twenties, followed by a steady decline beginning in the mid-thirties. A graph of the curve follows with a smoothed aging curve in blue. I smoothed the curve using a weighted average of the two years before and after the age in question.

SGPAgingCurve

What surprised me was the small size of the improvement and decline. Recall in terms of overall performance golfers improve by around 0.01 standard deviations each season between age 20 and 30. The overall improvement in putting performance up to a peak in the early thirties is equal to one season’s worth of overall performance improvement. Putting improvements are a very minor part of the age-related improvement of golfers in their twenties.

Similarly, the general age-related performance decline per season from the late 30s is roughly 0.04 standard deviations. The decline due to putting declines in total only 0.07 standard deviations. I can only conclude, again, that putting does not form a significant part of the age-related declines in golfers.

Further Discussion:

When I initially observed these results I guessed that survivor bias was distorting the results somehow. In my first foray into constructing an aging curve, I failed to properly account for survivor bias and my result was an aging curve that was largely flat until the mid-thirties before declining steeply. That graph looks a lot like the one linked above.

To test whether survivor bias was affecting my results I constructed another overall aging curve using only the golfers and seasons used for this study (in fact, I also only included the results from rounds played on ShotLink courses). The same sample of 769 seasons was used, using the z-score method to measure performance on all strokes. The graph this study produced is linked below, smoothed using the five year weighting method described above. In red is performance on all strokes, in blue is performance on only putting strokes, and in green is performance on non-putting strokes.

SGPAgingCompared

The overall performance looks almost identical to my aging curves that incorporated measures to eliminate the impact of survivor bias. Overall performance shows a small steady improvement the the early thirties followed by a steady decline from the mid-to-late thirties. More importantly, this graph shows the impact of putting on overall improvement and decline. In short, there is very little impact. Almost 100% of the improvement up to age 30 is due to non-putting strokes and over 80% of the decline experienced from age 39 to 50 is in non-putting strokes.

This suggests that putting performance changes very little during a golfer’s career. While overall performance declines by 1.50 strokes on average from peak to age 50, putting performance declines by less than 0.25 strokes on average over the same period.

This suggestion has interesting implications. Most importantly, do golfers who rely on their putting for success decline differently than golfers who rely more on their long game? I’ll try to answer that in a future post.

Review of Every Shot Counts – Mark Broadie

Mark Broadie’s Every Shot Counts: Using the Revolutionary Strokes Gained Approach to Improve Your Golf Performance and Strategy (2014) is the long-awaited full-length explanation of his strokes gained research. Broadie had published numerous academic papers discussing his strokes gained method and the PGA Tour has been showing the Strokes Gained Putting stat for a few years, so much of this material is merely rehashed from articles others have written or from his 2011 paper “Assessing Golfer Performance on the PGA Tour“. It’s well known that Broadie’s research has disproved putting as the most important part of the game and has elevated the long-game (driving and long approach shots) in its place, but where this book shines is in its lessons for applying this new knowledge to actually playing the game, whether you’re a pro, advising a pro, or an amateur.

Broadie spends the first six chapters basically explaining the strokes gained method. He covers why putting is overrated, why traditional putting statistics are worthless, and how the strokes gained method works. He then introduces strokes gained for the tee-to-green game. Broadie establishes that why the long-game is so important in separating elite pros from average pros, average pros from good amateurs, and good amateurs from 90 handicappers; in the process he shows why Tiger dominated golf so much in the last decade (he was good at everything, but the #GOAT at playing long approach shots). This part of the book is worthwhile for the more in-depth exploration of the strokes gained method, but if you’ve read his academic work feel free to skim it for the handful of insights.

Essentially, Broadie’s work is about how fractional strokes are so important in separating pro golfers. The best and worst golfers in a PGA Tour tournament are separated by 2.5 strokes/round. Most of that separation is manifested in things like hitting an extra green each round, driving the ball five yards further, leaving your shots from the sand a foot closer, and/or hitting a single approach shot within birdie range. His research argues for a strategy that considers all possible shots and outcomes of those shots, and selects the highest expected value play. In this recent interview, Broadie says that most golfers don’t play aggressively enough; they leave putts short of the hole, lay-up on par 5s, and hit woods/hybrids off the tee when they’d be better served hitting driver. Central to his work is the idea that being much closer to the hole is worth playing out of the rough/fairway bunker.

Broadie finally explores new ground the final three chapters, laying out how this new knowledge should be applied to all aspects of the game. In Chapter Seven, Broadie explores what the strokes gained analysis means for putting and how to figure out how aggressive to be on long putts. He explains that many PGA Tour golfers aren’t aggressive enough with their putting; they often purposefully don’t hit putts with enough force to get to the hole, ensuring that they miss the putt. There’s a lot of work in this chapter figuring out optimal aim points from different locations on the green; very interesting work for amateurs who are looking to improve their strategy on the green.

In Chapter Eight he explores how to optimize your long game to shave wasted strokes. Much of the chapter is spent on figuring out the proper way to target drives to ensure you miss the dangerous hazards (water, out of bounds), even if you are forced to play less from the fairway. This section would be very useful for amateurs who often find themselves wasting strokes off the tee by not being cognizant of where the dangerous areas of the course are. Broadie also spends this chapter detailing why lay-ups are typically a minus EV play – particularly notable this week after the way Patrick Reed laid-up so poorly down the stretch at Doral.

Chapter Nine is a detailed look at numerous different practice methods that use the ideas behind valuing each stroke and playing the highest EV game. I mainly skimmed these, but amateurs might find the lessons/games useful for improving their play.

I did pick up a few interesting lessons:

1. It’s well established that those who hit for more distance off the tee usually hit fewer than average fairways, but Broadie has actually found that longer players have a smaller degree of error in terms of how off-line they hit their shots. In fact, the only reason many long hitters hit fewer fairways is because when they do hit a shot with a larger than average degree of error, the increased distance cause it to fly/roll further off-line – into bunkers and the rough. Driving is basically a geometry problem where a smaller angle and larger hypotenuse can produce a larger miss.

2. Broadie introduces the concept of “median leave” in Chapter Five. The PGA Tour publishes stats showing the average proximity to the hole from approach shots, the rough, green-side bunkers, etc. However, Broadie argues we should use the median proximity instead because it’s not distorted by larger misses (like when you fly the green and leave it 50 yards from the hole). Median leave is simply the distance remaining to the pin after the shot divided by the distance to the pin before the shot. So a 150 yard approach to 18 feet would be a median leave of 4%. The best approach shot players have a median leave of 5.5% – equivalent to hitting it a median proximity of 29 feet from the average PGA approach shot (175 yards).

3. When discussing optimal driving strategy he explains the idea of “shot pattern”. Your shot pattern is all the possible results of each type of shot, considering the distance you can hit with a club, the degree of accuracy, and any spin/fade/draw/slice/etc. you can play. Golf is a game where each swing is essentially slightly random – a golfer might swing perfectly, contact the ball perfectly, judge the wind perfectly, and get the right amount of spin when he lands it on the green, but more likely his swing will be slightly off or he’ll mishit it slightly or the wind will push it offline a bit, or it will roll-out when it hits the green. The optimal golfer will know their 95% confidence interval for a 125 yard wedge shot, his average degree of miss when he hits driver, and all the possible results of an approach shot if the greens are firmer than expected. The optimal golfer will play their shots with all that understanding and avoid playing shots that are excessively conservative or needlessly risky.

All in all, it’s a worthwhile book if you’re interested in applying Broadie’s research to your golf game or at least interested in how a pro might apply it to how they work around a golf course. Broadie has plenty of evidence of some of the elite golf instructors already using this kind of stuff to help their clients excel. On the other hand if you’re just interested in the research itself, reading the literature I linked above is sufficient. His initial six chapters don’t provide a substantial amount of expansion on his earlier papers.

Thoughts on Patrick Reed (Without Using “Confident” or “Cocky”)

Patrick Reed won a tournament yesterday – his third win since August – and in the process delivered pre-round and post-round interviews where he said he thought he was a top five player in the world. There’s been a lot of bullshit spewed already about his comments so I’m going to try to avoid any of that. I am going to lay out some reasons for and against the idea of Patrick Reed being an elite golfer, with the knowledge that anyone who thinks they know for sure is full of it.

Reasons to Doubt

The main argument against Reed being elite is his aggregate play up to this point in his career. Going beyond his three wins in 51 starts, when you consider all of his rounds (not just the ones since August), Patrick Reed’s performance hasn’t been much different than an average PGA Tour cardholder. I have 180 rounds for him between the Web.com Tour and PGA Tour going back to before he turned pro in 2011. In those rounds he’s played to just barely above the level of an average cardholder (-0.17). 180 rounds isn’t the definitive picture of a golfer, but it tells us that in general he’s been essentially average over a fairly large sample of results over mostly the last three seasons.

As I just wrote in a piece last week, the first two months of the season, when considered alongside the last two years of data, provide little extra information about how a golfer will perform going forward. Reed played to a rating of -0.08 in all rounds prior to January 1st 2014 and he’s played to a rating of -0.81 (over 2 strokes better) in 24 rounds since then. In general, past performances have shown that we should place about 3.5 times as much weight on those prior rounds compared to the rounds from the beginning of the season. Using this line of thinking, Patrick Reed should be considered an above-average PGA Tour player, but no better than Kevin Chappell or Russell Knox or other young guys who no one pays an extra second of attention.

Now some might point to his age saying that plenty of young players break-out in their early to mid 20s. I re-ran the study from the piece linked above to factor in age. The methodology is outlined in the piece, but basically I used a regression analysis to predict performance from March to December of a season using the January/February performance as one variable and the previous two full seasons as another variable. I ran the analysis this time using all seasons from age 27 and younger, age 28 to 38, and age 39 and older. I used the age 27 cut-off because that is where my prior aging studies have shown general age-related improvement halts.

In fact, the age of the player does affect how strongly we should believe in early season improvements/declines, though the evidence still favors the prior two seasons. For the age 27 and under group, the weight was about 2.4 times stronger for the prior seasons than the early season form. The weight on the prior seasons was around 5 times stronger for the age 28 to 38 group and nearly 4 times stronger for the age 39 and older group. Consider that all seasons produced a weight of 3.5 times for the prior seasons and it’s clear there’s an effect for younger golfers. So that indicates that we should believe more in early season improvements for young players, but that we should still defer heavily to the prior performance data. Using this method to project Patrick Reed, I’d compare his abilities now to Billy Horschel or Harris English. Plenty of folks think they’re very good players, but no one (bookmakers included) considers them elite by any stretch.

I then set-up a regression which attempts to predict the delta of the remaining ten months of performance with age and the delta between the prior two seasons and the first two months as dependent variables. The top row of the graph below is the delta between the prior two seasons and the first two months (negative means improvement/positive means decline), while the first column is age. Each cell represents the expected delta between the prior two seasons and the remaining ten months of the season based on a golfer’s first two month delta and their age. You can see that younger golfers that outperform their prior two years are expected to retain more of their improvements over the rest of the season than peak aged or past-peak golfers.

ValueHotStartGraphByAge

Reasons to Believe

Now that I’ve laid out the reasons to doubt Reed, here are a few reasons to think that this may be more real than the general model predicts.

1. Reed was an outstanding amateur golfer, especially during his final two seasons in college. The gold-standard for measuring collegiate golf performance is Jeff Sagarin’s rankings. Sagarin uses a method that compares who you beat/lose to in the same tournament and how much you beat/lose to them. College golf doesn’t provide a huge sample of results – a golfer might complete 40 rounds during a season – but it works in general. During Reed’s two seasons at Augusta State, he played 20 tournaments and finished 4th and 9th in the nation in Sagarin’ rankings (and led Augusta State to two straight NCAA Championships). Less than ten others have finished with a better average rank in college than Reed (including Bill Haas, Ryan Moore, and Dustin Johnson). Elite college performance at least establishes that Reed isn’t coming out of nowhere; this guy was lighting tournaments up in college.

2. I don’t consider Monday qualifier results in my database. The data is provided by local PGA chapters using multiple spellings of names and it’s generally a hassle to collect. For most guys that wouldn’t be a huge issue, but Reed was 6/8 in Monday Qualifiers in 2012, earning his way into six tournaments when he had no Tour status. Monday qualifiers are held at a nearby courses with around 100 golfers participating (mainly PGA Tour members without the status to enter the tournament directly, Web.com golfers, or minor tour pros). Of those ~100, the best four scores over a single round qualify to enter the tournament. Because only the top four advance, these qualifiers require a golfer to play around the level of peak Tiger Woods for a round to qualify. In short, Reed playing that well in 6/8 qualifiers should inflate his overall rating by a small amount.

3. Most importantly, Reed isn’t getting terribly lucky putting so far this season. Putting drives a lot of luck on Tour – largely because it’s easier to sink an extra eight footer every round for two months than it is to randomly pick up an extra couple yards of driving distance on every hole. When I examined Jimmy Walker’s game a few weeks ago, all of Walker’s improvement in 2014 could be attributed to a strokes gained putting that was inflated nearly a full stroke above his career average. In Reed’s case his putting numbers are slightly higher than his career average, but nothing similar to Walker’s stats.

Entering 2014 he had gained 0.27 strokes on the field through his putting and was basically Tour average in driving/scrambling/approach shots/etc. in his career. So far this season, he’s gained 0.52 strokes from putting and 1.91 strokes from driving/scrambling/approach shots/etc. About 10-15% of his improvement can be traced to his putting and the rest to his driving, iron play, and short game. In short, he’s not relying on a lucky putter like Walker, instead he’s hitting his driver and irons more consistently – leading to more distance, more greens hit, and more birdie opportunities.

This is where I should sum up all the evidence and declare a winner. Is Patrick Reed going to keep winning tournaments, maybe a Major this season? Or is he going to regress to just being another guy grinding for his card? But I don’t really have any idea. I do hope we start getting more post-round interviews that are heavier on bravado than modesty.

What’s a Hot Start Worth?

This weekend marks the eighth weekend of professional golf in 2014 and the beginning of the Florida Swing of the PGA Tour schedule. So far this season, guys like Jimmy Walker, Patrick Reed, and Harris English have started off playing like top 20 players in the world. With almost two months of the year (not to mention five months of the Tour season) in the books, it feels like we’re reaching the point where we can start to tell who’s struggling, who’s excelling, who’s going to contend for a Major, who’s the next Big Thing, etc. This post is designed to throw some water on those ideas. Two months doesn’t tell us very much about how the rest of the season will play out, at least when compared to a much larger sample of past tournaments.

To test how predictive the first two months of the golf season are of the rest of the season, I gathered all players 2010 to 2013 who played at least 50 rounds in the two seasons prior to the season in question (ie, 2008-09 for 2010, 2009-10 for 2011, etc.) and who played any rounds in the first two months of the season in question and in the remaining calendar year of the season in question. I kept these requirements fairly loose, but tested other combinations. In total, I found 1984 seasons from players on the PGA Tour, European Tour, Web.com Tour, and Challenge Tour. I found the average performance in z-score for the two years prior to the season in question, the first two months of the season in question, and the remainder of the season in question.

To test whether the first two months were predictive of the remainder of the season I first simply found the correlation between the first two months and the remainder of the season. I found a strong correlation (R=0.57) between the two, indicating that the first two months were highly predictive of the rest of the season. However when I examined the correlation between the prior two seasons and the remainder of the season in question (while ignoring the most recent two months), the correlation grew even larger (R=0.68). Note again that this ignores the first two months of the season. That is, if you lock me in a room from New Years until March without access to any information about professional golf I will do a better job predicting the season than someone who only relies on who’s playing well in January and February.

Now, obviously you don’t have to ignore one set of data (two year average) in favor of another (two month average). I ran a simple linear regression of the two variables (two year average and two month average) on the rest-of-season average to see if the accuracy would improve. Indeed, including both variables increased the correlation slightly to 0.72, meaning that this model explains over half of the variance in the remainder of the season (this again shows how random golf is). More interesting are the coefficients the regression spits out: Y = (0.72*Two Year)+(0.20*Two Month)+0.04. That is, the two year average is 3.5 times more important than the two month average.

I followed this study up with another that repeated the methodology, but winnowed the sample down by restricting it to players with > 100 rounds in the previous two season, > 5 rounds in the first two months, and > 19 rounds in the remainder of the season. The results were consistent with what was earlier observed. Using the smaller sample (N=1300 now) slightly improved the predictive strength and also slightly increased the importance of the two year average relative to the two month average.

However, I conducted a further study that showed that drastic improvements in z-score in the first two months were much less predictive of the remainder of the season than the general sample. Using the stricter sampling method above, I split the seasons into those where the two month average was 0.30 standard deviations better than the two year average (basically the sixth of the sample that improved the most), where the two month average was 0.30 standard deviations worse than the two year average (basically the sixth of the sample that declined the most), and the remaining 2/3rds of the sample. I ran the regression using only the data from the +0.30, -0.30, and middle groups.

The results showed that when considering only those who improved the most, you should almost completely ignore what happened in the first two months and rely on the two year average to predict going forward. For the other two groups, the results were largely consistent with what was observed in the previous studies – two year average is roughly 3 times more important than two month average.

Now, I have to stress that the sample of those who improved the most is only 192 seasons and that the standard errors of the coefficients are large (0.11). The confidence interval for the two year coefficient is 0.66 – 1.13, centered on 0.90 while the confidence interval for the two month coefficient is -0.19 to 0.24, centered on 0.03. The standard errors for the previous studies were much smaller (0.02 to 0.03). The finding that two month average should be largely ignored for those who showed the most improvement certainly needs to be tested further with more data.

I am much more confident in the main conclusions, however. When attempting to predict performance over the rest of the season – like who will contend for Majors, Ryder Cup berths, and the FedEx Cup – weigh more heavily how a golfer has played in the prior few seasons than how they’ve started off the calendar year. If that means we pump the brakes a little on Walker, Reed, and English, so be it. And don’t write off Tiger, Kuchar, Poulter, and Luke Donald for a poor couple months. Those guys have shown for years that they belong in the world’s elite; that’s worth more than a cold start.

Regression Rules Everything

This post will be number/graph heavy, but it explains perhaps the most important concept in predicting golf performance – everyone regresses to the mean, no matter their performance. The below are two charts that show this effect in action. The first uses large buckets and compares all players performance in seasons with N > 50 rounds with their performance (regardless of N) in the subsequent season. The following shows similar data, broken down more at a more granular level, which also includes which percentage of seasons meet the criteria. Read the buckets as seasons within 0.05 standard deviations.

initialtosubseqseasons

tableofsubsequentseasons

In the first graph, all golfers better than +0.30 (approximately Web.com Tour average) in year 1 declined in year 2. Those worse (think Challenge Tour average) did not improve or decline, on average. Only those who performed very poorly in year 1 actually improved. For those better than PGA Tour average, the decline was fairly uniform (~0.05 to ~0.10 standard deviations). Remember, these are the aggregation of huge samples; many players improved at all skill levels, but on average regression/decline ruled everything.

In the second graph, the most important lesson is how rare the truly elite seasons are. Only roughly 1/4 of seasons came in below -.15 (which is roughly the talent level of the average PGA Tour card holder). The cut-off for the top 5% of seasons (2010-2012) came in at -0.45. Also, the regression of almost all players is evident; no bucket better than +0.35 improved in the subsequent season.

This data is fairly strong evidence that we should expect decline from most performances, on average. In fact, based on the rarity of rounds and the demonstrated regression, we should be skeptical about predicting any elite performance to be repeated the following season.

Bayesian Prediction of Golfer Performance (Individual Tournament)

I’ve posted several studies attempting to predict golfer performance. This attempted to find the importance of the previous week when predicting the following week. The study was not particularly sophisticated (simple linear regression), but the results indicated that the previous week’s performance should be valued at around 10% of the projection for the golfer the following week (90% would be the two-year performance). This other study attempted to predict golfer performance for an entire season using prior season data. That study found that no matter how many years are used or whether those years are weighted for recency, the resulting correlation is ~70%. Doing better than that for full-season prediction would indicate an additional level of sophistication beyond aggregating prior seasons or weighted data for recency.

This post, however, concerns predicting individual tournament performance using my Bayesian rankings. These rankings are generated each week by combining prior performance and sample performance using the equation ((prior mean/prior variance)+(observed mean/observed variance))/((1/prior variance)+(1/observed variance)). In this way, each golfer’s prediction for a week is updated when new information is encountered. The prior mean for a week is the Bayesian mean generated the prior week. My rankings also slowly regress to a golfer’s two-year performance if they are inactive for a period of weeks. For each week, the prior mean is calculated using the equation  (((Divisor – (Weeks since competed)) / Divisor) * (Prior Mean)) + ((1 – ((Divisor – (Weeks since competed)) / Divisor)) * (Two-year Z-Score)). I use 50 as the Divisor, which weights two-year performance at 2% for 1 week off, 27% for 5 weeks off, and 69% for 10 weeks off.

To measure how predictive these rankings were, I gathered data for all golfers who had accumulated 100 rounds on the PGA, European, Web.com, or Challenge Tour between 1-2010 and 7-2013. My sample was 643 golfers. I then examined performance in all tournaments between the 3-28-2013 and 8-8-2013. My sample was 6246 tournaments played. I then generated Bayesian rankings predicting performance before each of these tournaments played. The mean of my predictions was +0.08, indicating I expected the sample to be slightly worse than PGA average. I then compared each prediction to the golfer’s actual performance.

The table below shows the performance of Bayesian and pure Two-year predictions by including all predictions within +/- 0.05 from the displayed prediction (ie, -0.50 includes all predictions between -0.45 and -0.55). The accompanying graph shows the same information with best-fit lines.

BayesianPredictions

BayesianPredictionsGraph

Obviously, the Bayesian and Two-year predictions perform similarly. To test which is better I tested the mean square error. This shows how closely the prediction matched actual performance. I also included “dumb” predictions which simply predict all rounds will perform to the mean of all predictions (+0.08 for Bayesian, +0.055 for Two-year). The “dumb” predictions are the baseline for judging any predictions. If a prediction can’t beat it, it’s worthless.

The mean square error for the Bayesian predictions was 0.381 and 0.446 for the “dumb” predictions. The mean square error for the Two-year predictions was 0.389 and 0.452 for the “dumb” predictions. So both sets of predictions provide value over the “dumb” predictions, but both perform fairly similarly when compared to the “dumb” predictions (-0.065 for Bayesian and -0.063 for Two-year).

This study indicates two things; first, using Bayesian methods to predict golfer performance doesn’t substantially improve accuracy relative to unweighted aggregation of the last two years of performance, and second, that predicting golfer performance in individual tournaments is very difficult. A mean square error of 0.38 indicates an average miss of 3.5 strokes for golfers playing four rounds and 2.5 strokes for golfers playing two rounds.

The Aging Curve for PGA Tour Golfers (Part III) – Using Bayesian Prior

Several weeks ago I posted a two studies on aging among PGA Tour golfers, the most recent of which compared sequential seasons, regressing both seasons to PGA Tour average based on the number of rounds a golfer had played in the seasons. DSMok1 suggested modifying the amount and degree of regression by including a better prior, which makes more sense than regressing every golfer to the same mean. Instead of simply adding 25.5 rounds of average play to each golfer’s season, I found a Bayesian prior based on play in the prior season and measured the change in performance from that prior in the following season.

Sample and Design:

I included every player with >20 PGA Tour rounds in a season for 2010, 2011, and 2012. This limited my sample to 703 seasons. I then gathered data for YR N-1, YR N, and YR N+1 (ie, 2009, 2010, and 2011 for golfers with >20 rounds in 2010) on all major Tours (PGA, European, Web.com, and Challenge).

Using the equation ((prior mean/prior variance)+(observed mean/observed variance))/((1/prior variance)+(1/observed variance)) I found my prior expectation on performance, inputting data from YR N-1 for prior mean and variance and from YR N for observed mean and variance. That equation adjusts the observed performance based on what we’ve observed in the prior season to generate a true-talent level (True YR N) for YR N+1. I used the same equation to find the true-talent level for YR N+1. I inputted the prior generated from YR N-1 and YR N as the prior mean and the data for YR N+1 as the observed mean. This produced True YR N+1. I then compared both True YR N and True YR N+1to find the change in true-talent for each age group.

I weighted the results using the harmonic mean rounds played in YR N and YR N+1. For example, there were 18 golfers for age 26, so I took the sum of each harmonic mean of rounds and divided each golfer’s change in true talent by their share of the total rounds. This produced my total change in true-talent due to age for each age-group.

If a golfer had no performance in YR N-1 I used +0.08 (slightly below PGA Tour average) as their YR N-1 prior. In most cases, these players qualified via Qualifying School and +0.08 is the observed true-talent for Q-School golfers for 2009-2013. Only 8 golfers had 0 rounds in YR N-1 however.

Results:

20    -0.05    2
21    -0.06    3
22    -0.01    6
23    -0.05    8
24    -0.07    9
25    -0.11    11
26    -0.13    18
27    -0.13    23
28    -0.14    29
29    -0.12    36
30    -0.13    34
31    -0.11    39
32    -0.12    36
33    -0.11    34
34    -0.13    34
35    -0.12    36
36    -0.11    37
37    -0.10    42
38    -0.08    26
39    -0.05    30
40    -0.01    21
41    0.03    35
42    0.07    28
43    0.12    19
44    0.13    17
45    0.15    13
46    0.21    17
47    0.25    19
48    0.31    13
49    0.36    12
50    0.35    9
51    0.45    4
52    0.47    2

bayesian aging

Discussion:

The curve generated is very similar to that of the prior study regressing to a mean of +0.00. The peak is slightly lower and the decline is deeper in the late 40s, but otherwise this study supports my prior conclusion of aging with a peak in the mid 30s and subsequent decline.

The Aging Curve for PGA Tour Golfers (Part II)

Yesterday I posted the results of my study on aging among PGA Tour members. You can read the methodology at the link, but basically it compared pairs of seasons by age to find how much a player should be expected to improve or decline based solely on age (I included a mechanism to regress performance in an attempt to find “true talent”).  At the end I said I’d like to try a different regression mechanism that I hoped would produce a more accurate representation of true talent.

I’ve found before that it’s correct to regress PGA Tour performance around 30% to the mean to find true talent. However, that’s most accurate for golfers who play something like a full season (ie, 50-100 rounds worldwide/season). For regular Tour members, regressing 30% is correct, but for golfers playing only partial seasons it’s likely not regressing enough. A performance over 20 rounds is more likely to be the product of luck than a performance over 60 rounds. That’s problematic for this study because it doesn’t regress more extreme good or bad performances enough to the mean. You’ll see the errors that result when I compare the two studies below.

In prior research comparing sets of rounds [1], I’ve found that adding 25.5 rounds of average (0.00) performance properly regresses a performance to the mean. This means for a player with around 60 rounds, the 30% figure quoted above is accurate. For those playing more, like Brendon de Jonge’s 118 rounds in 2012, regressing 30% is way too much. We know a lot more about de Jonge’s true talent in 118 rounds than we do about, say, Jason Day’s 60 round sample in 2012, enough to regress de Jonge only 18%. Similarly, Hideki Matsuyama’s 26 major tour rounds in 2013 tell us much less about his true talent, and by adding 25.5 rounds of average he gets regressed 50% to the mean.

Sample & Design:

The same sample and methodology as the above quoted study were used, except instead of regressing using the equation True Talent=(.6944*Observed)+0.01, I simply added 25.5 rounds of average performance to every observed performance: True Talent=((Observed Z*Observed N)/(Observed N + 25.5)).

I still did not weight my data.

Results:
age         delta      N
19           0.02        3
20           -0.02      2
21           -0.03      4
22           0.01        8
23           -0.03      8
24           -0.01      11
25           -0.06      16
26           -0.02      23
27           -0.01      30
28           -0.01      39
29           -0.03      46
30           0.04        45
31           0.00        49
32           -0.01      44
33           -0.02      43
34           0.04        46
35           0.01        46
36           -0.02      49
37           0.01        51
38           0.04        38
39           0.03        34
40           0.03        38
41           0.05        40
42           0.03        28
43           0.01        27
44           0.04        21
45           0.10        18
46           0.00        28
47           0.03        22
48           0.06        15
49           0.03        16
50           0.02        10
51           0.00        6
52           0.07        2

aging w25.5regression

The smoothed curve averages the improvement of year N-1, N, and N+1.

The results here were much different using a more accurate regression mechanism. There is an observed slow increase in true talent of around -0.02/season from 19 to 29. Between 30 and 37 the curve is more or less flat, declining almost imperceptibly. Beginning in the late 30s is the steady decline of around 0.04/season that was also observed (though to a greater extent) in the previous study.

Discussion:
With this more accurate methodology, I think the previous study can be discarded. There IS age related improvement in a golfer’s twenties. Golfers tend to peak between 29 and 34, with a sharp decline around 38 onwards. This study does not necessarily disprove my prior hypothesis that there is a decline based on lessened commitment to practice/preparation among the more transient PGA Tour members, but it certainly means there is a larger improvement in the 20s being observed among the more permanent members.

[1] This study ordered PGA Tour rounds for a large group of golfers over a full-season from oldest to newest. I then selected two samples – one comprised of the even number rounds and one of odd number rounds – and compared them to see how predictive one half was of the other. I expect to reproduce that study with a larger sample of seasons and golfers soon.

Predicting the Professional Performance of Collegiate Golfers (Part IV)

Earlier this week I posted the latest version of a study measuring how collegiate golfers perform in their first professional season compared to their average Sagarin rating in college. That study used every season of collegiate data, but considering golfers typically improve from freshman to senior year do the final two seasons of college predict pro performance better than using up to four seasons?

My sample was the same as the previous study linked above, except I used only the final two seasons of collegiate competition. For golfers like Rickie Fowler, who played only two seasons, the observed college performance didn’t change. For others it did.

N=52. Average Sagarin rating=70.47. Average pro performance=+0.15.

college golf regression 4

The results were slightly less predictive (R^2=0.294, R=0.54) than using all four seasons of data (R^2=0.356, R=0.59), suggesting that including the earlier data provides some value in predicting later results. I would guess this is because the college season is so short (around 40 rounds); using four seasons provides twice the sample size and a more reliable observation of performance, even if the overall performance was worse. For the record, using only the final season gives R^2=0.205, R=0.45.