Similarly to recent discussions on NHL metrics, this article looks at whether or not weighting goals more heavily than other goal attempts can improve soccer betting models? Find out if this method can help your model predict soccer games more accurately than standard TSR?
Goals are the final arbiters of a soccer game. The side that scores the most wins, while everything else is incidental. However, the problem for bettors wishing to develop models to help predict future results is that goals in soccer are relatively scarce.
Both a goal and shot approach can be shown to correlate well to predicting future events
Shots however are more frequent and so borrowing from hockey, bettors focus on shots data to estimate team quality, and subsequently future predictions. Each approach has merit.
Goals are relatively easy to track, although some bettors may wish to remove more rare additions, such as penalty kicks and own goals. Shots also depend on the interpretation of the person recording the event, while blocked efforts and multiple shots within a short time frame, may further cloud the issue.
What is indisputable is that both a goal approach and a shot approach (which includes goals as well) can be shown to correlate well to future outcomes, something which soccer bettors are starting to recognise.
It is perhaps unfortunate that goals and shots have become separated in the search to project future game outcomes. Shots or more accurately, goal attempts to emphasise the inclusion of headers, is a culmination of events leading to a side trying to score a goal. And within this analysis there are a variety of outcomes ranging from saves, misses, blocks and successful goal outcomes.
Each event and the rates at which a side allows another team to create similar types of chances, potentially tells bettors about the quality of that side. But rather than discard large amounts of useful data, as a goal only approach may do, or treat wayward shots that miss the goal as equals to successful scoring attempts, it seems sensible to include everything, but weight events differently.
The recent discussion around weighted shots in hockey has inevitably spilled over into soccer.
Ultimately the merits of such approaches, no matter how intuitively satisfying it is to attach more weight to goals, are measured by whether the added information produces higher correlations to predicting future outcomes than rival models.
So if we have the same amount of historical data we can test to see which approach, weighted or un-weighted, delivers the better correlation to future events. (Shot data used from football data site)
The simplest correlation is to use the previous seasons data as a predictor of a teams performance the following year. So we are using shots and goals to predict an indicator of achievement, such as total league points, in the next.
The data at football data does not separate out blocked shots, but is detailed enough to enable a shot to be classified as shots that miss the target, those that are saved and those that result in a goal. Therefore, we can calculate season long differentials for each Premier League team for these three different classes of goal attempt by subtracting the totals amassed by their opponents in these same categories.
Using Arsenal as a test case for weighting goals differently to shots
For example, Arsenal in 2012/13 had 72 shots that resulted in a goal and faced 37 that ended as a goal against for a goal difference of +35. Similar calculations gave them a +60 differential for attempts that went wide or high of the goal and a +93 differential for on target shots.
Having posted these numbers in the 2012/13 season, Arsenal then went on to win 79 points the following year.
As a comparison that treats all attempts as equally weighted, Arsenal took 546 goal attempts during the 2012/13 season, giving up 358 attempts and expressed as a ratio, as has become common place in soccer, they enjoyed 60.3% of the shots taken in their matches.
Over 10 EPL seasons suitably weighted goal data appears to be a much better indicator
If we collect such data for each side in one season which survived in the Premier League to play the following year over multiple seasons, we can run a regression to find not only the strength of the correlation, denoted by the coefficient of correlation, 'r' or typically 'r^2', but also in the case of the data that has been categorized by shot outcome, the weightings given to goals and shots that do not result in a score.
Over the course of the last ten completed Premier League seasons, categorized data, suitably weighted appears to be a much better indicator of points won by a side in the following season than simply counting all attempts as equals.
R^2 values are 0.70 for the former and just 0.57 for the latter when regressing data against future point totals from 2002/03 to 2011/12.
Using shot type, next season’s points are projected by the following equation:
Next Season’s Points = Previous Season’s Goal Difference*0.47 + Previous Wide Shots Differential*0.01 + (Previous Shots on Target Differential*0.054) + 51
The comparable regression equation for shot ratios is:
Next Season’s Points = (Previous Season’s Shot Ratio as a decimal*167) - 31.4
If we use Arsenal’s data from 2012/13 to predict their points in 2013/14, weighted shots projects 73 points compared to just 69 for equally weighted shot ratios. Arsenal actually recorded 79 points, which is closer to the weighted shot predictions.
Correlations for useful future team achievements, such as goal difference in the remainder of a season, using data accrued from matches already played, also shows higher 'r' values when goals are weighted more heavily compared to merely using shots as equals. And a more accurate prediction of a side’s future goal difference may lead to more accurate individual match predictions.
Soccer betting models often improve in small steps and for the moment, weighting goals may be one such step forward that has still yet to be tested against the bookmaker’s odds. This article highlights how weighting goals more heavily, can improve correlation to future outcomes.
Комментариев нет:
Отправить комментарий