How to use TSR data to model game outcomes for betting
By Mark Taylor Aug 1, 2014
Tweet
Contemporary soccer analysts are increasingly relying on Total Shot Ratios (TSR) to produce goal expectations, where Poisson has been previously been used. Is TSR flexible enough, however to predict match results or point totals? This article investigates.
In an earlier article, we looked at the gradual move away from using traditional methods of judging team ability in soccer, which rely on goals, towards newly acquired data, such as shots.
Goal attempts, since they also include headers, satisfy many of the requirements to be a useful explanatory and predictive measure of team quality. They are more numerous than simply using goals scored or allowed and they appear to correlate well with team success.
Better sides tend to have more attempts on goal and restrict the opposition to fewer such attempts, especially over a full season. Equally importantly, shot ratios in the past are reasonably predictive of shot ratios in the future.
This allows a clear chain of events whereby previous total shot ratio figures can be used to predict expected shot numbers in the future and the correlation between performance and shots can then be used to predict match results or point totals.
Much of the posted work on total shot ratios in soccer revolves around establishing the relationship between TSR and points accrued. TSR from previous seasons or multiple seasons with an appropriate weighting is then used to predict league standings in the following season.
Above we’ve plotted the TSR of sides that remained in the Premier League from one season and their point totals in the subsequent EPL season. There is a definite correlation, although there is also some dispersion from the linear line of best fit and for these EPL seasons at least, there is a stronger relationship by simply using points in year N-1 to predict points in year N.
Of course the main aim of a soccer model is to produce odds for the outcomes of individual matches and while lacking the flexibility to produce many of the secondary markets, such as total goals, correct scores and handicap markets, TSR can be adapted to individual game outcomes.
A perennial problem when collecting data to use in evaluating team quality is how far to go back in time. More recent events are most relevant to the quality of the current team, but too little data may be insufficient for the repeatable signal to emerge from the randomness of the noise.
With smaller amounts of data we cannot be sure that a side with superior figures is really the better of the two teams.
To demonstrate, from the end of March to the end of the 2013/14 season, Tottenham’s TSR from the previous 30 matches remained relatively consistent between 0.55 and 0.57. If Spurs’ TSR on the day of a match was calculated from just the last 10 previous games, their TSR from late March to early May was more volatile, ranging from a below average 0.48 to a peak of nearly 0.61.
The more stable 30 game average may be more credible than the peaks and troughs seen over a much shorter time span, where opponent strength, in game effects, short term injury issues and simple good or bad fortune may be much more dominant in determining shooting ratios.
However, the longer time scale of 30 games may see a very different side taking to the field or managerial change and all the tactical nuances that may accompany it. The data will also inevitably stretch into previous seasons and even into different divisions for recently promoted teams, necessitating the application of a correction factor to any data used from outside the top flight
Historical TSR data to model soccer matches
Despite these difficulties, we can calculate a TSR from previous matches for each of the two teams in a Premier League contest. We can further use historical TSR match ups from previous seasons to establish the relationship between each team’s pre game TSR and the actual outcome of the game.
Logistic regression, where an outcome, such as a home victory, either happens or it doesn’t, is one route to determining match probabilities from game day TSRs for each team.
When Sunderland travelled to Tottenham on April 7th 2014, their TSR over the last 30 games was 0.43, over the last 10 it was slightly better, at 0.44. That of their hosts was 0.56 and 0.51, respectively.
Using historical, out of sample results from five previous seasons from soccer-data.co.uk and a logistic regression, the chance of a Spurs win at home to Sunderland based on 30 game averages was 68%.
The calculation involves two steps, where the respective home and away constants are obtained by running a logistic regression on matches played during the five most recent seasons of Premiership games, excluding 2013/14. The predictor variables were each team’s TSR from their previous 30 league games and the output was whether or not the result was a home win.
Firstly, calculate F= (8.19*HTSR)-(6.44*ATSR)-1.08;
Where HTSR is the TSR of the home side over the previous 30 matches and ATSR is that of the away team.
To convert this to a probability of the home team winning the game, finally take;
Home Win Probability = (exp^F)/(1+exp^F) = 0.68
The 30 game HTSR for Spurs was 0.56 and Sunderland’s ATSR was 0.43, giving Tottenham a likely winning probability of 0.68 when these numbers are put into the equations above.
This compares to a much lower home win probability of 0.53, when judged on TSR over just the previous 10 games. The home and away constants used to determine the value of F using TSR from the previous 10 matches are 7.11 and 5.12 respectively and the lone constant is 1.23. These slightly different constants are again derived from five seasons of Premiership data, but using each team’s TSR over the previous 10 games.
The odds generally available about a Tottenham victory equated to a probability of around 0.62.
30 Game HTSR Table
Home Team
Away Team
Home TSR
Away TSR
Home Win Probability
Quoted Probability
Aston Villa
Fulham
0.48
0.38
0.6
0.48
Cardiff
C. Palace
0.38
0.46
0.29
0.4
Chelsea
Stoke
0.62
0.43
0.77
0.76
Everton
Arsenal
0.54
0.53
0.47
0.39
Hull
Swansea
0.46
0.54
0.32
0.36
Man City
Southampton
0.64
0.6
0.57
0.73
Newcastle
Man Utd
0.56
0.53
0.51
0.23
Norwich
WBA
0.45
0.5
0.35
0.41
Tottenham
Sunderland
0.56
0.43
0.68
0.62
West Ham
Liverpool
0.37
0.57
0.15
0.11
10 Game HTSR Table
Home Team
Away Team
Home TSR
Away TSR
Home Win Probability
Quoted Probability
Aston Villa
Fulham
0.48
0.37
0.57
0.48
Cardiff
C. Palace
0.41
0.42
0.39
0.4
Chelsea
Stoke
0.65
0.44
0.76
0.76
Everton
Arsenal
0.48
0.46
0.45
0.39
Hull
Swansea
0.47
0.57
0.31
0.36
Man City
Southampton
0.65
0.65
0.51
0.73
Newcastle
Man Utd
0.56
0.55
0.49
0.23
Norwich
WBA
0.47
0.54
0.34
0.41
Tottenham
Sunderland
0.51
0.44
0.53
0.62
West Ham
Liverpool
0.3
0.58
0.11
0.11
In the table above I have listed the remaining games from the weekend of April 5th 2014, using both 10 and 30 game TSR for comparison. Actual home wins are shown in bold.
Chelsea, Everton and Tottenham were winning teams available at a value price using 30 week TSR, compared to just Everton using TSR from 10 games.
However, Villa, Newcastle and WHU were potentially value bets, which failed to win from the 30 game TSR, while Cardiff, Norwich, Hull and Man City looked poor value to win at home and the first two lost.
With the occasional exception, such as Cardiff, Man City and Spurs, each time frame generates similar odds for a home win. Intuitively, I would expect the larger sample size to out- perform that based on 10 observations, but the formulae is there for anyone to experiment with by varying amounts of data or even allowing greater weightings for the most recent matches.
The methodology allows shots to be converted into useable match odds, although a Poisson approach remains more versatile. But it is an area that is ripe for further investigation.
Mark Taylor is a freelance soccer and NFL writer who, along with producing expert content for Pinnacle Sports, also runs his own soccer analytics blog, the Power of Goals.
If you have feedback, comments or questions regarding this article, please email the author or send us a tweet on Twitter.
Комментариев нет:
Отправить комментарий