Offensive and Defensive Augmented APM for Soccer

Posted on 8/28/2022

Tags: Aug APM, Soccer, APM

Share this post!


Summary

In soccer, adjusted plus-minus typically produces overall ratings. In contrast, other sports split adjusted plus-minus into offense and defense.

This article introduces offensive and defensive Augmented APM for soccer. It's a tricky problem, so we start by explaining what makes it tricky. Then, we show how design-matrix weights achieve realistic adjusted plus-minus ratings without a prior. Finally, we extend our FIFA prior to work for offense and defense, which results in our final Augmented APM model.

Our soccer app is updated with the latest results.


Adjusted plus-minus for soccer is tricky. And getting adjusted plus-minus to work for offense and defense is even trickier.

But splitting adjusted plus-minus has clear benefits. The clearest benefit is that offensive and defensive APM reflects how soccer works. Some teams are better on offense, and other teams are better on defense. Splitting APM rewards offensive players on good offensive teams, and defensive players on good defensive teams.

You could argue that great offensive players, such as Lionel Messi, influence both offense and defense. For example, Messi forces opposing teams to play defensively, and increases his team's time of possession. What is the right way to balance this?

It helps to consider a thought experiment. What would happen if Messi went to the MLS? To be concrete, let's say Messi transferred to the San Jose Earthquakes. It's easy to imagine San Jose scoring many more goals. But would their defense get better?

Our model takes a middle path on this question. We make it possible for offensive players to impact defense (and vice versa). But we give offensive players more offensive impact, and defensive players more defensive impact. We'll have more on the specifics later in the post.

So benefit #1 is that offensive/defensive APM is more realistic. Another significant benefit is the emergence of a high-quality defensive statistic. Defensive statistics are notoriously difficult. Not just in soccer, but in all sports. Defensive APM is a large leap forward in quantifying defensive impact in soccer.

This parallels the development of defensive statistics in basketball. Back in the day, the only defensive statistics available were blocks and steals. This led observers to view high block, high steal players as the best defenders in the league. You can see this by looking at the blocks and steals of the defensive player of the year. Defensive APM was a significant breakthrough. It's now one of the most influential defensive statistics in the NBA.

This article walks through our offensive and defensive Augmented APM model for soccer. We built the model in two phases. Phase one gets pure adjusted plus-minus, with no prior. Phase two incorporates a prior into the pure adjusted plus-minus model from phase 1. Both are tricky. Let's start with phase one.

How we compute Pure APM for offense and defense

Before we compute more nuanced, bayesian adjusted plus-minus models, the first step is to compute pure adjusted plus-minus.

Pure APM simply means adjusted plus-minus with no prior. It only considers how many goals were scored or given up by a player's team, controlling for teammates and opponents.

Pure APM is a highly important signal. It separates how much of a player's rating is due to the prior, and how much is due to their performance. That's why we include it in our apps.

The question is: how does pure APM for offense/defense work in soccer?

Why pure APM for offense/defense in soccer is tricky

Pure APM for offense and defense in soccer is tricky because offensive players and defensive players always play together. For example, Salah plays most minutes with van Dijk. Although Salah mostly contributes on offense, and van Dijk mostly contributes on defense.

This makes it hard to know which players contribute more on offense, and which players contribute more on defense. In other words, Pure APM has no idea that Salah plays forward, and van Dijk plays center back.

Let's make the issue concrete. Here are the top ten players, ranked by pure offensive APM, in the 2021-2022 season:

PlayerPositionPure Offensive APM
Manuel NeuerGoalkeeper0.148
Oleksandr ZinchenkoFullback0.142
Kylian MbappeAttacking Mid0.129
Florian WirtzMidfielder0.121
Leroy SaneAttacking Mid0.120
Moussa DiabyAttacking Mid0.117
Georginio WijnaldumMidfielder0.116
RodrygoAttacking Mid0.115
Mahmoud DahoudMidfielder0.111
Niklas SuleCenter Back0.108

This kind of makes sense. Some of the best offensive players (Mbappe, Sane) are ranked in the top ten. But other results are just too weird. Notably, Neuer is a goalie, who contributes nothing on offense. But in this table, Neuer is the #1 offensive player. This is an obvious miss. And we also see center backs and fullbacks in the top ten: also clear misses.

What about defense? Here are the top ten players ranked by pure defensive APM:

PlayerPositionPure Defensive APM
Marcos AcunaFullback0.100
Virgil van DijkCenter Back0.098
Luis HernandezCenter Back0.094
Merih DemiralCenter Back0.093
Antonio RudigerCenter Back0.087
Lars StindlAttacking Mid0.086
Thierry CorreiaFullback0.085
Lovro MajerMidfielder0.084
JorginhoMidfielder0.083
Pierre KaluluFullback0.083

Like offense, this kind of makes sense. van Dijk and Rudiger are well-known defensive players, and it's great to see them in the top ten. Even Jorginho makes some sense, despite being a Midfielder. But again, there are just too many obvious misses. Lars Stindl is not a top-ten defensive player.

In sum, pure APM doesn't pass the eye test. How can we fix it?

How design weights fix the problem

The egalitarian principle behind pure APM is that every player contributes equally. This principle rewards players who are overlooked by traditional box score statistics. Shane Battier is the canonical NBA example.

Equal weights make less sense in soccer. In soccer, forwards can stand 50 yards away while the rest of their team plays defense. These forwards have nothing to do with a team's success on this defensive possession. Why should they get any credit?

Fortunately, there's a straightforward way to redistribute credit: Design-Weighted APM. Design weighted APM changes the 1's and -1's in our design matrix to more nuanced values (0.7, -1.3, etc).

To demonstrate, recall the image in our original post where we showed what the design matrix for adjusted plus-minus looks like:

The 1's, 0's, and -1's under the player columns represent player contributions to the game segment. Design-weighted APM changes these values. For example, Salah may get a 1.3 for offense, and a 0.7 for defense. That's all there is to it.

Does it work? Let's look a the top offensive players according to design-weighted APM:

PlayerPositionDesign-Weighted Off APMPure Off APM Rank
Kylian MbappeAttacking Mid0.1653
Leroy SaneAttacking Mid0.1555
Florian WirtzMidfielder0.1494
RodrygoAttacking Mid0.1378
Karim BenzemaForward0.13713
Moussa DiabyAttacking Mid0.1376
Thomas MullerAttacking Mid0.13117
Pep BielAttacking Mid0.12615
Patrik SchickForward0.12512
Vinicius JuniorAttacking Mid0.12414
Mohamed SalahAttacking Mid0.12422
Robert LewandowskiForward0.11831
Georginio WijnaldumMidfielder0.1167
Kevin De BruyneAttacking Mid0.11635
Erling HaalandForward0.11518

The top offensive players now look way more reasonable. Notably, there aren't any defenders, and great offensive players (Benzema, Lewandowski, Haaland, De Bruyne) burst into the top fifteen.

Here are the top defensive players according to design-weighted APM:

PlayerPositionDesign-Weighted Off APMPure Off APM Rank
Marcos AcunaFullback0.1131
Virgil van DijkCenter Back0.1122
Luis HernandezCenter Back0.1113
Merih DemiralCenter Back0.1104
Adama SoumaoroCenter Back0.1019
Antonio RudigerCenter Back0.0995
Thierry CorreiaFullback0.0997
Pierre KaluluFullback0.0989
Eric DierCenter Back0.09112
John StonesCenter Back0.09014
Damien Da SilvaCenter Back0.09014
Alessandro BastoniCenter Back0.08913
Lovro MajerMidfielder0.0888
JorginhoMidfielder0.0889
Max KilmanCenter Back0.08712

Like offense, the results make a lot more sense. We no longer have any forwards/attacking mids in the top fifteen.

Design-weighted APM passes the eye test!

For a deeper understanding of how design-weights impact player ratings, it helps to analyze the results in aggregate. The following image compares player ratings with even weights vs. design weights, split by player position:

This figure demonstrates two important points:

  1. Design-weighted results make offensive ratings for offensive players more extreme. The same goes for defense.
  2. Design-weighted results push the defensive rating of offensive players towards 0. Instead of making the defensive rating extremely negative. The same goes for defense.

To understand #1, compare the distribution of offensive APM for forwards and attacking mids. The distribution of offensive ratings for forwards and attacking mids is much wider with design weights. This is good: we give more offensive credit to forwards and attacking mids for offensive contributions. Defense is the same. Center backs and fullbacks have a larger variance in defensive ratings under design-weighted APM.

#2 is a subtle but important modeling choice. There are two options. Option one makes the defensive ratings for forwards extremely negative, and the offensive ratings for defenders extremely negative. Option two pushes towards offensive player's defensive ratings toward 0.

We prefer option two. Option two reflects that a poor defensive forward wouldn't hurt a team's defensive performance. However, a bad offensive forward would hurt a team's offense. In our view, this is a realistic assumption.

What design weights we used, and how we pick them

The previous section showed that design-weights work. The next question: which design-weights should we use?

When you pick design-weights, there are two things to consider:

  1. Which criteria to use (position, distance, etc.)?
  2. What values to assign (1, 0.7) values to assign, based on the criteria?

Our model uses player position as a criterion. There are obvious improvements possible (distance, number of touches, etc.). But position is simple to implement, works well, and can be iterated on in the future.

We selected values by running cross-validation with different sets of weights, for each position. Our principle was to improve interpretability, without sacrificing prediction accuracy. In the end, we found design-weights that marginally improved prediction accuracy (over even weights). Slightly better predictions, coupled with drastically improved interpretability, made the decision to use design-weights a no-brainer.

How we compute Augmented APM for offense and defense

Quick recap: the previous section extends pure adjusted plus-minus from overall to offense and defense. To build Augmented APM, we need a prior. The question is: what prior should we use?

For overall Augmented APM, we use a player's overall FIFA rating. So an obvious extension is to use FIFA ratings for offense and defense. Before you scoff, let me briefly make the case why FIFA ratings are useful in APM generally, and specifically for offense and defense.

Why FIFA ratings work well as APM priors

Despite the importance of rating players, there aren't too many systems that holistically rank every player in the world. Much less keep the ratings up to date. FIFA ratings are one of these systems.

And for our purposes, FIFA ratings work well because they estimate the same quantity APM estimates. Specifically, FIFA and APM estimate how good is each player right now. This key fact explains why FIFA ratings provide a useful prior in Augmented APM models.

Some folks complain that FIFA ratings are subjective. I get it.

It's true that FIFA ratings aren't based on observable statistics, like goals and assists. But FIFA ratings are the product of a vast network of scouts, coaches, and fans. And this network produces a clearly useful signal. In fact, we've shown that that FIFA ratings improve the prediction accuracy of our model.

And that's just for overall APM. FIFA ratings shine when we need a prior for offense and defense. As mentioned earlier, good defensive statistics are hard to come by, and FIFA defensive ratings are deeply intuitive. To demonstrate, here are the top ten defensive players in the world according to FIFA:

PlayerFifa Defensive Rating
Virgil van Dijk270
NGolo Kante269
Marquinhos268
Giorgio Chiellini265
Kalidou Koulibaly265
Ruben Dias264
Milan Skriniar261
Raul Albiol261
Mats Hummels261
Stefan de Vrij260

It's hard to get more intuitive than that.

We did look into other defensive statistics. But we couldn't find a good match. Transfer market values give too much weight to young players. And box-score statistics, like interceptions, and even goals, don't account for the time of possession: players with the most interceptions tend to play on the worst teams. There's room to improve this, but it will take time.

Finally, keep in mind that we can use as many signals as we want in the Augmented APM prior:

As we get better and better defensive statistics, they can be seamlessly integrated into our model. Future blog posts will cover these integrations. But because FIFA ratings provided the clearest path towards version 1, that's what we use.

The learned FIFA prior for offense and defense

In our basketball model post, we showed how our model learns the importance of different prior components as part of the model fitting process.

We take the same approach for soccer. Specifically, FIFA ratings are split into various components, and we use several of these components in our prior:

  • Attacking (offense)
  • Movement (offense)
  • Skill (offense)
  • Power (offense)
  • Mentality (offense)
  • Defending (defense)
  • Goalkeeping (defense)

Each rating can be in the offensive prior, defensive prior, or both. It's tricky to place ratings like Skill into offense or defense. But you need to make choices. In the end, we used Defending and Goalkeeper for the defensive prior, and the rest for the offensive prior. In our view, this was the cleanest way to separate a player's offensive and defensive skills.

Let's look at the results. Here are the fitted values for the offensive and defensive coefficients over the past five seasons:

For offense, Attacking and Movement are the most important FIFA ratings. This is consistent in every season we look at. Power and skill are somewhat important in a single season, but mostly hover around 0. For defense, the Defending and Goalkeeper ratings work as expected, and are highly predictive of a player's defensive performance.

Offensive and Defensive APM is now live!

The results of our offensive and defensive model for the last five seasons are now live in our soccer app. The ratings include matches from:

  • The big 5 leagues,
  • Champions, Europa, and Conference League,
  • World Cup, EuroCup, and Copa America.

Let's look at some highlights. Here are the top players in the latest, 2021-2022 season:

PlayerPositionTeamAug APMAug APM OffAug APM Def
Karim BenzemaFWReal Madrid0.2880.2690.019
Leroy SaneAMBayern Munich0.2700.2450.025
Kylian MbappeAMParis Saint Germain0.2650.2580.007
Mohamed SalahAMLiverpool0.2500.2400.010
Ben ChilwellFBChelsea0.2460.1250.121
Son Heung minAMTottenham Hotspur0.2430.2180.025
Gerard MorenoFWVillarreal0.2420.2320.010
Mikel MerinoMFReal Sociedad0.2390.1230.116
Dani CarvajalFBReal Madrid0.2370.0990.138
Alexandre LacazetteFWArsenal0.2370.1900.047

As expected, Benzema is the #1 player. Benzema led Real Madrid to a champions league title and is the favorite to win the Ballon d'Or.

Now let's look at the top ten offensive players. The following table shows their offensive Augmented APM rating, Adjusted (Pure) APM rating, and Prior rating. The prior rating is what their rating would be if they played 0 minutes, and is a direct translation of a player's FIFA rating:

PlayerPositionTeamAug APM OffAdj. APM OffAug APM Prior Off
Karim BenzemaFWReal Madrid0.2690.1370.218
Kylian MbappeAMParis Saint-Germain0.2580.1650.212
Leroy SaneAMBayern Munich0.2450.1550.169
Patrik SchickFWBayer Leverkusen0.2410.1250.154
Mohamed SalahAMLiverpool0.2400.1240.196
Robert LewandowskiFWBayern Munich0.2320.1180.218
Gerard MorenoFWVillarreal0.2320.0860.186
Ciro ImmobileFWLazio0.2210.0910.174
Son Heung minAMTottenham Hotspur0.2180.1030.163
Erling HaalandFWDortmund0.2140.1150.158

Good adjusted plus-minus ratings blend common sense and a few surprises. If the list was 100% common sense, the ratings wouldn't tell us anything. If the ratings had 0% common sense, we wouldn't trust them.

You can see a nice mix of common sense and surprise in these ratings. There are many familiar names in the top ten (Benzema, Mbappe), but we also have Patrik Schick. Schick is a surprise, and APM indicates that Schick may be undervalued, and a good transfer target for teams in search of a forward.

Now let's look at the top ten defensive players:

PlayerPositionTeamAug APM DefAdj. APM DefAug APM Def Prior
Virgil van DijkCBLiverpool0.1960.1120.121
Manuel NeuerGKBayern Munich0.1910.0750.146
Marcos AcunaFBSevilla0.1830.1130.099
MarquinhosFBParis Saint Germain0.1680.0750.139
AlissonGKLiverpool0.1620.0590.133
EdersonGKManchester City0.1590.0520.138
Antonio RudigerCBChelsea0.1580.0990.096
John StonesCBManchester City0.1570.0900.078
Gleison BremerCBTorino0.1480.0750.076

Again, we see a nice mix of common sense and surprise. Van Dijk is the top defensive player in the world. Van Dijk combines the #1 prior rating for defenders, with the #2 pure defensive APM rating, which is a powerful combination.

Marcos Acuna is a surprise. Acuna's high defensive rating is because he was a top defender on Sevilla. And Sevilla allowed the least goals in La Liga. Gleison Bremer is another surprise, who played on a Torino team with the 5th best defense in Serie A.

These results only scratch the surface. Our interactive soccer app has so much more. Go check it out!

Future app releases

We plan to release ratings for the current season around a month into the season. We'll continuously update the app on a daily or weekly cadence throughout the season.

If you're interested in this post and application, please sign up for our newsletter, or contact us if you have any questions, comments, or feedback! Thanks for taking the time to read.

Acknowledgements

Thanks to Francesca Matano, Collin Politsch, Taylor Pospisil, Lars Magnus Hvattum, Brian MacDonald, and Michael Shuckers for helping me think through the ideas in this blog post.