As part of our project to track stats for women’s soccer matches (please join and help us get more data!), we’ve been working on adding location data to virtually every action we track. Until now, if you’ve been following some of the stuff I’ve posted on Twitter or the WoSo Stats Shiny app, it’s largely been summary data devoid of location data. That is to say, it adds up aggregates of certain stats (such as total passes attempted by a player or team) or in some cases calculates additional stats based on those basic stats (such as a player’s passing completion percentage), none of which take into account where a player was on the field.
This time, I’m going to look at location-based data. In this post, to make things simple, I’m going to focus one match, the USA-Germany SheBelieves 2016 match. To make things even simpler, I’m also just going to look at passing and possession. This is an early dive into the location data we’re getting from this project, and how it can complement what we already know about a match based on its summary stats and, well, actually watching the game.
One of the most interesting things I found while exploring the stats this project is generating was the impact of pressure on a player’s passing completion percentage. I expected, based on intuition, to see a player’s passing completion percentage to go down with pressure, but what I saw was that, on average, it barely had an impact.
What you’re looking at is the impact that pressure had on a player’s open play passing completion percentage. Open play passes are all passes that aren’t throw ins, free kicks, corner kicks, goal kicks, or goalkeeper throws or dropkicks. I excluded those because those, by definition, can never be “under pressure” by a defender. In the chart above, the further to the right the bar is, the better the player’s open play passing completion percentage got under pressure. To account for differences in open play passing attempts, the darker the green, the more open play passes that player attempted under pressure.
For me, this was a bit of a head-scratcher at first, as I noticed similar numbers across different matches. The median difference is +15%, so it looks like more players’ passing completion percentage actually got better under pressure. I initially chalked this up to, well, these are the two best teams in the world and great players should continue to make good passes under pressure.
However, upon further thought, this does make some sense, which merits further analysis later on. A player under pressure is probably going to be more likely to revert to a “safer” pass, such as a backwards pass, or be forced into a riskier play, such as a take on, due to not having enough space or time to get a pass off. Inversely, a player who isn’t under pressure, with more time and space with the ball, might be more likely to attempt a riskier pass, such as a launched ball, or not even a pass altogether and instead opt for a shot.
It seems pressure might be a better predictor of a player’s passing completion percentage once we are able to break down those decisions a little better, but I’ll save that for another day. What do I want to get at is what happens to these passing stats when we break it down by location.
Adding Location Data
For each pass attempt, we tracked it’s origin (i.e. where the player was passing from) according to which one of the following “zones” on the field she was in.
For this analysis, I grouped together passes in the defensive middle third and attacking middle third as passes that generally happened in the middle third. Now, what happens to a player’s open play passing completion percentage when she’s passing from within that all-important attacking third?
It drops for pretty much everyone in the match who attempted an open play pass in the attacking third. Again, darker colors indicate more attacking third passing attempts, and the further to the right the bar is the better that player’s passing completion percentage got in the attacking third, compared to her passes in the middle and attacking third.
There are some outliers here. Lloyd, Horan, and Pugh had some very stark differences in completion percentage, but also because they barely attempted any passes from within the attacking third. In general, though, it appears that most players in this match had their passing completion percentage negative affected.
Something interesting worth pointing out is that most of the players in the top half of the chart were German. This stands out even more when we take these two different passing completion percentages (in the attacking 3rd vs. everywhere else) and put them on a dot plot, with a color for each team, as shown below.
The further to the right, the higher the player’s open play passing completion percentage in the defensive and middle third. The higher up, the higher the player’s open play passing completion percentage in the attacking third. The size of the dot indicates the number of open play pass attempts in the attacking third, so players who attempted more passes in that part of the field stand out more.
Almost every German player was above the median for open play passing completion percentage in the attacking third. Notably, Marozsan was the only player in the 75th percentile (better than 75% of all players in the match) for both categories. Meanwhile, it looks like Brian’s passing in this match was negatively affected the most when attempting a pass from within the attacking third.
Unfortunately for Germany, despite having better passing completion percentages in the attacking third and applying what appears to have been great pressure on the U.S. defense, they still lost due to an incredible take-on by Alex Morgan in the penalty box that led to an equalizer and an equally incredible error from Almuth Schult, the German goalkeeper, that gave Sam Mewis the game-winner.
Better passing in the attacking third, then, wasn’t enough to get Germany the win, which is really all that ultimately matters in soccer. It’ll be interesting to see, though, as we get more data for more matches, if that’s out of the ordinary. All that pressure on U.S. defense did get the Germans a goal and credit as the only team in 2016 to date to score a goal on the United States. It may not be a guarantee of victory, but I suspect it points most team in the right direction.
Either way, the way the U.S. goals came about is a nice segue into an analysis of take-ons (and what a player does afterwards) and changes in possessions (and where they happen), which I hope to do in the coming week with the USA-Colombia matches.
You can view the stats and visualizations used in this blog post on Tableau and the WoSo Stats Shiny app. All the source data is freely available on the GitHub repository.
Okay, if you’ve scrolled this far down then hopefully you’ll be interested enough to help us contribute to our small but growing database of women’s soccer stats. As almost everyone who’s tried to search for something as simple as passing stats for their favorite player knows, there’s a dearth of even the most basic stats for women’s soccer and really women’s sports in general.
Please help us change that, one match at a time! We need people who are willing to volunteer some time and effort (any and all would be appreciate) into logging data for women’s soccer matches. To see which matches immediately need help, check out this month’s goals. To learn how to help and get started, read here. The hope is, for starters, to track every NWSL 2016 match but we still need more people!