USWNT passing – comparing positions and opponents’ FIFA rankings

Over the past couple of days I’ve been trying to figure out how to create a Tableau workbook that aggregates all our USWNT data in a similar fashion to the NWSL 2016 Tableau workbook. The main challenge has been figuring out how to best show and compare stats from USWNT that, quite frankly, are all over the place due to how varied the quality of opponents has been.

Thankfully, we’re able to use all the USWNT stats tables we’ve got in the GitHub repo and use the database.csv file, with data for all the matches in the WoSo Stats GitHub repo, to create something that can show something like passing stats adjusted for the opponent’s quality.

The visualizations for the USWNT data, for now, are the two worksheets in this Tableau workbook. Below, I’ll explain what each one is, and some more detail on how how the data was calculated and aggregated to make it easier for you to make similar visualizations.

I won’t delve too much into an actual analysis of the data in the two charts. There’s too much there to go into right now – and why have all the fun when you can do that, too? Anyways, on to the charts

Visualizing USWNT Open Play Passing Stats

First, this visualization of USWNT passing stats for the USWNT matches that we have in our database. Each mark on the chart below represents a USWNT player from a match in our database. The x-axis is her total number of open play passes attempted during that match, the y-axis is her open play passing completion percentage. The color is her designated “position” (more on this later) and the shape of the mark is whether or not the opponent, at the time, had a FIFA ranking in the top 15.

Screen Shot 2017-07-16 at 9.27.38 AM

 

Midfielders and defenders generally pass the ball more, which is to be expected. Forwards, who are often surrounded by defenders, and goalkeepers, who may often launch the ball forward, see less of the ball and have lower passing completion percentages. It’s pretty clear that differences in passes attempted and in passing completion percentage have to do with the nature of a player’s position. We need to better adjust for position.

Adjusting For A Player’s Position

This visualization shows passing stats adjusted for a USWNT player’s position by using her standard deviation from the average for USWNT players in her position.

Screen Shot 2017-07-16 at 9.52.13 AM.png

Now it’s easier to spot which players, given their “designated” position, attempted to pass the ball more than average and completed their passes at a higher percentage than average. On the other hand, it’s also easier to spot which players passed the ball less than average and completed their passes at a lower percentage than average.

To account for some outliers, in the chart below I used the filters to exclude performances from any USWNT players who played less than 30 minutes and any USWNT players who had less than 10 open play pass attempts.

Screen Shot 2017-07-16 at 10.06.09 AM.png

A few things stand out. One, it’s easier to rack up more passing attempts with a high passing completion percentage against lesser opponents, as indicated by how many more cross-shaped marks compared to circle-shaped marks are in the upper-right. And playing top opposition can drastically cut down on both, with several circle-shaped marks spread out throughout the bottom-left corner.

Players’ “Designated” Positions and Next Steps

About the positions. Players are only given one for all their matches, instead of one for each match. This means that a player like Allie Long who in this chart is classified as a “midfielder” is being misrepresented for games where she has played as a defender.

And even within positions, some further refinement could be used. Fullbacks like Kelley O’Hara and Ali Krieger, who are correctly classified as “defenders,” have a propensity towards lower passing completion percentages because, as fullbacks, they often play higher up the pitch where a completed pass is less likely. But because they’re defenders, their passing completion percentage’s standard deviation from the average for all defenders looks worse than it really is because they’re counted against centerbacks, who are also correctly called “defenders” but have some of the highest completion percentages in the game.

A next step is going to be to figure out a way to resolve that Allie Long problem and figure out, on a match-by-match basis, a player’s position for a given match. And then further breaking down some positions like defenders into fullbacks and centerbacks.

Another idea is to only show passing stats broken down by thirds of the fields. I suspect the difference in passing stats vs Top 15 opponents and non-Top 15 opponents would be even more stark when we look at the attacking third.

You can help!

This data only happens because of help from fans like you (yes, you)! The WoSo Stats project needs help to log more stats and location data for USWNT stats, and past NWSL seasons. With your help, we can get even more richer data to expand on what we know about the sport.

If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at wosostats.team@gmail.com or send me a DM at @WoSoStats on Twitter. All the data logged will be publicly available on the WoSo Stats Github repo and will help me and others do more analyses like these!

How to explore USWNT passing stats with heat maps

Over the past several months, in addition to tracking actions such as passes and interceptions, we have also been adding location data to as many USWNT and NWSL 2016 matches as we can. The process for how that works is explained here, but here’s what it ends up looking like on the match’s actions spreadsheet (note the “poss.location” and “def.location” columns) for the USA-Germany SheBelieves Cup match:

Screen Shot 2016-08-02 at 1.00.50 PM

This series of events can be seen at https://streamable.com/xskp

The values in the “poss.location” and “def.location” columns (as well as the “poss.play.destination” column, which are blank here) represent the location of the player from the “possessing” team, based on splitting up the field into different zones as shown here. In the series of event shown above, play is shown moving from Babett Peter in the defensive middle third’s right wing, back to Almuth Schult in the defensive third’s center, and then all the way to the attacking right third where Anja Mittag attempts a side pass that is recovered by Morgan Brian in her own 18-yard box. Also logged is the location of defenders doing certain defensive actions, such as applying pressure onto a pass (as Alex Morgan did) or engaging in an aerial duel with the possessing team (as Crystal Dunn did).

As you can imagine, analyzing something like this, especially over the course of an entire match, is best done in a two-dimensional format. There’s only so many different stats tables you can make before you eventually need to put this on a heat map, like this!

Screen Shot 2016-08-02 at 4.49.25 PM

I created heat maps like the one above for these eight 2016 USWNT matches for which we currently have location data:

  • USA-Ireland (1/23/16 – International Friendly)
  • USA-Costa Rica (2/10/16 – 2016 Olympic CONCACAF Qualifiers)
  • USA-Mexico (2/13/16 – 2016 Olympic CONCACAF Qualifiers)
  • USA-Canada (2/21/16 – 2016 Olympic CONCACAF Qualifiers)
  • USA-England (3/3/16 – 2016 SheBelieves Cup)
  • USA-France (3/6/16 – 2016 SheBelieves Cup)
  • USA-Germany (3/9/16 – 2016 SheBelieves Cup)
  • USA-Colombia (4/6/16 – International Friendly)

The heat maps were created with Excel and can be downloaded here (Click on “View Raw to download).

There’s a heat map for each match in the second sheet of the Excel workbook. Currently, the heat maps only depicts completed passes that were made from within each zone. To change the player the heat map is depicting, just change the name of the player in the cell below where it says “Enter name here”.

8b7c645a462ec06eeca1344c083edaf5

Next to each heat map is a big table of stats and player info, which is where the heat map is getting its data. Don’t change any of this! Unless you really, really know what you’re doing. Make sure the player name you type in for a heat map matches the name of the player in that heat map’s adjacent stats table.

Worst comes to worst and you mess something up, just re-download the Excel workbook from the GitHub repo.

This is still a work in progress that I figured out over the course of a night. Way more than just passes can be put on this heat map, and we also have location data for more than just USWNT matches (we also have NWSL 2016 matches!). For now, though, this works.

If you run into any issues, send me a tweet at @WoSoStats or email me at wosostats.team@gmail.com.

We need volunteers!

If you made it this far, maybe you’re willing to help us log even more data! We are always in need of more volunteers to help us log match actions and location data for women’s soccer matches. Without the help of fans volunteering their time for this project, none of this data is possible. No experience is necessary, just a willingness to learn. Read more about how to help here: https://wosostats.wordpress.com/how-to-help/