Over the past couple of days I’ve been trying to figure out how to create a Tableau workbook that aggregates all our USWNT data in a similar fashion to the NWSL 2016 Tableau workbook. The main challenge has been figuring out how to best show and compare stats from USWNT that, quite frankly, are all over the place due to how varied the quality of opponents has been.
Thankfully, we’re able to use all the USWNT stats tables we’ve got in the GitHub repo and use the database.csv file, with data for all the matches in the WoSo Stats GitHub repo, to create something that can show something like passing stats adjusted for the opponent’s quality.
The visualizations for the USWNT data, for now, are the two worksheets in this Tableau workbook. Below, I’ll explain what each one is, and some more detail on how how the data was calculated and aggregated to make it easier for you to make similar visualizations.
I won’t delve too much into an actual analysis of the data in the two charts. There’s too much there to go into right now – and why have all the fun when you can do that, too? Anyways, on to the charts
Visualizing USWNT Open Play Passing Stats
First, this visualization of USWNT passing stats for the USWNT matches that we have in our database. Each mark on the chart below represents a USWNT player from a match in our database. The x-axis is her total number of open play passes attempted during that match, the y-axis is her open play passing completion percentage. The color is her designated “position” (more on this later) and the shape of the mark is whether or not the opponent, at the time, had a FIFA ranking in the top 15.
Midfielders and defenders generally pass the ball more, which is to be expected. Forwards, who are often surrounded by defenders, and goalkeepers, who may often launch the ball forward, see less of the ball and have lower passing completion percentages. It’s pretty clear that differences in passes attempted and in passing completion percentage have to do with the nature of a player’s position. We need to better adjust for position.
Adjusting For A Player’s Position
This visualization shows passing stats adjusted for a USWNT player’s position by using her standard deviation from the average for USWNT players in her position.
Now it’s easier to spot which players, given their “designated” position, attempted to pass the ball more than average and completed their passes at a higher percentage than average. On the other hand, it’s also easier to spot which players passed the ball less than average and completed their passes at a lower percentage than average.
To account for some outliers, in the chart below I used the filters to exclude performances from any USWNT players who played less than 30 minutes and any USWNT players who had less than 10 open play pass attempts.
A few things stand out. One, it’s easier to rack up more passing attempts with a high passing completion percentage against lesser opponents, as indicated by how many more cross-shaped marks compared to circle-shaped marks are in the upper-right. And playing top opposition can drastically cut down on both, with several circle-shaped marks spread out throughout the bottom-left corner.
Players’ “Designated” Positions and Next Steps
About the positions. Players are only given one for all their matches, instead of one for each match. This means that a player like Allie Long who in this chart is classified as a “midfielder” is being misrepresented for games where she has played as a defender.
And even within positions, some further refinement could be used. Fullbacks like Kelley O’Hara and Ali Krieger, who are correctly classified as “defenders,” have a propensity towards lower passing completion percentages because, as fullbacks, they often play higher up the pitch where a completed pass is less likely. But because they’re defenders, their passing completion percentage’s standard deviation from the average for all defenders looks worse than it really is because they’re counted against centerbacks, who are also correctly called “defenders” but have some of the highest completion percentages in the game.
A next step is going to be to figure out a way to resolve that Allie Long problem and figure out, on a match-by-match basis, a player’s position for a given match. And then further breaking down some positions like defenders into fullbacks and centerbacks.
Another idea is to only show passing stats broken down by thirds of the fields. I suspect the difference in passing stats vs Top 15 opponents and non-Top 15 opponents would be even more stark when we look at the attacking third.
You can help!
This data only happens because of help from fans like you (yes, you)! The WoSo Stats project needs help to log more stats and location data for USWNT stats, and past NWSL seasons. With your help, we can get even more richer data to expand on what we know about the sport.
If you’re interested in logging data for matches (that are all publicly available on YouTube), read more here and email me at email@example.com or send me a DM at @WoSoStats on Twitter. All the data logged will be publicly available on the WoSo Stats Github repo and will help me and others do more analyses like these!
Pingback: Applied Sports Science newsletter – July 18, 2017 | Sports.BradStenger.com