{"id":23,"date":"2013-02-18T18:49:40","date_gmt":"2013-02-19T02:49:40","guid":{"rendered":"http:\/\/gamesetmap.com\/?p=23"},"modified":"2013-02-18T18:53:21","modified_gmt":"2013-02-19T02:53:21","slug":"using-arcgis-for-sports-analytics","status":"publish","type":"post","link":"http:\/\/gamesetmap.com\/?p=23","title":{"rendered":"Using ArcGIS for sports analytics"},"content":{"rendered":"<p>The statistical component of sport has always provided a fascinating way to analyze performance and success. This might simply be the final score, but for some sports, such as football, baseball, cricket, golf and tennis, meaningful analysis of every facet of the game and a player or team\u2019s actions is part of the essence of the game itself. It is as common to see statistics and graphical summaries of the action reported as it is to see the action itself and this provides a fascinating insight into strategy as well as an explanation of outcome.\u00a0In this blog entry we explore the results of the London Olympics Gold Medal tennis match between Roger Federer and Andy Murray to show how you can use GIS to identify particular patterns within the match that may not have been exposed by using traditional non-geographical analysis and display techniques.<\/p>\n<p>Created using ArcGIS, figure 1 shows the location of where each player played a winning shot and their movement during every point of the gold medal match.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/FedererMurrayOlympicVisualisation1.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/FedererMurrayOlympicVisualisation1-1024x724.jpg\" width=\"640\" height=\"452\" \/><\/a><\/p>\n<p><em>Figure 1: An infographic showing the player movement and winning shot positions from the Olympic Gold Medal Match between Roger Federer and Andy Murray.<\/em><\/p>\n<p>Whilst figure 1 certainly carries a lot of visual impact it doesn\u2019t actually tell us a whole lot. The player movement lines overlap one another and make it hard to distinguish which line relates to which point. We cannot tell the direction of movement in many cases because there are no directional arrows. The infographic also doesn\u2019t show where the winning stroke landed, or the direction of the shot. It also fails to show the temporal component of the match.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisData.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisData-1024x724.jpg\" width=\"640\" height=\"452\" \/><\/a><\/p>\n<p><em>Figure 2: The complete data set from the Olympic Gold Medal Match. 1708 point locations were collected from the 3 set match<\/em><\/p>\n<p><strong>Capturing the data<\/strong><\/p>\n<p><strong><\/strong>For the study we captured the \u00a0tennis match data using\u00a0<a href=\"http:\/\/resources.arcgis.com\/en\/help\/main\/10.1\/index.html#\/Working_with_ArcGlobe_and_ArcScene\/00q8000000sv000000\/\">ArcScene 10.1<\/a>\u00a0and video footage of the match (see figure 3). We built a court at a scale of 1:1 in its correct geographic location (center court at Wimbledon) and were able to quickly capture the location of each player\u2019s stroke and corresponding ball bounce for the match entirely from the video footage. At each location we collected a set of key attributes like who played the stroke, what type of stroke it was, the stroke number, point number, game number, set number, who was serving etc. The data captured provides a statistical summary of every shot in the match.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/ArcSceneImage1.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/ArcSceneImage1.jpg\" width=\"706\" height=\"490\" \/><\/a><\/p>\n<p><em>Figure 3: Video footage of the match in ArcScene. The red dots represent the player\u2019s stroke position and ball bounce. The green lines represent the direction of ball travel for each shot.<\/em><\/p>\n<p>By using ArcScene we were able to plot the player\u2019s position and ball bounces to within +\/-20cm using the\u00a0<a href=\"http:\/\/resources.arcgis.com\/en\/help\/main\/10.1\/index.html#\/\/00q800000124000000\">3D editing tools<\/a>. We approximated the camera angle of the video footage and set our data view to match. This made the data capture process rapid and increased accuracy, compared to a 2D environment, because we were able to continuously match the changing camera view in the video by using the\u00a0<a href=\"http:\/\/resources.arcgis.com\/en\/help\/main\/10.1\/index.html#\/Using_the_3D_Navigate_tool\/00q80000011z000000\/\">Navigate Scene<\/a>\u00a0control in ArcScene. This also helped us counter the scale distortion in the camera view when capturing points at the end furthest from the camera.<\/p>\n<p>Once all of the point data was captured, we used the\u00a0<a href=\"http:\/\/resources.arcgis.com\/en\/help\/main\/10.1\/index.html#\/\/0017000000tv000000\">XY To Line<\/a>\u00a0tool to create connectivity between the points using the shot, point, game and set number attributes. The lines are instrumental in allowing us to visualize stroke patterns (as you will see later in the blog entry). We ran the same XY To Line process to create player movement lines.<\/p>\n<p><strong>Visualising the data<\/strong><\/p>\n<p>Statistics from the match tell us that Andy Murray made a total of 18 winners to Roger Federer\u2019s 13. What these statistics don\u2019t tell us is where those winners occurred, the stroke of each winner, when the winner occurred and what led to the winning shot occurring. They also fail to show us any potential stroke patterns during the match. By capturing and storing all of the match data in a file geodatabase (figure 4) we are able to take advantage of the geo-location of these winners and create some interesting visualizations to tell a far more interesting story than single snapshots allow.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/table.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/table-1024x439.jpg\" width=\"640\" height=\"274\" \/><\/a><\/p>\n<p><em>Figure 4: Using a file geodatabase to store sports data in ArcGIS<\/em><\/p>\n<p>One of the challenges in dealing with sports data is that there are many instances of similar events occurring at the same or similar locations over relative small periods of time. This often results in very tight clusters of points over very small areas of your court, pitch or field. If your data has an element of connectivity, you will additionally have overlapping lines along similar bearings and distances or lines that run in completely random directions, depending on the type of sport you are analyzing. This provides us with an interesting challenge of how to represent and compare this information meaningfully.<\/p>\n<p>One way to make sense of so many overlapping points and lines is to use a visualization technique (often promoted by Edward Tufte) called Small Multiples (see figure 5). Small multiples use a series of common basemaps (in our case a tennis court) with different slices of data on top of each map. The maps are arranged in a logical sequence, much like animated movie frames. Small multiples are useful to disaggregate your data, reducing the visual complexity and quantity of information so that it can more easily be seen and interpreted.<\/p>\n<div>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisAnalyticsMurray2.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisAnalyticsMurray2-1024x724.jpg\" width=\"640\" height=\"452\" \/><\/a><\/p>\n<p><em>Figure 5: Andy Murray\u2019s winning three shot sequence visualized using small multiples. The green lines represent the forehand winning strokes and the blue lines, the backhand winning strokes.<\/em><\/p>\n<p>Figure 5 allows us to very quickly see some important patterns from the match that were not visible using traditional tabular statistics. The most immediate pattern observed is the direction of each winning shot (half of Murray\u2019s backhands were down-the-line winners). You can also quickly identify the position of where the player made the winning shot (half of Murray\u2019s shots were made deep inside the court, near or around the service line) and the type of shot that was played (Murray\u2019s number of forehands to backhands ratio was 10 to 8). Temporally, we can see that 7 of Murray\u2019s winners were made on game point, either for or against him. Figure 6 illustrates the amount of information each small multiple illustrates and, therefore, the potential for recognition of patterns across a game or match.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/small-multiples1.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/small-multiples1-1024x465.jpg\" width=\"640\" height=\"290\" \/><\/a><\/p>\n<p><em>Figure 6: An explanation of the variables being mapped in the small multiples matrix<\/em><\/p>\n<p>Each individual image presents a second level of visual information that is likely to suit coaches, players or die-hard fans who want to know a little more about the game\u2019s pattern of play than maybe your average tennis fan or someone scanning the morning news. We have added some important temporal labels to the images to help users identify when the winning shot occurred, we have varied the colour and lineweight of lines in each image to reflect a level of importance and distinguish between line classes.\u00a0 Each stroke location is dynamically labeled from the stroke field in our file geodatabase, as is the sequence number. The player movement lines show us where the player has run from to make the winning shot. In 6 of Andy Murray\u2019s winners, he moved a considerable distance across the court to make the winning shot. The player movement lines also allow us to see the previous one or two strokes without actually showing the stroke lines on the map.<\/p>\n<p>You will notice we are only showing the two shots prior to the winning shot being made. We are mapping the \u2018set-up\u2018 stroke (point 1), the opponents returning stroke (point 2) and the winning stroke (point 3). Showing more than two lead up strokes prior to the winning shot can cause confusion and potential distraction to the user (figure 7).<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/GeneralisationComparison.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/GeneralisationComparison-1024x258.jpg\" width=\"640\" height=\"161\" \/><\/a><\/p>\n<p><em>Figure 7: The image on the left displays all of the strokes (14 in total) leading up to the 4th winning shot. The image on the right displays only two shots leading up to the winning shot.<\/em><\/p>\n<p>Some generalization is needed to ensure you don\u2019t overwhelm the user with information. Finding the correct balance of generalization is one aspect of the research that we are continuing to explore. Trying to determine how many events, and what type of events led to a particular event happening is incredibly dynamic and problematic so it is vital erroneous assumptions aren\u2019t introduced during generalization.<\/p>\n<p>In order for the small multiples to work better in sequence we rotated the data frame of each image using the\u00a0<a href=\"http:\/\/resources.arcgis.com\/en\/help\/main\/10.1\/index.html#\/\/006600000004000000\">Data Frame tools<\/a>\u00a0in ArcGIS. This allowed us to map all Murray\u2019s shots from one end and Federer\u2019s from another which enabled clearer patterns in the match to be seen. Whilst it was suitable in this instance to shift all of a players strokes to one end for visualization, in some cases this might not be suitable if, for instance, there were particular weather conditions that made play at one end more challenging.\u00a0 In this situation, being able to assess how different players react to different conditions might be an important component of the pattern of the match itself.<\/p>\n<p>Having already explored Murray\u2019s winning shot sequence, let\u2019s take a quick look at Federer\u2019s three stroke winning pattern in figure 8, below.<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisAnalyticsFederer.jpg\"><img loading=\"lazy\" class=\"aligncenter\" alt=\"\" src=\"http:\/\/blogs.esri.com\/esri\/arcgis\/files\/2012\/09\/TennisAnalyticsFederer-1024x724.jpg\" width=\"640\" height=\"452\" \/><\/a><\/p>\n<p><em>Figure 8: Roger Federer\u2019s winning three shot sequence. The green lines represent the forehand winning stroke and the blue lines, the backhand winning stroke.<\/em><\/p>\n<p>Federer made only two winners on his backhand side (indicated by the blue lines) and 10 out of his 13 winners came directly from the result of moving his opponent off the court from a wide serve, leaving an open court for Federer to hit an easy winner into. His two backhand winners were both struck with little or no room for error. These two shots could have easily missed the mark, leaving Federer only 11 winners from 3 sets of tennis, all from the forehand side. Five of Federer\u2019s 13 winners came at either game point against or for him.<\/p>\n<p>The small multiple format was perfect for this type of analysis. We were able to present a series of events over time in a logical, clear and concise manner. The two examples of gameplay explored in this blog entry show how powerful representing the results of sports data in a graphic form can be using GIS. By glancing at the images you take more away from the data than you would by simply seeing the totals of each winner in tabular form. By exploring them in detail we are able to reveal dimensions in the points, games and match that are simply impossible to gauge from other approaches. We are currently working on ways to animate particular scenes and looking into applications that serve the data up in an online environment giving users the ability to query the map for themselves and run their own analysis on the data.<\/p>\n<p>Sport Analytics is a growing field, but currently a less frequented field in the world of GIS. Some of the worlds largest sporting organizations like Manchester City, Adidas, Nike and leagues like the EPL, NBA and AFL and are capturing every movement their players make and recording their actions. The challenge is to understand the best way to present this data to the players, coaches, media and fans.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The statistical component of sport has always provided a fascinating way to analyze performance and success. This might simply be the final score, but for some sports, such as football, baseball, cricket, golf and tennis, meaningful analysis of every facet &hellip; <a href=\"http:\/\/gamesetmap.com\/?p=23\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[4,7,8,5,6],"_links":{"self":[{"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/posts\/23"}],"collection":[{"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=23"}],"version-history":[{"count":4,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions"}],"predecessor-version":[{"id":26,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions\/26"}],"wp:attachment":[{"href":"http:\/\/gamesetmap.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=23"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=23"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/gamesetmap.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=23"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}