Mining Hawk-Eye data. It’s time to show the tennis world what they’re missing out on.

It’s been a little quiet around here recently, so I wanted to update you all on what’s been happening at Gamesetmap. The good news is there has been plenty of great work going on.

As many of you know I have been chasing down access to Hawk-Eye data for over 18 months now. One of my earliest posts on Gamesetmap outlined the challenges of getting access to the data. You can check out my post titled “Unlocking Hawk-Eye data: What it means for tennis, the ATP, WTA and ITF”.

Today, I’m please to announce that I now have access to a select number of matches, with the opportunity to purchase further matches in future. This is an enormous breakthrough for tennis, and for Gamesetmap!

Hawk-Eye tennis dataLet the Hawk-Eye data mining journey begin. A screen dump of the raw ball trajectory data from Hawk-Eye.

Generating the ball trajectory from the raw Hawk-Eye data was made possible with the help of Darren O’Shaughnessy. Darren runs a small consulting company in sports analytics and informatics called Ranking Software. He was able to extract the information I needed from the raw Hawk-Eye files in order to get started analysing the data. The image above shows the ball trajectory from the matches I have thus far.

Late last year I created a 3D Diorama using Hawk-Eye player tracking data from the same matches (see below):

A Diorama of Player Movement in Sport

The two datasets (player tracking and ball trajectory) provide an insight into tennis matches that has rarely be seen before.

Stay tuned over the next few months as I dig through these fabulous datasets to uncover the spatial patterns that exist in tennis.

Visualisations created using 3D ArcGIS.


Roland Garros Men’s Final 2014 – Game Tree

Rafael Nadal continued his dominance on court Philippe Chatrier with an emphatic 3-6, 7-5, 6-2, 6-4 win over Novak Djokovic to take out the French Open for an incredible 9th time. Below are Novak and Rafa’s interactive Game Trees. You can explore the interactive game tree here.

Roland Garros Game Tree

Roland Garros Men’s Final 2014 Game Trees  (click to enlarge).

How the Game Tree Works

Each point is color-coded to reflect the momentum in each game. A match that is dominated by the server is highlighted with a thicker, outside flow through the ‘positive’ blue points of the Game Tree. More tightly contested service games result in thicker lines through the ‘neutral’ (white) and ‘negative’ (red) points of the Game Tree. You can click on each line to reveal how many times the score passed through that point in the match.

A Summary of the Roland Garros Game Tree

  • Nadal won 16 service games. He was broken 3 times.
  • Djokovic won 14 service games. He was broken 6 times.
  • Nadal won 6 service games from 40-30 (38%), and 5 from 40-0 (31%).
  • Djokovic won 9 service games from 40-15 (64%), and 2 from 40-30 (14%).
  • Djokovic lost his serve 3 times (50%) from 40-Ad. He lost his serve twice at 30-40 (33%), and once from 15-40 (16%).
  • Nadal lost his serve 2 times from 30-40 (67%). The one other time he lost serve was at 40-Ad.
  • Nadal won 74% of first points while serving.
  • Djokovic won 70% of first points while serving.
  • Neither player served out a shocker of a service game (loosing from 0-40). However both players found themselves at 0-30 once, but each time they were able to push the score back to 15-30.

Most Frequently Played Points

Game Tree Roland Garros

Roland Garros Men’s Final Most Frequently played points (click to enlarge).

Nadal played most of his points down the outside positive section of the game tree (as indicated by the rich yellow markers on the above diagram). Other commonly played points by Nadal were played at 15-15, 30-30 and 40-30 (as indicated by the lighter yellow markers).

Djokovic played most of his points down the positive section of the game tree, but with a little less safety than Nadal. Most of Djokovic’s points were played within one point of the neutral section of the game tree, whereas Nadal’s were played within two points of neutral section of the game tree.

The Big Difference: 30-30

The big difference between the two player’s game trees is at 30-30.

Nadal played 30-30 seven times, and was able to convert 100% of those into a very positive position, 40-30 (see below). Djokovic also played 30-30 seven times, but was only able to get to 40-30, 3 times (43%). He was forced into a very dangerous position (30-40) 4 times (57%).

Roland Garros_30-30

Nadal                                          Djokovic


The Game Tree was originally developed to give us a better understanding of the final score, and how close a match was. It provides one of many ways to analyse the final result of a tennis match. In this match the game trees for each player were remarkably similar hinting that the match was tightly contested. No player likes to see too many (if any) thick lines extending from the bottom four red circles of the game tree like. In this match Djokovic saw his service broken 6 times (once at 15-40, twice at 30-40, and three times at 40-Ad). (Note: lines extending from any of the four bottom red circles means a player lost his serve). Djokovic played 15-15 point better than Nadal, but clearly Nadal played 30-30 far better than Djokovic. Nether player played through deuce a lot (in comparison to other matches). However, Nadal was able to convert more winning games from deuce than Djokovic. Nadal converted 75% of games played through Deuce, while Djokovic converted only 50% of games played through deuce. At 40-30, Nadal also had a better conversion rate than Djokovic (66% to 40%).

Like we see in so many of these matches, games are won and lost at a few very important points. The game tree suggest there wasn’t much in this match between these two players. It does however suggest that Nadal was perhaps the better ‘big point’ player, and as we know in tennis, big points win matches.

Make your own conclusion about the final using the interactive Game Tree here.



Journal of Medicine and Science in Tennis publish Spatial Serve Variation article

I’m proud to announce that the Journal of Medicine and Science in Tennis (JMST) recently published my research on spatio-temporal serve variation in their October 2013 issue.


The Journal of Medicine and Science (JMST) in Tennis is an international, peer-reviewed journal produced by the Society for Tennis Medicine (STMS) in co-operation with the ITF, the ATP, and the WTA Tour. It is published three times a year (February, June, and October).

The goal of the STMS is to be a forum and a disseminator of tennis specific and tennis related information, and as a catalyst for advancement in tennis sports medicine and science.

Journal of Medicine and Science in Tennis

The cover of the October 2013 issue of the JMST.

JMST Article

It is an honor to have my work published in such a prestigious and well respected journal. It’s great to see this type of analytical tennis work getting some serious traction in well respected areas of the tennis profession. Stay tuned throughout 2014 for more geo-tennis work from GameSetMap!

The article was prepared alongside Dr Mark Kovacs.


The Federer Backhand – Using Analytics from a Coaching Perspective

Earlier this month Dr Mark Kovacs and I published an article in the May/June edition of Tennis Pro Magazine titled “The Federer Backhand – Using Analytics from a Coaching Perspective“.

TennisPro is the award winning magazine of PTR (Professional Tennis Registry), published bi-monthly and distributed to PTR members in more than 117 countries.

Tennis Pro Magazine

TennisPro. The official magazine from the PTR.

The article analyzed what specific areas of the Federer backhand produce errors, and where these errors occur. The purpose of the article was to help coaches better understand how to use analytics to help player’s improve, but also how to potentially scout opponents to help find areas of weakness in future opponents.

Federer BackhandPage 1 from the article.

Federer Backhand 2Page 2 and 3 from the article.

The article evaluates the 2012 Olympic Final between Roger Federer and Andy Murray. Andy Murray won the match 6-2, 6-1, 6-4.

You can read the article in full at the PTR website.

I hope you enjoy this example of how statistics and analytics are becoming more crucial in modern tennis at every level of the game.


Animating Player Movement using Hawk-Eye Optical Tracking Data

Over the last few nights I have been experimenting with ways to animate player movement in tennis, using optical tracking data from Hawk-Eye. Displaying how movement is changing over time (or any temporal data for that matter) has, and still provides a great challenge to data visualization experts, animators and cartographers. This very quick animation highlight some of the advantages and disadvantages of 2D animations.

Over the next month or so as the project progresses I’ll start sharing some of my experiences of what I’ve learnt.

Until then…


IBM “Data Wall” @ 2013 US Open

Late last year I received a call from the very talented crew at Hush in New York to help on a project they were building for the 2013 US Open and IBM. You can check out a video they put together to promote the work below.

The IBM Data Wall had three layers of interaction, Playground View, Simple View and Detail View. Each digital “tennis ball” represented a match and the wealth of data that every match produces, and sometimes even included predictions of the outcome.

My primary role on the project was to:

  • contribute ideas and design sketches
  • be a source of tennis knowledge
  • contribute as source of interpretation of data

Sean Klassen in a recent article on the Communication Arts website had this to say about the project…

“Sports are chock-full of data enthusiasts, so it’s great to see IBM and the U.S. Open taking advantage of that fact with some of the most beautiful infographics I’ve ever seen.”

The full article by Communication Arts which reviews the application can be read here.

Some of my design sketches and ideas were transformed by the guys at Hush into these fabulous interactive infographics which were used in the final application (see below).

IBM Data Wall Important Point

TennisServe Speed Infographic


The main interface to exploring the Detail View ended up like this:

US Open 2013 IBM Data Wall

The guys at Hush had some fun with concept of an exploding tennis ball for the Playground View – they literally built this overnight!

Exploding Yellow Tennis Ball

A more comprehensive collection of screen shots from the application can be seen at Hush’s official website.



Presenting a Diorama of Player Movement in Sport

Earlier today Sports Performance and Tech magazine published my article on visualizing Hawk-Eye player tracking data. The article explores the value of 3D and Space Time Cubes when displaying spatio-temporal data.

To create the Diorama I used 3D optical tracking data from an official Hawk-Eye tennis match played between Roger Federer and Paul-Henri Mathieu at the Swiss Indoors in Basel, 2012.

Sports Performance and Tech Magazine

A Diorama of Player Movement in Sport

A Diorama of Player Movement in Sport

The full article can be read here.

A Diorama of Player Movement in Sport

To explore the 3D Diorama in more detail please visit the app here.

*** The app is best viewed on a computer or laptop using Google Chrome ***


How We Made Nadal’s Interactive Game Tree

Late last year we published an interactive game tree celebrating Rafael Nadal’s historic 2013 season. The Game Tree allows users to visually explore how easily, or not, Nadal won each of his 666 service games in the Masters 1000 Tournaments, Grand Slams and World Tour Finals he played in 2013. This rare point-by-point summary shows where Nadal’s history breaking season was won and rarely lost.

Nadal Game Tree

Figure 1. Nadal’s Interactive Game Tree was released after the completion of the World Tour Finals, November, 2013. Click here to view the application.

How the Project Began…

The idea for the project came about after years of frustration by never really knowing how close a match was by just looking at the final score. For example, a 6-4, 6-4 score-line could mean multiple things; one break of serve, or multiple breaks of serve. The winner may have won their service games easily, or they might have been hotly contested. Clearly, the final score gives no indication of the competitiveness of the match. To ease this frustration we set out to find a way to graphically present how hard Nadal was challenged in his matches during the 2013 season.

Our Inspiration

Inspired by Donato Ricci et al’s, (2008) game tree-like infographic (Figure 2), we set out to illustrate the path to victory using game tree theory. Sometimes referred to as a tree of possibilities, a game tree represents paths from a starting point to an end point, often in a game scenario like chess. Tennis plugs perfectly into a game tree as each player starts at 0-0 and makes a move in one direction only through the tree, depending on their success at the 0-0 point.

Ricci et al’s, (2008) map

Figure 2. Mapping relationships between events. Ricci et al’s, (2008) map of the most common research methodologies used by various Italian design firms.

In order to determine the effectiveness of Nadal during his service games we mapped the frequency of paths from one point in a game to the next. To do this, we borrowed concepts from a 19th century cartographic method, called flow mapping. Flow maps were first introduced by Henry Drury Harness in the Atlas to Accompany Second Report of the Railway Commissioners, Ireland (1837) (Figure 3).

Henry Drury Harness Map

Figure 3: Henry Drury Harness introduced the first flow map in 1837. The map uses a variety of line thickness to convey a quantity of traffic flow between Irish cities.

The lines connecting each point in the Game Tree became the quantitative flow lines, and were scaled proportionally representing the number of times Nadal played through each point. The various line thicknesses allowed us to very quickly identify the most common path during each service game.

The Data & Technology Behind the Game Tree

To create the game tree we began by downloading all of the appropriate matches from the William Hill sports website as XML files. Each match was available as a separate XML file and these files contained high-level information about the match (players, tournament info, date, etc.), along with a detailed point-by-point breakdown of the match. After a preliminary assessment of the data we developed a javascript application, which looped through the files and began to process the points.

William Hill Data

Figure 4. An extract of data from the xml game files used in the game tree.

We then prepared a series of functions using javascript, to mimic the behavior of the game tree. The Game Tree at present only maps Nadal’s service games, therefore all point values of the opponents’ service games were simply skipped over and tie break points ignored. As the points are looped through and processed, we used the Rapheal javascript library to draw and animate the entire game tree using SVG (Scalable Vector Format). Some additional jQuery code was then added to hook up the tournament and match filters. The application was framed using HTML5, CSS3, SASS, Compass, and the Mueller Grid System.

Designing the Application

Our design work started off defining what the users expectations were from the application, and working out the simplest way of fulfilling their needs.

We defined a number of core functions the app should support:

  • The ability to compare game tree patterns at both the tournament and game level.
  • Multiple filtering at the season, tournament, and match level.
  • Interaction with the flow lines should reveal the exact quantities per line.
  • Tournaments should appear in the order they occurred, and the score should appear alongside each match.

Once we defined the core functions of the app we started sketching out how the game tree would support the application, and how we would visually organize the content for mobile, tablet and desktop devices.

Some of the earlier game tree concepts were centered on a circular game tree, before slowly transitioning to a more conventional representation of the tree diagrams (Figure 5).

Sketching out the Game Tree

Figure 5. Sketching the game tree designs. From here it was a matter of refining the triangular game tree until the design begun to solidify.

It was important that we designed the game tree to be responsive across small and large devices. We needed to ensure a seamless user experience regardless of device type or size. To do this we introduced some mobile ready functions into the design. For example we collapsed the menu on smaller devices so the game tree remained the focal point of the application. And we re-arranged the text on the opening page for smaller devices (Figure 6).

Responsive Design

Figure 6. Designing the optimal viewing experience across tablet and mobile devices forced a reshuffle of some of the key elements of the application.

Each point in the Game Tree was color coded to reflect the momentum in each game. Dark blue representing + positive momentum, red – negative momentum and the neutral points down the spine of the tree were colored white (Figure 7).


Figure 7. Each point in the Game Tree is color coded to reflect momentum in the match.

Results and Analysis

Nadal’s (6-2, 3-6, 6-4, 6-1) win against Novak Djokovic at last years US Open final illustrates the analytical power of the game tree (Figure 8).

Nadal v Djokovic Game Tree

Figure 8. The US Open final played between Rafael Nadal and Novak Djokovic. The game tree clearly highlights where Nadal played the majority of points on his serve (Deuce to Ad-40 – 12 times)

The score from the match, 6-2, 3-6, 6-4, 6-1 indicates a fairly one-sided match. But the game tree tells us that Nadal won 6 of his service games from Ad-40, (more than any other point). He and Novak wrestled back-and-forth between Deuce and Ad-40 12 times on Nadal’s serve. The frequency/line thickness through this part of the tree suggests that Novak had many opportunities to break Nadal’s serve, and that perhaps this match was much closer than the score suggests.

Nadal’s victory against Stanislas Wawrinka in the final of Madrid (6-2, 6-4) shows us how brutal Nadal can be when serving (Figure 9).

Nadal v Wawrinka Game Tree

Figure 9. An almost perfect service pattern. Nadal’s victory against Wawrinka in the final of Madrid (6-2, 6-4).

In 9 service games, Wawrinka never saw an opportunity to break Nadal in this match, coming close only once at deuce. Nadal’s remaining service games were won from commanding positions in the game (4 times each from 40-15, and 40-0). Nadal was only twice in the red zone (at 0-15). But each time he quickly pulled the momentum in his favor for a quick path to winning each game. Whilst the final score suggests a relatively straight forward win for Nadal, it’s not until we see his service games visualized in this manner that we truly understand his dominance in the match.


We believe this is the first ever-interactive point-by-point Game Tree of a tennis match covering an entire season for one player.

In both the Djokovic and Wawrinka examples presented above the game tree enabled a better understanding of the match than simply seeing the final score. The game tree presents opportunities for further analysis as well. For example we are able to determine where Nadal is most effective on serve. We can see that at Deuce, Nadal beats his opponents more than any other point. He fights back-and-forth between 40-Ad and Ad-40 (like against Djokovic in the US Open Final), but rarely losses when he is serving at Deuce. Across his 666 service games last season, his opponents only had a 1 in 5 chance (0.2) of winning the game from Deuce onwards.

The simplicity of the Game Tree application, and its ability to graphically present traditional statistical data in a unique and informative way allows users to better understand the final score of a match and how games are played out over time.

Craig O’Shannessy, leading tennis analyst for the NY Times, the ATP, and former panelist at the MIT Sloan Sports Analytics Conference labeled the application, “pioneering, and groundbreaking”.  It has featured heavily on well-respected data visualization websites like, and Nadal’s Game Tree captured the imagination of tennis analyst, fans, and data visualization experts worldwide for it’s originality and function.

Stay tuned for further interactive sports visualizations in 2014!

Click here to view the Nadal Game Tree application.

This article was written for the MIT Sloan Sports Conference.

Damien Saunder (formerly Demaj) is a Geospatial Designer at Esri where he designs and builds online interactive maps. He is continually rethinking spatial analytics for tennis via @damiensaunder

David Webb is the web team lead at Rady Children’s Hospital-San Diego, where he builds responsive web sites and web applications. He enjoys experimenting and tackling interesting challenges via


Talk Like a Tennis Player: Word Clouds

Recently ASAP Sports released a bunch of Australian Open press conference transcripts via their website. I thought I’d have some fun and visualize some of the transcripts using a technique called a World Could, or Tag Cloud. A word cloud is a cool way of visualizing text data. A word cloud application ‘tags’ the most frequently used words in a document and makes those words appear bigger than the other less commonly used words. Word clouds allows us to very quickly visualize the most commonly used words in a website, document or in our case a press conference transcript. Ever wanted to know how to talk like a tennis player? Well here’s your chance!

Creating a world cloud is really easy. There are many apps available online that do the tagging and arrangement for you. I used an app called Tagul. Let’s start with the new Swiss star of men’s tennis. Stanislas Wawrinka.

Wawrinka Word Cloud

A blog like this wouldn’t be complete if it didn’t include one of the great characters of women’s tennis, and 2014 Australian Open Champ, Li Na.

Li Na Word Cloud

Now let’s take a look at a couple of the other big names from the Australian Summer. Eugenie Bouchard was the rising star in women’s tennis in 2013, and she kicked off 2014 with a bang making a deep run into the semis at the Australian Open.

Bouchard World Cloud

Rafael Nadal missed last years Australian Open through injury, but made an emphatic comeback in 2014, making the final of the first Grand Slam of the year. Let’s check out his word cloud.

Nadal Word Cloud

The top five most commonly used words by each player were:

Wawrinka: really (102), match (84), play (72), know (65), always (59)

Li Na: think (94), know (73), play (64), match (58), yeah (54)

Bouchard: really (72), think (61), well (47), know (46), bit (33)

Nadal: very (118), play (85), playing (60), great (59), against (58)

THE TOP FIVE: play (221), know (184), really (174), think (155), match (142)

So what are the words you need to know if you want to talk like a tennis player? Well you need to use “Play” a lot in your conversation. Be sure to tell everyone how you “Really Think” the “Match” went and tell the press you “Know” you’re “Playing” “Great”. Don’t forget to drop in some other fillers like “Yeah” (Li Na), “Bit” (Bouchard), “More” (Nadal) and “Always” (Wawrinka).

Given that English is only one of these players native language (Bouchard), there are surprisingly a lot of similarities amongst these four. I guess there is an element of monotony in the questions being asked during each press conference, hence the same words are used each to time to answer them!

OK, so I’m not going to get all geeky and try to run some deep and meaningful analysis on these word clouds, because quite frankly they were just a bit of fun. However we do know word clouds provide an effective way of representing text data in a fun and interesting way. In this case we were able to graphically summarize almost 30,000 words from the Aussie Open press conferences for Wawrinka, Li Na, Bouchard and Nadal (for what’s it’s worth!).

Note: The Tagal word cloud filters out common words like “a”, “some”, “this”, “us” etc.