Recently ASAP Sports released a bunch of Australian Open press conference transcripts via their website. I thought I’d have some fun and visualize some of the transcripts using a technique called a World Could, or Tag Cloud. A word cloud is a cool way of visualizing text data. A word cloud application ‘tags’ the most frequently used words in a document and makes those words appear bigger than the other less commonly used words. Word clouds allows us to very quickly visualize the most commonly used words in a website, document or in our case a press conference transcript. Ever wanted to know how to talk like a tennis player? Well here’s your chance!
Creating a world cloud is really easy. There are many apps available online that do the tagging and arrangement for you. I used an app called Tagul. Let’s start with the new Swiss star of men’s tennis. Stanislas Wawrinka.
A blog like this wouldn’t be complete if it didn’t include one of the great characters of women’s tennis, and 2014 Australian Open Champ, Li Na.
Now let’s take a look at a couple of the other big names from the Australian Summer. Eugenie Bouchard was the rising star in women’s tennis in 2013, and she kicked off 2014 with a bang making a deep run into the semis at the Australian Open.
Rafael Nadal missed last years Australian Open through injury, but made an emphatic comeback in 2014, making the final of the first Grand Slam of the year. Let’s check out his word cloud.
The top five most commonly used words by each player were:
Wawrinka: really (102), match (84), play (72), know (65), always (59)
Li Na: think (94), know (73), play (64), match (58), yeah (54)
Bouchard: really (72), think (61), well (47), know (46), bit (33)
Nadal: very (118), play (85), playing (60), great (59), against (58)
THE TOP FIVE: play (221), know (184), really (174), think (155), match (142)
So what are the words you need to know if you want to talk like a tennis player? Well you need to use “Play” a lot in your conversation. Be sure to tell everyone how you “Really Think” the “Match” went and tell the press you “Know” you’re “Playing” “Great”. Don’t forget to drop in some other fillers like “Yeah” (Li Na), “Bit” (Bouchard), “More” (Nadal) and “Always” (Wawrinka).
Given that English is only one of these players native language (Bouchard), there are surprisingly a lot of similarities amongst these four. I guess there is an element of monotony in the questions being asked during each press conference, hence the same words are used each to time to answer them!
OK, so I’m not going to get all geeky and try to run some deep and meaningful analysis on these word clouds, because quite frankly they were just a bit of fun. However we do know word clouds provide an effective way of representing text data in a fun and interesting way. In this case we were able to graphically summarize almost 30,000 words from the Aussie Open press conferences for Wawrinka, Li Na, Bouchard and Nadal (for what’s it’s worth!).
Note: The Tagal word cloud filters out common words like “a”, “some”, “this”, “us” etc.