Word Analysis of the Presidential Debate

After the Republican and Democratic conventions wrapped, I wrote a brief analysis of each nominees’ convention speeches and I think this gave some interesting insight into what is most important to each candidate. Since Monday was the first debate between the two Presidential candidates, I thought it might be interesting to do a similar analysis. Like the convention speeches, I wondered which topics were addressed most frequently and how this contrasted between the two, as well as how much time each candidate spent talking about their opponent.

Analyzing the Word Usage
Fortunately, many media outlets provided a full transcript of the debate. I obtained a copy of the transcript from the Washington Post. From here, I separated out the words of each candidate, then used a tool from to parse out each word and its frequency of use (this is the same tool I used for the speech analysis). Once I had the frequencies of each word, I did some cleanup, filtering out pronouns, prepositions, conjunctions, and other types of common words, and removing any words which were used less than six times.

From the data, we can make a few observations:
  • Mr. Trump spoke a lot more than Secretary Clinton, using a total of 8,674 words compared to Clinton’s 6,384, a difference of over 35%.
  • Like the convention speeches, both candidates spent a lot of time speaking about their opponent. Trump’s # 3 word was “Secretary”  (27 occurrences) and his #5 word was “Clinton” (22 occurrences); he also said “Hillary” eight times. Clinton’s #2 word was “Donald” (25 occurrences). Interestingly, she only said “Trump” twice, preferring to refer to her opponent by his first name only.
  • Not surprisingly, since this was a debate and the candidates were being asked similar questions, there is a lot of overlap in each candidate’s most commonly used words. These include “People”, “Good”, “Country”, and “Jobs”
  • The general focus of the most commonly used words were the economy as is noted by the heavy use of the words, “Jobs”, “Business”, “Economy”, “Tax”, “Money”, and “Companies”.
  • Clinton’s 10 most used words were:
1.  People – 33 occurrences
2.  Donald – 25 occurrences
3.  Good – 17 occurrences
4.  Jobs - 17 occurrences
5.  Country – 16 occurrences
6.  Tax – 16 occurrences
7.  Business – 15 occurrences
8.  Work – 15 occurrences
9.  American – 13 occurrences
10.  Economy – 13 occurrences

  • Trump's most used words were:
1.  Country – 51 occurrences
2.  People – 36 occurrences
3.  Secretary – 27 occurrences
4.  Years – 23 occurrences
5.  Clinton – 22 occurrences
6.  Companies – 20 occurrences
7.  Good – 20 occurrences
8.  Jobs – 19 occurrences
9.  Money – 17 occurrences
10 (tie).  Bad – 16 occurrences
10 (tie).  Great – 16 occurrences

Word Cloud
And, like the speech analysis, I thought that word clouds would be a good way to visualize this data set, so I used Tableau Public to create word clouds for each candidate, as shown below.

And, just for fun, I also analyzed the word usage of the moderator, Lester Holt. Of course, he did not speak nearly as much as the candidates, and spent a fair amount of time just trying to keep things under control, but his word usage is still interesting. His top words, not surprisingly, were the candidates’ names—he said “Mr.” and “Trump” 27 times, “Secretary” 22 times, and “Clinton” 21 times.

If you have any questions or comments, please let me know. And, if you’d like to interact with the visualization, feel free to check it out on my Tableau Public page:!/vizhome/First2016PresidentialDebate/WordCloud

Ken Flerlage, September 28, 2016

