* Singular and plural form of nouns was counted as one, e.g. *job* = job + jobs, *tax* = tax + taxes, *woman* = woman + women, etc.
* Contractions were expanded for words count purposes, e.g. *I’m* was counted as two words – ‘I’ and ‘am’
As theoretically both candidates had equal time to speak, I chose to show the actual word frequency counts and not the percentage of total words spoken by each candidate (or a rank). The idea is that during the event a specific word is spoken by a candidate and heard by the audience a certain number of times. This, in my opinion, is meaningful no less than the percentage from total words.
Side note, as the transcripts may contain spelling mistakes and/or variations of words, the counts may not be exact and may vary. For example, sometimes in the transcript the word ‘healthcare’ may appear as two words ‘health care’. I believe the margin of error should be more or less similar for both candidates, but I did not manually check every word, nor did I read the whole transcript.
InvisibleBlueUnicorn on
I think Harris did talk about tariffs.
Spaghet-3 on
To be fair, Trump never says “Russia” once. It’s always “russia russia russia.” Does that *really* count as 3 times?
talrich on
Obamacare looks like a notable gap. I wonder how much of that gap is explained by Harris referring to the same policy by its formal name, as the Affordable Care Act or ACA?
4 Comments
* **Data source:** June 2024 debate – [CNN debate transcript](https://edition.cnn.com/2024/06/27/politics/read-biden-trump-debate-rush-transcript/index.html), September 2024 debate – [ABC debate transcript](https://abcnews.go.com/Politics/harris-trump-presidential-debate-transcript/story?id=113560542)
* **Data processing:** Python
* **Data visualization:** Excel
Shown are word counts by each candidate:
* Singular and plural form of nouns was counted as one, e.g. *job* = job + jobs, *tax* = tax + taxes, *woman* = woman + women, etc.
* Contractions were expanded for words count purposes, e.g. *I’m* was counted as two words – ‘I’ and ‘am’
As theoretically both candidates had equal time to speak, I chose to show the actual word frequency counts and not the percentage of total words spoken by each candidate (or a rank). The idea is that during the event a specific word is spoken by a candidate and heard by the audience a certain number of times. This, in my opinion, is meaningful no less than the percentage from total words.
Side note, as the transcripts may contain spelling mistakes and/or variations of words, the counts may not be exact and may vary. For example, sometimes in the transcript the word ‘healthcare’ may appear as two words ‘health care’. I believe the margin of error should be more or less similar for both candidates, but I did not manually check every word, nor did I read the whole transcript.
I think Harris did talk about tariffs.
To be fair, Trump never says “Russia” once. It’s always “russia russia russia.” Does that *really* count as 3 times?
Obamacare looks like a notable gap. I wonder how much of that gap is explained by Harris referring to the same policy by its formal name, as the Affordable Care Act or ACA?