Political Speech is the main way politicians communicate with their constituents – in the modern political sphere, politicians have several different avenues to make their platform known and to advocate their agenda. From the formal floor speeches to shorter, snappier social media posts (such as those on X, formerly known as Twitter), the information age has made types of political speech much more diverse. However, does the content of what is being said differ from speech type to speech type? Do politicians discuss different topics when they address their fellow politicians versus when they post to social media?
In order to get a broad understanding of the content of various different speech types, I applied the Latent Dirichlet Allocation (LDA) model to the political nonprofit VoteSmart’s database of political speech from officials and candidates. In a broad sense, the LDA model provides a general idea of the content of various speeches by separating them out into a predefined number of topics (set by the user). By looking at the most probable words for each topic, it becomes much easier to see what topics are mentioned across a given set of speeches.
I analyzed a random sample of 10000 political speeches from VoteSmart’s database. Grouping them by speech type, I used the LDA model to identify 6 topics for each group. I then visualized the most prevalent words for each topic. The words appear as abbreviated versions of themselves, or “stems” – this allows the model to treat words with similar roots (such as senator, senators, senate, senatorial, etc..) as the same, making analysis more insightful. I manually labeled each topic based on its identifying words.
The first group analyzed was political letters. When written by politicians, letters are typically addressed to other members of government. They tend to be a demonstration of action in support of a specific policy: for example, a state’s governor writing to the Secretary of Energy to advocate for lower gas prices. While formally addressed to an official, political letters often provide politicians with a means to demonstrate their advocacy to their constituents. I labeled the 6 topics based on which words were most likely to appear:
Topic 1: Procedural Matters
Topic 2: General Legislation Advocacy
Topic 3: Foreign Military Policy
Topic 4: COVID-19 Policy
Topic 5: Energy Policy
Topic 6: Healthcare Policy
Note: Topic 2 seems to be picking up on general language common to letters – formally addressed, nonspecific advocacy.
Executive Orders Executive orders are a very narrow form of political speech, making them interesting to investigate through a topic model. Written by either presidents or governors taking near unilateral action, they are each generally limited in their scope and are directed towards specific issues.
The model picked up on executive orders pertaining to the following (manually labeled):
Topic 1: Education
Topic 2: Administrative language
Topic 3: Fiscal policy
Topic 4: Commerce/Manufacturing
Topic 5: Administrative language
Topic 6: Voting Rights
Note: Topics 2 and 5 seem to be picking up on the more formal language required of a legal document such as an Executive order.
Social Media Posts Social Media in general, and specifically X (the app formerly known as Twitter), have seen a meteoric rise in popularity as a form of political messaging over the last 5 years. Posts on platforms such as X, Instagram, and Facebook have become a readily accessible, easily digestible form of communication between politicians and their constituents. However, character limits and the fast paced nature of scrolling on a timeline often result in shorter messages. The data reflect this; social media posts were the shortest “speeches” I analyzed, averaging only 48 words per post. I labeled the 6 topics that were most likely to appear:
Topic 1: General contemporary political commentary
Topic 2: COVID-19/Labor policy
Topic 3: Legislation/Donation advocacy
Topic 4: Healthcare
Topic 5: Donald Trump
Topic 6: Infrastructure
Speeches (Including Floor Speeches and Statements)
This is the most archetypal form of political speech. This category includes long form, first person speeches generally delivered to a live audience. Averaging around 934 words per speech in the sample, they are usually thought of as not constrained to specific language or addressed to a specific person. The model picked up on word probablilies pertaining to the following (manually labeled) topics:
Topic 1: Foreign Policy
Topic 2: Legislative Advocacy
Topic 3: Senate Floor Speeches
Topic 4: Healthcare spending
Topic 5: Healthcare
Topic 6: Welfare
I also analyzed Op-eds – published articles written by politicians. They provide an opportunity for political speech distinct from most others. They are meant to be read within a publication, and are explicitly addressed to the public. They are often confined to specific political issues, which the author is qualified to comment on. The model picked up on word probablilies pertaining to the following (manually labeled) topics:
Topic 1: Small Business Policy
Topic 2: Healthcare/Social spending
Topic 3: Government Spending
Topic 4: Economic Policy
Topic 5: Women’s Rights
Topic 6: Gun Control
The analysis of the topics confirmed conventional wisdom surrounding how different forums of political speech differ from each other. The form of speech that appears most distinct from the others is social media posts. Not only are they the shortest by a significant margin, they also seem to be much more focused on contemporary issues, with words like “today”, “POTUS”, and “candidate”. Social media posts are also the only speech type to reveal discussion of a particular political figure (Donald Trump) through topic analysis. This supports the idea that politicians use social media not only to advocate for a legislative agenda, but also to discuss contemporary politics and support or attack other candidates.
Analyzing the topics from other forms of speech revealed discussion of a variety of specific legislative issues, though some were almost ubiquitous. Healthcare was present as a topic across all forms of speech (sometimes appearing in more than one topic within a given speech type), and economic policy language appeared in several speech types as well.
When reading/listening to politicians and their statements, it’s important to keep in mind to whom their words are addressed and the platform through which they’re delivered. Factors like these often inform what to expect – political speech is not the same across different contexts. The diversity of political speech across platforms presents a challenge to American voters: how to judge a candidate holistically when there is simply too much information to consider to do so realistically. While the volume and variety of candidate speeches is just one facet of this issue, it underscores the importance of how citizens get informed during election seasons. Organizations like VoteSmart not only make the kind of analysis in this article possible with their diligent research, but also help Americans make sense of the increasingly complex landscape of information before them. Having access to unbiased, reliable, and accessible information is critical to voters making informed decisions, and to keeping democracy healthy. Written by Vote Smart Special Interest Groups and API Intern - Mark Verzhbinsky