Twitter Public Health Sentiment

Basic Overview

Status: Completed

Timeline: 1 Month April 2022

Technology: R

Note: Source code for the twitter data mining is not provided out of an abundance of caution for my own developer account, but I am happy to talk through my process. I used Twitter's API accessed via the rtweet package; nothing particularly complicated nor technically worth bragging about.

Download Source Code

Why Twitter for Public Health?

80%+ of Americans Get News Primarily Online

A large majority of Americans get news at least sometimes from digital devices, according to a Pew Research Center survey conducted Aug. 31-Sept. 7, 2020.

Social media has a disproportionate effect on the disbursal of false information

The Center for Countering Digital Hate, a nonprofit whose work focuses on misinformation and hate disseminated online, conducted the study to examine the origins of anti-vaccine sentiment that has gained momentum on social networking platforms during the coronavirus pandemic. Results pinpointed a group of 12 individuals, collectively referred to as "the disinformation dozen" in the CCDH's conclusory report, who are at the forefront of false information campaigns targeting COVID-19 vaccines on Facebook, Instagram and Twitter.

Method

  • Identify a variety of news organizations representing various viewpoints in American politics
  • Find “high influence” followers of these news organizations on twitter (top 50 number of follower users per org)
  • Draw 1000 tweets from these followers' timelines
  • Isolate tweets with public health terms
  • Perform sentiment analysis on these public health related tweets
  • Test for any differences in results by news org they follow using one-way ANOVA and grouping

Summary of Results

News Org Number of Tweets Average Sentiment Median Sentiment
ABC News 1258 -0.11 0
BBC News (World) 1224 -0.02 0
Bloomberg 582 -0.17 0
CBS News 1953 -0.2 0
CDC 2660 -0.01 0
CNBC 1234 0.15 0
CNN 423 0.13 0
Forbes 1460 0.05 0
Fox News 1094 -0.22 0
NBC News 733 0.22 1
Reuters 1835 0.07 0
The Associated Press 2566 -0.12 0
The Economist 1315 0.11 0
The Guardian 2077 -0.06 0
The New York Times 1255 -0.03 0
The Wall Street Journal 923 0.16 0
The Washington Post 1075 -0.08 0
TIME 415 -0.05 0
World Health Organization (WHO) 1450 0.11 0
Sentiment by News Org Followed Boxplot

Tukey Summary Groups

The above reults may be hard to interpret and lack confidence intervals, so to determine significant difference I performed a one way ANOVA between each news org results with Tukey's honest significant difference wit ha 95% confidence interval. Each news org was the nassigned to one of several possible groups, represented by a letter; if a news org shares a letter with another news org, then there is nothing in the data to suggest a significant difference between the groups. However, if the news orgs share no letters, then there is evidence that the difference in sentiment is significant.

News Org Groups
ABC News abce
BBC News (World) abcdef
Bloomberg abce
CBS News a
CDC bcdef
CNBC df
CNN abcdef
Forbes bdef
Fox News ac
NBC News d
Reuters bdef
The Associated Press ace
The Economist bdf
The Guardian abcef
The New York Times abcdef
The Wall Street Journal bdf
The Washington Post abcdef
TIME abcdef
World Health Organization (WHO) bdf

Conclusions

There are some significant difference between groups in my findings that suggest which news sources a person turns to may affect their perception of various public health efforts.

However, this represents a tiny cross section of data, only ~38,000 tweets that were determined to be relevant from a pool of ~750,000. With a longer term project and a larger pool of data, there may be more clarity and confidence in the results. The Twitter API time limit severely limited my ability to mine for a significant number of tweets quickly.

May also need to work on a paired down list of public health terms, and do more data exploration for relevance (especially outliers!)