Search

Scholarly Works (2 results)

Article
Peer Reviewed

Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US

UC San Francisco Previously Published Works (2019)

Objective: Describe variation in sentiment of tweets using race-related terms and identify themes characterizing the social climate related to race. Methods: We applied a Stochastic Gradient Descent Classifier to conduct sentiment analysis of 1,249,653 US tweets using race-related terms from 2015-2016. To evaluate accuracy, manual labels were compared against computer labels for a random subset of 6600 tweets. We conducted qualitative content analysis on a random sample of 2100 tweets. Results: Agreement between computer labels and manual labels was 74%. Tweets referencing Middle Eastern groups (12.5%) or Blacks (13.8%) had the lowest positive sentiment compared to tweets referencing Asians (17.7%) and Hispanics (17.5%). Qualitative content analysis revealed most tweets were represented by the categories: negative sentiment (45%), positive sentiment such as pride in culture (25%), and navigating relationships (15%). While all tweets use one or more race-related terms, negative sentiment tweets which were not derogatory or whose central topic was not about race were common. Conclusion: This study harnesses relatively untapped social media data to develop a novel area-level measure of social context (sentiment scores) and highlights some of the challenges in doing this work. New approaches to measuring the social environment may enhance research on social context and health.

Cover page: Pride, Love, and Twitter Rants: Combining Machine Learning and Qualitative Techniques to Understand What Our Tweets Reveal about Race in the US

Article
Peer Reviewed

Exploring U.S. Shifts in Anti-Asian Sentiment with the Emergence of COVID-19

UC San Francisco Previously Published Works (2020)

Background: Anecdotal reports suggest a rise in anti-Asian racial attitudes and discrimination in response to COVID-19. Racism can have significant social, economic, and health impacts, but there has been little systematic investigation of increases in anti-Asian prejudice. Methods: We utilized Twitter's Streaming Application Programming Interface (API) to collect 3,377,295 U.S. race-related tweets from November 2019-June 2020. Sentiment analysis was performed using support vector machine (SVM), a supervised machine learning model. Accuracy for identifying negative sentiments, comparing the machine learning model to manually labeled tweets was 91%. We investigated changes in racial sentiment before and following the emergence of COVID-19. Results: The proportion of negative tweets referencing Asians increased by 68.4% (from 9.79% in November to 16.49% in March). In contrast, the proportion of negative tweets referencing other racial/ethnic minorities (Blacks and Latinx) remained relatively stable during this time period, declining less than 1% for tweets referencing Blacks and increasing by 2% for tweets referencing Latinx. Common themes that emerged during the content analysis of a random subsample of 3300 tweets included: racism and blame (20%), anti-racism (20%), and daily life impact (27%). Conclusion: Social media data can be used to provide timely information to investigate shifts in area-level racial sentiment.

Cover page: Exploring U.S. Shifts in Anti-Asian Sentiment with the Emergence of COVID-19