- Nguyen, Thu T;
- Criss, Shaniece;
- Dwivedi, Pallavi;
- Huang, Dina;
- Keralis, Jessica;
- Hsu, Erica;
- Phan, Lynn;
- Nguyen, Leah H;
- Yardi, Isha;
- Glymour, M Maria;
- Allen, Amani M;
- Chae, David H;
- Gee, Gilbert C;
- Nguyen, Quynh C
Background: Anecdotal reports suggest a rise in anti-Asian racial attitudes and discrimination in response to COVID-19. Racism can have significant social, economic, and health impacts, but there has been little systematic investigation of increases in anti-Asian prejudice. Methods: We utilized Twitter's Streaming Application Programming Interface (API) to collect 3,377,295 U.S. race-related tweets from November 2019-June 2020. Sentiment analysis was performed using support vector machine (SVM), a supervised machine learning model. Accuracy for identifying negative sentiments, comparing the machine learning model to manually labeled tweets was 91%. We investigated changes in racial sentiment before and following the emergence of COVID-19. Results: The proportion of negative tweets referencing Asians increased by 68.4% (from 9.79% in November to 16.49% in March). In contrast, the proportion of negative tweets referencing other racial/ethnic minorities (Blacks and Latinx) remained relatively stable during this time period, declining less than 1% for tweets referencing Blacks and increasing by 2% for tweets referencing Latinx. Common themes that emerged during the content analysis of a random subsample of 3300 tweets included: racism and blame (20%), anti-racism (20%), and daily life impact (27%). Conclusion: Social media data can be used to provide timely information to investigate shifts in area-level racial sentiment.