Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Previously Published Works bannerUC San Diego

A longitudinal and geospatial analysis of COVID-19 tweets during the early outbreak period in the United States

Abstract

Introduction

Early reports of COVID-19 cases and deaths may not accurately convey community-level concern about the pandemic during early stages, particularly in the United States where testing capacity was initially limited. Social media interaction may elucidate public reaction and communication dynamics about COVID-19 in this critical period, during which communities may have formulated initial conceptions about the perceived severity of the pandemic.

Methods

Tweets were collected from the Twitter public API stream filtered for keywords related to COVID-19. Using a pre-existing training set, a support vector machine (SVM) classifier was used to obtain a larger set of geocoded tweets with characteristics of user self-reporting COVID-19 symptoms, concerns, and experiences. We then assessed the longitudinal relationship between identified tweets and the number of officially reported COVID-19 cases using linear and exponential regression at the U.S. county level. Changes in tweets that included geospatial clustering were also assessed for the top five most populous U.S. cities.

Results

From an initial dataset of 60 million tweets, we analyzed 459,937 tweets that contained COVID-19-related keywords that were also geolocated to U.S. counties. We observed an increasing number of tweets throughout the study period, although there was variation between city centers and residential areas. Tweets identified as COVID-19 symptoms or concerns appeared to be more predictive of active COVID-19 cases as temporal distance increased.

Conclusion

Results from this study suggest that social media communication dynamics during the early stages of a global pandemic may exhibit a number of geospatial-specific variations among different communities and that targeted pandemic communication is warranted. User engagement on COVID-19 topics may also be predictive of future confirmed case counts, though further studies to validate these findings are needed.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View