Skip to main content
Open Access Publications from the University of California

Identifying Health-Related Discussions of Cannabis Use on Twitter by Using a Medical Dictionary: Content Analysis of Tweets

Published Web Location


The cannabis product and regulatory landscape is changing in the United States. Against the backdrop of these changes, there have been increasing reports on health-related motives for cannabis use and adverse events from its use. The use of social media data in monitoring cannabis-related health conversations may be useful to state- and federal-level regulatory agencies as they grapple with identifying cannabis safety signals in a comprehensive and scalable fashion.


This study attempted to determine the extent to which a medical dictionary-the Unified Medical Language System Consumer Health Vocabulary-could identify cannabis-related motivations for use and health consequences of cannabis use based on Twitter posts in 2020.


Twitter posts containing cannabis-related terms were obtained from January 1 to August 31, 2020. Each post from the sample (N=353,353) was classified into at least 1 of 17 a priori categories of common health-related topics by using a rule-based classifier. Each category was defined by the terms in the medical dictionary. A subsample of posts (n=1092) was then manually annotated to help validate the rule-based classifier and determine if each post pertained to health-related motivations for cannabis use, perceived adverse health effects from its use, or neither.


The validation process indicated that the medical dictionary could identify health-related conversations in 31.2% (341/1092) of posts. Specifically, 20.4% (223/1092) of posts were accurately identified as posts related to a health-related motivation for cannabis use, while 10.8% (118/1092) of posts were accurately identified as posts related to a health-related consequence from cannabis use. The health-related conversations about cannabis use included those about issues with the respiratory system, stress to the immune system, and gastrointestinal issues, among others.


The mining of social media data may prove helpful in improving the surveillance of cannabis products and their adverse health effects. However, future research needs to develop and validate a dictionary and codebook that capture cannabis use-specific health conversations on Twitter.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View