Skip to main content
Open Access Publications from the University of California

Building a Psychological Ground Truth Dataset with Empathy and Theory-of-Mind During the COVID-19 Pandemic


As the mental health crisis deepens with the prolonged COVID-19 pandemic, there is an increasing need for understanding individuals’ emotional experiences. We have built a large-scale Korean text corpus with five self-labeled psychological ground-truths: empathy, loneliness, stress, personality, and emotions. We collected 19,025 documents of daily emotional experiences from 3,805 Korean residents from October to December 2020. We collected 42,128 sentences with different levels of theory-of-mind. Each sentence was annotated by trained psychology students and reviewed by experts. Participants varied in their ages from the early 20s to late 80s and had various social and economic statuses. The pandemic impacted the majority of daily lives, and participants often reported negative emotional experiences. We found the most frequent topics: responses to confirmed cases, health concerns of family members, anger towards people without masks, stress-relief strategies, change of the lifestyle, and preventive practices. We then trained the Word2Vec model to observe specific words that match each topic from the topic model. The current dataset will serve as benchmark data for large-scale and computational methods for identifying mental health levels based on text. This dataset is expected to be used and transformed in many creative ways to mitigate COVID-19-related mental health problems

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View