Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/30272
Title: Analysis of Long COVID Phenotypes and their Impact on Mental Health and Daily Functioning: Insights from Twitter
Authors: Markovikj, Marko
Dobreva, Jovana
Lucas, Mary
Vodenska, Irena
Chitkushev, Lou
Trajanov, Dimitar 
Keywords: Long COVID, data mining, computer science, nlp
Issue Date: 2023
Publisher: Belgrade: Institute of molecular genetics and genetic engineering
Conference: 4th Belgrade Bioinformatics Conference
Abstract: In this study, we conducted an investigation into Long COVID from a user perspective, utilizing Twitter social media data. Prior to analysis, the data underwent preprocessing to obtain raw text per tweet. Our analysis commenced with basic statistical analysis and subsequently expanded to identify characteristic periods for the phenotypes based on dynamic timelines. We also explored the relationships between the phenotypes, as well as the interdependence between phenotypes and geolocation. In the context of this research, an analysis was conducted on a collection of tweets that encompassed the timeframe from March 2020 to March 2022. The dataset consisted of approximately 1.9 million tweets. In order to concentrate on word phrases, extraneous elements such as mentions, emoticons, links, and hashtags were eliminated. Subsequently, a process of lemmatization was performed. For the purpose of reducing the number of distinct phenotypes under investigation and facilitating the presentation of results, the collected data was categorized into five overarching groups: Cardiovascular, Respiratory, Daily Living, Neurological and Mental Health, and Other. The statistical data regarding the most commonly used words by individuals describing their experiences during the Long COVID period are as follows: “Ampicillin” was tweeted 125,295 times, “Death” was tweeted 121,156 times, “Suffer” was tweeted 125,113 times, and “Vaccine” was tweeted 108,968 times. We observe distinct patterns in the emergence of certain phenotypes during this period, particularly in relation to the quality of life. On August 1, 2020, the term “quality of life” was mentioned in only 223 tweets, whereas one year later, during the same month, this phenotype garnered 1,663 tweets. Our findings reveal that the occurrence of Long COVID phenotypes is influenced by both temporal and geographical factors. The analysis shows a clear and notable trend within the dataset. Specifically, it is observed that neurological symptoms, along with symptoms that impede individuals’ daily functioning, exhibit the highest prevalence, particularly during the latter half of the analyzed tweet period. This period corresponds to a time when an increasing number of individuals have recovered from COVID-19 and are reporting their experiences with Long COVID. Notably, fatigue, depression, stress, and anxiety emerge as the most prevalent phenotypes. This scientific investigation of the complex interactions between Long COVID phenotypes, mental health, and the manifestation of diverse symptoms is offering insights into the profound consequences on individuals’ lives. These findings shed light on the significant burden posed by Long COVID and its cascading effects on various aspects of individuals’ well-being and society at large.
URI: http://hdl.handle.net/20.500.12188/30272
Appears in Collections:Faculty of Computer Science and Engineering: Journal Articles

Show full item record

Page view(s)

39
checked on Sep 22, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.