IEEE Transactions on Services Computing | 2021
How COVID-19 information spread in US The Role of Twitter as Early Indicator of Epidemics
Abstract
The aim of the paper is to exploit Twitter to enhance the comprehension of COVID-19 diffusion and people reactions. To this purpose, the objectives are to identify the key terms and features used in the tweets, and the interest in the COVID-19 topics. To address those goals the paper proposes an approach that combines peak detection and clustering techniques. Space-time features are extracted from the tweets and modeled as time series. After that, peaks are detected and the textual features are clustered based on the co-occurrence in the tweets. Each cluster obtained is then associated to a topic. Results, performed over a real-world dataset of tweets related to COVID-19 in US, show that the approach is able to detect several relevant topics, of varying importance and character. A case study about the correlation of Twitter data with COVID-19 confirmed cases has been presented, also evaluating the feasibility of exploiting Twitter for the outbreak diffusion prediction. Results highlight a high correlation between tweets and real COVID-19 data, proving that Twitter can be considered a reliable indicator of the epidemic spreading and that data generated by user activity on social media is becoming an invaluable source for capturing and understanding epidemics outbreaks. IEEE