We present in this dashboard the first results of our analysis of Twitter data relating COVID. We analyze the mentions of COVID-related terms over time, and especially the mention of COVID symptoms. We note a strong correlation between the number of tweets including symptoms and the number of victims in Île-de-France (Paris region).
We collected tweets specifically from users in Île-de-France. We first used the Streaming API to identify users in the Paris area, and then collected the historic data from these users. This dashboard presents our analyses, based on 30,000 Twitter users, for a total of about 33 million tweets from December 2019. We exclude the retweets from this analysis which is now based on 17 million tweets.
The graphs are interactive, ie. one can select the variables of interest in the legend.
You can find the notebook that produced these analyses on GitHub and also run it directly in MyBinder.
We used public data from Santé Publique France about emergencies and SOS Médecins data related to COVID. We use data about passages to emergencies for suspicion of COVID, and hospitalizations for suspicion of COVID. We present the raw (daily) data, as well as the data averaged on 3 days.
We notice that the curves of tweets mentioning symptoms and emergencies seem really similar, the first one preceding the emergencies curves by about 11 days.
We shift the tweets curve by 11 days and find that the two curves superpose.
We also analyzed the evolution of the number of deaths due to COVID, based on the Data from OpenCOVID19-fr. We also notice a similar trends between the number of tweets mentioning symptoms, and the number of deaths in Île-de-France, but with a lag of around 20 days.