How has sentiment towards Victoria’s lock-down changed over time?
It is quite clear to most in Melbourne that public opinion towards the lock-down has changed over its duration. We notice shifts in both the support for lock-down policy itself and changes in our sentiment towards living under these restrictions. These two phenomenon are not necessarily linked; one can become increasingly frustrated and upset over the realities of living under lock-down without their belief in the effectiveness or the strength of the moral reasoning underpinning lock-down policies changing. Thus, we should distinguish between support for lock-down and sentiment towards life under lock-down.
Opinion polls on preferred premier may give us occasional insights into how attitude has changed with respect to the first question; of support for lock-down policy. We can assume that changes in support for Dan Andrews are due to people agreeing/disagreeing with the lock-down policies of his governments.
However, when it comes to how the feelings or attitudes of Melburnians towards life under lock-down have changed over the course of lock-down: whether optimism and positivity have been substituted for more pessimistic, angry attitudes, there is little in the way of quantifying these shifts — particularly on a day to day basis.
In this article I look at one possible way of getting a sense of these changes. The r/melbourne subreddit contains daily discussion threads about COVID-19 in Melbourne. These threads are a rich source of data on the feelings of Victorian’s towards the government’s lock-down policies if harnessed correctly. Using the NLTK Python package to perform some very rudimentary natural language processing, I attempt to do exactly that by quantitatively evaluating the sentiment of comments in these daily threads and then summarizing them (the threads) with a few different metrics designed to capture the level of frustration (sentiment) in a way that can be compared to other days for which a discussion thread exists (details in a second).
I want to stress that sentiment score for a given day’s discussion thread should not be taken to reflect sentiment/support towards Dan Andrews/the government. Though it is true that if a lot of people are making negative comments about the Andrew’s government this would be reflected in the overall sentiment metrics, it could also be the case that, for example, people are making many negative comments about the opposition , or about life under lock-down policies but without necessarily believing the policies to be misguided — so we can’t use this to gauge support for the government’s policies. Instead, I think that this captures the general moods and attitudes of Victorian (Reddit users) on the lock-down as conveyed through the tone they use when discussing it. If they are feeling positive about lock-down then, I believe, they are likely to sound more positive in their comments — whatever it is they are talking about. The opposite case holds too.
Before going into the results, I’ll describe how I scrapped and analyzed the Reddit data as well as the overall metrics I use to condense the series of sentiment scores for posts in a daily thread into a single value representing the sentiment on that day.
I created a Python script that goes through the daily discussion threats from the 27th of September to the 23rd of October and records all top level comments (comments that are not replies to any other comment) and the associated karma (the net amount of positive votes the comment has received from other users). I then use NLTK’s VADER sentiment analysis package to compute the compound ‘polarity’ of each comment. Polarity refers to how positive or negative the sentiment or ‘mood’ behind a comment is. It is a score between -1 and 1 with -1 representing the most negative comment possible and 1 representing the most positive comment possible.
Based on these comments, their karma and their polarity, I created a few metrics to quantify the overall ‘positivity’ of the day.
- Mean polarity: as the name implies this is simply the average polarity score over the comments taken from the relevant daily discussion thread.
- Proportion charged: the proportion of all comments that have a polarity NOT equal to zero (i.e. just those comments that are positive or negative), that are positive.
- Proportion total: the same as the above but as a proportion of all comments, even those that are neutral.
- Beta: for each thread I compute a simple linear regression of karma on sentiment. i.e. I estimate how increasing the positivity of a comment is associated with, on average, that comments karma. I divide this coefficient by 10. As such this value represents the average change in karma associated with polarity being increased by 0.1.
The beta value is important as, it may be a better measure of general sentiment than the others. It could be the case that the same users post on most discussion threads and that their attitudes are reasonably constant. However, the support for these comments from other users (as evidenced by their karma) may vary over time. This measure seeks to capture the sentiment expressed by those users who do not comment and and express their sentiment only via their upvotes and downvotes on other comments.
I quantify these values for each date in the range mentioned and graph them over time. I also graph the relation between polarity and karma for selected dates.
Here’s the Python script I used for those interested:
This rudimentary sentiment analysis reveals some interesting trends in how the feelings of Victorians have changed over time.
I’ll let the graphs speak for themselves and then go into the details a bit more.
Here’s a table of the results:
The correlation between the different metrics over time is also of some interest:
Here we see that all measures except beta are highly correlated with one another. This is to be expected as both mean polarity and the proportion of positive comments are both meant to capture the overall positivity of a particular day’s discussion thread.
We see that mean polarity and the proportion of positive comments measures tend to zigzag, almost as if one day’s frustration begets the next days optimism which in turn begets the next day’s frustration — though this is almost certainly reading too much into the noisy data. There seems to be a general trend of positive sentiment decreasing, reaching its lowest point on the 10th of October, and from there climbing up again (still zigzagging), as if spirits are rising with the ‘end’ in sight.
Beta does not correlate as strongly with the other measures (although still reasonably strongly with mean and proportion charged). This could suggest that there are discrepancies between the popularity of negative/positive comments and the frequency of these comments, illustrating both a divergence between the opinions of the posters and the opinions of all Reddit users (in the r/melbourne subreddit), many of whom do not post.
The p-value of the beta coefficients (for the hypothesis test β≠0) vary, for the 18th of October, 27th of September and 8th of October (all of which are discussed in detail below) they are 0.1, 0.3 and so close to 0 that my console doesn’t display a number that isn’t 0.
Within the range for which I have data, there were important press conferences on the 27th and the 18th where plans for the future and easing of restrictions were announced.
Looking at the data for the 18th, it is a local minimum with respect to mean and proportion measures. This (and again it could just be ‘noise’) indicates that perhaps people were initially disappointed with the modest easing of restrictions announced on this date relative to expectation (hence the drop in sentiment from the previous days).
For the 27th, sadly I do not have data prior to this date, we see that sentiment seemed to trend downwards (until the 10th of October from this date), perhaps as people grew disillusioned with the impact of the eased restrictions on the quality of their lives under lock-down.
Interestingly, the beta scores on both these dates were quite high, suggesting positivity is rewarded by the thread’s readers, that people prefer positive messages in the face of announcements of change in restrictions. Though the relations between karma and sentiment may not look particularly strong on these dates (and indeed, they are not), the difference is most obvious when compared to a date where negative comments are more popular.
I hope this has been an enjoyable and informative read. I plan to extend this analysis at some point in the future. Prior to the daily discussion threads being posted there were mega-threads that seemed to span several days. I could use PRAW to scrape comments from these threads along with their date of posting and then group comments by date to do a similar type analysis that extends further back to the start of the second lock-down. This would no doubt give a better picture of trends, however, it seems to take a long time to scrape comments from very large threads so it may be a while before I return to this.
If you have any feedback please let me know.
Thanks for reading!