Participate

What’s Almost Left Unsaid: An Analysis of Harvard Confessions

An analysis of confessions and sentiment on Harvard Confessions.


By Jenny Gu, Melissa Kwan, Sahana Srinivasan & Yijiang Zhao
06-02-2019

The Facebook page Harvard Confessions allows students to submit notes to crushes, rants about school, and general controversial opinions under the veil of anonymity. Started March of this year, it has amassed over 1,500 likes on Facebook. On the whole, what do Harvard students feel the need to confess to the Internet? HODP scraped the text, dates, and likes of the first 365 confessions, posted over the course of two months up through mid-April, to find out.

The most popular topics of conversation were friends, love lives, and Harvard, as seen below in a word cloud that compiles the most popular words used in all posts, excluding basic articles and common words . The word cloud also represents the fact that multi-post threads focused on singular, niche topics sometimes dominate — the Asian-American Association, for example, makes the list (seen between the ‘g’ and ‘i’ of ‘girl’).


A lot of the posts, be they shoutouts to crushes, compliments, or complaints, were about other people.

We correlated number of likes with the presence of specific words to see if certain topics made posts more popular. Correlation coefficients were all low enough, below 0.3, not to be definitively meaningful: the correlation coefficient of the relationship between post length and number of likes was 0.29, between likes and explicit mention of Harvard it was 0.215, and between presence of expletives and mentions of Harvard it was 0.24. 22 percent of posts mentioned Harvard explicitly.

We're Sad, But Not That Sad

Here are some examples of how posts were categorized:

Positive:



Negative:



Neutral:



So how did we fare overall? We subtracted 0.5 from the results to change the sentiment range from [0, 1] to [-0.5, 0.5], resetting a neutral sentiment to 0. After aggregating the results, we found the mean sentiment of the posts we analyzed was -0.129 and variance was 0.026, meaning that the majority of posts fell in the neutral-moderate range. Note from the right skew that the number of extremely negative posts (< -0.3) far outweighed the number of extreme positive posts (> 0.3 or above).


The distribution of post sentiment, with 0 representing completely neutral. Results are noticeably skewed right.

A Rise in Popularity

As could be expected, the responses to a post (the aggregate of views, likes, and comments) increased over time as the page amassed more likes. The break in data in mid-March corresponds to spring break. (The current, active page has 1,641 likes as of early June; for comparison, the well-established MIT Confessions Facebook page has about 33,000.)


A plot of likes, comments, and views of all posts from a given day against date of the posts, for the first two months of posts.


Each point represents the number of posts of that sentiment on that day. Posts were not made every day.



Spring Break Blues

Based on the graphs above, we analyzed the mood of the posts across time. Interestingly (although not surprisingly), the number of negative posts drastically increased immediately after school returned to session after spring break (which ran from March 16 to March 24). Note when looking at the line graph that no posts were made over spring break. Admittedly, the overall number of posts after spring break, due to the hiatus in posting, spiked as well, but the number of positive posts actually dropped right after break. The Confessions page appears to reflect the overall mood of the student body and the elevated levels of stress from coming back on campus.

Detailed Sentiment Analysis

What confession topics are the most common? And which are the most popular? To find this out, we manually categorized each post as negative, positive, or neutral and by topic. Compilation posts were treated as one, since they tend to have a common theme, which may skew the representation of positive posts due to “compliment compilations” being an oft-recurring type of post. Negative posts received a marginally greater average number of likes than did positive posts, with neutral posts being noticeably less popular. It seems we tend to more actively show approval of more polarized posts.


Plot of likes against our manual sentiment categorization.

The seven categories of confession topics we used were love lives, campus, general life, compliments, school, replies to other posts, and friends. Posts on love, dating, and hookups comprised a plurality of the page, followed by campus and then general life.


Post breakdown by topic.

Below is a breakdown of the mood of posts within each category: posts about campus life and Harvard are mostly negative and had a considerably higher proportion of negative posts than any other category, even love. General life posts were more often positive than posts on other topics. Post about love lives were also overwhelmingly negative or neutral.


Sentiment breakdown by topic.

We also broke down the average number of likes within each category of post. Campus- and Harvard-related posts received on average the most likes, perhaps because they are the most universally relatable, especially when compared to personal life confessions or individual shoutouts.


Average number of likes by topic.

So what do Harvard students need to confess anonymously on the Internet? Mostly lamentations on Harvard; love, hookups, and dating; and life in general. As someone just scrolling through the page might surmise, our confessions lean negative. It makes sense that it might be easier to tell to a Google Form things that are hard to admit or complain about in person, and it makes sense that the page has been getting more popular as more people read, relate to, and want to submit their own confessions. And though they’re dominated by the negative and neutral posts, the positive ones — mostly compliments or shout-outs to specific friends or crushes —still get likes and do exist.


Harvard Open Data Project
© 2016-2020, Built with Sanity & Gatsby

Resources
Docs
Harvard Wiki

The code for this website is open source.
Subscribe to our monthly newsletter

Interested in open data? Join the team.