The Simpsons by The Data

2 minute read

The iconic animated show “The Simpson” is in its 28th season with almost 600 episodes. The first season started in 1989 and it almost aired for three decades. As a data science student at Metis (NYC Cohort 10), I wanted to work with something that I am really interested in for my third project. I’ve always wanted to work for a media/TV show company and been a fan of The Simpsons growing up. When I came across a dataset about the Simpsons in Kaggle, I had to dive in and start exploring before I even decide what to do with the project. The data can be found here.

I created a D3.js / Dimple.js chart for all episodes in 27 seasons of the show. I wanted to be able to identify which episodes are the best / worst ones in a glance. The chart shows all episodes as data points with x-axis being air date in year and y-axis for IMDB rating. The size of the circle represents US viewership number. I gathered additional writer/director information from Wikipedia and included in the toolip.

What is happening?

The IMDB rating declined over the 27 seasons. We could also see that the circle sizes being decreased gradually. As most of the Simpsons fans would agree, the first 8 seasons have great ratings. So if you are planning to watch The Simpsons again or for the first time, you know which ones to watch!

The purpose of this visualization was to practice D3 and simply be fun. This visualization itself was not part of my project. The actual project was about using Natural Language Processing for the script data. I ended up with summarizing the episode with LDA topic modeling and a bunch of exploratory analysis, which turned out to be quite interesting. Of course as a data science student just started exploring NLP techniques with limited time for the project, I had a lot of limitations.

Update June, 2017

I interviewed with Electronic Arts and they wanted me to present about a recent data science project as a part of the final interview. I thought this Simpson project was perfect since EA developed Simpsons mobile game. I didn’t get the job but it was really fun traveling to the EA’s HQ in Redwood City, CA and share what I learned to the data science practitioners in the Silicon Valley. My powerpoint slide for the interview can be found in here

Updated: