Graphing Swear Words In Movies

Jos Dirksen uses Spark and D3 to count and graph swear words in movies:

So how do we do this? Well, the first thing to do is get the number of swearwords per minute. I mentioned that for the original article someone just counted every swearwords, in our case, we’re just going to parse a subtitle file, and extract the swear words from that.

Without going into too much detail, you can find the code I’ve experimtend with in this gist (it’s very ugly code, since I just hacked something together that worked).

Jos includes counts for four movies.  This link does contain a few bad words, but if you get past that, it’s a good pattern for analyzing word counts in general.

Related Posts

Interactive ggplot Plots with plotly

Laura Ellis takes us through ggplotly: As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration […]

Read More

Custom ggplot2 Fonts

Daniel Oehm shares two techniques for using custom fonts in your ggplot2 visuals: ggplot – You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme. Fonts can breathe new life into your plots, helping to match […]

Read More


July 2016
« Jun Aug »