Press "Enter" to skip to content

Category: Visualization

Word Cloud Visual

Devin Knight shows off the word cloud custom visual in Power BI:

Key Takeaways

  • Great for parsing unstructured data

  • Utilize stop words to remove commonly used filler words like a, the, an, etc…

    • You can use the default stop word that are provided and add your own that you would like to remove from the visual.
  • The size of the words in the visual tell you how frequently the word is used.

Cf. yesterday’s word cloud example.  I’m not sure how truly valuable word clouds are for visualization purposes, but at least they’re fun to peruse.

Comments closed

Word Clouds In Python

Allison Tharp shows how to generate a word cloud using Python:

Every week, someone on Reddit posts a “word cloud” on all of the NFL team’s subreddits.  These word clouds show the most used words on that subreddit for the week (the larger the word, the more it was used).  These word plots are always really fascinating to me, so I wanted to try to make some for myself.  In this tutorial, we’ll be making the following word cloud from my board game stats twitter feed, @BGGStats

Looks like the implementation is fairly straightforward, so check it out.

Comments closed

NetworkD3

Vessy combines Javascript and R to visualize networks:

The networkD3 package provides a function called igraph_to_networkD3, that uses an igraph object to convert it into a format that networkD3 uses to create a network representation. As I used igraph object to store my network, including node and edge properties, I was hoping that I may only need to use this function to create a visualization of my network. However, this function does not work exactly like that (which is not that surprising, given the differences in how D3.js works and how igraph object is defined). Instead, it extracts lists of nodes and edges from the igraph object, but not the information about all node and edges properties (the exception is a priori specified information about nodes membership groups/clusters, which can be derived from one or more network properties, e.g., node degree). Additionally, the igraph_to_networkD3 function does not plot the network itself, but only extracts parameters that are later used in theforceNetwork function that plots the network.

This is the kind of thing I want to see when working with network data.  It doesn’t necessarily scale, but given how well the human eye tracks relationships, this is very useful.

Comments closed

Analyzing The Simpsons

Todd Schneider has a fun analysis of the Simpsons:

Per Wikipedia:

While later seasons would focus on Homer, Bart was the lead character in most of the first three seasons

I’ve heard this argument before, that the show was originally about Bart before switching its focus to Homer, but the actual scripts only seem to partially support it.

Bart accounted for a significantly larger share of the show’s dialogue in season 1 than in any future season, but Homer’s share has always been higher than Bart’s. Dialogue share might not tell the whole story about a character’s prominence, but the fact is that Homer has always been the most talkative character on the show.

My reading is that it took a couple seasons for show writers to realize that Homer is the funniest character and that Bart’s character was too context-sensitive to be consistently funny.  It took quite a bit more time before merchandisers figured that out, to the extent that they ever did.

Comments closed

Stream Graphs

Devin Knight continues his visualization series with the Stream Graph:

Key Takeaways

  • Works and looks similar to a Stacked Area Chart but with a wiggle feature that gives it a more fluid look and feel

  • Great for displaying data that changes over time

At first, I read this as “Steam Graph,” which made it sound like a steampunk visualization with unnecessary pipes and mechanical accouterments, but alas, it was not meant to be.  I do like the stream graph visual, though.

Comments closed

Table Heatmaps

Devin Knight continues his Power BI custom visuals series:

  • In the Table Heatmap the color of the boxes is determined by the value in your measure.

  • Only 1 category field can be used, which will dynamically generate the number of columns based on the number of distinct values your field has.

  • The number and types of colors can be changed using some of the settings we’ll discuss below.

I can see the table heatmap being a good visual for calendars.

Comments closed

Tornado Visual

Devin Knight looks at the Tornado chart:

  • The Tornado has a few limitation that should be aware of before using

    • If there’s a legend value it should only have 2 distinct values

    • Each distinct category values is a separate bar with left or right parts

    • Alternatively, you can have two measure values and compare them without  a legend

I’m split on whether I like the tornado or not.  It is intuitive and information-dense, which are two major factors in its favor.  It is, however, difficult to read and compare.  This seems like a useful “big picture” chart, but you’d want to organize the data in a different way when you start drilling down.

Comments closed