Press "Enter" to skip to content

Category: Visualization

Mapping Geospatial Data

The folks at Sharp Sight Labs have a great blog post on mapping geospatial data using R:

If you’ve learned the basics of data visualization in R (namely, ggplot2) and you’re interested in geospatial visualization, use this as a small, narrowly-defined exercize to practice some intermediate skills.

There are at least three things that you can learn and practice with this visualization:

  1. Learn about color: Part of what makes this visualization compelling are the colors. Notice that in the area surrounding the US, we’re not using pure black, but a dark grey. For the title, we’re not using white, but a medium grey. Also, notice that for the rivers, we’re not using “blue” but a very specific hexadecimal color. These are all deliberate choices. As an exercise, I highly recommend modifying the colors. Play around a bit and see how changing the colors changes the “feel” of the visualization.

  2. Learn to build visualizations in layers: I’ve emphasized this several times recently, but layering is an important principle of data visualization. Notice that we’re layering the river data over the USA country map. As an exercise, you could also layer in the state boundaries between the country map and the rivers. To do this, you can use map_data().

  3. Learn about ‘Spatial’ data: R has several classes for dealing with ‘geospatial’ data, such as ‘SpatialLines‘, ‘SpatialPoints‘, and others. Spatial data is a whole different animal, so you’ll have to learn its structure. This example will give you a little experience dealing with it.

I also like the iterative approach they discuss.  You’ll almost never get it right the first go-around, but one of the nice things about ggplot2 is that it’s designed to be iterative:  you layer your changes on, making it a bit easier to fiddle with them to get the visualization just right.

Comments closed

Using Sparklyr To Analyze Flight Data

Aki Ariga uses sparklyr on Apache Spark 2.0 to analyze flight data living in S3:

Using sparklyr enables you to analyze big data on Amazon S3 with R smoothly. You can build a Spark cluster easily with Cloudera Director. sparklyr makes Spark as a backend database of dplyr. You can create tidy data from huge messy data, plot complex maps from this big data the same way as small data, and build a predictive model from big data with MLlib. I believe sparklyr helps all R users perform exploratory data analysis faster and easier on large-scale data. Let’s try!

You can see the Rmarkdown of this analysis on RPubs. With RStudio, you can share Rmarkdown easily on RPubs.

Sparklyr is an exciting technology for distributed data analysis.

Comments closed

Superheat

David Smith shows off a very cool heatmap package called superheat:

While the superheat pacakge uses the ggplot2 package internally, it doesn’t itself follow the grammar of graphics paradigm: the function is more like a traditional base R graphics function with a couple of dozen options, and it creates a plot directly rather than returning a ggplot2 object that can be further customized. But as long as the options cover your heatmap needs (and that’s likely), you should find it a useful tool next time you need to represent data on a grid.

The superheat package apparently works with any R version after 3.1 (and I can confirm it works on the most recent, R 3.3.2). This arXiv paper provides some details and several case studies, and you can find more examples here. Check out the vignette for detailed usage instructions, and download it from its GitHub repository linked below.

Click through for some great-looking examples.

Comments closed

R Visuals In Power BI

Ryan Wade ties ggplot2 visuals into Power BI:

The package that we are going to use to develop our custom visualization is ggplot2. The ggplot2 package is arguably the most popular data visualization package in R. It is based on the “grammar of graphics” concept that was created by the statistician, Leland Wilkinson. The ggplot2 package allows you to approach creating charts and graphs in the same manner that Bob Ross approached painting trees in the forest. With ggplot2 you are able to start with a blank canvas and add layers upon layers via short code snippets that builds on each other until you end up with the desired visualization.

The pbix file that is being used in this blog can be found here: http://bit.ly/2jwoCyP. The GentleIntroToR_ChartExample.pbix file contains an example of using R to create a box plot chart that shows the distribution of player scores for the L.A. Lakers. Chiclet slicers were added that allows you to filter by division and/or opponent. The R visualization was created in four steps.

Check out the PBIX file.

Comments closed

Animating Visuals In R

Tomaz Kastrun shows how to create animated charts in R using ggplot2:

In addition to R code, the ImageMagic program needs to be installed on your machine, as well. Also the speed, quality and many other parameters can be set, when creating animated gif.

Animated gif can be also included into your SSRS report, your Sharepoint site or any other site – like my blog 🙂 and it will stay interactive. In Power BI, importing animated gif as a picture, unfortunately will not work.

Be very careful with this, as not everything supports animated GIFs and you can make some really painful graphs if you try hard enough…

Comments closed

Sankey Bar Charts

Devin Knight continues his custom visuals series:

In this module you will learn how to use the Sankey Bar Chart Power BI Custom Visual.  The Sankey Bar Chart is used to show a flow of data between different stages of a process.

It’s an interesting mix of sankey, bar chart, and funnel.  In other words, you may only have one thing you can use it for, but it’ll be a really good use.

Comments closed

Sorting By Column In Power BI

Reza Rad explains how to sort a column by another column’s value in Power BI visuals:

Problem happens when you want a Text field to be ordered based on something different than the value of the field. For example if you look at above chart you can see that months ordered from April to September. This is not order of months, this is alphabetical order. If you change the sorting of visual, it will only change it from A to Z, or Z to A. To make it in the order of month numbers you have to do it differently.

Read on for the solution.

Comments closed

7 Visualizations In R

Dikesh Jariwala provides sample R code for seven common visualizations:

In your day-to-day activities, you’ll come across the below listed 7 charts most of the time.

  1. Scatter Plot
  2. Histogram
  3. Bar & Stack Bar Chart
  4. Box Plot
  5. Area Chart
  6. Heat Map
  7. Correlogram

We’ll use ‘Big Mart data’ example as shown below to understand how to create visualizations in R. You can download the full dataset from here.

That’s a nice set of visuals, covering a broad swath of potential visualization scenarios.

Comments closed

Horizontal Funnel

Devin Knight shows off the horizontal funnel Power BI custom visual:

In this module you will learn how to use the Horizontal Funnel Power BI Custom Visual.  The Horizontal Funnel functions somewhat similar to the traditional funnel but it allows you to display a secondary measure and has a few more customizations than you would normally get. You’ll find that the Horizontal Funnel is great for displaying a flow of data.

One of the better non-sales uses of funnels I’ve seen is tracking completion rates on multi-page forms or multi-step processes.  If you see a huge drop-off at one step in the process, it might indicate a bug in the form or some incongruity with the end user’s expectation.

Comments closed