Remove Chart Clutter

Melissa Yu provides advice on improving your data visualization skills:

Common chart clutter items include:

  • 3-dimensional effects

  • Dark gridlines (use soft gray gridlines or eliminate gridlines when possible)

  • Overuse of bright, bold colors

  • Unnecessary use of all uppercase text (uppercase text is only necessary when calling attention to an element)

Basically, remove every visualization “feature” that Excel 97 gave you…

San Francisco Crime Analysis

Vimal Natarajan shows off some R charts using crime incident data:

By analyzing the plot above, we can arrive at the following insights:

  • The number of crimes steadily decline from midnight and are at the lowest during the early morning hours and then they start increasing and peak around 6 PM in the evening. This is the same insight we arrived in my previous analysis but here we have categorized by the Police district and still see the same pattern.

  • As seen in the previous plot, Park and Richmond districts have the lowest number of crimes throughout the day.

  • As highlighted in red in the plot above, the maximum number of crimes happens in Southern district around 6 PM in the evening.

I would prefer to see code here, but it does serve to give you an idea of what R can do.

Flood Visualization

David Smith points out an animated flood chart using R:

As more settlements in Texas and France are impacted by severe flooding, this is a good time to thank the hydrologists at the NOAA who forecast river level rises in advance and give residents in affected areas time to move to higher ground. Along with topgraphic, rainfall, and weather data, monitoring stations maintained by NOAA and the USGS along rivers provide critical real-time information about river levels. NOAA scientists access this data using the dataRetrieval package for R, which they then incorporate into flood prediction models and use to generate animations like this one of the flood of the Delaware in February this year

Looks like I’ve got a new blog to follow…

R: Using Images As Labels

Jonathan Carroll shows how to use images as labels in R:

There are probably very few cases for which this is technically a good idea (trying to be a featured author on JunkCharts might very well be one of those reasons). Nonetheless, there are at least a couple of requests for this floating around on stackoverflow; here and here for example. I struggled to find any satisfactory solutions that were in current working order (though perhaps my Google-fu has failed me).

Jonathan is rather against this idea, and it does seem like the answer is a hack.  I suppose the real answer is “sometimes an image isn’t worth a thousand words.”

Retention Analytics

Patrick LeBlanc shows collegiate retention data using Power BI:

Partnering with Stetson University, I am happy to share the first of many Power BI Higher Education Analytics solutions. This solution shows student persistence, retention, and graduation patterns, leveraging BANNER as the data source. Year-over-over retention and graduation rates can be filtered to allow deeper examination of trends at the college and major level. Additional views, including retention and graduation rate tables by major and ethnicity, are included within the report solution.  The entire solution with documentation can be downloaded here.

The following image shows the first view within the report: overall persistence, retention, and graduation rates by year of first time student cohort. This report allows users to quickly show institutional retention and graduation trends across time, with the option to filter the view to show only specific colleges and/or majors.

This also serves as a Power BI demo, in case you’re hurting for good examples.

Power BI Custom Visuals Course

Devin Knight is starting a free course on custom visuals in Power BI:

Welcome to an exciting new FREE class that I am launching today!  Over the next year (that’s right year!) I will be releasing one module a week detailing how to work with all of the Power BI visuals available in the Custom Visuals Gallery.  You might ask why am I doing this?  Well The Microsoft Power BI team and the Power BI Community, through the Custom Visuals Gallery, have expanded the data visualization capabilities of Power BI drastically but unfortunately has provided little and in some cases no direction on how to use these the new features.  These Custom Visuals are designed by Microsoft on occasion but more often then not the Power BI Community has put in a lot of hard work to provide these great new features for everyone to use.  My thought is if the Power BI Community is willing to design and publish these without asking individuals for payment then I would love to provide training on these features to you for free as well.

This sounds like a nice course.  Good on Devin for doing this.

Box And Whisker Plots

Slava Murygin shows how to create a box and whisker plot in SSMS using spatial data types:

If you have no idea what Box-and-Whisker Plot is, please visit following link: http://www.wellbeingatschool.org.nz/information-sheet/understanding-and-interpreting-box-plots

At first, I will show how to do it based on AdventureWorks database in SQL Server 2014.

We will analyze amounts of Individual lines of Sales Orders within each month.

The first step is to create a Data Set to process.  That Data Set will contain a Month, Single Line amount and order number of that record within a month.

This is really cool…but I wonder if it wouldn’t be better to do this in R, where it’d take a lot less code.  If you can’t reach out to R, though, this is a good way of visualizing results.

Ambari With Grafana

Sid Wagle shows Grafana, a dashboard builder for Ambari:

Grafana provides a powerful and customizable dashboard builder for visualizing time series data. Ambari installs Grafana v2.6 as a Master Component of AMS and adds a datasource for AMS to Grafana. The dashboard builder is supported through a Metadata API in AMS that allows easy discovery of metrics, applications and hosts which are the key components that formalize an API call to AMS. There has been significant work put into creating templated dashboards for Hadoop ecosystem services tailored towards analyzing issues and performance bottlenecks on the Hadoop cluster. The following is an image of the dashboard builder highlighting the metric name drop down with type ahead and auto complete along with options to apply aggregate functions as needed based on whether the metric is a GAUGE or a COUNTER.

This is the beginning of a good visualization system for Hadoop metrics.

Dashboard Design

Melissa Yu explains how people look at dashboards:

Dashboards can be used to communicate a dense collection of information efficiently on a single canvas. Your audience has a limited amount of time to monitor key metrics to get a quick status and identify anything that needs attention. The attention span of the average human has gone from about 12 seconds in 2000 (when mobile phones became mainstream) to about 8 seconds today – a second less than a goldfish – according to a 2015 study.

Following data visualization design principles is key to making your dashboard easily consumable. A poorly designed dashboard can make your eyes jump all over the screen. While it won’t give you much insight, it may cause a headache. In the Western world, we read from top left to right, then zig-zag down left and scroll right again (in a Z-pattern). Understanding where the audience’s eyes will start and travel next allows you to guide them through your dashboard.

Check the link for more details.

Graphing With Microsoft R Open

David Smith points out a free e-book on creating effective graphs with Microsoft R Open:

The examples were done using Microsoft R Open, but since it’s 100% compatible with R the code works with any relatively recent R version.

Naomi and Joyce presented several examples from their e-book in a recent webinar (presented by Microsoft), and fielded lots of interesting questions from the audience. If you’d like to see the recorded webinar and also receive a copy of the slides and the e-book, follow the link below to register to receive the materials via email.

The book is free, the code is available on GitHub.  What more could you ask for?

Categories

May 2019
MTWTFSS
« Apr  
 12345
6789101112
13141516171819
20212223242526
2728293031