Category: Visualization

WVPlots 1.0.0

Published 2018-05-29 by Kevin Feasel

Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. We are excited to announce the WVPlots is now at version 1.0.0 on CRAN!

The idea is: we sacrifice some of the flexibility and composability inherent to ggplot2 in R for a menu of prescribed presentation solutions. This is a package to produce plots while you are in the middle of another task.

I like this idea: I know the kind of plot I need and just want to throw something together for myself to give me an idea of the underlying data.

Comments closed

Power BI Color Palattes

Published 2018-05-29 by Kevin Feasel

Meagan Longoria helps us choose a color palette for Power BI reports:

A color palette is simply a collection of colors applied to the visual elements in your report. What we typically refer to as color is a combination of three main properties: hue (base color on the color wheel), intensity (brightness or gray-ness) and value (lightness or darkness). You can build an engaging and professional looking report with just 6 colors. It’s possible to have fewer colors or more colors, but 6 should cover many common visualization needs. If you are using more than 6 colors, you might want to check that you are optimizing engagement and cognitive load.

Main color – default color on graphs
Color 2 – used when multiple colors are needed in a graph or report
Color 3 – used when multiple colors are needed in a graph or report and Color 2 has already been used
Highlight color – a color used to highlight important data points to make them stand out from other points on the page
Border color – a light color used for borders on tables and KPIs where necessary
Title color – color used for visual titles and axis labels as appropriate

There’s a lot of good advice in here.

Comments closed

Visualization Over Kafka And KSQL

Published 2018-05-24 by Kevin Feasel

Shant Hovsepian shows off a data visualization tool which can read Kafka Streams data:

KSQL is a game-changer not only for application developers but also for non-technical business users. How? The SQL interface opens up access to Kafka data to analytics platforms based on SQL. Business analysts who are accustomed to non-coding, drag-and-drop interfaces can now apply their analytical skills to Kafka. So instead of continually building new analytics outputs due to evolving business requirements, IT teams can hand a comprehensive analytics interface directly to the business analysts. Analysts get a self-service environment where they can independently build dashboards and applications.

Arcadia Data is a Confluent partner that is leading the charge for integrating visual analytics and BI technology directly with KSQL. We’ve been working to combine our existing analytics stack with KSQL to provide a platform that requires no complicated new skills for your analysts to visualize streaming data. Just as they will create semantic layers, build dashboards, and deploy analytical applications on batch data, they can now do the same on streaming data. Real-time analytics and visualizations for business users have largely been a misnomer until now. For example, some architectures enabled visualizations for end users by staging Kafka data into a separate data store, which added latency. KSQL removes that latency to let business users see the most recent data directly in Kafka and react immediately.

Click through for a couple repos and demos.

Comments closed

Creating Choropleths With ggcounty

Published 2018-05-21 by Kevin Feasel

Sebastian Sauer has a quick example of using ggcounty to plot data on a map of US counties:

This posts shows how easy it can be to build an visually pleasing plot. We will use hrbrmster’s ggcounty, which is an R package at this Github repo. Graphics engine is as mostly in my plots, Hadley Wickhams ggplot. All build on R. Standing on shoulders…

Disclaimer: This example heavily draws on hrbrmster example on this page. All credit is due to Rudy, and those on whose work he built up on.

In just a few lines of code, you can have a pretty nice map.

Comments closed

Overlaying Visuals In Power BI

Published 2018-05-21 by Kevin Feasel

Annie Xu gives us two methods for being able to jump between two visuals in the same space:

Disconnected Table method:

This method is more towards PowerBI modelers. Basically, the idea is to have a Field in a independent table (no relationship to other tables) as Slicer with your measure choice and then create a measure using SELECTEDVALUE function to have the measure dynamically switch referring measures based on the choice made on the slicer.

Click through for both methods.

Comments closed

Building Flow Charts In R

Published 2018-05-10 by Kevin Feasel

Alan Haynes shows how to build flow charts in R using the grid Gmisc packages:

Flow charts are an important part of a clinical trial report. Making them can be a pain though. One good way to do it seems to be with the grid and Gmisc packages in R. X and Y coordinates can be designated based on the center of the boxes in normalized device coordinates (proportions of the device space – 0.5 is this middle) which saves a lot of messing around with corners of boxes and arrows.

A very basic flow chart, based very roughly on the CONSORT version, can be generated as follows…

Click through for sample code and a resulting image. H/T R-bloggers

Comments closed

Building Palettes From Pictures In R

Published 2018-05-10 by Kevin Feasel

Andrea Cirillo takes inspiration from the great works to build palettes:

If you see this painting you will find a profound of colours with a great equilibrium between different hues, the hardy usage of complementary colours and the ability expressed in the “chiaroscuro” technique. While I was looking at the painting I started, wondering how we moved from this wisdom to the ugly charts you can easily find within today’s corporate reports ( find a great sample on the WTF visualization website)

This is where Paletter comes from: bring the Renaissance wisdom and beauty within the plots we produce every day.

Introducing paletter

PaletteR is a lean R package which lets you draw from any custom image an optimized palette of colours. The package extracts a custom number of representative colours from the image. Let’s try to apply it on the “Vergine con il Bambino, angeli e Santi” before looking into its functional specification.

It’s an interesting package. I’ll have to play around with it.

Comments closed

ggplot2 Coordinate Systems

Published 2018-05-08 by Kevin Feasel

Lea Waniek walks us through coordinate systems in ggplot2:

The coordinate system can be manipulated by adding one of ggplot’s different coordinate systems. When you are imagining a coordinate system, you are most likely thinking of a Cartesian one. The Cartesian coordinate system combines x and y dimension orthogonally and is ggplots default (coord_cartesian).

There also are several varaitions of the familiar Cartesian coordinate system in ggplot, namely coord_fixed, coord_flip and coord_trans. For all of them, the displayed section of the data can be specified by defining the maximal value depicted on the x (xlim =) and y (ylim =) axis. This allows to “zoom in” or “zoom out” of a plot. It is a great advantage, that all manipulations of the coordinate system only alter the depiction of the data but not the data itself.

I tend to avoid polar coordinates, but that’s mostly because I don’t work in a space which benefits from it.

Comments closed

Limitations Of Mapping In Power BI

Published 2018-05-08 by Kevin Feasel

David Stelfox points out a limitation in Power BI and tries to circumvent it with R to some limited effect:

This results in a row per ride and visualises pretty well in SSMS. If you are familiar with the geography of London you can make out the river Thames toward the centre of the image and Regents Park towards the top left:

This could be overlaid on a shape file of London or a map from another provider such as Google Maps or Mapbox.

However, when you try to load the dataset into Power BI, you find that Power BI does not natively support Geography data types. There is an idea you can vote on here to get them supported: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/12257955-support-sql-server-geometry-geography-data-types-i

Hit up that idea link if you want to see geography type support within Power BI.

Comments closed

Demos Using Amazon QuickSight

Published 2018-04-30 by Kevin Feasel

Karthik Kumar Odapally and Pranabesh Mandal have several example visuals that you can generate using Amazon QuickSight:

Typical Amazon QuickSight workflow

When you create an analysis, the typical workflow is as follows:

Connect to a data source, and then create a new dataset or choose an existing dataset.
(Optional) If you created a new dataset, prepare the data (for example, by changing field names or data types).
Create a new analysis.
Add a visual to the analysis by choosing the fields to visualize. Choose a specific visual type, or use AutoGraph and let Amazon QuickSight choose the most appropriate visual type, based on the number and data types of the fields that you select.
(Optional) Modify the visual to meet your requirements (for example, by adding a filter or changing the visual type).
(Optional) Add more visuals to the analysis.
(Optional) Add scenes to the default story to provide a narrative about some aspect of the analysis data.
(Optional) Publish the analysis as a dashboard to share insights with other users.

It’s interesting to see how Amazon is trying to move this functionality from third-party tools (Power BI, Tableau, etc.) and notebooks right into the set of AWS offerings. Contrast this with the way that Microsoft is building in Jupyter with Azure Notebooks.

Comments closed