Press "Enter" to skip to content

Category: Visualization

Overlaying Visuals In Power BI

Annie Xu gives us two methods for being able to jump between two visuals in the same space:

Disconnected Table method:

This method is more towards PowerBI modelers. Basically, the idea is to have a Field in a independent table (no relationship to other tables) as Slicer with your measure choice and then create a measure using SELECTEDVALUE function to have the measure dynamically switch referring measures based on the choice made on the slicer.

Click through for both methods.

Comments closed

Building Flow Charts In R

Alan Haynes shows how to build flow charts in R using the grid Gmisc packages:

Flow charts are an important part of a clinical trial report. Making them can be a pain though. One good way to do it seems to be with the grid and Gmisc packages in R. X and Y coordinates can be designated based on the center of the boxes in normalized device coordinates (proportions of the device space – 0.5 is this middle) which saves a lot of messing around with corners of boxes and arrows.

A very basic flow chart, based very roughly on the CONSORT version, can be generated as follows…

Click through for sample code and a resulting image.  H/T R-bloggers

Comments closed

Building Palettes From Pictures In R

Andrea Cirillo takes inspiration from the great works to build palettes:

If you see this painting you will find a profound of colours with a great equilibrium between different hues, the hardy usage of complementary colours and the ability expressed in the “chiaroscuro” technique. While I was looking at the painting I started, wondering how we moved from this wisdom to the ugly charts you can easily find within today’s corporate reports ( find a great sample on the WTF visualization website)

This is where Paletter comes from: bring the Renaissance wisdom and beauty within the plots we produce every day.

Introducing paletter

PaletteR is a lean R package which lets you draw from any custom image an optimized palette of colours. The package extracts a custom number of representative colours from the image. Let’s try to apply it on the “Vergine con il Bambino, angeli e Santi” before looking into its functional specification.

It’s an interesting package.  I’ll have to play around with it.

Comments closed

ggplot2 Coordinate Systems

Lea Waniek walks us through coordinate systems in ggplot2:

The coordinate system can be manipulated by adding one of ggplot’s different coordinate systems. When you are imagining a coordinate system, you are most likely thinking of a Cartesian one. The Cartesian coordinate system combines x and y dimension orthogonally and is ggplots default (coord_cartesian).

There also are several varaitions of the familiar Cartesian coordinate system in ggplot, namely coord_fixedcoord_flip and coord_trans. For all of them, the displayed section of the data can be specified by defining the maximal value depicted on the x (xlim =) and y (ylim =) axis. This allows to “zoom in” or “zoom out” of a plot. It is a great advantage, that all manipulations of the coordinate system only alter the depiction of the data but not the data itself.

I tend to avoid polar coordinates, but that’s mostly because I don’t work in a space which benefits from it.

Comments closed

Limitations Of Mapping In Power BI

David Stelfox points out a limitation in Power BI and tries to circumvent it with R to some limited effect:

This results in a row per ride and visualises pretty well in SSMS. If you are familiar with the geography of London you can make out the river Thames toward the centre of the image and Regents Park towards the top left:

This could be overlaid on a shape file of London or a map from another provider such as Google Maps or Mapbox.

However, when you try to load the dataset into Power BI, you find that Power BI does not natively support Geography data types. There is an idea you can vote on here to get them supported: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/12257955-support-sql-server-geometry-geography-data-types-i

Hit up that idea link if you want to see geography type support within Power BI.

Comments closed

Demos Using Amazon QuickSight

Karthik Kumar Odapally and Pranabesh Mandal have several example visuals that you can generate using Amazon QuickSight:

Typical Amazon QuickSight workflow

When you create an analysis, the typical workflow is as follows:

  1. Connect to a data source, and then create a new dataset or choose an existing dataset.

  2. (Optional) If you created a new dataset, prepare the data (for example, by changing field names or data types).

  3. Create a new analysis.

  4. Add a visual to the analysis by choosing the fields to visualize. Choose a specific visual type, or use AutoGraph and let Amazon QuickSight choose the most appropriate visual type, based on the number and data types of the fields that you select.

  5. (Optional) Modify the visual to meet your requirements (for example, by adding a filter or changing the visual type).

  6. (Optional) Add more visuals to the analysis.

  7. (Optional) Add scenes to the default story to provide a narrative about some aspect of the analysis data.

  8. (Optional) Publish the analysis as a dashboard to share insights with other users.

It’s interesting to see how Amazon is trying to move this functionality from third-party tools (Power BI, Tableau, etc.) and notebooks right into the set of AWS offerings.  Contrast this with the way that Microsoft is building in Jupyter with Azure Notebooks.

Comments closed

Creating Seaborn Plots With R

Abdul Majed Raja shows how to call Python from R and build plots using the Seaborn Python package:

The reticulate package provides a comprehensive set of tools for interoperability between Python and R. The package includes facilities for:

  • Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session.
  • Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays).
  • Flexible binding to different versions of Python including virtual environments and Conda environments.

Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability.

The more common use of reticulate I’ve seen is running TensorFlow neural networks from R.

Comments closed

Creating Map Plots With ggmap

Laura Ellis shows how to use the ggmap package to create choropleth maps in R:

In the last map, it was a bit tricky to see the density of the incidents because all the graphed points were sitting on top of each other.  In this scenario, we are going to make the data all one color and we are going to set the alpha variable which will make the dots transparent.  This helps display the density of points plotted.

Also note, we can re-use the base map created in the first step “p” to plot the new map.

Check it out.  This is an introduction to creating choropleths, making it a good start.

Comments closed

Faceting With R And SQL Server ML Services

Marlon Ribunal has a quick example showing how to build faceted plots with SQL Server ML Services and ggplot2:

In my previous post, I have demonstrated how easy it is to create a bar graph in SQL Server 2017 In-Database Machine Learning using  R.

We’re going to build upon that basic graph.

Sometimes doing data analysis would require us to look at an overview of our data across specific partitions, say a year. For example, we want to see how our product groups fare on month-to-month basis across the last 4 years.

In a data analytics perspective, there are quite a handful of data points in this requirement – data aggregate (quantity), monthly periods, and year partitions.

One of the approaches to handle such requirement is by using a facet. Faceting is a way of plotting subsets of data into a matrix of panels based on one or more variables – or facets.

Click through for the example and code.  Facets are quite useful, but they run the risk of misleading if you squeeze too many onto the screen.  The same line can look quite different with a “tall” facet versus a “wide” facet, and that can change how people interpret your visual.

Comments closed

Building Forest Plots With ggplot2

Faisal Atakora shows how to build a forest plot using ggplot2:

To build a Forest Plot often the forestplot package is used in R. However, I find the ggplot2 to have more advantages in making Forest Plots, such as enable inclusion of several variables with many categories in a lattice form. You can also use any scale of your choice such as log scale etc. In this post, I will introduce how to plot Risk Ratios and their Confidence Intervals of several conditions.

Click through for the script.  You might also want to compare it to the forestplot package to see how these differ.

Comments closed