Outliers In Histograms

Edwin Thoen has an interesting solution to a classic problem with histograms:

Two strategies that make the above into something more interpretable are taking the logarithm of the variable, or omitting the outliers. Both do not show the original distribution, however. Another way to go, is to create one bin for all the outlier values. This way we would see the original distribution where the density is the highest, while at the same time getting a feel for the number of outliers. A quick and dirty implementation of this would be

hist_data %>% mutate(x_new = ifelse(x > 10, 10, x)) %>% ggplot(aes(x_new)) + geom_histogram(binwidth = .1, col = "black", fill = "cornflowerblue")

Edwin then shows a nicer solution, so read the whole thing.

Time Brush Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Time Brush Power BI Custom Visual.  The Time Brush gives you the ability both filter your report and see a graphics representation of your data at the same time. The name Time Brush comes from the behavior used when you select the values you’d like to filter.

The use of color is an interesting take on combining continuous data points with categorical representations of those points.

Ignoring SSAS Dynamic Formatting

Chris Webb shows that tools like Power BI ignore formatting in SCOPE statements:

What’s more (and this is a bit strange) if you look at the DAX queries that are generated by Power BI to get data from the cube, they now request a new column to get the format string for the measure even though that format string isn’t used. Since it increases the amount of data returned by the query much larger, this extra column can have a negative impact on query performance if you’re bringing back large amounts of data.

There is no way of avoiding this problem at the moment, unfortunately. If you need to display formatted values in Power BI you will have to create a calculated measure that returns the value of your original measure, set the format string property on that calculated measure appropriately, and use that calculated measure in your Power BI reports instead:

Click through for more details and a workaround.

Power BI Line Dot Charts

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Line Dot Chart Power BI Custom Visual.  The Line Dot Chart gives you the ability to make a more engaging line chart that can actually be animated across time.

This seems more like a fun chart than a useful chart, but I could see it being visually engaging for demonstrating relatively low-frequency events.

Basics Of R Plotting

Aman Tsegai shows some basic ways to customize R’s plot function:

We’re going to be using the cars dataset that is built in R. To follow along with real code, here’s an interactive R Notebook. Feel free to copy it and play around with the code as you read along.

So if we were to simply plot the dataset using just the data as the only parameter, it’d look like this:


The plot function is great for cases where you don’t much care how the visual looks, and the simplicity is great for throwaway visuals.

Network Navaigator Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Network Navigator Power BI Custom Visual.  You may find the need to use the Network Navigator when you’re trying to find links between different attributes in a dataset. It does this by visualizing each attribute as a node and the strength of activity between those nodes can be represented in multiple ways.

Click through to get to Devin’s video.  This visual looks interesting for graphical analysis, like trying to tease out common connections or discovering dependencies.

ggedit 0.2.0

Jonathan Sidi announces ggedit 0.2.0:

ggedit is an R package that is used to facilitate ggplot formatting. With ggedit, R users of all experience levels can easily move from creating ggplots to refining aesthetic details, all while maintaining portability for further reproducible research and collaboration.
ggedit is run from an R console or as a reactive object in any Shiny application. The user inputs a ggplot object or a list of objects. The application populates Bootstrap modals with all of the elements found in each layer, scale, and theme of the ggplot objects. The user can then edit these elements and interact with the plot as changes occur. During editing, a comparison of the script is logged, which can be directly copied and shared. The application output is a nested list containing the edited layers, scales, and themes in both object and script form, so you can apply the edited objects independent of the original plot using regular ggplot2 grammar.

This makes modifying ggplot2 visuals a lot easier for people who aren’t familiar with the concept of aesthetics and layers—like, say, the marketing team or management.

OLAP Limitations In Tableau

Tim Cost points out areas of friction when trying to use Tableau to connect to a multi-dimensional Analysis Services cube:

I love Tableau, I do NOT however, love working with Tableau when it is connected to an OLAP cube (like Microsoft SQL Server Analysis Services).  I don’t enjoy working with cube data in Tableau because basically all the coolest parts of Tableau won’t work or won’t work in the ways you might expect.  I don’t see this as a failing of Tableau, I lay the blame on the OLAP cube.  The main issue with working against a cube in Tableau is that you talk to a cube with MDX, where we talk to almost every other data source with SQL.  MDX (or Mind Destroying Expressions as I think of them), are just a huge pain to work with.  As hard as it is for ME to write MDX, for Tableau it’s even harder. Here are some things that you should consider before committing to a Tableau project with Microsoft SQL Server Analysis Services as a data source

Click through for ten such considerations.

Attribute Slicer

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Attribute Slicer Power BI Custom Visual.  Using the Attribute Slicer you have the ability to filter your entire report while also being able to visibly see a measure value associated with each attribute.

Click through for the video as well as more details.  This looks like a very interesting way of integrating a slicer with some important metric, like maybe including dollar amounts per sales region and then filtering by specific regions to show more detailed analyses.

Grafana On Elasticsearch

Daniel Berman shows how to replace Kibana with Grafana:

While very similar in terms of what can be done with the data itself within the two tools. The main differences between Kibana and Grafana lie in configuring how the data is displayed. Grafana has richer display features and more options for playing around with how the data is represented in the graphs.

While it takes some time getting accustomed to building graphs in Grafana — especially if you’re coming from Kibana — the data displayed in Grafana dashboards can be read and analyzed more easily.

I prefer Grafana over Kibana for a few reasons, so I’m happy to see Grafana articles popping up.


April 2017
« Mar