Press "Enter" to skip to content

Day: February 16, 2022

Visualizing High-Density Regions with R

The rOpenSci team covers the history of the gghdr package:

This was how being a newcomer to rOpenSci OzUnconf 2019 felt. It was incredible to be a part of such a diverse, welcoming and inclusive environment. I thought it would be fun to blog about how it all began, and the twists and turns we experienced along the way as we developed the gghdr package. The package provides tools for plotting highest density regions with ggplot2 and was inspired by the package hdrcde developed by Rob J Hyndman. The highest density region approach of summarizing a distribution is useful for analyzing multimodal distributions and can be composed of numerous disjoint subsets. For example, the histogram of the highway mileage (hwy) data from the mpg dataset (a) shows that cars with 6 cylinders (cyl) are bimodally distributed, which is reflected in the highest density region (HDR) boxplot (c) but not in the standard boxplot (b). Hence, we see that HDRs are useful in displaying multimodality in the distribution.

Read on for a short history of an interesting package.

Comments closed

Filtered Indexes and Functions

Eitan Blumin looks at filtered indexes:

In fact, absolutely no functions of any kind can be used within the WHERE clause of a filtered index. Not even schema-bound user-defined scalar functions.

Unfortunately, as stated in the Microsoft Docs page about Filtered Indexes, the WHERE clause of a filtered index can only support simple comparison operators.

Well, it’s not entirely true, as you CAN actually use some functions, but on two conditions:

Read the whole thing. Eitan lays out one limitation of filtered indexes and provides a couple of potential workarounds.

Comments closed

Verbalizing a Chart

Alex Velez reminds us of the spoken side of communication:

I’m confident that I could overcome some of these design challenges by effectively explaining the graph to someone else. Will it be a perfect data communication? No—but sometimes, we have to deal with less-than-ideal circumstances like time limitations, or not having control over our designs. Knowing how to verbalize a graph can be a practical solution when faced with these constraints.

I should caveat this by clarifying that my intention is not to say that we shouldn’t spend time on our visualizations. But too often, we focus only on the visual. We believe that a graph or a picture is worth a thousand words. Or maybe we assume that because we created the chart, we will automatically know how to talk through it. I am super guilty of this!

Read on for some tips on vocalizing a visual.

Comments closed

Unicode and Data Length

Kevin Wilkie lays out an argument:

If you truly need the UNICODE characters in your data, go ahead and use them! If not though, please make your DBA happy by not using them. Since UNICODE characters take up twice the amount of space as the ASCII versions do, then your DBAs will recommend to use the ASCII versions if you are not going to be using any UNICODE characters.

Read on for the justification. But I’m still NVARCHAR (Almost) Everywhere.

Comments closed

Fun with the SSMS Extended Events UI

Grant Fritchey airs a few grievances:

I like Extended Events and I regularly use the Session Properties window to create and explore sessions. I’m in the window all the time, noting it’s quirks & odd behaviors, even as it helps me get stuff done. However, found a new one. Let me tell you about just a few of them.

Click through for some examples of UI oddities when working with session properties.

Comments closed

Power BI Implementation Planning Guidance

Melissa Coates has a plan:

I’m really excited to share with you that we’re working on a new set of Power BI guidance called “Power BI Implementation Planning.”

The first pieces of content are some of the most common Power BI usage scenarios. In terms of scope of content, we’re just getting started – there will be LOTS more content to come. We’ll be iteratively publishing additions to this set of content over the next several months.

There are a lot of smart people working on this project.

Comments closed

Replication in Azure DB for MySQL

Arun Sirpal explains how you can set up replication with Azure DB for MySQL:

No doubt there will be a need for you to split off your analytical queries from the main database for performance reasons.

If you have been following me in the past with Azure SQL DB you would use failover group read endpoints. With MySQL we would need to build a replica (read only) to another server. This uses MySQL’s native feature binlog replication which is great to hear. This form is asynchronous.

Read on to see how.

Comments closed