Month: December 2017

Failover Groups In Azure SQL Database

Published 2017-12-27 by Kevin Feasel

Jim Donahoe shows off Failover Groups in Azure SQL Database. Part 1 involves setting up a Failover Group:

In my former company, we had 22 web applications that all had connections to various databases. We had all of our databases configured for Geo-Replication already, but still if we had to failover, we had to update each connection string for the web apps along with others which became a tedious process. In came Failover Groups to the rescue! With a Failover Group, I was able to create two endpoints that stayed the same no matter which server was primary/secondary. I liked to think of these as my Availability Group Listeners as they kinda serve the same functionality: Route traffic to a node depending on if its read-only or not. Best part? It’s configured through the Azure Portal SO EASILY! You can use PowerShell as well, but for this blog post, I will walk through the creation via the Portal. I will make a separate post or attach a script at some point for the PowerShell deployment.
Before we start the configuration portion of this though, let’s take a look at how Microsoft defines what a Failover Group is. I found this definition here: “Azure SQL Database auto-failover groups (in-preview) is a SQL Database feature designed to automatically manage geo-replication relationship, connectivity, and failover at scale.” Sounds pretty interesting, right? Let’s make one!

In Part 2, Jim shows how to connect to SQL Server using the Failover Group listener:

Well, now that the easy stuff is out of the way, let’s talk about how you connect to these groups via SSMS. This is where some of the confusion happens. When I first configured a Failover Group, the first thing I tried to do was connect to the Primary server via SSMS thinking it will work just like an Always On Listener in traditional SQL Server…NEWP!

If you’re running a production database on Azure SQL Database, you might want to look at Failover Groups.

Comments closed

The Argument For Single-Socket Servers

Published 2017-12-27 by Kevin Feasel

Joe Chang wants us to think about socket counts:

It might seem that the 2-socket system continues to be a good choice, as two processors with an intermediate number of cores is less expensive than one processor with twice as many cores. An example is the Xeon Gold 6132 14-core versus the Xeon Platinum 8180 28-core processors. In addition, the two-socket system has twice the memory capacity and nominally twice as much memory bandwidth.
So, end of argument, right? Well, no.

Click through for his argument in favor of single-socket machines for OLTP systems.

Comments closed

Power BI Usage Metrics

Published 2017-12-27 by Kevin Feasel

Gogula Aryalingam shows how to access Power BI usage metrics for a report or dashboard:

Each app workspace gets its own report usage metrics data set, it’s just that you don’t see it when you are in the portal. In order to access it (at least for now) you need to use Power BI Desktop. When you open Power BI Desktop, you need to sign-in with the appropriate login, and then choose Power BI service from Get Data menu item. You then get listed with a set of app workspaces; within each you would find a list of all the datasets that were every published to each of the workspaces. Additionally, Power BI will also give you two more datasets: Report Usage Metrics Model and Dashboard Usage Metrics Model. However, these data models will only show up if you had attempted to view usage metrics at least once on one of the reports of the app workspace.

Read the whole thing.

Comments closed

When The Power BI Work Is Done

Published 2017-12-27 by Kevin Feasel

Melissa Coates has a great checklist to help you figure out if your Power BI dashboard is done:

Auto time intelligence is enabled by default, and it applies to each individual PBIX file (there’s not a global option). For most datetime columns that exist in the dataset, a hidden date table is created in the model to support time-oriented DAX calculations. This is great functionality for newer users, or if you have a very simple data model. However, if you typically utilize a standard Date table, then you will want to disable the hidden date tables to reduce the file size. (Tip: You can view the hidden date tables if you connect to the PBIX via DAX Studio.)

There are a lot of good things to think about here.

Comments closed

Using The Squint Test

Published 2017-12-27 by Kevin Feasel

Meagan Longoria gives us the squint test:

While you can definitely perform the Squint Test on your report within Power BI Desktop, I recommend also testing in a browser once the report is deployed to PowerBI.com or to the Power BI Report Server portal since colors and objects may be slightly different there.
The Squint Test is also used in web page design, so web developers have made tools to aid them in this check. While just squinting at the page is perfectly sufficient, using a browser extension or another tool allows you to easily share your findings with others. In the Chrome Browser, there is a free extension called The Squint Test. This extension places an eye icon near the top right of the browser window. Clicking the icon provides a slider that allows you to increase or decrease the amount of blur applied to the page.

Meagan also has an example of applying this test and picks a dashboard where she can make some improvements, so check it out.

Comments closed

The ggplot2 Books

Published 2017-12-26 by Kevin Feasel

Hadley Wickham has a couple of books which teach a lot about ggplot2. The first book I’d recommend is his and Garrett Grolemund’s R For Data Science book, which is available for free online:

To map an aesthetic to a variable, associate the name of the aesthetic to the name of the variable inside aes(). ggplot2 will automatically assign a unique level of the aesthetic (here a unique color) to each unique value of the variable, a process known as scaling. ggplot2 will also add a legend that explains which levels correspond to which values.
The colors reveal that many of the unusual points are two-seater cars. These cars don’t seem like hybrids, and are, in fact, sports cars! Sports cars have large engines like SUVs and pickup trucks, but small bodies like midsize and compact cars, which improves their gas mileage. In hindsight, these cars were unlikely to be hybrids since they have large engines.

Wickham also has the source to build his ggplot2 book online. If you don’t want to build the source, you also have the option of buying the book.

Comments closed

A Layered Grammar Of Graphics

Published 2017-12-26 by Kevin Feasel

Hadley Wickham describes some of the decisions he made when putting together ggplot2:

In the examples above, we have seen some of the components that make up a plot:
• data and aesthetic mappings,
• geometric objects,
• scales, and
• facet specification.
We have also touched on two other components:
• statistical transformations, and
• the coordinate system.
Together, the data, mappings, statistical transformation, and geometric object form a layer. A plot may have multiple layers, for example, when we overlay a scatterplot with a smoothed line.

This isn’t an article about how to use ggplot2; rather, it’s an article about implementation decisions. To that end, I think it’s useful to see some of the logic behind ggplot2’s decisions.

Comments closed

The Grammar Of Graphics

Published 2017-12-26 by Kevin Feasel

Leland Wilkinson has written the book on how we should write systems which visualize data:

This book was written for statisticians, computer scientists, geographers, research and applied scientists, and others interested in visualizing data. It presents a unique foundation for producing almost every quantitative graphic found in scientific journals, newspapers, statistical packages, and data visualization systems. This foundation was designed for a distributed computing environment (Internet, Intranet, client-server), with special attention given to conserving computer code and system resources.

There’s no free copy of this book, and it’s a very expensive textbook. For most people, you’ll get more from derivative works, but if you’ve thought about putting together a graphics library, this is a must-read.

Comments closed

Data Visualization For Social Science

Published 2017-12-26 by Kevin Feasel

I’ve started reading Kieran Healy’s book, Data Visualization For Social Science. He has a free draft available online, and it automatically builds nightly so you’re seeing the latest version. From the preface:

This book is a hands-on introduction to the principles and practice of looking at and presenting data using R and ggplot. R is a powerful, widely used, and freely available programming language for data analysis. You may be interested in exploring ggplot after having used R before, or be entirely new to both R and ggplot and just want to graph your data. I do not assume you have any prior knowledge of R.
After installing the software we need, we begin with an overview of some basic principles of visualization. We focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. We then learn how to produce and refine plots using ggplot2, a powerful, versatile, and widely-used visualization library for R (Wickham 2016a). The ggplot2 library implements a “grammar of graphics” (Wilkinson 2005). This approach gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.
Through a series of worked examples, you will learn how to build plots piece by piece, beginning with scatterplots and summaries of single variables, then moving on to more complex graphics. Topics covered include plotting continuous and categorical variables, layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, and also some alternatives to maps worth considering when presenting country- or state-level data. We will also cover cases where we are not working directly with a dataset, but rather with estimates from a statistical model. From there, we will explore the process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular items of interest, annotating plots, and changing their overall appearance. Finally we will examine some strategies for presenting graphical results in different formats, and to different sorts of audiences.

I’m less than halfway through the book so far, but it is quite an approachable look at the ggplot2 library with a bit of discussion on what makes for quality graphics.

Comments closed

Merry Christmas—No Curation Today

Published 2017-12-25 by Kevin Feasel

Merry Christmas to all. If Christmas isn’t your bag, happy Monday.

Curation resumes tomorrow with a few long-read items, and then resumes for real on Wednesday.

Comments closed