Press "Enter" to skip to content

Month: February 2018

Faceted ggplot2

I have another post in my ggplot2 series, this time covering facets:

Notice that we create a graph per continent by setting facets = ~continent.  The tilde there is important—it’s a one-sided formula.  You could also write c("continent") if that’s clearer to you.

I also set the number of columns, guaranteeing that we see no more than 3 columns of grids. I could alternatively set nrow, which would guarantee we see no more than a certain number of rows.

There are a couple other interesting features in facet_wrap. First, we can set scales = "free" if we want to draw each grid as if the others did not exist. By default, we use a scale of “fixed” to ensure that everything plots on the same scale. I prefer that for this exercise because it lets us more easily see those continental clusters.

Facets let you compare multiple graphs quickly.  They’re great for fast comparison, but as I show in the post, you can distort the way the data looks by lining it up horizontally or vertically.

Comments closed

Exploring MSSQL-CLI

Drew Furgiuele hates the pernicious effect of graphical user interfaces and wants to return to nobler times:

Okay, fine, who’s this really for?

Anyone, I suppose, but in reality? If you’re stuck on a remote system that only allows terminal logins, this is a super handy tool to use. If you’re used to text-based editors, like Emacs, Nano, or even vi or vim, you’ll feel right at home in here. You can type in commands and then run queries. There’s even multi-line support for more complex stuff. There’s another side to this too: if you’re the kind of DBA that gets really mad if someone installs management studio on a server, then this might be a solution: it’s got a very small installation footprint, and you don’t get a full UI experience so unless you know all the big T-SQL commands for heavy administration (like for, say, dealing with an availability group or adding permissions, or taking or restoring backups), you won’t get much use out of it. But in a pinch, on a server, in a crisis where you can’t tell if SQL is up and serving queries? It just might work out well for you.

Read the whole thing for an enlightening Q&A session, although the question-asker is kind of a jerk to the answerer.

Comments closed

When Nanoseconds Count

Joe Chang thinks about single-socket servers:

There is a mechanism by which we can significantly influence memory latency in a multi-processor (socket) server system, that being memory locality. But few applications actually make use of the NUMA APIs in this regard. Some hypervisors like VMware allow VMs to be created with cores and memory from the same node. What may not be appreciated, however, is that even local node memory on a multi-processor system has significantly higher latency than memory access on a (physical) single-socket system.

That the single processor system has low memory latency was interesting but non-actionable bit of knowledge, until recently. The widespread practice in IT world was to have the 2-way system as the baseline standard. Single socket systems were relegated to small business and turnkey solutions. From long ago to a few years ago, there was a valid basis for this, though the reasons changed over the years. When multi-core processors began to appear, the 2-way became much more powerful than necessary for many secondary applications. But this was also the time virtualization became popular, which gave new reason to continue the 2-way as baseline practice.

Joe points out that for a highly-used transactional system, the lower memory latency might make a single-socket server perform better than a multi-socket server.

Comments closed

Options For Deploying Power BI Reports

Eugene Meidinger covers the various deployment options for Power BI:

Even worse, Power BI is rapidly being iterated on. This is great for users, but a challenge for people trying to keep up with the technology. One year ago the following deployment options modes didn’t exist.

  1. Sharing individual reports (Jan 2018)
  2. “Apps” (May 2017)
  3. SharePoint Embedding (Feb 2017)
  4. Power BI Premium (May 2017)
  5. Power BI Report Server (June 2017)
  6. Power BI Embedded V2 (May 2017)

It can be a real challenge to keep up. I think that a lot of the dust has settled when it comes to deployment options. I don’t see them adding a lot of new methods. But I expect there to be many small tweaks as time goes on. In fact I had to make two changes to my slides this morning because they announced changes yesterday!

In contrast, I expect another six to be added to this list in the coming three months.  Because it’s Power BI and the only rule behind Power BI is that there must be more.

Comments closed

More Keyboard Shortcuts

Andy Mallon has a great shortcuts cheat sheet for SQL Server Management Studio and SQL Operations Studio:

Nearly two year ago, I first published my Shortcuts cheat sheet. Since then, thousands of people have downloaded it. I’ll be the first to admit that I didn’t expect it to be as much of a hit as it has been. When I give my one-hour talk in person, I bring card stock handouts of my cheat sheets, too. I also ask people for their favorite shortcuts, and I’ve learned some great new hidden gems.

I’ve been working on some updates, and the updated version is ready to go. I’ve added a bunch more shortcuts, and even added shortcuts for SQL Operations StudioIt’s two pages now, for double the fun!

That’s great stuff.  Learning these keyboard shortcuts will provide a nice marginal benefit to your productivity.

Comments closed

Themes And Legends In ggplot2

I have another part of my ggplot2 series up, this time on themes and legends:

You are not limited to using defaults in your graphs.  Let’s go back to the minimal theme but change the fonts a bit.  I want to make the following changes:

  1. Use Gill Sans fonts instead of the default

  2. Increase the title font size a little bit

  3. Decrease the X axis font size a little bit

  4. Remove the Y axis; the subtitle makes it clear what the Y axis contains

By the time we’re through this, we have publication-quality visuals in a few dozen lines of code.  I also have provided a bonus rant on Windows and R and fonts because that’s a nasty experience.

Comments closed

Installing Apache Mesos On EC2

Anubhav Tarar has a guide for setting up Apache Mesos along with Spark and Hadoop on EC2:

Apache Mesos is open source project for managing computer clusters originally developed at the University Of California. It sits between the application layer and operating system to manage the application works efficiently on the large-scale distributed environment.

In this blog, we will see how to setup mesos client and master on ec2 from scratch.

Read on for the step-by-step guide.

Comments closed

Backup Management Is More Than Taking Backups

Kenneth Fisher makes a great point regarding backups:

I’ve said before that backups are at once one of the easiest things DBAs do, one of the most important, and one of the most complicated. Take a full backup, restore it. Pretty simple right? And yet it’s vital when accident or corruption require recovering data. And as simple as it can be on the surface, the more you dig, the more there is to know, and the more complicated it can become. Well, one of those complications is the backup of the backup files. I mean, assuming you are using native backups, that full backup is sitting on a drive somewhere, and hopefully, that drive gets backed up right?

Why? Well, for performance purposes you probably back up your databases locally. To a drive attached to the server. Now you may not, heck you could be backing up to Azure, but for the sake of this argument let’s say you are. Part of a careful disaster recovery plan is making sure you have access to those backups. I’ve heard stories of entire data centers going underwater (literally). You need to at least have a copy of your backups in a separate system, separate location from production.

The proliferation of S3/Blob Storage for “warm” backups and Glacier/Cool Blob Storage for “cold” backups has made it much cheaper to retain longer-term backups.

Comments closed

SQL In Kubernetes On Docker On Windows

Andrew Pruski is two buzzwords away from sending me into sensory overload:

Now, if this is the first time working with Kubernetes you won’t have to perform the next couple of steps but just to confirm, run the following: –

kubectl config current-context

If your shell cannot find the kubectl command, add
C:\Program Files\Docker\Docker\Resources\bin\
to your PATH environment variable and restart your shell.

If the command outputs anything other than docker-for-desktop you will need to switch to the desktop cluster.

Click through to see how to set this up.

Comments closed

Finding Palindromes With T-SQL

Chris Hyde has started a new series on palindromes in T-SQL:

Immediately I realized that this algorithm will need to accomplish two different things.  I first need to remove all non-alphabetic characters from the string I am testing, because while “able was I ere I saw Elba” is palindromic even leaving the spaces intact, this will not work for other well-known palindromes such as “A man, a plan, a canal, Panama!”  Then the second task is to check that the remaining string is the same front-to-back as it is back-to-front.

With the help of Elder’s Dead Roots Stirring album I set out to find the most efficient T-SQL code to accomplish this task.  My plan was to avoid resorting to Google for the answer, but perhaps in a future post I will go back and compare my solution to the best one I can find online.  For this first post in the series I will tackle only the first task of removing the non-alphabetic characters from the string.

Read on to see how Chris takes on this task.

Comments closed