Press "Enter" to skip to content

Author: Kevin Feasel

Digging Into The In-Memory Columnstore Location

Niko Neugebauer does some investigation into where, exactly, memory-optimized columnstore data goes:

This is a rather simple blog post that is dedicated to the theme of the In-Memory Columnstore Indexes location. This has been a constant topic of discussion over a long period of time, even during the public events – and there is a need to clear out this topic.

I have assumed that the In-Memory Columnstore structures (Segments, Dictionaries, …) are located in the In-Memory, but there have been voices that I greatly respect, pointing that actually the Columnstore Object Pool is the exact location of any Columnstore structures, and there is nothing better than to take this feature for a ride and see what the SQL Server engine is actually doing.

Niko shows off a couple of useful DMVs along the way, too.

Comments closed

Analyzing Spatial Data With Cosmos DB

Ben Jarvis shows how to query spatial data from Cosmos DB:

The above code connects to Cosmos DB and retrieves the details for the base airfield that was specified, it then calculates the range of the aircraft in meters by multiplying the endurance (in hours) by the true airspeed in knots (nautical miles per hour) and then multiplying that my 1852 (number of meters in a nautical mile). A Linq query is then run against Cosmos DB using the built-in spatial functions to find airfields within the specified distance. The result is then converted into a JSON array that can be understood by the Google Maps API that is being used on the client side.

The client side uses the Google Maps API to plot the airfields on a map, giving us a view like the one below when given a base airfield of Blackbushe (EGLK), a true airspeed of 100kts and an endurance of 4.5 hours

Click through for .NET code to load and analyze the data.

Comments closed

Faceted ggplot2

I have another post in my ggplot2 series, this time covering facets:

Notice that we create a graph per continent by setting facets = ~continent.  The tilde there is important—it’s a one-sided formula.  You could also write c("continent") if that’s clearer to you.

I also set the number of columns, guaranteeing that we see no more than 3 columns of grids. I could alternatively set nrow, which would guarantee we see no more than a certain number of rows.

There are a couple other interesting features in facet_wrap. First, we can set scales = "free" if we want to draw each grid as if the others did not exist. By default, we use a scale of “fixed” to ensure that everything plots on the same scale. I prefer that for this exercise because it lets us more easily see those continental clusters.

Facets let you compare multiple graphs quickly.  They’re great for fast comparison, but as I show in the post, you can distort the way the data looks by lining it up horizontally or vertically.

Comments closed

When Nanoseconds Count

Joe Chang thinks about single-socket servers:

There is a mechanism by which we can significantly influence memory latency in a multi-processor (socket) server system, that being memory locality. But few applications actually make use of the NUMA APIs in this regard. Some hypervisors like VMware allow VMs to be created with cores and memory from the same node. What may not be appreciated, however, is that even local node memory on a multi-processor system has significantly higher latency than memory access on a (physical) single-socket system.

That the single processor system has low memory latency was interesting but non-actionable bit of knowledge, until recently. The widespread practice in IT world was to have the 2-way system as the baseline standard. Single socket systems were relegated to small business and turnkey solutions. From long ago to a few years ago, there was a valid basis for this, though the reasons changed over the years. When multi-core processors began to appear, the 2-way became much more powerful than necessary for many secondary applications. But this was also the time virtualization became popular, which gave new reason to continue the 2-way as baseline practice.

Joe points out that for a highly-used transactional system, the lower memory latency might make a single-socket server perform better than a multi-socket server.

Comments closed

Exploring MSSQL-CLI

Drew Furgiuele hates the pernicious effect of graphical user interfaces and wants to return to nobler times:

Okay, fine, who’s this really for?

Anyone, I suppose, but in reality? If you’re stuck on a remote system that only allows terminal logins, this is a super handy tool to use. If you’re used to text-based editors, like Emacs, Nano, or even vi or vim, you’ll feel right at home in here. You can type in commands and then run queries. There’s even multi-line support for more complex stuff. There’s another side to this too: if you’re the kind of DBA that gets really mad if someone installs management studio on a server, then this might be a solution: it’s got a very small installation footprint, and you don’t get a full UI experience so unless you know all the big T-SQL commands for heavy administration (like for, say, dealing with an availability group or adding permissions, or taking or restoring backups), you won’t get much use out of it. But in a pinch, on a server, in a crisis where you can’t tell if SQL is up and serving queries? It just might work out well for you.

Read the whole thing for an enlightening Q&A session, although the question-asker is kind of a jerk to the answerer.

Comments closed

Options For Deploying Power BI Reports

Eugene Meidinger covers the various deployment options for Power BI:

Even worse, Power BI is rapidly being iterated on. This is great for users, but a challenge for people trying to keep up with the technology. One year ago the following deployment options modes didn’t exist.

  1. Sharing individual reports (Jan 2018)
  2. “Apps” (May 2017)
  3. SharePoint Embedding (Feb 2017)
  4. Power BI Premium (May 2017)
  5. Power BI Report Server (June 2017)
  6. Power BI Embedded V2 (May 2017)

It can be a real challenge to keep up. I think that a lot of the dust has settled when it comes to deployment options. I don’t see them adding a lot of new methods. But I expect there to be many small tweaks as time goes on. In fact I had to make two changes to my slides this morning because they announced changes yesterday!

In contrast, I expect another six to be added to this list in the coming three months.  Because it’s Power BI and the only rule behind Power BI is that there must be more.

Comments closed

More Keyboard Shortcuts

Andy Mallon has a great shortcuts cheat sheet for SQL Server Management Studio and SQL Operations Studio:

Nearly two year ago, I first published my Shortcuts cheat sheet. Since then, thousands of people have downloaded it. I’ll be the first to admit that I didn’t expect it to be as much of a hit as it has been. When I give my one-hour talk in person, I bring card stock handouts of my cheat sheets, too. I also ask people for their favorite shortcuts, and I’ve learned some great new hidden gems.

I’ve been working on some updates, and the updated version is ready to go. I’ve added a bunch more shortcuts, and even added shortcuts for SQL Operations StudioIt’s two pages now, for double the fun!

That’s great stuff.  Learning these keyboard shortcuts will provide a nice marginal benefit to your productivity.

Comments closed

Themes And Legends In ggplot2

I have another part of my ggplot2 series up, this time on themes and legends:

You are not limited to using defaults in your graphs.  Let’s go back to the minimal theme but change the fonts a bit.  I want to make the following changes:

  1. Use Gill Sans fonts instead of the default

  2. Increase the title font size a little bit

  3. Decrease the X axis font size a little bit

  4. Remove the Y axis; the subtitle makes it clear what the Y axis contains

By the time we’re through this, we have publication-quality visuals in a few dozen lines of code.  I also have provided a bonus rant on Windows and R and fonts because that’s a nasty experience.

Comments closed

Installing Apache Mesos On EC2

Anubhav Tarar has a guide for setting up Apache Mesos along with Spark and Hadoop on EC2:

Apache Mesos is open source project for managing computer clusters originally developed at the University Of California. It sits between the application layer and operating system to manage the application works efficiently on the large-scale distributed environment.

In this blog, we will see how to setup mesos client and master on ec2 from scratch.

Read on for the step-by-step guide.

Comments closed

Backup Management Is More Than Taking Backups

Kenneth Fisher makes a great point regarding backups:

I’ve said before that backups are at once one of the easiest things DBAs do, one of the most important, and one of the most complicated. Take a full backup, restore it. Pretty simple right? And yet it’s vital when accident or corruption require recovering data. And as simple as it can be on the surface, the more you dig, the more there is to know, and the more complicated it can become. Well, one of those complications is the backup of the backup files. I mean, assuming you are using native backups, that full backup is sitting on a drive somewhere, and hopefully, that drive gets backed up right?

Why? Well, for performance purposes you probably back up your databases locally. To a drive attached to the server. Now you may not, heck you could be backing up to Azure, but for the sake of this argument let’s say you are. Part of a careful disaster recovery plan is making sure you have access to those backups. I’ve heard stories of entire data centers going underwater (literally). You need to at least have a copy of your backups in a separate system, separate location from production.

The proliferation of S3/Blob Storage for “warm” backups and Glacier/Cool Blob Storage for “cold” backups has made it much cheaper to retain longer-term backups.

Comments closed